add copilot instructions and rename classes
This commit is contained in:
105
.github/copilot-instructions.md
vendored
Normal file
105
.github/copilot-instructions.md
vendored
Normal file
@@ -0,0 +1,105 @@
|
||||
# Voice Branding Quantitative Analysis - Copilot Instructions
|
||||
|
||||
## Project Overview
|
||||
Qualtrics survey analysis for brand personality research. Analyzes voice samples (V04-V91) across speaking style traits, character rankings, and demographic segments. Uses **Marimo notebooks** for interactive analysis and **Polars** for data processing.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Core Components
|
||||
- **`QualtricsSurvey`** (`utils.py`): Main class combining data loading, filtering, and plotting via `QualtricsPlotsMixin`
|
||||
- **Marimo notebooks** (`0X_*.py`): Interactive apps run via `uv run marimo run <file>.py`
|
||||
- **Data exports** (`data/exports/<date>/`): Qualtrics CSVs with `_Labels.csv` and `_Values.csv` variants
|
||||
- **QSF files**: Qualtrics survey definitions for mapping QIDs to question text
|
||||
|
||||
### Data Flow
|
||||
```
|
||||
Qualtrics CSV (3-row header) → QualtricsSurvey.load_data() → LazyFrame with QID columns
|
||||
↓
|
||||
filter_data() → get_*() methods → plot_*() methods → figures/<export>/<filter>/
|
||||
```
|
||||
|
||||
## ⚠️ Critical AI Agent Rules
|
||||
|
||||
1. **NEVER modify Marimo notebooks directly** - The `XX_*.py` files are Marimo notebooks and should not be edited by AI agents
|
||||
2. **NEVER run Marimo notebooks for debugging** - These are interactive apps, not test scripts
|
||||
3. **For debugging**: Create a standalone temporary Python script (e.g., `debug_temp.py`) to test functions
|
||||
4. **Reading notebooks is OK** - You may read notebook files to understand how functions are used. Ask the user which notebook they're working in for context
|
||||
5. **No changelog markdown files** - Do not create new markdown files to document small changes or describe new usage
|
||||
|
||||
## Key Patterns
|
||||
|
||||
### Polars LazyFrames
|
||||
Always work with `pl.LazyFrame` until visualization; call `.collect()` only when needed:
|
||||
```python
|
||||
data = S.load_data() # Returns LazyFrame
|
||||
subset, meta = S.get_voice_scale_1_10(data) # Returns (LazyFrame, Optional[dict])
|
||||
df = subset.collect() # Materialize for plotting
|
||||
```
|
||||
|
||||
### Column Naming Convention
|
||||
Survey columns follow patterns that encode voice/trait info:
|
||||
- `SS_Green_Blue__V14__Choice_1` → Speaking Style, Voice 14, Trait 1
|
||||
- `Voice_Scale_1_10__V48` → 1-10 rating for Voice 48
|
||||
- `Top_3_Voices_ranking__V77` → Ranking position for Voice 77
|
||||
|
||||
### Filter State & Figure Output
|
||||
`QualtricsSurvey` stores filter state and auto-generates output paths:
|
||||
```python
|
||||
S.filter_data(data, consumer=['Early Professional'])
|
||||
# Plots save to: figures/<export>/Cons-Early_Professional/<plot_name>.png
|
||||
```
|
||||
|
||||
### Getter Methods Return Tuples
|
||||
All `get_*()` methods return `(LazyFrame, Optional[metadata])`:
|
||||
```python
|
||||
df, choices_map = S.get_ss_green_blue(data) # choices_map has trait descriptions
|
||||
df, _ = S.get_character_ranking(data) # Second element may be None
|
||||
```
|
||||
|
||||
## Development Commands
|
||||
|
||||
```bash
|
||||
# Run interactive analysis notebook
|
||||
uv run marimo run 02_quant_analysis.py --port 8080
|
||||
|
||||
# Edit notebook in editor mode
|
||||
uv run marimo edit 02_quant_analysis.py
|
||||
|
||||
# Headless mode for shared access
|
||||
uv run marimo run 02_quant_analysis.py --headless --port 8080
|
||||
```
|
||||
|
||||
## Important Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `utils.py` | `QualtricsSurvey` class, data transformations, PPTX utilities |
|
||||
| `plots.py` | `QualtricsPlotsMixin` with all Altair plotting methods |
|
||||
| `theme.py` | `ColorPalette` and `jpmc_altair_theme()` for consistent styling |
|
||||
| `validation.py` | Data quality checks (progress, duration outliers, straight-liners) |
|
||||
| `speaking_styles.py` | `SPEAKING_STYLES` dict mapping colors to trait groups |
|
||||
|
||||
## Conventions
|
||||
|
||||
### Altair Charts & Colors
|
||||
- **ALL colors MUST come from `theme.py`** - Use `ColorPalette.PRIMARY`, `ColorPalette.RANK_1`, etc.
|
||||
- If a new color is needed, add it to `ColorPalette` in `theme.py` first, then use it
|
||||
- Never hardcode hex colors directly in plotting code
|
||||
- Charts auto-save via `_save_plot()` when `fig_save_dir` is set
|
||||
- Filter footnotes added automatically via `_add_filter_footnote()`
|
||||
|
||||
### QSF Parsing
|
||||
Use `_get_qsf_question_by_QID()` to extract question config:
|
||||
```python
|
||||
cfg = self._get_qsf_question_by_QID('QID27')['Payload']
|
||||
recode_map = cfg['RecodeValues'] # Maps choice numbers to values
|
||||
```
|
||||
|
||||
### PPTX Image Replacement
|
||||
Images matched by perceptual hash (not filename); alt-text encodes figure path:
|
||||
```python
|
||||
utils.update_ppt_alt_text(ppt_path, image_source_dir) # Tag images with alt-text
|
||||
utils.pptx_replace_named_image(ppt, target_tag, new_image) # Replace by alt-text
|
||||
```
|
||||
|
||||
This is a process that should be run manually be the user ONLY.
|
||||
Reference in New Issue
Block a user