Files

Luigi Maiorano 6ba30ff041 add copilot instructions and rename classes

2026-02-02 17:21:57 +01:00

4.4 KiB

Raw Blame History

Voice Branding Quantitative Analysis - Copilot Instructions

Project Overview

Qualtrics survey analysis for brand personality research. Analyzes voice samples (V04-V91) across speaking style traits, character rankings, and demographic segments. Uses Marimo notebooks for interactive analysis and Polars for data processing.

Architecture

Core Components

QualtricsSurvey (utils.py): Main class combining data loading, filtering, and plotting via QualtricsPlotsMixin
Marimo notebooks (0X_*.py): Interactive apps run via uv run marimo run <file>.py
Data exports (data/exports/<date>/): Qualtrics CSVs with _Labels.csv and _Values.csv variants
QSF files: Qualtrics survey definitions for mapping QIDs to question text

Data Flow

Qualtrics CSV (3-row header) → QualtricsSurvey.load_data() → LazyFrame with QID columns
                                      ↓
                           filter_data() → get_*() methods → plot_*() methods → figures/<export>/<filter>/

⚠️ Critical AI Agent Rules

NEVER modify Marimo notebooks directly - The XX_*.py files are Marimo notebooks and should not be edited by AI agents
NEVER run Marimo notebooks for debugging - These are interactive apps, not test scripts
For debugging: Create a standalone temporary Python script (e.g., debug_temp.py) to test functions
Reading notebooks is OK - You may read notebook files to understand how functions are used. Ask the user which notebook they're working in for context
No changelog markdown files - Do not create new markdown files to document small changes or describe new usage

Key Patterns

Polars LazyFrames

Always work with pl.LazyFrame until visualization; call .collect() only when needed:

data = S.load_data()  # Returns LazyFrame
subset, meta = S.get_voice_scale_1_10(data)  # Returns (LazyFrame, Optional[dict])
df = subset.collect()  # Materialize for plotting

Column Naming Convention

Survey columns follow patterns that encode voice/trait info:

SS_Green_Blue__V14__Choice_1 → Speaking Style, Voice 14, Trait 1
Voice_Scale_1_10__V48 → 1-10 rating for Voice 48
Top_3_Voices_ranking__V77 → Ranking position for Voice 77

Filter State & Figure Output

QualtricsSurvey stores filter state and auto-generates output paths:

S.filter_data(data, consumer=['Early Professional'])
# Plots save to: figures/<export>/Cons-Early_Professional/<plot_name>.png

Getter Methods Return Tuples

All get_*() methods return (LazyFrame, Optional[metadata]):

df, choices_map = S.get_ss_green_blue(data)  # choices_map has trait descriptions
df, _ = S.get_character_ranking(data)  # Second element may be None

Development Commands

# Run interactive analysis notebook
uv run marimo run 02_quant_analysis.py --port 8080

# Edit notebook in editor mode
uv run marimo edit 02_quant_analysis.py

# Headless mode for shared access
uv run marimo run 02_quant_analysis.py --headless --port 8080

Important Files

File	Purpose
`utils.py`	`QualtricsSurvey` class, data transformations, PPTX utilities
`plots.py`	`QualtricsPlotsMixin` with all Altair plotting methods
`theme.py`	`ColorPalette` and `jpmc_altair_theme()` for consistent styling
`validation.py`	Data quality checks (progress, duration outliers, straight-liners)
`speaking_styles.py`	`SPEAKING_STYLES` dict mapping colors to trait groups

Conventions

Altair Charts & Colors

ALL colors MUST come from theme.py - Use ColorPalette.PRIMARY, ColorPalette.RANK_1, etc.
If a new color is needed, add it to ColorPalette in theme.py first, then use it
Never hardcode hex colors directly in plotting code
Charts auto-save via _save_plot() when fig_save_dir is set
Filter footnotes added automatically via _add_filter_footnote()

QSF Parsing

Use _get_qsf_question_by_QID() to extract question config:

cfg = self._get_qsf_question_by_QID('QID27')['Payload']
recode_map = cfg['RecodeValues']  # Maps choice numbers to values

PPTX Image Replacement

Images matched by perceptual hash (not filename); alt-text encodes figure path:

utils.update_ppt_alt_text(ppt_path, image_source_dir)  # Tag images with alt-text
utils.pptx_replace_named_image(ppt, target_tag, new_image)  # Replace by alt-text

This is a process that should be run manually be the user ONLY.

4.4 KiB Raw Blame History