239 lines
6.9 KiB
Markdown
239 lines
6.9 KiB
Markdown
# Voice Branding Quantitative Analysis
|
|
|
|
## Running Marimo Notebooks
|
|
|
|
Running on Ct-105 for shared access:
|
|
|
|
```bash
|
|
uv run marimo run 02_quant_analysis.py --headless --port 8080
|
|
```
|
|
|
|
---
|
|
|
|
## Batch Report Generation
|
|
|
|
The quant report can be run with different filter combinations via CLI or automated batch processing.
|
|
|
|
### Single Filter Run (CLI)
|
|
|
|
Run the report script directly with JSON-encoded filter arguments:
|
|
|
|
```bash
|
|
# Single consumer segment
|
|
uv run python 03_quant_report.script.py --consumer '["Starter"]'
|
|
|
|
# Single age group
|
|
uv run python 03_quant_report.script.py --age '["18 to 21 years"]'
|
|
|
|
# Multiple filters combined
|
|
uv run python 03_quant_report.script.py --age '["18 to 21 years", "22 to 24 years"]' --gender '["Male"]'
|
|
|
|
# All respondents (no filters = defaults to all options selected)
|
|
uv run python 03_quant_report.script.py
|
|
```
|
|
|
|
Available filter arguments:
|
|
- `--age` — JSON list of age groups
|
|
- `--gender` — JSON list of genders
|
|
- `--ethnicity` — JSON list of ethnicities
|
|
- `--income` — JSON list of income groups
|
|
- `--consumer` — JSON list of consumer segments
|
|
|
|
### Batch Runner (All Combinations)
|
|
|
|
Run all single-filter combinations automatically with progress tracking:
|
|
|
|
```bash
|
|
# Preview all combinations without running
|
|
uv run python run_filter_combinations.py --dry-run
|
|
|
|
# Run all combinations (shows progress bar)
|
|
uv run python run_filter_combinations.py
|
|
|
|
# Or use the registered CLI entry point
|
|
uv run quant-report-batch
|
|
uv run quant-report-batch --dry-run
|
|
```
|
|
|
|
This generates reports for:
|
|
- All Respondents (no filters)
|
|
- Each age group individually
|
|
- Each gender individually
|
|
- Each ethnicity individually
|
|
- Each income group individually
|
|
- Each consumer segment individually
|
|
|
|
Output figures are saved to `figures/<export_date>/<filter_slug>/`.
|
|
|
|
### Jupyter Notebook Debugging
|
|
|
|
The script auto-detects Jupyter/IPython environments. When running in VS Code's Jupyter extension, CLI args default to `None` (all options selected), so you can debug cell-by-cell normally.
|
|
|
|
---
|
|
|
|
## Adding Custom Filter Combinations
|
|
|
|
To add new filter combinations to the batch runner, edit `run_filter_combinations.py`:
|
|
|
|
### Checklist
|
|
|
|
1. **Open** `run_filter_combinations.py`
|
|
|
|
2. **Find** the `get_filter_combinations()` function
|
|
|
|
3. **Add** your combination to the list before the `return` statement:
|
|
|
|
```python
|
|
# Example: Add a specific age + consumer cross-filter
|
|
combinations.append({
|
|
'name': 'Age-18to24_Consumer-Starter', # Used for output folder naming
|
|
'filters': {
|
|
'age': ['18 to 21 years', '22 to 24 years'],
|
|
'consumer': ['Starter']
|
|
}
|
|
})
|
|
```
|
|
|
|
4. **Filter keys** must match CLI argument names (defined in `FILTER_CONFIG` in `03_quant_report.script.py`):
|
|
- `age` — values from `survey.options_age`
|
|
- `gender` — values from `survey.options_gender`
|
|
- `ethnicity` — values from `survey.options_ethnicity`
|
|
- `income` — values from `survey.options_income`
|
|
- `consumer` — values from `survey.options_consumer`
|
|
|
|
5. **Check available values** by running:
|
|
```python
|
|
from utils import QualtricsSurvey
|
|
S = QualtricsSurvey('data/exports/2-2-26/...Labels.csv', 'data/exports/.../....qsf')
|
|
S.load_data()
|
|
print(S.options_age)
|
|
print(S.options_consumer)
|
|
# etc.
|
|
```
|
|
|
|
6. **Test** with dry-run first:
|
|
```bash
|
|
uv run python run_filter_combinations.py --dry-run
|
|
```
|
|
|
|
### Example: Adding Multiple Cross-Filters
|
|
|
|
```python
|
|
# In get_filter_combinations(), before return:
|
|
|
|
# Young professionals
|
|
combinations.append({
|
|
'name': 'Young_Professionals',
|
|
'filters': {
|
|
'age': ['22 to 24 years', '25 to 34 years'],
|
|
'consumer': ['Early Professional']
|
|
}
|
|
})
|
|
|
|
# High income males
|
|
combinations.append({
|
|
'name': 'High_Income_Male',
|
|
'filters': {
|
|
'income': ['$150,000 - $199,999', '$200,000 or more'],
|
|
'gender': ['Male']
|
|
}
|
|
})
|
|
```
|
|
|
|
### Notes
|
|
|
|
- **Empty filters dict** = all respondents (no filtering)
|
|
- **Omitted filter keys** = all options for that dimension selected
|
|
- **Output folder names** are auto-generated from active filters by `QualtricsSurvey.filter_data()`
|
|
|
|
---
|
|
|
|
## Adding a New Filter Dimension
|
|
|
|
To add an entirely new filter dimension (e.g., a new demographic question), you need to update several files:
|
|
|
|
### Checklist
|
|
|
|
1. **Update `utils.py` — `QualtricsSurvey.__init__()`** to initialize the filter state attribute:
|
|
|
|
```python
|
|
# In __init__(), add after existing filter_ attributes (around line 758):
|
|
self.filter_region:list = None # QID99
|
|
```
|
|
|
|
2. **Update `utils.py` — `load_data()`** to populate the `options_*` attribute:
|
|
|
|
```python
|
|
# In load_data(), add after existing options:
|
|
self.options_region = sorted(df['QID99'].drop_nulls().unique().to_list()) if 'QID99' in df.columns else []
|
|
```
|
|
|
|
3. **Update `utils.py` — `filter_data()`** to accept and apply the filter:
|
|
|
|
```python
|
|
# Add parameter to function signature:
|
|
def filter_data(self, q: pl.LazyFrame, ..., region:list=None) -> pl.LazyFrame:
|
|
|
|
# Add filter logic in function body:
|
|
self.filter_region = region
|
|
if region is not None:
|
|
q = q.filter(pl.col('QID99').is_in(region))
|
|
```
|
|
|
|
4. **Update `plots.py` — `_get_filter_slug()`** to include the filter in directory slugs:
|
|
|
|
```python
|
|
# Add to the filters list:
|
|
('region', 'Reg', getattr(self, 'filter_region', None), 'options_region'),
|
|
```
|
|
|
|
5. **Update `plots.py` — `_get_filter_description()`** for human-readable descriptions:
|
|
|
|
```python
|
|
# Add to the filters list:
|
|
('Region', getattr(self, 'filter_region', None), 'options_region'),
|
|
```
|
|
|
|
6. **Update `03_quant_report.script.py` — `FILTER_CONFIG`**:
|
|
|
|
```python
|
|
FILTER_CONFIG = {
|
|
'age': 'options_age',
|
|
'gender': 'options_gender',
|
|
# ... existing filters ...
|
|
'region': 'options_region', # ← New filter
|
|
}
|
|
```
|
|
|
|
This **automatically**:
|
|
- Adds `--region` CLI argument
|
|
- Includes it in Jupyter mode (defaults to all options)
|
|
- Passes it to `S.filter_data()`
|
|
- Writes it to the `.txt` filter description file
|
|
|
|
7. **Update `run_filter_combinations.py`** to generate combinations (optional):
|
|
|
|
```python
|
|
# Add after existing filter loops:
|
|
for region in survey.options_region:
|
|
combinations.append({
|
|
'name': f'Region-{region}',
|
|
'filters': {'region': [region]}
|
|
})
|
|
```
|
|
|
|
### Currently Available Filters
|
|
|
|
| CLI Argument | Options Attribute | QID Column | Description |
|
|
|--------------|-------------------|------------|-------------|
|
|
| `--age` | `options_age` | QID1 | Age groups |
|
|
| `--gender` | `options_gender` | QID2 | Gender |
|
|
| `--ethnicity` | `options_ethnicity` | QID3 | Ethnicity |
|
|
| `--income` | `options_income` | QID15 | Income brackets |
|
|
| `--consumer` | `options_consumer` | Consumer | Consumer segments |
|
|
| `--business_owner` | `options_business_owner` | QID4 | Business owner status |
|
|
| `--employment_status` | `options_employment_status` | QID13 | Employment status |
|
|
| `--personal_products` | `options_personal_products` | QID14 | Personal products |
|
|
| `--ai_user` | `options_ai_user` | QID22 | AI user status |
|
|
| `--investable_assets` | `options_investable_assets` | QID16 | Investable assets |
|
|
| `--industry` | `options_industry` | QID17 | Industry | |