Start automation of running filter combinations

2026-02-03 14:33:09 +01:00
parent 840cb4e6dc
commit 8dd41dfc96
5 changed files with 354 additions and 3 deletions
--- a/README.md
+++ b/README.md
@@ -1,5 +1,147 @@
+# Voice Branding Quantitative Analysis
+
+## Running Marimo Notebooks
+
 Running on Ct-105 for shared access:

-```
+```bash
 uv run marimo run 02_quant_analysis.py --headless --port 8080
-```
+```
+
+---
+
+## Batch Report Generation
+
+The quant report can be run with different filter combinations via CLI or automated batch processing.
+
+### Single Filter Run (CLI)
+
+Run the report script directly with JSON-encoded filter arguments:
+
+```bash
+# Single consumer segment
+uv run python 03_quant_report.script.py --consumer '["Starter"]'
+
+# Single age group
+uv run python 03_quant_report.script.py --age '["18 to 21 years"]'
+
+# Multiple filters combined
+uv run python 03_quant_report.script.py --age '["18 to 21 years", "22 to 24 years"]' --gender '["Male"]'
+
+# All respondents (no filters = defaults to all options selected)
+uv run python 03_quant_report.script.py
+```
+
+Available filter arguments:
+- `--age` — JSON list of age groups
+- `--gender` — JSON list of genders  
+- `--ethnicity` — JSON list of ethnicities
+- `--income` — JSON list of income groups
+- `--consumer` — JSON list of consumer segments
+
+### Batch Runner (All Combinations)
+
+Run all single-filter combinations automatically with progress tracking:
+
+```bash
+# Preview all combinations without running
+uv run python run_filter_combinations.py --dry-run
+
+# Run all combinations (shows progress bar)
+uv run python run_filter_combinations.py
+
+# Or use the registered CLI entry point
+uv run quant-report-batch
+uv run quant-report-batch --dry-run
+```
+
+This generates reports for:
+- All Respondents (no filters)
+- Each age group individually
+- Each gender individually
+- Each ethnicity individually
+- Each income group individually
+- Each consumer segment individually
+
+Output figures are saved to `figures/<export_date>/<filter_slug>/`.
+
+### Jupyter Notebook Debugging
+
+The script auto-detects Jupyter/IPython environments. When running in VS Code's Jupyter extension, CLI args default to `None` (all options selected), so you can debug cell-by-cell normally.
+
+---
+
+## Adding Custom Filter Combinations
+
+To add new filter combinations to the batch runner, edit `run_filter_combinations.py`:
+
+### Checklist
+
+1. **Open** `run_filter_combinations.py`
+
+2. **Find** the `get_filter_combinations()` function
+
+3. **Add** your combination to the list before the `return` statement:
+
+```python
+# Example: Add a specific age + consumer cross-filter
+combinations.append({
+    'name': 'Age-18to24_Consumer-Starter',  # Used for output folder naming
+    'filters': {
+        'age': ['18 to 21 years', '22 to 24 years'],
+        'consumer': ['Starter']
+    }
+})
+```
+
+4. **Filter keys** must match CLI argument names:
+   - `age` — values from `survey.options_age`
+   - `gender` — values from `survey.options_gender`
+   - `ethnicity` — values from `survey.options_ethnicity`
+   - `income` — values from `survey.options_income`
+   - `consumer` — values from `survey.options_consumer`
+
+5. **Check available values** by running:
+```python
+from utils import QualtricsSurvey
+S = QualtricsSurvey('data/exports/2-2-26/...Labels.csv', 'data/exports/.../....qsf')
+S.load_data()
+print(S.options_age)
+print(S.options_consumer)
+# etc.
+```
+
+6. **Test** with dry-run first:
+```bash
+uv run python run_filter_combinations.py --dry-run
+```
+
+### Example: Adding Multiple Cross-Filters
+
+```python
+# In get_filter_combinations(), before return:
+
+# Young professionals
+combinations.append({
+    'name': 'Young_Professionals',
+    'filters': {
+        'age': ['22 to 24 years', '25 to 34 years'],
+        'consumer': ['Early Professional']
+    }
+})
+
+# High income males
+combinations.append({
+    'name': 'High_Income_Male',
+    'filters': {
+        'income': ['$150,000 - $199,999', '$200,000 or more'],
+        'gender': ['Male']
+    }
+})
+```
+
+### Notes
+
+- **Empty filters dict** = all respondents (no filtering)
+- **Omitted filter keys** = all options for that dimension selected
+- **Output folder names** are auto-generated from active filters by `QualtricsSurvey.filter_data()`