diff --git a/Architecture_Overview.py b/Architecture_Overview.py index 19922bc..f9236f3 100644 --- a/Architecture_Overview.py +++ b/Architecture_Overview.py @@ -78,7 +78,9 @@ def _(mo): **Goal:** Convert unstructured text into a structured dataset. - 1. **Input:** All 26 Transcripts + `master_codebook.json`. + This will be a dedicated notebook, and be run per transcript. + + 1. **Input:** Transcript + `master_codebook.json`. 2. **Process:** * The LLM analyzes each transcript segment-by-segment. * It extracts specific quotes that match a Theme Definition. @@ -86,8 +88,9 @@ def _(mo): * **Granular Sentiment Analysis:** For each quote, the model identifies: * **Subject:** The specific topic/object being discussed (e.g., "Login Flow", "Brand Tone"). * **Sentiment:** Positive / Neutral / Negative. - 3. **Output:** `coded_segments.csv` + 3. **Output:** `_coded_segments.csv` * Columns: `Source_File`, `Speaker`, `Theme`, `Quote`, `Subject`, `Sentiment`, `Context`. + * Each transcript produces its own CSV-file, which can be reviewed and adjusted before moving to the next stage """) return