From 98202ac3f2132bb0f00591d088d8007ea5840152 Mon Sep 17 00:00:00 2001 From: Luigi Maiorano Date: Wed, 3 Dec 2025 12:12:23 +0100 Subject: [PATCH] architecture clarification --- Architecture_Overview.py | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/Architecture_Overview.py b/Architecture_Overview.py index 19922bc..f9236f3 100644 --- a/Architecture_Overview.py +++ b/Architecture_Overview.py @@ -78,7 +78,9 @@ def _(mo): **Goal:** Convert unstructured text into a structured dataset. - 1. **Input:** All 26 Transcripts + `master_codebook.json`. + This will be a dedicated notebook, and be run per transcript. + + 1. **Input:** Transcript + `master_codebook.json`. 2. **Process:** * The LLM analyzes each transcript segment-by-segment. * It extracts specific quotes that match a Theme Definition. @@ -86,8 +88,9 @@ def _(mo): * **Granular Sentiment Analysis:** For each quote, the model identifies: * **Subject:** The specific topic/object being discussed (e.g., "Login Flow", "Brand Tone"). * **Sentiment:** Positive / Neutral / Negative. - 3. **Output:** `coded_segments.csv` + 3. **Output:** `_coded_segments.csv` * Columns: `Source_File`, `Speaker`, `Theme`, `Quote`, `Subject`, `Sentiment`, `Context`. + * Each transcript produces its own CSV-file, which can be reviewed and adjusted before moving to the next stage """) return