architecture clarification

2025-12-03 12:12:23 +01:00
parent b21f402e1e
commit 98202ac3f2
1 changed files with 5 additions and 2 deletions
--- a/Architecture_Overview.py
+++ b/Architecture_Overview.py
@@ -78,7 +78,9 @@ def _(mo):

    **Goal:** Convert unstructured text into a structured dataset.

-    1.  **Input:** All 26 Transcripts + `master_codebook.json`.
+    This will be a dedicated notebook, and be run per transcript.
+
+    1.  **Input:** Transcript + `master_codebook.json`.
    2.  **Process:**
        *   The LLM analyzes each transcript segment-by-segment.
        *   It extracts specific quotes that match a Theme Definition.
@@ -86,8 +88,9 @@ def _(mo):
        *   **Granular Sentiment Analysis:** For each quote, the model identifies:
            *   **Subject:** The specific topic/object being discussed (e.g., "Login Flow", "Brand Tone").
            *   **Sentiment:** Positive / Neutral / Negative.
-    3.  **Output:** `coded_segments.csv`
+    3.  **Output:** `<transcript_name>_coded_segments.csv`
        *   Columns: `Source_File`, `Speaker`, `Theme`, `Quote`, `Subject`, `Sentiment`, `Context`.
+        *   Each transcript produces its own CSV-file, which can be reviewed and adjusted before moving to the next stage
    """)
    return