KMS Compilation Pipeline — Requirements
Problem
Uploaded raw .txt files sit in raw/notes/ but never become wiki pages.
The Compile button re-indexes wiki/topics/ but there's no pipeline
to create those wiki pages from raw uploads.
Goals
- User uploads .txt files via /upload
- One-click Compile button transforms raw .txt → wiki pages
- Wiki pages are properly formatted (frontmatter, tags, title)
- LLM is used to extract structure from unstructured notes
- Already-compiled files are skipped (no reprocessing)
- Results shown clearly in the UI
Constraints
- LLM API keys come from env vars (Rule 35: pydantic-settings)
- Original .txt files remain untouched (source artifacts)
- Pipeline runs server-side (no manual SSH steps)
- Handles errors: bad files, API failures, duplicate filenames
- Pages link back to source file for reference
- .txt → .md conversion uses pandoc (already available?) Fallback: Python text wrapping if pandoc not installed
Non-Goals
- Not a full text-to-knowledge AI pipeline (summary extraction, cross-referencing)
- Not editing existing wiki pages — only creating new ones from unprocessed files
Files Affected
New: - kms/scripts/compile_pipeline.py — core pipeline logic - kms/web/pipeline_config.py — LLM settings (API keys, model, prompt)
Modified: - kms/web/main.py — /compile route updated to call pipeline - kms/web/templates/compile.html — richer result display - kms/web/config.py — maybe add LLM settings
Components (for DP/HLI)
- File tracker — knows which raw files have been compiled (SQLite or hash DB)
- Converter — .txt → .md (pandoc or Python)
- LLM processor — sends .md to LLM, gets frontmatter + structured content
- Wiki writer — writes wiki page to wiki/topics/ with proper format
- Pipeline orchestrator — ties it together, handles errors
- Rebuild — FTS reindex after new pages