Overview
Text highlighting on akita-web for clippings and Calibre book chapter summaries, with persistent storage and semantic search.
Architecture
Client-side (akita-web)
- Selection handler on article/clipping/chapter-summary pages
- On text selection: capture
selectedText,paragraphIndex,startOffset,endOffset, surrounding context - POST to Akita highlights endpoint
- On render: inject
<mark>tags at stored offsets - Distinct styles for user highlights vs. AI-synthesized annotations
Storage
- MongoDB
highlightscollection - Schema:
{ source_id, source_type, text, context, paragraph_index, start_offset, end_offset, comment, comment_type, conversation_id, tags, created_at } source_type:clipping|chapter_summary|book_chaptercomment_type:manual|ai_synthesized
Qdrant integration
- Each highlight embedded and indexed for semantic search
- Enables: “find everything I’ve highlighted about consciousness” across books and articles
- Highlights surface in dossiers and briefings via existing corpus search
MCP tools
highlights__list— list highlights by source, date, tagshighlights__get— retrieve a highlight with its comment/conversationhighlights__search— semantic search across all highlightshighlights__delete— remove a highlight
Anchoring strategy
- Content is controlled (Akita renders it) → simple offset-based anchoring is sufficient
- Store highlighted text string as fallback for re-anchoring after re-ingest
- Fuzzy text match on re-render if offsets are stale
Key decisions
- No complex W3C Web Annotation anchoring needed — we control the rendered content
- Highlights are first-class corpus objects (embedded in Qdrant)
- Comment field supports both manual notes and AI-synthesized summaries (see: inline LLM discussion ticket)