Corpus & Vector Search

Qdrant-powered semantic search across books, RSS, dossiers, and more

March 20, 2026

Overview

The Corpus is Akita’s semantic search layer, backed by Qdrant (vector database, port 6333). It indexes content from multiple sources — books, RSS feeds, research dossiers, chapter summaries — and provides semantic search across all of them.

Search Tools

Tool	Purpose
`corpus__search`	Semantic search across the entire indexed corpus
`corpus__search_in_book`	Search within a specific book title
`corpus__search_by_author`	Filter search results by author
`corpus__search_by_tags`	Filter by Calibre book tags (match any or all)
`corpus__search_book_summaries`	Search within AI-generated chapter summaries

Content Access

Tool	Purpose
`corpus__get_chunk_context`	Return a chunk with surrounding context (configurable window)
`corpus__get_file_chunks`	List all chunks from a specific file path
`corpus__get_chapter_summary`	Raw summary text for a chapter
`corpus__get_chapter_text`	Chapter plaintext (from summary path or direct path)

Indexing

corpus__reindex — re-index all or specific sources
corpus__corpus_stats — collection counts, index health

Data Sources

Content flows into the corpus from:

Calibre books — chapter text extracted from EPUBs
Chapter summaries — AI-generated summaries of book chapters
RSS feeds — fetched periodically, reports available via get_rss_reports
Research dossiers — output of deep research (see Tracks)

MCP Tools

See MCP Tools Reference for the full list of corpus__* tools.