This content has not been translated yet.

Corpus & Vector Search

Qdrant-powered semantic search across books, RSS, dossiers, and more

Overview

The Corpus is Akita’s semantic search layer, backed by Qdrant (vector database, port 6333). It indexes content from multiple sources — books, RSS feeds, research dossiers, chapter summaries — and provides semantic search across all of them.

Search Tools

ToolPurpose
corpus__searchSemantic search across the entire indexed corpus
corpus__search_in_bookSearch within a specific book title
corpus__search_by_authorFilter search results by author
corpus__search_by_tagsFilter by Calibre book tags (match any or all)
corpus__search_book_summariesSearch within AI-generated chapter summaries

Content Access

ToolPurpose
corpus__get_chunk_contextReturn a chunk with surrounding context (configurable window)
corpus__get_file_chunksList all chunks from a specific file path
corpus__get_chapter_summaryRaw summary text for a chapter
corpus__get_chapter_textChapter plaintext (from summary path or direct path)

Indexing

  • corpus__reindex — re-index all or specific sources
  • corpus__corpus_stats — collection counts, index health

Data Sources

Content flows into the corpus from:

  1. Calibre books — chapter text extracted from EPUBs
  2. Chapter summaries — AI-generated summaries of book chapters
  3. RSS feeds — fetched periodically, reports available via get_rss_reports
  4. Research dossiers — output of deep research (see Tracks)

MCP Tools

See MCP Tools Reference for the full list of corpus__* tools.