getting started

Getting Started

Install Docmancer, index your first docs, and run a hybrid query.

Updated

What is Docmancer?

Docmancer turns any pile of documentation into a local hybrid-search index your coding agent can query through a simple CLI. Point it at a folder of Markdown, PDF, DOCX, RTF, or HTML files, or at a docs URL (GitBook, Mintlify, generic web, GitHub), and your agent gets compact context packs back, sized to a token budget, with source attribution.

A fresh install ships everything needed: SQLite FTS5 for lexical search, a Docmancer-owned local Qdrant for dense and sparse vectors, FastEmbed for embeddings (no API key), and a retriever that fuses all three signals with Reciprocal Rank Fusion.

Docmancer also installs version-pinned API MCP packs so the same agent can call typed API operations through a local MCP server. Use docs retrieval for grounded context, packs when the agent needs callable tools.

Install

pipx install docmancer --python python3.13

Or with pip:

pip install docmancer

Optional extras pull in heavier dependencies only when you need them:

pip install 'docmancer[local]'     # PDF, DOCX, RTF, HTML loaders
pip install 'docmancer[vector]'    # Qdrant client + sqlite-vec fallback
pip install 'docmancer[browser]'   # Playwright for JS-heavy sites

First run

Three commands take you from a fresh install to a grounded query:

docmancer setup
docmancer ingest ./docs
docmancer query "How do I authenticate?" --explain

setup creates ~/.docmancer/ with the config and SQLite database, auto-detects installed coding agents (Claude Code, Cursor, Codex, Cline, Gemini, OpenCode, Claude Desktop, GitHub Copilot), and installs their skill files.

On the first ingest, Docmancer downloads the pinned Qdrant binary (~60 MB) and the FastEmbed models (~500 MB) into ~/.docmancer/. After that, ingest is offline.

query --explain shows which signal (lexical, dense, or sparse) placed each result.

Index a docs site instead

docmancer add https://docs.pytest.org
docmancer query "How do I parametrize a fixture?" --mode hybrid

docmancer add is for URLs and GitHub repos. docmancer ingest is for local paths. Both end up in the same hybrid index.

Install an API MCP pack

When your agent needs to call APIs, not just read about them, install a version-pinned pack:

docmancer install-pack open-meteo@v1
docmancer mcp doctor

Open-Meteo is keyless, so the install reports Required credentials: (none). For keyed packs, export the relevant <PACKAGE>_API_KEY env var before running mcp doctor.

Then ask your agent:

Get the current weather in New York Central Park.

The agent selects open_meteo__v1__forecast and dispatches GET /v1/forecast?latitude=40.7812&longitude=-73.9665&current_weather=true through your local docmancer mcp serve process.

What you get

  • Hybrid by default. Every query fans out across FTS5, dense vectors, and sparse vectors, then fuses with RRF.
  • No API keys required. FastEmbed runs locally; cloud providers (OpenAI, Voyage, Cohere) are optional.
  • Inspectable. Sections are written to ~/.docmancer/extracted/ as Markdown + JSON. docmancer inspect shows index stats.
  • Shared index. All agents on the same machine share the same database. Add docs in Claude Code, query from Cursor.