MIT licensed · OKF v0.1 conformant · zero LLM required

Stop feeding your AI agent
the whole codebase.

Scans any repo, outputs one markdown file per function, class, module, and dependency. Your agent runs okf lookup <Name> instead of re-reading 600 lines to find one signature. No LLM required to build it.

~/my_project

Reading a whole file to find one function is expensive

Cloud models with huge context windows hide this cost. Local models on a laptop run out of memory immediately.

Before — entire file as context
~14,000
tokens to find one 12-line method
After — exact concept only
CLASS: WorldBankConnector
Description: Fetches World Bank
  development indicators.
Methods: get_indicator, search
Signature: class WorldBankConnector
~140
tokens. exact answer, zero guessing.

Three commands, one workflow

Generate once. Look up forever. Regenerate whenever the code changes.

01 — scan

okf generate

Tree-sitter and native AST parsers walk the repo and extract every function, class, module, and manifest dependency, with cross-referenced calls and imports.

02 — retrieve

okf lookup

Instant, zero-LLM concept search by name, type, tag, or file. Returns signature, docstring, params, callers, and callees in milliseconds.

03 — integrate

okf install

Wires the bundle into Claude Code, Cursor, Copilot, Windsurf, Cline, or OpenCode so the agent checks the bundle before touching source.

Works with the agent you already use

One command wires the bundle into your agent's rules. One more exposes it over MCP.

MODEL CONTEXT PROTOCOL

Speak MCP? So does the bundle.

Run okf mcp ./okf_bundle and any MCP-compatible client — Claude Desktop, Claude Code, or a custom agent — can query the knowledge graph directly, no CLI wrapper needed.

$ okf mcp ./okf_bundle
MCP server listening on stdio…
okf install claudeClaude Code skill
okf install cursor.cursorrules
okf install opencode/lookup command
okf install copilotCopilot instructions
okf install windsurf.windsurfrules
okf install cline.clinerules

Or set up every detected agent at once: okf install all

Built for local SLMs, not just the cloud

Cloud models mask the cost of huge context windows. Local models — Gemma, Llama, Phi — running on a laptop don't have that luxury; feed one the whole repo and it runs out of memory. okf lookup sends a ~50-token query and gets back a ~200-token concept card. No embeddings, no vector DB, no RAG pipeline.

local llama.cpp
OKF_ENRICH=1 \
OKF_BASE_URL="http://localhost:8080/v1" \
OKF_MODEL="gemma-3-4b-it-qat-GGUF:Q4_0" \
$ okf generate ./my_project ./okf_bundle

10 languages, 17 manifest formats

Deterministic extraction — no LLM call needed to index a codebase.

Python JavaScript TypeScript Go Java Rust Ruby C C++ C# SQL
pip / Python
requirements.txtpyproject.tomlpoetry.lock
npm / JS
package.jsonyarn.lockpnpm-lock.yaml
cargo / Rust
Cargo.tomlCargo.lock
go / Go
go.modgo.sum
maven & gradle / Java
pom.xmlbuild.gradle
bundler / Ruby
Gemfile
composer / PHP
composer.json
swiftpm / Swift
Package.swift
other
project.cljmix.exs

Built for agent workflows, not just documentation

okf_bundle/
okf_bundle/
├── SUMMARY.md ← bird's-eye view
├── index.md
├── _dependencies/
│   └── pip/npm/cargo/…
└── src/connectors/
    ├── economic_data.md ← Module
    └── economic_data/
       ├── WorldBankConnector.md
       └── get_indicator.md

Layout mirrors your source tree

No flat functions/ / classes/ buckets. Every concept file sits where its source file sits, plus a domain-organized _dependencies/ tree. Diff-friendly, git-friendly, and safe to commit alongside the code it describes.

Zero-LLM extraction

Tree-sitter + AST parsing. Nothing calls an API unless you turn on enrichment.

Cross-reference linker

Imports → dependencies, calls → callers/callees. Resolved across every supported language.

Interactive visualizer

One HTML file. Tree nav + local graphs. Opens in a browser, no server.

Bundle diff

Added / removed / changed concepts between two bundle versions, by content hash.

MCP server

okf mcp — bundle over Model Context Protocol, any client.

Training pairs

Bundle → JSONL. codegen, QA, doc, summarize, crosslink pair types.

CLI at a glance

okf generateScan a codebase and write an OKF v0.1 bundle
okf lookupSearch the bundle — by name, type, tag, or source file
okf initInteractive bundle setup wizard
okf diffCompare two bundles: added, removed, changed concepts
okf summarizeRegenerate the bundle's SUMMARY.md map
okf visualizeGenerate a self-contained interactive HTML explorer
okf serveLaunch a local server and auto-open the visualization
okf mcpExpose the bundle over Model Context Protocol
okf pairsConvert a bundle into JSONL fine-tuning pairs
okf installWire up Claude Code, Cursor, Copilot, Windsurf, Cline, or OpenCode

Not RAG. Not embeddings. Exact lookup.

RAG retrieves by semantic similarity — approximate, and it can miss exact symbols. okf indexes real functions, classes, and dependencies by name.

okf-generatorRAG / vector searchRead whole file
Zero-LLM requiredYesNo — needs embeddingsYes
Exact symbol matchYesApproximateYes, if you find it
Vector DB / infraNone neededRequiredNone needed
Token cost per query~140 tokensChunk-dependentWhole file
Works fully offlineYesDepends on embedderYes
Git-diffable outputPlain markdownOpaque vectorsN/A

Common questions

Does this require an API key or internet connection?

No. Core extraction (okf generate) is fully offline and deterministic — no LLM call is made unless you explicitly enable OKF_ENRICH=1.

What happens if my language isn't supported?

Unsupported files are skipped, not dropped silently — log.md records what was scanned. Adding a language is a self-contained tree-sitter grammar mapping; it's a listed good-first-issue.

Does this work on monorepos or very large codebases?

Yes — the bundle mirrors your source tree, so scanning is linear in file count. For very large repos, scope okf generate to a subdirectory if you only need part indexed.

Is the bundle safe to commit to git?

Yes — that's the intended workflow. Bundles are plain markdown, diff cleanly, and version alongside the code they describe.

One install, works with any agent

pip install okf-generator
macOS / Linux one-linercurl -fsSL raw.githubusercontent.com/UmairBaig8/okf-generator/main/scripts/install.sh | bash
With LLM enrichmentpip install "okf-generator[llm]"