Docgen: A C++ CLI Tool for Documentation Generation Using Local LLMs

Docgen is an open-source command-line tool written in C++ that uses locally-run large language models to automate the generation of code documentation. The project targets one of software development's most persistent problems: codebases with missing, outdated, or inadequate documentation. By processing source files on the developer's machine and generating inline comments, docstrings, or documentation files, Docgen avoids sending code to external cloud APIs — a meaningful advantage for organizations handling proprietary or sensitive source code.

The choice of C++ as the implementation language reflects a deliberate focus on performance and low overhead. Local LLM inference is resource-intensive, and a native C++ implementation can interface more efficiently with popular inference backends such as llama.cpp and Ollama compared to tools built on interpreted runtimes. That portability also makes Docgen a reasonable fit for CI/CD pipelines or pre-commit hooks, where pulling in a heavyweight language runtime is impractical.

Docgen is not the only tool chasing local-first AI documentation. Projects like Mintlify and Swimm have tackled auto-generated docs from the cloud side, while open-source initiatives like llm-docstring-generator take a Python-based approach to local inference. Docgen's bet is that C++ overhead savings matter enough at the inference layer to justify the implementation cost. Whether that holds across real-world codebases — and across languages beyond C++ itself — is an open question. The project appeared on Hacker News under a "Show HN" submission, signaling it is in early community-feedback stages, with supported model backends, output quality, and compatibility with established standards like Doxygen and JSDoc still to be established.