Doctran is a python package. It uses LLMs and open-source
NLP libraries to transform raw text into clean, structured, information-dense documents
that are optimized for vector space retrieval. You can think of Doctran as a black box where
messy strings go in and nice, clean, labelled strings come out.
Installation and Setup
Document Transformers
Document Interrogator
See a usage example for DoctranQATransformer.Property Extractor
See a usage example for DoctranPropertyExtractor.Document Translator
See a usage example for DoctranTextTranslator.Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.