DocxLoader allows you to extract text data from Microsoft Word documents. It supports both the modern .docx format and the legacy .doc format. Depending on the file type, additional dependencies are required.
Setup
To useDocxLoader, you’ll need the @langchain/community integration along with either mammoth or word-extractor package:
mammoth: For processing.docxfiles.word-extractor: For handling.docfiles.
Installation
For .docx Files
npm
For .doc Files
npm
Usage
Loading .docx Files
For .docx files, there is no need to explicitly specify any parameters when initializing the loader:
Loading .doc Files
For .doc files, you must explicitly specify the type as doc when initializing the loader:
Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.