Skip to main content
This guide will take you through the steps required to load documents from Notion pages and databases using the Notion API.

Overview

Notion is a versatile productivity platform that consolidates note-taking, task management, and data organization tools into one interface. This document loader is able to take full Notion pages and databases and turn them into a LangChain Documents ready to be integrated into your projects.

Setup

  1. You will first need to install the official Notion client and the notion-to-md package as peer dependencies:
npm
npm install @langchain/community @langchain/core @notionhq/client notion-to-md
  1. Create a Notion integration and securely record the Internal Integration Secret (also known as NOTION_INTEGRATION_TOKEN).
  2. Add a connection to your new integration on your page or database. To do this open your Notion page, go to the settings pips in the top right and scroll down to Add connections and select your new integration.
  3. Get the PAGE_ID or DATABASE_ID for the page or database you want to load.
The 32 char hex in the url path represents the ID. For example:
PAGE_ID: https://www.notion.so/skarard/LangChain-Notion-API-b34ca03f219c4420a6046fc4bdfdf7b4
DATABASE_ID: https://www.notion.so/skarard/c393f19c3903440da0d34bf9c6c12ff2?v=9c70a0f4e174498aa0f9021e0a9d52de
REGEX: /(?<!=)[0-9a-f]{32}/

Example Usage

import { NotionAPILoader } from "@langchain/community/document_loaders/web/notionapi";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";

// Loading a page (including child pages all as separate documents)
const pageLoader = new NotionAPILoader({
  clientOptions: {
    auth: "<NOTION_INTEGRATION_TOKEN>",
  },
  id: "<PAGE_ID>",
  type: "page",
});

const splitter = new RecursiveCharacterTextSplitter();

// Load the documents
const pageDocs = await pageLoader.load();
// Split the documents using the text splitter
const splitDocs = await splitter.splitDocuments(pageDocs);

console.log({ splitDocs });

// Loading a database (each row is a separate document with all properties as metadata)
const dbLoader = new NotionAPILoader({
  clientOptions: {
    auth: "<NOTION_INTEGRATION_TOKEN>",
  },
  id: "<DATABASE_ID>",
  type: "database",
  onDocumentLoaded: (current, total, currentTitle) => {
    console.log(`Loaded Page: ${currentTitle} (${current}/${total})`);
  },
  callerOptions: {
    maxConcurrency: 64, // Default value
  },
  propertiesAsHeader: true, // Prepends a front matter header of the page properties to the page contents
});

// A database row contents is likely to be less than 1000 characters so it's not split into multiple documents
const dbDocs = await dbLoader.load();

console.log({ dbDocs });

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.
I