Fleet AI Context is a dataset of high-quality embeddings of the top 1200 most popular & permissive Python Libraries & their documentation.
The Fleet AI team is on a mission to embed the world’s most important data. They’ve started by embedding the top 1200 Python libraries to enable code generation with up-to-date knowledge.
Let’s take a look at how we can use these embeddings to power a docs retrieval system and ultimately a simple code-generating chain!
Retriever chunks
As part of their embedding process, the Fleet AI team first chunked long documents before embedding them. This means the vectors correspond to sections of pages in the LangChain docs, not entire pages. By default, when we spin up a retriever from these embeddings, we’ll be retrieving these embedded chunks. We will be using Fleet Context’sdownload_embeddings() to grab LangChain’s documentation embeddings. You can view all supported libraries’ documentation at fleet.so/context.
Other packages
You can download and use other embeddings from this Dropbox link.Retrieve parent docs
The embeddings provided by Fleet AI contain metadata that indicates which embedding chunks correspond to the same original document page. If we’d like we can use this information to retrieve whole parent documents, and not just embedded chunks. Under the hood, we’ll use a MultiVectorRetriever and a BaseStore object to search for relevant chunks and then map them to their parent document.Putting it in a chain
Let’s try using our retrieval systems in a simple chain!Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.