Overview
KnowledgeBase enables you to build Retrieval-Augmented Generation (RAG) systems by automatically processing documents, creating embeddings, and storing them in vector databases. It integrates seamlessly with Agent and Task to provide relevant context for AI-powered queries.Key Features
- Automatic Processing: Loads documents, chunks text, creates embeddings, and stores in vector databases
- Multiple Formats: Supports PDFs, Markdown, DOCX, CSV, JSON, HTML, and more
- Intelligent Chunking: Auto-detects optimal text splitting strategies based on file type and use case
- Flexible Storage: Works with Chroma, Milvus, Qdrant, Pinecone, Weaviate, FAISS, PGVector, and SuperMemory
- Hybrid Search: Combines dense vector search with full-text search for better results
- Tool Integration: Can be used as a tool, allowing agents to actively search and retrieve information
- Named Knowledge Bases: Use
name,description, andtopicsto help agents intelligently route queries across multiple knowledge bases
Installation
To use KnowledgeBase, you’ll need to install the required dependencies for your chosen vector database and (optionally) document loaders and embedding providers.Example: Setting up KnowledgeBase with ChromaFor a complete RAG setup using Chroma as the vector database, PDF loader, and OpenAI embeddings:What each optional group provides:
[chroma]- ChromaDB vector database client[pdf-loader]- PDF document loader (PyPDF)[embeddings]- Embedding providers (OpenAI, etc.)
chroma with qdrant, milvus, weaviate, pinecone, faiss, pgvector, or supermemory. For other loaders, see the Loaders documentation.Quick Start
Create a KnowledgeBase from documents and use it with an Agent:Integrations
KnowledgeBase supports a rich ecosystem of integrations for vector stores, embedding providers, document loaders, and text splitters.Vector Stores
Chroma, Qdrant, Pinecone, Milvus, PGVector, FAISS, Weaviate, SuperMemory
Embedding Providers
OpenAI, Azure, Google, AWS Bedrock, HuggingFace, FastEmbed, Ollama
Document Loaders
PDF, DOCX, CSV, JSON, Markdown, HTML, XML, YAML, Text & more
Text Splitters
Recursive, Semantic, Agentic, Character, Markdown, HTML, JSON, Python
Navigation
- Getting Started - Build your first RAG system in 5 minutes
- Attributes - Configuration options for KnowledgeBase
- Putting Files - How to add documents to your knowledge base
- Using as Tool - Use KnowledgeBase as a tool in Agent or Task
- Query Control - Control when RAG context is injected
- Examples - Practical runnable examples
- Vector Stores - Choose and configure your vector database
Advanced Features
- Auto-Detection - Intelligent loader and splitter selection
- Indexed Processing - Per-source loaders and splitters
- Direct Content - Ingest raw text without files
- Vector Search Tuning - Fine-tune retrieval per Task
- Document Management - Add, remove, refresh documents dynamically
- Storage Persistence - Persist document metadata to a storage backend
- Isolated Search - Scope queries to a single KnowledgeBase in a shared collection

