Back to BlogEngineering

Chonkie LlamaIndex Integration

Hafedh Hichri
Hafedh HichriTechnical Staff · February 19, 2026 · 4 min read

Chonkie is now officially integrated into the LlamaIndex ecosystem as a native node parser and embedding module. This means you can use Chonkie's algorithms for chunking and embeddings without leaving the LlamaIndex API.

The Chunker node parser

LlamaIndex's node parsers turn raw documents into TextNode objects that the rest of the pipeline can index and retrieve. The new Chunker node parser wraps Chonkie's chunkers behind that same interface, so you get all of Chonkie's strategies without leaving the LlamaIndex API.

Installation (Chunker)

bash
pip install llama-index-node-parser-chonkie

Quick start

The simplest way to use it is to pass a chunker alias and any keyword arguments the underlying Chonkie chunker accepts:

python
from llama_index.node_parser.chonkie import Chunker

parser = Chunker("recursive", chunk_size=2048)
nodes = parser.get_nodes_from_documents(documents)

If you already have a Chonkie chunker instance, you can pass it directly:

python
from chonkie import RecursiveChunker
from llama_index.node_parser.chonkie import Chunker

chonkie_chunker = RecursiveChunker(chunk_size=512)
parser = Chunker(chonkie_chunker)
nodes = parser.get_nodes_from_documents(documents)

Supported chunkers

The chunker parameter accepts an alias string for any of the supported strategies:

AliasDescription
recursive(Default) Recursively splits text using a hierarchy of separators
sentenceSplits at sentence boundaries
tokenSplits by token count
wordSplits by word count
semanticSplits by semantic similarity
lateLate chunking strategy
neuralNeural-based chunking
codeOptimized for source code
fastHigh-performance basic chunking

To see the full list at runtime:

python
from llama_index.node_parser.chonkie import Chunker

print(Chunker.valid_chunker_types)

Using it in an ingestion pipeline

Chunker is a standard LlamaIndex transformation, so it slots directly into IngestionPipeline:

python
from llama_index.core import Document
from llama_index.core.ingestion import IngestionPipeline
from llama_index.node_parser.chonkie import Chunker

pipeline = IngestionPipeline(
    transformations=[
        Chunker("recursive", chunk_size=512),
        # ... embed, store, etc.
    ]
)

nodes = pipeline.run(documents=[Document.example()])

Semantic chunking with a custom model

Keyword arguments are forwarded to the underlying Chonkie chunker, so you can configure any strategy in full:

python
from llama_index.node_parser.chonkie import Chunker

parser = Chunker(
    "semantic",
    chunk_size=512,
    embedding_model="all-MiniLM-L6-v2",
    threshold=0.5,
)
nodes = parser.get_nodes_from_documents(documents)

AutoEmbeddings

Chonkie's AutoEmbeddings module picks the right embeddings backend from a model name string — the same way HuggingFace's AutoModel works. The LlamaIndex integration exposes this as ChonkieAutoEmbedding, a drop-in replacement for any LlamaIndex embedding model.

Supported providers:

  • Sentence Transformers — local, no API key needed
  • Model2Vec — fast static embeddings
  • OpenAItext-embedding-3-small, text-embedding-3-large, etc.
  • Cohereembed-english-v3.0, etc.
  • Jina AIjina-embeddings-v3, etc.

Shout out to Clelia for adding this integration to LlamaIndex!

Installation (AutoEmbeddings)

bash
pip install llama-index-embeddings-autoembeddings

Local embeddings (no API key)

python
from llama_index.embeddings.autoembeddings import ChonkieAutoEmbedding

embedder = ChonkieAutoEmbedding(model_name="all-MiniLM-L6-v2")
vector = embedder.get_text_embedding(
    "The quick brown fox jumps over the lazy dog."
)

Cloud embeddings

Set the provider's API key as an environment variable and pass the model name — ChonkieAutoEmbedding figures out the rest:

python
import os
from llama_index.embeddings.autoembeddings import ChonkieAutoEmbedding

os.environ["OPENAI_API_KEY"] = "YOUR-API-KEY"

embedder = ChonkieAutoEmbedding(model_name="text-embedding-3-large")
vector = embedder.get_text_embedding(
    "The quick brown fox jumps over the lazy dog."
)

Putting it all together

Here is a complete RAG pipeline using both integrations — Chonkie for chunking and ChonkieAutoEmbedding for embeddings:

python
from llama_index.core import Document
from llama_index.core.ingestion import IngestionPipeline
from llama_index.core.vector_stores import SimpleVectorStore
from llama_index.node_parser.chonkie import Chunker
from llama_index.embeddings.autoembeddings import ChonkieAutoEmbedding

pipeline = IngestionPipeline(
    transformations=[
        Chunker("semantic", chunk_size=512, threshold=0.5),
        ChonkieAutoEmbedding(model_name="all-MiniLM-L6-v2"),
    ],
    vector_store=SimpleVectorStore(),
)

documents = [Document(text="Your corpus goes here...")]
nodes = pipeline.run(documents=documents)

Why this matters

Before these integrations, using Chonkie inside LlamaIndex meant writing glue code to convert chunks into nodes and wiring up a custom embedding class. Now both are first-class LlamaIndex components: installable from PyPI, composable with the rest of the ecosystem, and documented in the official LlamaIndex module guide.

If you are building RAG pipelines with LlamaIndex and want better chunking, this is the path of least resistance.