Chonkie is now officially integrated into the LlamaIndex ecosystem as a native node parser and embedding module. This means you can use Chonkie's algorithms for chunking and embeddings without leaving the LlamaIndex API.

The Chunker node parser

LlamaIndex's node parsers turn raw documents into TextNode objects that the rest of the pipeline can index and retrieve. The new Chunker node parser wraps Chonkie's chunkers behind that same interface, so you get all of Chonkie's strategies without leaving the LlamaIndex API.

Installation (Chunker)

bash

pip install llama-index-node-parser-chonkie

Quick start

The simplest way to use it is to pass a chunker alias and any keyword arguments the underlying Chonkie chunker accepts:

python

from llama_index.node_parser.chonkie import Chunker

parser = Chunker("recursive", chunk_size=2048)
nodes = parser.get_nodes_from_documents(documents)

If you already have a Chonkie chunker instance, you can pass it directly:

python

from chonkie import RecursiveChunker
from llama_index.node_parser.chonkie import Chunker

chonkie_chunker = RecursiveChunker(chunk_size=512)
parser = Chunker(chonkie_chunker)
nodes = parser.get_nodes_from_documents(documents)

Supported chunkers

The chunker parameter accepts an alias string for any of the supported strategies:

Alias	Description
`recursive`	(Default) Recursively splits text using a hierarchy of separators
`sentence`	Splits at sentence boundaries
`token`	Splits by token count
`word`	Splits by word count
`semantic`	Splits by semantic similarity
`late`	Late chunking strategy
`neural`	Neural-based chunking
`code`	Optimized for source code
`fast`	High-performance basic chunking

To see the full list at runtime:

python

from llama_index.node_parser.chonkie import Chunker

print(Chunker.valid_chunker_types)

Using it in an ingestion pipeline

Chunker is a standard LlamaIndex transformation, so it slots directly into IngestionPipeline:

python

from llama_index.core import Document
from llama_index.core.ingestion import IngestionPipeline
from llama_index.node_parser.chonkie import Chunker

pipeline = IngestionPipeline(
    transformations=[
        Chunker("recursive", chunk_size=512),
        # ... embed, store, etc.
    ]
)

nodes = pipeline.run(documents=[Document.example()])

Semantic chunking with a custom model

Keyword arguments are forwarded to the underlying Chonkie chunker, so you can configure any strategy in full:

python

from llama_index.node_parser.chonkie import Chunker

parser = Chunker(
    "semantic",
    chunk_size=512,
    embedding_model="all-MiniLM-L6-v2",
    threshold=0.5,
)
nodes = parser.get_nodes_from_documents(documents)

AutoEmbeddings

Chonkie's AutoEmbeddings module picks the right embeddings backend from a model name string — the same way HuggingFace's AutoModel works. The LlamaIndex integration exposes this as ChonkieAutoEmbedding, a drop-in replacement for any LlamaIndex embedding model.

Supported providers:

Sentence Transformers — local, no API key needed
Model2Vec — fast static embeddings
OpenAI — text-embedding-3-small, text-embedding-3-large, etc.
Cohere — embed-english-v3.0, etc.
Jina AI — jina-embeddings-v3, etc.

Shout out to Clelia for adding this integration to LlamaIndex!

Installation (AutoEmbeddings)

bash

pip install llama-index-embeddings-autoembeddings

Local embeddings (no API key)

python

from llama_index.embeddings.autoembeddings import ChonkieAutoEmbedding

embedder = ChonkieAutoEmbedding(model_name="all-MiniLM-L6-v2")
vector = embedder.get_text_embedding(
    "The quick brown fox jumps over the lazy dog."
)

Cloud embeddings

Set the provider's API key as an environment variable and pass the model name — ChonkieAutoEmbedding figures out the rest:

python

import os
from llama_index.embeddings.autoembeddings import ChonkieAutoEmbedding

os.environ["OPENAI_API_KEY"] = "YOUR-API-KEY"

embedder = ChonkieAutoEmbedding(model_name="text-embedding-3-large")
vector = embedder.get_text_embedding(
    "The quick brown fox jumps over the lazy dog."
)

Putting it all together

Here is a complete RAG pipeline using both integrations — Chonkie for chunking and ChonkieAutoEmbedding for embeddings:

python

from llama_index.core import Document
from llama_index.core.ingestion import IngestionPipeline
from llama_index.core.vector_stores import SimpleVectorStore
from llama_index.node_parser.chonkie import Chunker
from llama_index.embeddings.autoembeddings import ChonkieAutoEmbedding

pipeline = IngestionPipeline(
    transformations=[
        Chunker("semantic", chunk_size=512, threshold=0.5),
        ChonkieAutoEmbedding(model_name="all-MiniLM-L6-v2"),
    ],
    vector_store=SimpleVectorStore(),
)

documents = [Document(text="Your corpus goes here...")]
nodes = pipeline.run(documents=documents)

Why this matters

Before these integrations, using Chonkie inside LlamaIndex meant writing glue code to convert chunks into nodes and wiring up a custom embedding class. Now both are first-class LlamaIndex components: installable from PyPI, composable with the rest of the ecosystem, and documented in the official LlamaIndex module guide.

If you are building RAG pipelines with LlamaIndex and want better chunking, this is the path of least resistance.

Node parser docs: developers.llamaindex.ai
Chonkie docs: docs.chonkie.ai

Chonkie LlamaIndex Integration

The Chunker node parser

Installation (Chunker)

Quick start

Supported chunkers

Using it in an ingestion pipeline

Semantic chunking with a custom model

AutoEmbeddings

Installation (AutoEmbeddings)

Local embeddings (no API key)

Cloud embeddings

Putting it all together

Why this matters