Chroma
Since Camel 4.17
Only producer is supported
The Chroma Component provides support for interacting with the Chroma Vector Database.
Chroma is an open-source embedding database designed for AI applications. It allows you to store, search, and retrieve embeddings along with their associated metadata and documents.
URI format
chroma:collection[?options]
Where collection represents a named collection in your Chroma database.
Configuring Options
Camel components are configured on two separate levels:
-
component level
-
endpoint level
Configuring Component Options
At the component level, you set general and shared configurations that are, then, inherited by the endpoints. It is the highest configuration level.
For example, a component may have security settings, credentials for authentication, urls for network connection and so forth.
Some components only have a few options, and others may have many. Because components typically have pre-configured defaults that are commonly used, then you may often only need to configure a few options on a component; or none at all.
You can configure components using:
-
the Component DSL.
-
in a configuration file (
application.properties,*.yamlfiles, etc). -
directly in the Java code.
Configuring Endpoint Options
You usually spend more time setting up endpoints because they have many options. These options help you customize what you want the endpoint to do. The options are also categorized into whether the endpoint is used as a consumer (from), as a producer (to), or both.
Configuring endpoints is most often done directly in the endpoint URI as path and query parameters. You can also use the Endpoint DSL and DataFormat DSL as a type safe way of configuring endpoints and data formats in Java.
A good practice when configuring options is to use Property Placeholders.
Property placeholders provide a few benefits:
-
They help prevent using hardcoded urls, port numbers, sensitive information, and other settings.
-
They allow externalizing the configuration from the code.
-
They help the code to become more flexible and reusable.
The following two sections list all the options, firstly for the component followed by the endpoint.
Component Options
The Chroma component supports 5 options, which are listed below.
| Name | Description | Default | Type |
|---|---|---|---|
The configuration;. | ChromaConfiguration | ||
The Chroma server host URL. | String | ||
Whether the producer should be started lazy (on the first message). By starting lazy you can use this to allow CamelContext and routes to startup in situations where a producer may otherwise fail during starting and cause the route to fail being started. By deferring this startup to be lazy then the startup failure can be handled during routing messages via Camel’s routing error handlers. Beware that when the first message is processed then creating and starting the producer may take a little time and prolong the total processing time of the processing. | false | boolean | |
Max results for similarity search. | 10 | int | |
Whether autowiring is enabled. This is used for automatic autowiring options (the option must be marked as autowired) by looking up in the registry to find if there is a single instance of matching type, which then gets configured on the component. This can be used for automatic configuring JDBC data sources, JMS connection factories, AWS Clients, etc. | true | boolean |
Endpoint Options
The Chroma endpoint is configured using URI syntax:
chroma:collection
With the following path and query parameters:
Query Parameters (3 parameters)
| Name | Description | Default | Type |
|---|---|---|---|
The Chroma server host URL. | String | ||
Max results for similarity search. | 10 | int | |
Whether the producer should be started lazy (on the first message). By starting lazy you can use this to allow CamelContext and routes to startup in situations where a producer may otherwise fail during starting and cause the route to fail being started. By deferring this startup to be lazy then the startup failure can be handled during routing messages via Camel’s routing error handlers. Beware that when the first message is processed then creating and starting the producer may take a little time and prolong the total processing time of the processing. | false | boolean |
Message Headers
The Chroma component supports 12 message header(s), which is/are listed below:
| Name | Description | Default | Type |
|---|---|---|---|
| Constant: | The action to be performed. Enum values:
| String | |
CamelChromaCollectionName (producer) Constant: | The collection name. | String | |
| Constant: | The embedding IDs. | List | |
CamelChromaEmbeddings (producer) Constant: | The embeddings. | List | |
CamelChromaMetadatas (producer) Constant: | The metadata for embeddings. | List | |
CamelChromaDocuments (producer) Constant: | The documents for embeddings. | List | |
CamelChromaQueryEmbeddings (producer) Constant: | The query embeddings for similarity search. | List | |
CamelChromaNResults (producer) Constant: | The number of results to return. | 10 | Integer |
| Constant: | Chroma where filter. | Map | |
CamelChromaWhereDocument (producer) Constant: | Chroma where document filter. | Map | |
| Constant: | The fields to include in the result. | List | |
CamelChromaOperationStatus (producer) Constant: | The operation status. | String |
Embedding Function
Chroma requires an EmbeddingFunction to convert documents into vector embeddings. You must configure an embedding function for the component to work properly.
The chromadb-java-client provides several embedding function implementations:
-
DefaultEmbeddingFunction- Uses ONNX runtime for local embeddings -
OpenAIEmbeddingFunction- Uses OpenAI API -
CohereEmbeddingFunction- Uses Cohere API -
OllamaEmbeddingFunction- Uses Ollama for local embeddings -
HuggingFaceEmbeddingFunction- Uses Hugging Face API
You can also implement your own EmbeddingFunction interface.
Configuring Embedding Function
-
Java
// Using OpenAI embeddings
OpenAIEmbeddingFunction embeddingFunction = new OpenAIEmbeddingFunction(
System.getenv("OPENAI_API_KEY"),
"text-embedding-ada-002"
);
ChromaComponent component = context.getComponent("chroma", ChromaComponent.class);
component.getConfiguration().setHost("http://localhost:8000");
component.getConfiguration().setEmbeddingFunction(embeddingFunction); Examples
Collection Examples
Create Collection
In the route below, we use the chroma component to create a collection named myCollection:
-
Java
from("direct:in")
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.CREATE_COLLECTION)
.to("chroma:myCollection"); Get Collection
In the route below, we use the chroma component to get information about a collection named myCollection:
-
Java
from("direct:in")
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.GET_COLLECTION)
.to("chroma:myCollection")
.process(exchange -> {
Collection collection = exchange.getMessage().getBody(Collection.class);
log.info("Collection name: {}", collection.getName());
}); Document Examples
Add Documents
In the route below, we use the chroma component to add documents to a collection. The embedding function will automatically generate embeddings for the documents:
-
Java
from("direct:in")
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.ADD)
.setHeader(ChromaHeaders.IDS)
.constant(Arrays.asList("doc1", "doc2", "doc3"))
.setBody()
.constant(Arrays.asList(
"Camel is an open source integration framework",
"Chroma is a vector database for AI applications",
"Apache Camel supports many components"
))
.to("chroma:myCollection"); Add Documents with Metadata
You can also add metadata to documents for filtering during queries:
-
Java
from("direct:in")
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.ADD)
.setHeader(ChromaHeaders.IDS)
.constant(Arrays.asList("doc1", "doc2"))
.setHeader(ChromaHeaders.METADATAS)
.constant(Arrays.asList(
Map.of("source", "wiki", "category", "integration"),
Map.of("source", "docs", "category", "database")
))
.setBody()
.constant(Arrays.asList(
"Camel is an open source integration framework",
"Chroma is a vector database"
))
.to("chroma:myCollection"); Add Documents with Pre-computed Embeddings
If you have pre-computed embeddings, you can pass them directly via headers:
-
Java
from("direct:in")
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.ADD)
.setHeader(ChromaHeaders.IDS)
.constant(Arrays.asList("doc1"))
.setHeader(ChromaHeaders.EMBEDDINGS)
.constant(Arrays.asList(
Arrays.asList(0.1f, 0.2f, 0.3f, 0.4f) // Your embedding vector
))
.setBody()
.constant(Arrays.asList("Document text"))
.to("chroma:myCollection"); Query Examples
Query Documents
In the route below, we use the chroma component to perform a semantic search query:
-
Java
from("direct:in")
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.QUERY)
.setHeader(ChromaHeaders.N_RESULTS)
.constant(5)
.setBody()
.constant(Collections.singletonList("integration framework"))
.to("chroma:myCollection")
.process(exchange -> {
Collection.QueryResponse response = exchange.getMessage()
.getBody(Collection.QueryResponse.class);
// Process query results
}); Query with Metadata Filter
You can filter query results using metadata:
-
Java
from("direct:in")
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.QUERY)
.setHeader(ChromaHeaders.N_RESULTS)
.constant(10)
.setHeader(ChromaHeaders.WHERE)
.constant(Map.of("category", "integration"))
.setBody()
.constant(Collections.singletonList("apache camel"))
.to("chroma:myCollection"); Query with Include Options
You can specify which fields to include in the response:
-
Java
from("direct:in")
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.QUERY)
.setHeader(ChromaHeaders.N_RESULTS)
.constant(5)
.setHeader(ChromaHeaders.INCLUDE)
.constant(Arrays.asList("documents", "metadatas", "distances"))
.setBody()
.constant(Collections.singletonList("search query"))
.to("chroma:myCollection"); Get Examples
Get Documents by IDs
In the route below, we retrieve specific documents by their IDs:
-
Java
from("direct:in")
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.GET)
.setHeader(ChromaHeaders.IDS)
.constant(Arrays.asList("doc1", "doc2"))
.to("chroma:myCollection")
.process(exchange -> {
Collection.GetResult result = exchange.getMessage()
.getBody(Collection.GetResult.class);
List<String> documents = result.getDocuments();
List<String> ids = result.getIds();
}); Update Examples
Update Documents
In the route below, we update existing documents:
-
Java
from("direct:in")
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.UPDATE)
.setHeader(ChromaHeaders.IDS)
.constant(Arrays.asList("doc1"))
.setHeader(ChromaHeaders.METADATAS)
.constant(Arrays.asList(
Map.of("source", "wiki", "updated", "true")
))
.setBody()
.constant(Arrays.asList("Updated document content"))
.to("chroma:myCollection"); Upsert Examples
Upsert Documents
Upsert will insert new documents or update existing ones:
-
Java
from("direct:in")
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.UPSERT)
.setHeader(ChromaHeaders.IDS)
.constant(Arrays.asList("doc1", "doc4"))
.setBody()
.constant(Arrays.asList(
"Updated content for doc1",
"New document doc4"
))
.to("chroma:myCollection"); RAG (Retrieval-Augmented Generation) Example
Here’s an example of using Chroma for RAG with a language model:
-
Java
from("direct:ask")
// First, query Chroma for relevant documents
.setHeader(ChromaHeaders.ACTION)
.constant(ChromaAction.QUERY)
.setHeader(ChromaHeaders.N_RESULTS)
.constant(3)
.to("chroma:knowledgeBase")
// Build context from retrieved documents
.process(exchange -> {
Collection.QueryResponse response = exchange.getMessage()
.getBody(Collection.QueryResponse.class);
String context = String.join("\n", response.getDocuments().get(0));
String question = exchange.getProperty("question", String.class);
String prompt = "Context:\n" + context + "\n\nQuestion: " + question;
exchange.getMessage().setBody(prompt);
})
// Send to LLM for answer generation
.to("langchain4j-chat:myModel"); Spring Boot Auto-Configuration
When using chroma with Spring Boot make sure to use the following Maven dependency to have support for auto configuration:
<dependency>
<groupId>org.apache.camel.springboot</groupId>
<artifactId>camel-chroma-starter</artifactId>
<version>x.x.x</version>
<!-- use the same version as your Camel core version -->
</dependency> The component supports 6 options, which are listed below.
| Name | Description | Default | Type |
|---|---|---|---|
Whether autowiring is enabled. This is used for automatic autowiring options (the option must be marked as autowired) by looking up in the registry to find if there is a single instance of matching type, which then gets configured on the component. This can be used for automatic configuring JDBC data sources, JMS connection factories, AWS Clients, etc. | true | Boolean | |
The configuration;. The option is a org.apache.camel.component.chroma.ChromaConfiguration type. | ChromaConfiguration | ||
Whether to enable auto configuration of the chroma component. This is enabled by default. | Boolean | ||
The Chroma server host URL. | String | ||
Whether the producer should be started lazy (on the first message). By starting lazy you can use this to allow CamelContext and routes to startup in situations where a producer may otherwise fail during starting and cause the route to fail being started. By deferring this startup to be lazy then the startup failure can be handled during routing messages via Camel’s routing error handlers. Beware that when the first message is processed then creating and starting the producer may take a little time and prolong the total processing time of the processing. | false | Boolean | |
Max results for similarity search. | 10 | Integer |