Debezium MongoDB Connector Component

Since Camel 3.0

Only consumer is supported

The Debezium MongoDB component is wrapper around Debezium using Debezium Embedded, which enables Change Data Capture from MongoDB database using Debezium without the need for Kafka or Kafka Connect.

Note: The Debezium MongoDB connector uses MongoDB’s oplog to capture the changes, so the connector works only with MongoDB replica sets or with sharded clusters where each shard is a separate replica set, therefore you will need to have your MongoDB instance running either in replica set mode or sharded clusters mode.

Note on handling failures: Per Debezium Embedded Engine documentation, the engines is actively recording source offsets and periodically flushes these offsets to a persistent storage, so when the application is restarted or crashed, the engine will resume from the last recorded offset. Thus, at normal operation, your downstream routes will receive each event exactly once, however in case of an application crash (not having a graceful shutdown), the application will resume from the last recorded offset, which may result in receiving duplicate events immediately after the restart. Therefore, your downstream routes should be tolerant enough of such case and deduplicate events if needed.

Note: The Debezium Mongodb component is currently not supported in OSGi

Maven users will need to add the following dependency to their pom.xml for this component.

<dependency>
    <groupId>org.apache.camel</groupId>
    <artifactId>camel-debezium-mongodb</artifactId>
    <version>x.x.x</version>
    <!-- use the same version as your Camel core version -->
</dependency>

URI format

debezium-mongodb:name[?options]

Options

The Debezium MongoDB Connector component supports 4 options, which are listed below.

Name Description Default Type

configuration (consumer)

Allow pre-configured Configurations to be set.

MongoDbConnectorEmbeddedDebeziumConfiguration

basicPropertyBinding (advanced)

Whether the component should use basic property binding (Camel 2.x) or the newer property binding with additional capabilities

false

boolean

lazyStartProducer (producer)

Whether the producer should be started lazy (on the first message). By starting lazy you can use this to allow CamelContext and routes to startup in situations where a producer may otherwise fail during starting and cause the route to fail being started. By deferring this startup to be lazy then the startup failure can be handled during routing messages via Camel’s routing error handlers. Beware that when the first message is processed then creating and starting the producer may take a little time and prolong the total processing time of the processing.

false

boolean

bridgeErrorHandler (consumer)

Allows for bridging the consumer to the Camel routing Error Handler, which mean any exceptions occurred while the consumer is trying to pickup incoming messages, or the likes, will now be processed as a message and handled by the routing Error Handler. By default the consumer will use the org.apache.camel.spi.ExceptionHandler to deal with exceptions, that will be logged at WARN or ERROR level and ignored.

false

boolean

The Debezium MongoDB Connector endpoint is configured using URI syntax:

debezium-mongodb:name

with the following path and query parameters:

Path Parameters (1 parameters):

Name Description Default Type

name

Required Unique name for the connector. Attempting to register again with the same name will fail.

String

Query Parameters (43 parameters):

Name Description Default Type

bridgeErrorHandler (consumer)

Allows for bridging the consumer to the Camel routing Error Handler, which mean any exceptions occurred while the consumer is trying to pickup incoming messages, or the likes, will now be processed as a message and handled by the routing Error Handler. By default the consumer will use the org.apache.camel.spi.ExceptionHandler to deal with exceptions, that will be logged at WARN or ERROR level and ignored.

false

boolean

internalKeyConverter (consumer)

The Converter class that should be used to serialize and deserialize key data for offsets. The default is JSON converter.

org.apache.kafka.connect.json.JsonConverter

String

internalValueConverter (consumer)

The Converter class that should be used to serialize and deserialize value data for offsets. The default is JSON converter.

org.apache.kafka.connect.json.JsonConverter

String

offsetCommitPolicy (consumer)

The name of the Java class of the commit policy. It defines when offsets commit has to be triggered based on the number of events processed and the time elapsed since the last commit. This class must implement the interface 'OffsetCommitPolicy'. The default is a periodic commit policy based upon time intervals.

io.debezium.embedded.spi.OffsetCommitPolicy.PeriodicCommitOffsetPolicy

String

offsetCommitTimeoutMs (consumer)

Maximum number of milliseconds to wait for records to flush and partition offset data to be committed to offset storage before cancelling the process and restoring the offset data to be committed in a future attempt. The default is 5 seconds.

5000

long

offsetFlushIntervalMs (consumer)

Interval at which to try committing offsets. The default is 1 minute.

60000

long

offsetStorage (consumer)

The name of the Java class that is responsible for persistence of connector offsets.

org.apache.kafka.connect.storage.FileOffsetBackingStore

String

offsetStorageFileName (consumer)

Path to file where offsets are to be stored. Required when offset.storage is set to the FileOffsetBackingStore

String

offsetStoragePartitions (consumer)

The number of partitions used when creating the offset storage topic. Required when offset.storage is set to the 'KafkaOffsetBackingStore'.

int

offsetStorageReplication Factor (consumer)

Replication factor used when creating the offset storage topic. Required when offset.storage is set to the KafkaOffsetBackingStore

int

offsetStorageTopic (consumer)

The name of the Kafka topic where offsets are to be stored. Required when offset.storage is set to the KafkaOffsetBackingStore.

String

exceptionHandler (consumer)

To let the consumer use a custom ExceptionHandler. Notice if the option bridgeErrorHandler is enabled then this option is not in use. By default the consumer will deal with exceptions, that will be logged at WARN or ERROR level and ignored.

ExceptionHandler

exchangePattern (consumer)

Sets the exchange pattern when the consumer creates an exchange.

ExchangePattern

basicPropertyBinding (advanced)

Whether the endpoint should use basic property binding (Camel 2.x) or the newer property binding with additional capabilities

false

boolean

synchronous (advanced)

Sets whether synchronous processing should be strictly used, or Camel is allowed to use asynchronous processing (if supported).

false

boolean

collectionBlacklist (mongodb)

Description is not available here, please check Debezium website for corresponding key 'collection.blacklist' description.

String

collectionWhitelist (mongodb)

The collections for which changes are to be captured

String

connectBackoffInitialDelay Ms (mongodb)

The initial delay when trying to reconnect to a primary after a connection cannot be made or when no primary is available. Defaults to 1 second (1000 ms).

1000

long

connectBackoffMaxDelayMs (mongodb)

The maximum delay when trying to reconnect to a primary after a connection cannot be made or when no primary is available. Defaults to 120 second (120,000 ms).

120000

long

connectMaxAttempts (mongodb)

Maximum number of failed connection attempts to a replica set primary before an exception occurs and task is aborted. Defaults to 16, which with the defaults for 'connect.backoff.initial.delay.ms' and 'connect.backoff.max.delay.ms' results in just over 20 minutes of attempts before failing.

16

int

databaseBlacklist (mongodb)

The databases for which changes are to be excluded

String

databaseHistoryFileFilename (mongodb)

The path to the file that will be used to record the database history

String

databaseWhitelist (mongodb)

The databases for which changes are to be captured

String

fieldBlacklist (mongodb)

Description is not available here, please check Debezium website for corresponding key 'field.blacklist' description.

String

fieldRenames (mongodb)

Description is not available here, please check Debezium website for corresponding key 'field.renames' description.

String

heartbeatIntervalMs (mongodb)

Length of an interval in milli-seconds in in which the connector periodically sends heartbeat messages to a heartbeat topic. Use 0 to disable heartbeat messages. Disabled by default.

0

int

heartbeatTopicsPrefix (mongodb)

The prefix that is used to name heartbeat topics.Defaults to __debezium-heartbeat.

__debezium-heartbeat

String

initialSyncMaxThreads (mongodb)

Maximum number of threads used to perform an intial sync of the collections in a replica set. Defaults to 1.

1

int

maxBatchSize (mongodb)

Maximum size of each batch of source records. Defaults to 2048.

2048

int

maxQueueSize (mongodb)

Maximum size of the queue for change events read from the database log but not yet recorded or forwarded. Defaults to 8192, and should always be larger than the maximum batch size.

8192

int

mongodbHosts (mongodb)

The hostname and port pairs (in the form 'host' or 'host:port') of the MongoDB server(s) in the replica set.

String

mongodbMembersAutoDiscover (mongodb)

Specifies whether the addresses in 'hosts' are seeds that should be used to discover all members of the cluster or replica set ('true'), or whether the address(es) in 'hosts' should be used as is ('false'). The default is 'true'.

true

boolean

mongodbName (mongodb)

Required Unique name that identifies the MongoDB replica set or cluster and all recorded offsets, andthat is used as a prefix for all schemas and topics. Each distinct MongoDB installation should have a separate namespace and monitored by at most one Debezium connector.

String

mongodbPassword (mongodb)

Required Password to be used when connecting to MongoDB, if necessary.

String

mongodbSslEnabled (mongodb)

Should connector use SSL to connect to MongoDB instances

false

boolean

mongodbSslInvalidHostname Allowed (mongodb)

Whether invalid host names are allowed when using SSL. If true the connection will not prevent man-in-the-middle attacks

false

boolean

mongodbUser (mongodb)

Database user for connecting to MongoDB, if necessary.

String

pollIntervalMs (mongodb)

Frequency in milliseconds to wait for new change events to appear after receiving no events. Defaults to 500ms.

500

long

snapshotDelayMs (mongodb)

The number of milliseconds to delay before a snapshot will begin.

0

long

snapshotFetchSize (mongodb)

The maximum number of records that should be loaded into memory while performing a snapshot

int

snapshotMode (mongodb)

The criteria for running a snapshot upon startup of the connector. Options include: 'initial' (the default) to specify the connector should always perform an initial sync when required; 'never' to specify the connector should never perform an initial sync

initial

String

sourceStructVersion (mongodb)

A version of the format of the publicly visible source part in the message

v2

String

tombstonesOnDelete (mongodb)

Whether delete operations should be represented by a delete event and a subsquenttombstone event (true) or only by a delete event (false). Emitting the tombstone event (the default behavior) allows Kafka to completely delete all events pertaining to the given key once the source record got deleted.

false

boolean

Spring Boot Auto-Configuration

When using Spring Boot make sure to use the following Maven dependency to have support for auto configuration:

<dependency>
  <groupId>org.apache.camel.springboot</groupId>
  <artifactId>camel-debezium-mongodb-starter</artifactId>
  <version>x.x.x</version>
  <!-- use the same version as your Camel core version -->
</dependency>

The component supports 44 options, which are listed below.

Name Description Default Type

camel.component.debezium-mongodb.basic-property-binding

Whether the component should use basic property binding (Camel 2.x) or the newer property binding with additional capabilities

false

Boolean

camel.component.debezium-mongodb.bridge-error-handler

Allows for bridging the consumer to the Camel routing Error Handler, which mean any exceptions occurred while the consumer is trying to pickup incoming messages, or the likes, will now be processed as a message and handled by the routing Error Handler. By default the consumer will use the org.apache.camel.spi.ExceptionHandler to deal with exceptions, that will be logged at WARN or ERROR level and ignored.

false

Boolean

camel.component.debezium-mongodb.configuration.collection-blacklist

Description is not available here, please check Debezium website for corresponding key 'collection.blacklist' description.

String

camel.component.debezium-mongodb.configuration.collection-whitelist

The collections for which changes are to be captured

String

camel.component.debezium-mongodb.configuration.connect-backoff-initial-delay-ms

The initial delay when trying to reconnect to a primary after a connection cannot be made or when no primary is available. Defaults to 1 second (1000 ms).

1000

Long

camel.component.debezium-mongodb.configuration.connect-backoff-max-delay-ms

The maximum delay when trying to reconnect to a primary after a connection cannot be made or when no primary is available. Defaults to 120 second (120,000 ms).

120000

Long

camel.component.debezium-mongodb.configuration.connect-max-attempts

Maximum number of failed connection attempts to a replica set primary before an exception occurs and task is aborted. Defaults to 16, which with the defaults for 'connect.backoff.initial.delay.ms' and 'connect.backoff.max.delay.ms' results in just over 20 minutes of attempts before failing.

16

Integer

camel.component.debezium-mongodb.configuration.connector-class

The name of the Java class for the connector

Class

camel.component.debezium-mongodb.configuration.database-blacklist

The databases for which changes are to be excluded

String

camel.component.debezium-mongodb.configuration.database-history-file-filename

The path to the file that will be used to record the database history

String

camel.component.debezium-mongodb.configuration.database-whitelist

The databases for which changes are to be captured

String

camel.component.debezium-mongodb.configuration.field-blacklist

Description is not available here, please check Debezium website for corresponding key 'field.blacklist' description.

String

camel.component.debezium-mongodb.configuration.field-renames

Description is not available here, please check Debezium website for corresponding key 'field.renames' description.

String

camel.component.debezium-mongodb.configuration.heartbeat-interval-ms

Length of an interval in milli-seconds in in which the connector periodically sends heartbeat messages to a heartbeat topic. Use 0 to disable heartbeat messages. Disabled by default.

0

Integer

camel.component.debezium-mongodb.configuration.heartbeat-topics-prefix

The prefix that is used to name heartbeat topics.Defaults to __debezium-heartbeat.

__debezium-heartbeat

String

camel.component.debezium-mongodb.configuration.initial-sync-max-threads

Maximum number of threads used to perform an intial sync of the collections in a replica set. Defaults to 1.

1

Integer

camel.component.debezium-mongodb.configuration.internal-key-converter

The Converter class that should be used to serialize and deserialize key data for offsets. The default is JSON converter.

org.apache.kafka.connect.json.JsonConverter

String

camel.component.debezium-mongodb.configuration.internal-value-converter

The Converter class that should be used to serialize and deserialize value data for offsets. The default is JSON converter.

org.apache.kafka.connect.json.JsonConverter

String

camel.component.debezium-mongodb.configuration.max-batch-size

Maximum size of each batch of source records. Defaults to 2048.

2048

Integer

camel.component.debezium-mongodb.configuration.max-queue-size

Maximum size of the queue for change events read from the database log but not yet recorded or forwarded. Defaults to 8192, and should always be larger than the maximum batch size.

8192

Integer

camel.component.debezium-mongodb.configuration.mongodb-hosts

The hostname and port pairs (in the form 'host' or 'host:port') of the MongoDB server(s) in the replica set.

String

camel.component.debezium-mongodb.configuration.mongodb-members-auto-discover

Specifies whether the addresses in 'hosts' are seeds that should be used to discover all members of the cluster or replica set ('true'), or whether the address(es) in 'hosts' should be used as is ('false'). The default is 'true'.

true

Boolean

camel.component.debezium-mongodb.configuration.mongodb-name

Unique name that identifies the MongoDB replica set or cluster and all recorded offsets, andthat is used as a prefix for all schemas and topics. Each distinct MongoDB installation should have a separate namespace and monitored by at most one Debezium connector.

String

camel.component.debezium-mongodb.configuration.mongodb-password

Password to be used when connecting to MongoDB, if necessary.

String

camel.component.debezium-mongodb.configuration.mongodb-ssl-enabled

Should connector use SSL to connect to MongoDB instances

false

Boolean

camel.component.debezium-mongodb.configuration.mongodb-ssl-invalid-hostname-allowed

Whether invalid host names are allowed when using SSL. If true the connection will not prevent man-in-the-middle attacks

false

Boolean

camel.component.debezium-mongodb.configuration.mongodb-user

Database user for connecting to MongoDB, if necessary.

String

camel.component.debezium-mongodb.configuration.name

Unique name for the connector. Attempting to register again with the same name will fail.

String

camel.component.debezium-mongodb.configuration.offset-commit-policy

The name of the Java class of the commit policy. It defines when offsets commit has to be triggered based on the number of events processed and the time elapsed since the last commit. This class must implement the interface 'OffsetCommitPolicy'. The default is a periodic commit policy based upon time intervals.

io.debezium.embedded.spi.OffsetCommitPolicy.PeriodicCommitOffsetPolicy

String

camel.component.debezium-mongodb.configuration.offset-commit-timeout-ms

Maximum number of milliseconds to wait for records to flush and partition offset data to be committed to offset storage before cancelling the process and restoring the offset data to be committed in a future attempt. The default is 5 seconds.

5000

Long

camel.component.debezium-mongodb.configuration.offset-flush-interval-ms

Interval at which to try committing offsets. The default is 1 minute.

60000

Long

camel.component.debezium-mongodb.configuration.offset-storage

The name of the Java class that is responsible for persistence of connector offsets.

org.apache.kafka.connect.storage.FileOffsetBackingStore

String

camel.component.debezium-mongodb.configuration.offset-storage-file-name

Path to file where offsets are to be stored. Required when offset.storage is set to the FileOffsetBackingStore

String

camel.component.debezium-mongodb.configuration.offset-storage-partitions

The number of partitions used when creating the offset storage topic. Required when offset.storage is set to the 'KafkaOffsetBackingStore'.

Integer

camel.component.debezium-mongodb.configuration.offset-storage-replication-factor

Replication factor used when creating the offset storage topic. Required when offset.storage is set to the KafkaOffsetBackingStore

Integer

camel.component.debezium-mongodb.configuration.offset-storage-topic

The name of the Kafka topic where offsets are to be stored. Required when offset.storage is set to the KafkaOffsetBackingStore.

String

camel.component.debezium-mongodb.configuration.poll-interval-ms

Frequency in milliseconds to wait for new change events to appear after receiving no events. Defaults to 500ms.

500

Long

camel.component.debezium-mongodb.configuration.snapshot-delay-ms

The number of milliseconds to delay before a snapshot will begin.

0

Long

camel.component.debezium-mongodb.configuration.snapshot-fetch-size

The maximum number of records that should be loaded into memory while performing a snapshot

Integer

camel.component.debezium-mongodb.configuration.snapshot-mode

The criteria for running a snapshot upon startup of the connector. Options include: 'initial' (the default) to specify the connector should always perform an initial sync when required; 'never' to specify the connector should never perform an initial sync

initial

String

camel.component.debezium-mongodb.configuration.source-struct-version

A version of the format of the publicly visible source part in the message

v2

String

camel.component.debezium-mongodb.configuration.tombstones-on-delete

Whether delete operations should be represented by a delete event and a subsquenttombstone event (true) or only by a delete event (false). Emitting the tombstone event (the default behavior) allows Kafka to completely delete all events pertaining to the given key once the source record got deleted.

false

Boolean

camel.component.debezium-mongodb.enabled

Whether to enable auto configuration of the debezium-mongodb component. This is enabled by default.

Boolean

camel.component.debezium-mongodb.lazy-start-producer

Whether the producer should be started lazy (on the first message). By starting lazy you can use this to allow CamelContext and routes to startup in situations where a producer may otherwise fail during starting and cause the route to fail being started. By deferring this startup to be lazy then the startup failure can be handled during routing messages via Camel’s routing error handlers. Beware that when the first message is processed then creating and starting the producer may take a little time and prolong the total processing time of the processing.

false

Boolean

Message headers

Consumer headers

The following headers are available when consuming change events from Debezium.

Header constant Header value Type Description

DebeziumConstants.HEADER_IDENTIFIER

"CamelDebeziumIdentifier"

String

The identifier of the connector, normally is this format "{server-name}.{database-name}.{table-name}".

DebeziumConstants.HEADER_KEY

"CamelDebeziumKey"

Struct

The key of the event, normally is the table Primary Key.

DebeziumConstants.HEADER_SOURCE_METADATA

"CamelDebeziumSourceMetadata"

Map

The metadata about the source event, for example table name, database name, log position, etc, please refer to the Debezium documentation for more info.

DebeziumConstants.HEADER_OPERATION

"CamelDebeziumOperation"

String

If presents, the type of event operation. Values for the connector are c for create (or insert), u for update, d for delete or r for read (in the case of a initial sync).

DebeziumConstants.HEADER_TIMESTAMP

"CamelDebeziumTimestamp"

Long

If presents, the time (using the system clock in the JVM) at which the connector processed the event.

Note: Debezium Mongodb uses MongoDB’s oplog to populate the CDC events, the update events in MongoDB’s oplog don’t have the before or after states of the changed document, so there’s no way for the Debezium connector to provide this information, therefore header key CamelDebeziumBefore is not available in this component.

Message body

The message body if is not null (in case of tombstones), it contains the state of the row after the event occurred as String JSON format and you can unmarchal using Camel JSON Data Format.

Samples

Consuming events

Here is a very simple route that you can use in order to listen to Debezium events from MongoDB connector.

from("debezium-mongodb:dbz-test-1?offsetStorageFileName=/usr/offset-file-1.dat&mongodbHosts=rs0/localhost:27017&mongodbUser=debezium&mongodbPassword=dbz&mongodbName=dbserver1&databaseHistoryFileName=/usr/history-file-1.dat")
    .log("Event received from Debezium : ${body}")
    .log("    with this identifier ${headers.CamelDebeziumIdentifier}")
    .log("    with these source metadata ${headers.CamelDebeziumSourceMetadata}")
    .log("    the event occured upon this operation '${headers.CamelDebeziumSourceOperation}'")
    .log("    on this database '${headers.CamelDebeziumSourceMetadata[db]}' and this table '${headers.CamelDebeziumSourceMetadata[table]}'")
    .log("    with the key ${headers.CamelDebeziumKey}")
    .choice()
        .when(header(DebeziumConstants.HEADER_OPERATION).in("c", "u", "r"))
            .unmarshal().json()
            .log("Event received from Debezium : ${body}")
         .end()
    .end();

By default, the component will emit the events in the body String JSON format in case of u, c or r operations, this can be easily converted to JSON using Camel JSON Data Format e.g: .unmarshal().json() like the above example. In case of operation d, the body will be null.

Important Note: This component is a thin wrapper around Debezium Engine as mentioned, therefore before using this component in production, you need to understand how Debezium works and how configurations can reflect the expected behavior, especially in regards to handling failures.