Any23 DataFormat

Available as of Camel version 3.0

The main functionality of this DataFormat focuses on its Unmarshal method which extracts RDF triplets from compatible pages, in a wide variety of RDF syntaxes. Any23 is a Data Format that is intended to convert HTML from a site (or file) into rdf.

Any23 Options

The Any23 dataformat supports 4 options, which are listed below.

Name Default Java Type Description

outputFormat

RDF4JMODEL

Any23Type

What RDF syntax to unmarshal as, can be: NTRIPLES, TURTLE, NQUADS, RDFXML, JSONLD, RDFJSON, RDF4JMODEL. It is by default: RDF4JMODEL.

extractors

List

List of Any23 extractors to be used in the unmarshal operation. A list of the available extractors can be found here here. If not provided, all the available extractors are used.

baseURI

String

The URI to use as base for building RDF entities if only relative paths are provided.

contentTypeHeader

false

Boolean

Whether the data format should set the Content-Type header with the type from the data format if the data format is capable of doing so. For example application/xml for data formats marshalling to XML, or application/json for data formats marshalling to JSon etc.

Spring Boot Auto-Configuration

When using Spring Boot make sure to use the following Maven dependency to have support for auto configuration:

<dependency>
  <groupId>org.apache.camel</groupId>
  <artifactId>camel-any23-starter</artifactId>
  <version>x.x.x</version>
  <!-- use the same version as your Camel core version -->
</dependency>

The component supports 4 options, which are listed below.

Name Description Default Type

camel.dataformat.any23.base-u-r-i

The URI to use as base for building RDF entities if only relative paths are provided.

String

camel.dataformat.any23.content-type-header

Whether the data format should set the Content-Type header with the type from the data format if the data format is capable of doing so. For example application/xml for data formats marshalling to XML, or application/json for data formats marshalling to JSon etc.

false

Boolean

camel.dataformat.any23.enabled

Whether to enable auto configuration of the any23 data format. This is enabled by default.

Boolean

camel.dataformat.any23.extractors

List of Any23 extractors to be used in the unmarshal operation. A list of the available extractors can be found here here. If not provided, all the available extractors are used.

List

Java DSL Example

An example where the consumer provides some HTML

from("direct:start").unmarshal().any23("http://mock.foo/bar").to("mock:result");

Spring XML Example

The following example shows how to use TidyMarkup to unmarshal using Spring

<camelContext id="camel" xmlns="http://camel.apache.org/schema/spring">
    <dataFormats>
      <any23 id="any23" baseURI ="http://mock.foo/bar" outputFormat="TURTLE" >
        <configurations>
          <entry>
            <key>any23.extraction.metadata.nesting</key>
            <value>off</value>
          </entry>
        </configurations>
        <extractors>html-head-title</extractors>
      </any23>
    </dataFormats>
    <route>
      <from uri="direct:start"/>
      <to uri="http://microformats.org/2009/08"/>
      <unmarshal>
        <custom ref="any23"/>
      </unmarshal>
      <to uri="mock:result"/>
    </route>
  </camelContext>

Dependencies

To use Any23 in your camel routes you need to add the a dependency on camel-any23 which implements this data format.

If you use maven you could just add the following to your pom.xml, substituting the version number for the latest & greatest release (see the download page for the latest versions).

<dependency>
  <groupId>org.apache.camel</groupId>
  <artifactId>camel-any23</artifactId>
  <version>x.x.x</version>
</dependency>