Camel K Integration Scaling

Manual Scaling

An Integration can be scaled using the kubectl scale command, e.g.:

$ kubectl scale it <integration_name> --replicas <number_of_replicas>

This can also be achieved by editing the Integration resource directly, e.g.:

$ kubectl patch it <integration_name> -p '{"spec":{"replicas":<number_of_replicas>}}'

The Integration also reports its number of replicas in the .status.replicas field, e.g.:

$ kubectl get it <integration_name> -o jsonpath='{.status.replicas}'

Autoscaling with Knative

An Integration that deploys as a Knative Service can automatically scale based on incoming traffic, including scaling to zero.

The incoming traffic measures either as:

The number of simultaneous requests, that are processed by each replica at any given time;
Or the number of requests that are processed per second, per replica.

That implies the Integration must expose a container port, that receives incoming requests, and complies with the Knative runtime contract. This is the case when the Integration either:

Exposes an HTTP endpoint, using the Camel HTTP component or the REST DSL, e.g.:

rest('/')
  .produces("text/plain")
  .get()
    .route()
    .transform().constant("Response");

Or consumes Knative events, from a Broker or a Channel, using the Knative component, e.g.:
```
from("knative:channel/events")
  .convertBodyTo(String.class)
  .to("log:info")
```

The Knative Autoscaler can be configured using the Knative Service trait, e.g., to set the scaling upper bound (the maximum number of replicas):

$ kamel run -t knative-service.max-scale=10

More information can be found in the Knative Autoscaling documentation.

When manually scaling an Integration, that deploys as a Knative Service, both scale bounds, i.e., minScale and maxScale, are set to the specified number of replicas. Scale bounds can be reset by removing the .spec.replicas field from the Integration, e.g., with:

$ kubectl patch it <integration_name> --type=json -p='[{"op": "remove", "path": "/spec/replicas"}]'

Autoscaling with HPA

An Integration can automatically scale based on its CPU utilization and custom metrics using horizontal pod autoscaling (HPA).

For example, executing the following command creates an autoscaler for the Integration, with target CPU utilization set to 80%, and the number of replicas between 2 and 5:

$ kubectl autoscale it <integration_name> --min=2 --max=5 --cpu-percent=80

Integration metrics can also be exported for horizontal pod autoscaling (HPA), using the custom metrics Prometheus adapter, so that the Integration can scale automatically based on its own metrics.

If you have an OpenShift cluster, you can follow Exposing custom application metrics for autoscaling to set it up.

Assuming you have the Prometheus adapter up and running, you can create a HorizontalPodAutoscaler resource based on a particular Integration metric, e.g.:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: camel-k-autoscaler
spec:
  scaleTargetRef:
    apiVersion: camel.apache.org/v1
    kind: Integration
    name: example
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metric:
        name: application_camel_context_exchanges_inflight_count
      target:
        type: AverageValue
        averageValue: 1k

the HPA can work when the Integration replica field needs to be specified. You need to scale the Integration via kubectl scale it my-it --replicas 1 or edit the .spec.replicas field of your Integration to 1. This is due to a Kubernetes behavior which does not allow an empty value on the resource to scale.

More information can be found in Horizontal Pod Autoscaler from the Kubernetes documentation.

HPA can also be used with Knative, by installing the HPA autoscaling Serving extension.