Who Owns the OpenTelemetry Config?
A Kubernetes operator for splitting OpenTelemetry collector ownership between platform and developer teams.
Observability on a multi-tenant platform
On the Intility Developer Platform, each team provisions their own Kubernetes clusters on-demand. They control their deployments, scaling, and configuration, all without affecting other tenants.
Observability works differently. Traces and metrics need to end up somewhere central, and that destination is managed by the platform team. We use OpenTelemetry collectors to receive traces and metrics from applications and export them to backends like Logfire and Elasticsearch.
Deploying collectors isn't hard. The question is who owns the configuration.
The configuration problem
The platform team needs to control where telemetry goes. Write tokens for Logfire, endpoints for Elasticsearch, TLS certificates. Developers shouldn't have to manage these, or even see them.
Developers do have legitimate reasons to customize processing. One team wants to inject environment attributes. Another needs aggressive batching for high-throughput services. A third wants to drop debug spans before they leave the cluster.
Our first approach was a shared GitOps repository for collector configs. Want a change? Open a PR. It worked, but every request needed platform team review. We had to understand each team's context, merge processor configs carefully, and mediate when settings conflicted.
We considered giving each team their own collector. For unusual requirements like custom receivers or non-standard pipelines, that still makes sense. But most teams just want to tweak processors. Full collector ownership for everyone means more operational overhead and fragmented config.
Splitting the config
What we built is a Kubernetes operator with two custom resources.
The Collector resource represents a collector deployment. The platform team creates these with the base configuration: receivers accepting OTLP, exporters configured with backend credentials, resource limits, replica counts. Developer teams don't need access to any of this.
apiVersion: otel.intility.io/v1alpha1
kind: Collector
metadata:
name: platform-collector
namespace: observability
spec:
replicas: 2
allowProcessorOverride: true
secrets:
- name: logfire-credentials
mountPath: /secrets/logfire
config: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
timeout: 10s
memory_limiter:
limit_mib: 512
exporters:
otlphttp:
endpoint: https://logfire.example.com
headers:
Authorization: Bearer ${file:/secrets/logfire/token}
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlphttp]The CollectorProcessor resource lets developers override the processors section. They create it in their own namespace, pointing to whichever collector they want to customize. They don't need access to the observability namespace or visibility into export credentials.
apiVersion: otel.intility.io/v1alpha1
kind: CollectorProcessor
metadata:
name: my-team-processors
namespace: my-namespace
spec:
priority: 10
collectorRef:
name: platform-collector
namespace: observability
processorConfig: |
attributes:
actions:
- key: team
value: my-team
action: insertThe operator merges this into the base config and rolls out the change.
How the merge works
The processor configuration is merged, not replaced. In the example above, the developer only specifies an attributes processor. The batch and memory_limiter processors from the base Collector config remain intact. The final merged config contains all three.
If the developer wants to change batch behavior, they can override it explicitly:
processorConfig: |
batch:
timeout: 5s
attributes:
actions:
- key: team
value: my-team
action: insertNow batch uses the developer's 5s timeout instead of the platform team's 10s default. The memory_limiter still comes from the base config.
The pipeline definition also gets updated. New processors are added to the configured pipelines automatically.
The resulting processors section in the ConfigMap:
processors:
batch:
timeout: 5s # overridden by CollectorProcessor
memory_limiter:
limit_mib: 512 # preserved from Collector
attributes: # added by CollectorProcessor
actions:
- key: team
value: my-team
action: insert
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch, attributes] # attributes added
exporters: [otlphttp]If multiple CollectorProcessors target the same collector, the highest priority wins. Most setups have one processor per collector, but the field keeps behavior predictable when that's not the case.
This lets the platform team set sensible defaults while giving developers room to tune what matters for their workload.
Control stays with the platform team
The allowProcessorOverride field on the Collector determines whether processor overrides are permitted at all. Setting it to false locks the collector down completely. Any CollectorProcessor targeting it will fail with a clear error in its status.
This is useful when the platform team needs full control over a collector's behavior. Maybe it's handling sensitive data, or the processing pipeline is tuned for specific performance characteristics that shouldn't be modified.
When teams need more
Some teams have requirements that processor overrides can't satisfy. Maybe they need a custom receiver for a legacy protocol, or they're experimenting with a different export format. For these cases, they can deploy their own Collector resource in their namespace.
How to get started
The operator comes pre-installed on the Intility Developer Platform. Each cluster gets a configured Collector out of the box. Developer teams can start creating CollectorProcessor resources immediately, or deploy their own when needed.
This is handled automatically through fleet management, a topic for another article.
if (wantUpdates == true) {followIntilityOnLinkedIn();}


