On our path toward graduation, the OpenTelemetry project is currently undergoing a security audit sponsored by the CNCF, facilitated by OSTIF, and performed by 7ASecurity. During this process, we have received a few ideas about things that we could do better, like using specific compiler flags when preparing our OpenTelemetry Collector binaries. On 31 May 2024, we received a more serious report: a malicious user could cause a denial of service (DoS) when using a specially crafted HTTP or gRPC request. The advisory was assigned the following CVE identifier: CVE-2024-36129.
When sending an HTTP request with a compressed payload, the Collector would verify only whether the compressed payload is beyond a certain limit, but not its uncompressed version. A malicious payload could then send a “compressed bomb”, causing the Collector to crash.
Similarly, when sending a gRPC request using zstd compression, the decompression mechanism would not respect the limits imposed by gRPC, also causing the Collector to crash while decompressing the malicious payload.
A few business hours after the report, on 03 June 2024, Collector developers were able to reproduce the report related to HTTP and worked together on a fix that got merged the same day. Given the high score of this issue, we’ve decided to hold the release that would have happened on the same day, completing it on 04 June 2024 instead.
After the release, we got a confirmation that gRPC with zstd was also affected. Within a few business hours of the confirmation, we worked on a fix that also got merged the same day. We released v0.102.1 right after that.
You are affected by this vulnerability if you have an OpenTelemetry Collector with one or more HTTP or gRPC receivers on a public port, such as the OTLP Receiver with the “HTTP” or “gRPC” protocol enabled (typically on ports 4318 and 4317, respectively) AND the receiver has version 0.102.0 or below. The vulnerability is exploitable only by attackers who can send payloads to your HTTP/gRPC endpoint(s). This usually means that the port needs to be exposed to the public internet or another network segment that’s available to the attacker.
Note that if you require authentication, an attacker would need to have valid credentials in order to exploit the vulnerability using the HTTP protocol. For gRPC, the exploitable code is executed before authentication.
If you manage a Collector that has an interface to the public internet, you
should upgrade it as soon as feasible, and consider setting the parameter
max_request_body_size
on HTTP receivers, such as the OTLP receiver, to a value
that makes sense to your workload. Up to v0.101.0, this setting applied only to
the payload size sent by the client, which could often be compressed.
Starting from v0.102.0, this setting applies to uncompressed, compressed, and decompressed payload sizes and we are establishing a default value of 20 MiB for this. This new default characterizes a breaking change, as clients sending payloads bigger than 20 MiB will start seeing an error. While we believe most authentic requests will be way within this limit, it’s still wise to monitor your Collector for increased error rates after this update. Here’s an example of a configuration setting a different limit to this field:
receivers:
otlp:
protocols:
http:
endpoint: localhost:4318
max_request_body_size: 10485760 # 10 MiB
For gRPC receivers, it’s sufficient to upgrade to v0.102.1, as there’s a default value being applied to the message size already: 4 MiB.
If your Collector instances are receiving data only from trusted clients, like your own applications, you are still encouraged to upgrade to the latest Collector version but you can do it at your regular pace.
If you are using a custom distribution and building it with the OpenTelemetry Collector Builder (ocb), you can add a “replaces” entry pointing to the latest version of the confighttp and configgrpc Go module. If your base Collector version is at v0.96.0 or higher, we do not expect any compatibility issues by just bumping to the latest version.
During this process, we found a couple of gaps in the telemetry for the Collector, as well as to the options we provide to Collector admins. Concretely, we noticed that we do not have a good way to verify what’s the distribution of request sizes received by the Collector, which would have been useful to determine whether the change would break clients for a given Collector. We also noticed that we don’t provide a way for admins to completely disable compression, which would be a good way to mitigate an attack without having to upgrade the Collector. We are working to fill those gaps over the next releases.
We are also working on stabilizing the component.UseLocalHostAsDefaultHost
feature gate to reduce exposure of all Collector endpoints by default. This
feature gate was motivated by a previous,
similar vulnerability on Go’s standard library
and has been in alpha for several months. You can follow the discussion
surrounding stabilization at
issue 8510.
This issue was identified by Miroslav Stampar, from 7ASecurity. We’d like to thank 7ASecurity for the responsible reporting of this vulnerability.