Which middleware has the bug?
@hono/otel
What version of the middleware?
1.1.1
What version of Hono are you using?
4.12.14
What runtime/platform is your app running on? (with version if possible)
Node.js 24.15.0 (Railway)
What steps can reproduce the bug?
@hono/otel creates its http.server.request.duration histogram without
passing advice.explicitBucketBoundaries:
// packages/otel/src/index.ts
const histogram = getMeter(config).createHistogram(
METRIC_HTTP_SERVER_REQUEST_DURATION,
{
unit: 's',
description: '...',
},
);
With no advice, the OTel SDK falls back to its static default bucket
boundaries, which are legacy ms-scale values:
[0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000]
(see sdk-metrics/src/view/Aggregation.ts).
Since PR #1784 correctly converts the recorded value to seconds, every
sub-second request (the common case) lands in the first bucket [0, 5s].
Minimal repro:
-
Use @hono/otel >= 1.1.1 with any OTLP metric exporter.
-
Serve any route that returns in <5 seconds (i.e. ~all routes).
-
Query Prometheus / VictoriaMetrics / any backend:
histogram_quantile(0.99,
sum by(le) (rate(http_server_request_duration_seconds_bucket[5m]))
)
-
Observe the result is always ~4.95 regardless of actual latency.
What is the expected behavior?
histogram_quantile(0.99, ...) should approximate the real p99 latency.
For a service where requests typically take a few ms, p99 should be on the
order of tens-to-hundreds of milliseconds — not ~4.95 seconds.
What do you see instead?
histogram_quantile(0.99) reports ~4.95s for every route, because the SDK
interpolates linearly within the [0, 5s] bucket: 99% of the way through
that bucket is 0 + 5 * 0.99 = 4.95s.
Direct query on a sample histogram confirms the bucket layout:
le=0 count=0
le=5 count=29
le=10 count=29
le=25 count=29
... (all identical through le=10000)
le=+Inf count=29
All 29 requests (mean latency 0.68ms per _sum / _count) sit in the first
bucket. histogram_quantile has no signal to differentiate fast from slow
within that bucket.
Additional information
The OTel JS SDK's static histogram default was inherited from the pre-stable-semconv era, when HTTP durations were recorded in milliseconds. Stable HTTP semconv later switched units to seconds, but the SDK default boundaries were never updated — instead, well-behaved instrumentations pass their own advice.explicitBucketBoundaries to override.
@opentelemetry/instrumentation-http does this correctly:
this._stableHttpServerDurationHistogram = this.meter.createHistogram(
METRIC_HTTP_SERVER_REQUEST_DURATION,
{
description: '...',
unit: 's',
advice: {
explicitBucketBoundaries: [
0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1,
2.5, 5, 7.5, 10,
],
},
},
);
@hono/otel needs the same advice block. PR #1784 fixed the value unit (dividing by 1000 before recording) but did not add bucket advice — a one-line oversight that leaves the histogram effectively unreadable via histogram_quantile for the common sub-second request case.
Proposed fix: add advice.explicitBucketBoundaries with the stable HTTP semconv defaults (same values instrumentation-http uses), so Hono apps and apps instrumented via instrumentation-http land on identical bucket grids and share PromQL queries / dashboards.
Backward compatibility: similar character to #1784 — which was itself a behavior change shipped as a patch. Any user relying on the current (broken) bucket layout for alerts would see le labels change. Users who already override via an SDK-level View are unaffected (View precedence wins over advice).
Workaround for users hitting this today — override at the SDK level via a View in your telemetry bootstrap:
import { AggregationType } from '@opentelemetry/sdk-metrics';
new NodeSDK({
views: [{
instrumentName: 'http.server.request.duration',
aggregation: {
type: AggregationType.EXPLICIT_BUCKET_HISTOGRAM,
options: {
boundaries: [
0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1,
2.5, 5, 7.5, 10,
],
},
},
}],
// ...
});
References:
Which middleware has the bug?
@hono/otel
What version of the middleware?
1.1.1
What version of Hono are you using?
4.12.14
What runtime/platform is your app running on? (with version if possible)
Node.js 24.15.0 (Railway)
What steps can reproduce the bug?
@hono/otel creates its
http.server.request.durationhistogram withoutpassing
advice.explicitBucketBoundaries:With no advice, the OTel SDK falls back to its static default bucket
boundaries, which are legacy ms-scale values:
[0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000](see
sdk-metrics/src/view/Aggregation.ts).Since PR #1784 correctly converts the recorded value to seconds, every
sub-second request (the common case) lands in the first bucket
[0, 5s].Minimal repro:
Use
@hono/otel>= 1.1.1 with any OTLP metric exporter.Serve any route that returns in <5 seconds (i.e. ~all routes).
Query Prometheus / VictoriaMetrics / any backend:
Observe the result is always ~4.95 regardless of actual latency.
What is the expected behavior?
histogram_quantile(0.99, ...)should approximate the real p99 latency.For a service where requests typically take a few ms, p99 should be on the
order of tens-to-hundreds of milliseconds — not ~4.95 seconds.
What do you see instead?
histogram_quantile(0.99)reports ~4.95s for every route, because the SDKinterpolates linearly within the
[0, 5s]bucket: 99% of the way throughthat bucket is
0 + 5 * 0.99 = 4.95s.Direct query on a sample histogram confirms the bucket layout:
All 29 requests (mean latency 0.68ms per
_sum / _count) sit in the firstbucket.
histogram_quantilehas no signal to differentiate fast from slowwithin that bucket.
Additional information
The OTel JS SDK's static histogram default was inherited from the pre-stable-semconv era, when HTTP durations were recorded in milliseconds. Stable HTTP semconv later switched units to seconds, but the SDK default boundaries were never updated — instead, well-behaved instrumentations pass their own advice.explicitBucketBoundaries to override.
@opentelemetry/instrumentation-http does this correctly:
@hono/otel needs the same advice block. PR #1784 fixed the value unit (dividing by 1000 before recording) but did not add bucket advice — a one-line oversight that leaves the histogram effectively unreadable via histogram_quantile for the common sub-second request case.
Proposed fix: add advice.explicitBucketBoundaries with the stable HTTP semconv defaults (same values instrumentation-http uses), so Hono apps and apps instrumented via instrumentation-http land on identical bucket grids and share PromQL queries / dashboards.
Backward compatibility: similar character to #1784 — which was itself a behavior change shipped as a patch. Any user relying on the current (broken) bucket layout for alerts would see le labels change. Users who already override via an SDK-level View are unaffected (View precedence wins over advice).
Workaround for users hitting this today — override at the SDK level via a View in your telemetry bootstrap:
References: