You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SWIP-11 Support iOS App Monitoring via OpenTelemetry
Motivation
iOS (including iPadOS) is one of the most important client-side platforms. Monitoring iOS app performance
— HTTP request latency, crash rates, app launch time — is as important as browser monitoring, which
SkyWalking has supported since v8.x.
The OpenTelemetry Swift SDK (v2.3.0, tracing
stable) provides auto-instrumentation for iOS apps including HTTP request tracing (URLSession), device/OS
resource attributes, and Apple MetricKit integration. All data is exported via standard OTLP.
Unlike browser monitoring which requires a custom SkyWalking protocol (BrowserPerf.proto) and a dedicated
receiver plugin, the OTel Swift SDK speaks standard OTLP. SkyWalking already has an OTLP receiver, so this
feature primarily requires layer detection, a MetricKit span analyzer, LAL rules for crash diagnostics, and
UI dashboards.
This SWIP also establishes a Mobile menu group in the UI, preparing for future Android monitoring
(via opentelemetry-android).
No OTel Collector is required, though one can be used for buffering.
Key challenge: The OTel Swift SDK does not set service.layer or service.instance.id — and
this is common for most OTLP sources. Rather than hardcoding layer inference in the handler, this
SWIP introduces a general-purpose mechanism: sourceAttributes on LogMetadata + LAL script-level
layer assignment.
Proposed Changes
1. New Layer: IOS
Add in Layer.java:
/** * iOS/iPadOS app monitoring via OpenTelemetry Swift SDK */IOS(47, true),
Normal layer (isNormal=true) because the iOS app is directly instrumented.
2. Source Attributes on LogMetadata (General Enhancement)
OTLP resource attributes (e.g., os.name, device.model.identifier) are currently extracted by OpenTelemetryLogHandler to read service.name, service.layer, service.instance.id, then discarded. They are not passed into LogData tags and not available to LAL scripts.
This is a problem not only for iOS but for any OTLP source where service.layer is absent — the
LAL script has no information to determine the layer.
Solution: sourceAttributes on LogMetadata
Add a non-persistent sourceAttributes field to LogMetadata (Java bean, not proto):
@Data@BuilderpublicclassLogMetadata {
privateStringservice;
privateStringserviceInstance;
privateStringendpoint;
privateStringlayer;
privatelongtimestamp;
@Builder.DefaultprivateTraceContexttraceContext = TraceContext.EMPTY;
/** * Non-persistent attributes from the log source (e.g., OTLP resource attributes, * ALS node context). Available to LAL scripts via sourceAttribute() but NOT stored * in tagsRawData. */@Builder.DefaultprivateMap<String, String> sourceAttributes = Collections.emptyMap();
}
Why sourceAttributes not resourceAttributes: Different receivers have different source
contexts — OTLP has resource attributes, Envoy ALS has node info, etc. sourceAttributes is
generic.
Why on LogMetadata not LogData:LogData is a proto object (from Logging.proto). Its tags field gets serialized into tagsRawData and persisted to storage. LogMetadata is a Java
bean used only as a transient carrier during LAL processing — adding fields here has no storage
impact.
Handler Change: OpenTelemetryLogHandler
Pass all resource attributes into LogMetadata.sourceAttributes:
// Existing: extract specific fields from resource attributesfinalvarservice = attributes.get("service.name");
finalvarlayer = attributes.getOrDefault("service.layer", "");
finalvarserviceInstance = attributes.getOrDefault("service.instance.id", "");
// New: pass ALL resource attributes as sourceAttributesfinalvarmetadata = LogMetadata.builder()
.service(service)
.serviceInstance(serviceInstance)
.layer(layer)
.timestamp(logRecord.getTimeUnixNano() / 1_000_000)
.sourceAttributes(attributes) // <-- all resource attrs, non-persistent
.build();
logAnalyzerService().doAnalysis(metadata, logDataBuilder);
LAL DSL: sourceAttribute() Function
Add a new function to the LAL DSL that reads from LogMetadata.sourceAttributes:
Currently, layer in a LAL rule YAML serves as both a routing key (only rules matching the log's
layer are evaluated) and output metadata. This creates a chicken-and-egg problem: a rule that wants
to SET the layer cannot be reached if the layer is absent.
Solution: layer: auto mode
A new layer: auto declaration indicates the layer is determined by the script. Rules with layer: auto match logs where service.layer is absent (empty/unset). The script is expected to
set the layer in the extractor:
rules:
- name: ios-metrickit-diagnosticslayer: auto # layer determined by script; dropped if not setdsl: | filter { // Determine if this is an iOS log if (sourceAttribute("os.name") != "iOS" && sourceAttribute("os.name") != "iPadOS") { abort {} } extractor { layer IOS # LAL script sets the layer // ... } sink { } }
Drop policy: In auto mode, if the script does not set the layer (either because the script
aborted or because the extractor omitted layer), the log is warned and dropped at persistence. layer: auto means "I take responsibility for setting the layer" — if no layer is set, it's either
a non-matching log (abort) or a script bug (warn).
This enforces that every OTLP log source either:
Sets service.layer explicitly (like Envoy AI Gateway), OR
Has a matching layer: auto LAL rule that determines the layer from source attributes
Backward compatibility: Existing OTLP log sources that set service.layer are unaffected —
their logs have a concrete layer and are routed to layer-specific rules as before. layer: auto
rules only see logs with absent layer. The existing default.yaml rule (layer: GENERAL) continues
to catch logs that have layer = GENERAL.
4. Resource Attributes Available to LAL (via sourceAttribute())
The OTel Swift SDK sets the following resource attributes, all available via sourceAttribute():
Currently, SpanForward hardcodes GenAI-specific logic (processGenAILogic()) inline. Adding iOS
MetricKit handling as another hardcoded case would be unmaintainable. This SWIP introduces a general span listener mechanism to support extensible span-based metric extraction and trace persistence
control.
GenAI logic is hardcoded — adding iOS/Android/etc. would keep growing
Listeners can't see original OTLP structure (InstrumentationScope name, resource attributes
as separate fields) — everything is already flattened into Zipkin tags
Spans that should NOT be persisted (e.g., 24-hour MetricKit) still get converted to Zipkin format
Solution: OTLPSpanListener Interface — Before Zipkin Conversion
Listeners operate on the raw OTLP span + resource attributes, before Zipkin conversion.
This gives listeners access to:
InstrumentationScope name and version (lost in Zipkin conversion)
Resource attributes as a separate map (not flattened with span attributes)
Original OTLP span structure
/** * Listener for OTLP spans. Called BEFORE Zipkin conversion. * Implementations can: * 1. Extract metrics or other data from spans * 2. Modify resource/span attributes before Zipkin conversion * 3. Control whether the span should be converted and persisted as a trace */publicinterfaceOTLPSpanListener {
/** * Process an OTLP span. * * @param span the raw OTLP span * @param resourceAttributes resource-level attributes (service.name, os.name, etc.) * @param scopeName InstrumentationScope name (e.g., "NSURLSession", "MetricKit") * @param scopeVersion InstrumentationScope version * @return result controlling persistence and tag modifications */OTLPSpanListenerResultonSpan(
io.opentelemetry.proto.trace.v1.Spanspan,
Map<String, String> resourceAttributes,
StringscopeName,
StringscopeVersion
);
}
publicclassOTLPSpanListenerResult {
/** Whether this span should be converted to Zipkin and persisted. Default: true */privatebooleanpersistTrace = true;
/** Additional tags to inject before Zipkin conversion (e.g., estimated_cost) */privateMap<String, String> additionalTags = Collections.emptyMap();
/** Layer override — if set, the service is assigned this layer */privateLayerlayer = null;
}
Listeners are registered via SPI (META-INF/services/) and loaded at handler initialization.
The existing processGenAILogic() is refactored into GenAISpanListener — no behavior change,
just better structure.
Key design points:
Listeners see raw OTLP data — InstrumentationScope name, resource attributes as separate map
Any listener can veto trace persistence — prevents Zipkin conversion entirely (no wasted work)
Any listener can inject tags — merged before Zipkin conversion
Multiple listeners can process the same span (e.g., a GenAI span on iOS triggers both)
If ANY listener vetoes persistence, the span is not converted or stored
Note: No IOSLayerSpanListener is needed. The IOS layer is registered automatically
when the MAL expSuffix with Layer.IOS processes MetricKit metrics. The OTLP→Zipkin trace
pipeline (SpanForward) emits Zipkin-specific sources (not OAL sources), so there are no
OAL traffic metrics for OTLP traces.
6. Entity Model
SkyWalking Entity
Source
Example
Service
service_name label in MAL expSuffix
MyApp
Service Instance
service_instance_id label in MAL expSuffix
2.1.0
No endpoint entity — MetricKit metrics are service/instance scoped only.
7. HTTP Span Processing (Trace Path)
HTTP spans from InstrumentationScope NSURLSession flow through the existing OTLP → Zipkin → SpanForward
trace pipeline. They are stored as Zipkin spans and queryable via the Zipkin query API.
Note: The OTLP→Zipkin trace pipeline (SpanForward) emits Zipkin-specific sources
(ZipkinService, ZipkinServiceSpan, ZipkinServiceRelation), not OAL sources. There are no OAL traffic metrics (e.g., service_cpm, service_resp_time) generated from OTLP traces.
HTTP trace metrics for iOS may be added in the future via MAL extraction in a SpanListener.
OTLP Export Feedback Loop
The URLSession auto-instrumentation captures all HTTP calls including the OTLP export calls
themselves. This creates an exponential feedback loop — validated in our POC: 4 real HTTP requests
generated 41,213 spurious export spans.
Recommended mitigation (documented in user guide): Use the SDK's shouldInstrument callback to
exclude the collector URL:
iOS monitoring metrics come from MetricKit — daily aggregated device statistics delivered once
per day per device via the OTel Swift SDK's MetricKit instrumentation.
Apple's MetricKit delivers pre-aggregated app statistics once per day. The OTel Swift SDK encodes
this as a single span with startTime = 24h ago, endTime = now, with all statistics as span
attributes. These are not trace spans — they must be intercepted and converted to metrics.
IOSMetricKitSpanListener implements the SpanListener SPI (Section 5):
Detection:scopeName == "MetricKit" AND span.spanName() == "MXMetricPayload" — uses the
raw OTLP InstrumentationScope name, available because listeners run before Zipkin conversion
Action: Extract span attributes as SampleFamily samples with 4 labels (service_name, service_instance_id, device_model, os_version), push into the shared MAL pipeline via OpenTelemetryMetricRequestProcessor.toMeter() — no duplicate rule loading
Persistence: Returns shouldPersist = false — a 24-hour span must not be stored as a trace
Required module:receiver-otel — the listener uses the otel-receiver's MAL converters
configured via enabledOtelMetricsRules
MetricKit data is inherently daily — each device reports once per day. Multiple devices running the
same app produce multiple data points per day. The analyzer uses the span's end time as the data
point timestamp with day-level time bucket (TimeBucket.getDayTimeBucket()).
Different metrics require different cross-device aggregation:
Metric Category
Aggregation
Reasoning
Pre-averaged values (launch time, hang time)
longAvg
Apple already averaged per-device; average across fleet
Peak values (memory)
max
Want the worst-case device
Counts (crash count, exit count)
sum
Total events across fleet
Cumulative volumes (network bytes, disk writes, CPU time)
sum
Total fleet resource usage
Ratios (scroll hitch)
doubleAvg
Fleet-wide average jank
Span-to-Sample Conversion
The listener converts each MXMetricPayload span into labeled SampleFamily samples:
The listener emits histogram-bucketed samples (with le labels) for app launch time and
hang time, enabling histogram_percentile to compute P50/P75/P90/P95/P99 across the device fleet.
Bucket ceiling: both histograms top out at a finite 30 s bucket rather than +Inf. MAL
parses le="Infinity" to (long) Double.POSITIVE_INFINITY = Long.MAX_VALUE and surfaces it
verbatim in percentile queries; on a dashboard that renders as ~9.2×10¹⁸, which is worse than
a visibly alarming but human-readable cap. Values above 30 s are vanishingly rare for iOS app
launch / hang observations (MetricKit itself hard-caps hangs near 30 s), so the finite sentinel
preserves percentile accuracy without breaking the UI.
Aggregation Example
Given 3 devices reporting on the same day for service "MyApp":
MetricKit diagnostic payloads arrive as OTLP log records with InstrumentationScope: MetricKit.
The diagnostic type is identified by the name log record attribute.
LogData Input to LAL
After the changes in Sections 2–3, the LogData seen by LAL for a crash diagnostic:
rules:
- name: ios-metrickit-diagnosticslayer: auto # layer determined by script; dropped if not setdsl: | filter { // Only match iOS/iPadOS logs if (sourceAttribute("os.name") != "iOS" && sourceAttribute("os.name") != "iPadOS") { abort {} } // Only match MetricKit diagnostic logs if (tag("name") == null || !tag("name").startsWith("metrickit.diagnostic.")) { abort {} } extractor { layer IOS // Set instance from service.version (SDK doesn't set service.instance.id) instance sourceAttribute("service.version") // Selectively copy useful source attributes into persistent tags tag 'device.model': sourceAttribute("device.model.identifier") tag 'os.version': sourceAttribute("os.version") // Copy diagnostic details from log record tags tag 'diagnosticType': tag("name") tag 'exception.type': tag("exception.type") tag 'exception.message': tag("exception.message") tag 'exception.stacktrace': tag("exception.stacktrace") tag 'signal.name': tag("metrickit.diagnostic.crash.exception.signal.name") tag 'hang.duration': tag("metrickit.diagnostic.hang.hang_duration") } sink { // Store all diagnostics — they are already rare (once/day batches from real devices) } }
11. UI Menu and Dashboards
Menu Configuration
Add to oap-server/server-starter/src/main/resources/ui-initialized-templates/menu.yaml:
- title: Mobileicon: mobiledescription: Mobile application monitoring via OpenTelemetry SDKs.i18nKey: mobilemenus:
- title: iOSlayer: IOSdescription: iOS/iPadOS app monitoring via OpenTelemetry Swift SDK.documentLink: https://skywalking.apache.org/docs/main/next/en/setup/service-agent/ios-monitoring/i18nKey: ios
Dashboard Templates
Create dashboards under ui-initialized-templates/ios/:
ios-root.json — Root list view of all iOS app services.
A separate PR in skywalking-booster-ui is needed
for i18n menu entries for the "Mobile" group and "iOS" sub-item.
Imported Dependencies libs and their licenses.
No new dependencies. All processing uses existing OTLP receiver, OAL, LAL, and meter infrastructure.
Compatibility
Configuration: New layer IOS and menu entry — additive, no breaking change.
Storage: No new storage structures. Uses existing trace, metrics, and log storage.
Protocols: No protocol changes. Uses existing OTLP receiver.
LogMetadata: New sourceAttributes field — backward compatible. Existing receivers that don't
populate it get an empty map. Existing LAL rules that don't call sourceAttribute() are unaffected.
LAL layer: auto mode: Additive. Existing rules with specific layers (GENERAL, MESH, etc.)
are unaffected. Only new rules can opt into auto mode to match logs with absent layer.
Drop policy for auto rules: In auto mode, logs where the script does not set a layer are
warned and dropped. This only affects logs routed to auto rules — logs with explicit layers
are unaffected.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
SWIP-11 Support iOS App Monitoring via OpenTelemetry
Motivation
iOS (including iPadOS) is one of the most important client-side platforms. Monitoring iOS app performance
— HTTP request latency, crash rates, app launch time — is as important as browser monitoring, which
SkyWalking has supported since v8.x.
The OpenTelemetry Swift SDK (v2.3.0, tracing
stable) provides auto-instrumentation for iOS apps including HTTP request tracing (URLSession), device/OS
resource attributes, and Apple MetricKit integration. All data is exported via standard OTLP.
Unlike browser monitoring which requires a custom SkyWalking protocol (
BrowserPerf.proto) and a dedicatedreceiver plugin, the OTel Swift SDK speaks standard OTLP. SkyWalking already has an OTLP receiver, so this
feature primarily requires layer detection, a MetricKit span analyzer, LAL rules for crash diagnostics, and
UI dashboards.
This SWIP also establishes a Mobile menu group in the UI, preparing for future Android monitoring
(via opentelemetry-android).
Architecture Graph
No OTel Collector is required, though one can be used for buffering.
Key challenge: The OTel Swift SDK does not set
service.layerorservice.instance.id— andthis is common for most OTLP sources. Rather than hardcoding layer inference in the handler, this
SWIP introduces a general-purpose mechanism:
sourceAttributesonLogMetadata+ LAL script-levellayer assignment.
Proposed Changes
1. New Layer:
IOSAdd in
Layer.java:Normal layer (
isNormal=true) because the iOS app is directly instrumented.2. Source Attributes on LogMetadata (General Enhancement)
OTLP resource attributes (e.g.,
os.name,device.model.identifier) are currently extracted byOpenTelemetryLogHandlerto readservice.name,service.layer,service.instance.id, thendiscarded. They are not passed into LogData tags and not available to LAL scripts.
This is a problem not only for iOS but for any OTLP source where
service.layeris absent — theLAL script has no information to determine the layer.
Solution:
sourceAttributesonLogMetadataAdd a non-persistent
sourceAttributesfield toLogMetadata(Java bean, not proto):Why
sourceAttributesnotresourceAttributes: Different receivers have different sourcecontexts — OTLP has resource attributes, Envoy ALS has node info, etc.
sourceAttributesisgeneric.
Why on
LogMetadatanotLogData:LogDatais a proto object (fromLogging.proto). Itstagsfield gets serialized intotagsRawDataand persisted to storage.LogMetadatais a Javabean used only as a transient carrier during LAL processing — adding fields here has no storage
impact.
Handler Change:
OpenTelemetryLogHandlerPass all resource attributes into
LogMetadata.sourceAttributes:LAL DSL:
sourceAttribute()FunctionAdd a new function to the LAL DSL that reads from
LogMetadata.sourceAttributes:This is similar to
tag()but reads from the non-persistent source context instead of LogData tags.3. LAL Script-Level Layer Assignment (
layer: auto)Currently,
layerin a LAL rule YAML serves as both a routing key (only rules matching the log'slayer are evaluated) and output metadata. This creates a chicken-and-egg problem: a rule that wants
to SET the layer cannot be reached if the layer is absent.
Solution:
layer: automodeA new
layer: autodeclaration indicates the layer is determined by the script. Rules withlayer: automatch logs whereservice.layeris absent (empty/unset). The script is expected toset the layer in the extractor:
Drop policy: In
automode, if the script does not set the layer (either because the scriptaborted or because the extractor omitted
layer), the log is warned and dropped at persistence.layer: automeans "I take responsibility for setting the layer" — if no layer is set, it's eithera non-matching log (abort) or a script bug (warn).
This enforces that every OTLP log source either:
service.layerexplicitly (like Envoy AI Gateway), ORlayer: autoLAL rule that determines the layer from source attributesBackward compatibility: Existing OTLP log sources that set
service.layerare unaffected —their logs have a concrete layer and are routed to layer-specific rules as before.
layer: autorules only see logs with absent layer. The existing
default.yamlrule (layer: GENERAL) continuesto catch logs that have
layer = GENERAL.4. Resource Attributes Available to LAL (via
sourceAttribute())The OTel Swift SDK sets the following resource attributes, all available via
sourceAttribute():os.nameiOS,iPadOS,macOSUIDevice.current.systemNameos.typedarwinos.version17.4.1ProcessInfo.operatingSystemVersiondevice.model.identifieriPhone15,2sysctl(HW_MACHINE)service.nameMyAppCFBundleNameservice.version2.1.0 (45)CFBundleShortVersionString+ buildtelemetry.sdk.languageswift5. OTLP Span Listener Mechanism (General Enhancement)
Currently,
SpanForwardhardcodes GenAI-specific logic (processGenAILogic()) inline. Adding iOSMetricKit handling as another hardcoded case would be unmaintainable. This SWIP introduces a general
span listener mechanism to support extensible span-based metric extraction and trace persistence
control.
Current Problem
Problems:
as separate fields) — everything is already flattened into Zipkin tags
Solution:
OTLPSpanListenerInterface — Before Zipkin ConversionListeners operate on the raw OTLP span + resource attributes, before Zipkin conversion.
This gives listeners access to:
InstrumentationScopename and version (lost in Zipkin conversion)Revised Flow in
OpenTelemetryTraceHandlerRegistered Listeners
GenAISpanListenergen_ai.systemorgen_ai.provider.nameattributeestimated_costtag)IOSMetricKitSpanListenerscopeName == "MetricKit"+span.name == "MXMetricPayload"Listeners are registered via SPI (
META-INF/services/) and loaded at handler initialization.The existing
processGenAILogic()is refactored intoGenAISpanListener— no behavior change,just better structure.
Key design points:
Note: No
IOSLayerSpanListeneris needed. TheIOSlayer is registered automaticallywhen the MAL
expSuffixwithLayer.IOSprocesses MetricKit metrics. The OTLP→Zipkin tracepipeline (
SpanForward) emits Zipkin-specific sources (not OAL sources), so there are noOAL traffic metrics for OTLP traces.
6. Entity Model
service_namelabel in MALexpSuffixMyAppservice_instance_idlabel in MALexpSuffix2.1.0No endpoint entity — MetricKit metrics are service/instance scoped only.
7. HTTP Span Processing (Trace Path)
HTTP spans from
InstrumentationScope NSURLSessionflow through the existing OTLP → Zipkin → SpanForwardtrace pipeline. They are stored as Zipkin spans and queryable via the Zipkin query API.
Note: The OTLP→Zipkin trace pipeline (
SpanForward) emits Zipkin-specific sources(
ZipkinService,ZipkinServiceSpan,ZipkinServiceRelation), not OAL sources. There areno OAL traffic metrics (e.g.,
service_cpm,service_resp_time) generated from OTLP traces.HTTP trace metrics for iOS may be added in the future via MAL extraction in a SpanListener.
OTLP Export Feedback Loop
The URLSession auto-instrumentation captures all HTTP calls including the OTLP export calls
themselves. This creates an exponential feedback loop — validated in our POC: 4 real HTTP requests
generated 41,213 spurious export spans.
Recommended mitigation (documented in user guide): Use the SDK's
shouldInstrumentcallback toexclude the collector URL:
8. Metrics Overview
iOS monitoring metrics come from MetricKit — daily aggregated device statistics delivered once
per day per device via the OTel Swift SDK's MetricKit instrumentation.
9. MetricKit Span Listener (
IOSMetricKitSpanListener)Apple's MetricKit delivers pre-aggregated app statistics once per day. The OTel Swift SDK encodes
this as a single span with
startTime = 24h ago,endTime = now, with all statistics as spanattributes. These are not trace spans — they must be intercepted and converted to metrics.
IOSMetricKitSpanListenerimplements theSpanListenerSPI (Section 5):scopeName == "MetricKit"ANDspan.spanName() == "MXMetricPayload"— uses theraw OTLP InstrumentationScope name, available because listeners run before Zipkin conversion
SampleFamilysamples with 4 labels (service_name,service_instance_id,device_model,os_version), push into the shared MAL pipeline viaOpenTelemetryMetricRequestProcessor.toMeter()— no duplicate rule loadingshouldPersist = false— a 24-hour span must not be stored as a tracereceiver-otel— the listener uses the otel-receiver's MAL convertersconfigured via
enabledOtelMetricsRulesMetricKit Source Attributes
metrickit.app_launch.time_to_first_draw_averagemetrickit.app_responsiveness.hang_time_averagemetrickit.cpu.cpu_timemetrickit.memory.peak_memory_usagemetrickit.network_transfer.wifi_downloadmetrickit.network_transfer.wifi_uploadmetrickit.network_transfer.cellular_downloadmetrickit.network_transfer.cellular_uploadmetrickit.app_exit.foreground.abnormal_exit_countmetrickit.app_exit.foreground.normal_app_exit_countmetrickit.app_exit.background.abnormal_exit_countmetrickit.app_exit.background.normal_app_exit_countmetrickit.app_exit.background.memory_pressure_exit_countmetrickit.animation.scroll_hitch_time_ratiometrickit.gpu.timemetrickit.diskio.logical_write_countmetrickit.metadata.device_typemetrickit.metadata.os_versionAggregation Strategy
MetricKit data is inherently daily — each device reports once per day. Multiple devices running the
same app produce multiple data points per day. The analyzer uses the span's end time as the data
point timestamp with day-level time bucket (
TimeBucket.getDayTimeBucket()).Different metrics require different cross-device aggregation:
longAvgmaxsumsumdoubleAvgSpan-to-Sample Conversion
The listener converts each
MXMetricPayloadspan into labeledSampleFamilysamples:Labels are extracted from:
service_name→ resource attributeservice.nameservice_instance_id→ resource attributeservice.version(instance fallback)device_model→ span attributemetrickit.metadata.device_typeor resource attributedevice.model.identifieros_version→ span attributemetrickit.metadata.os_versionor resource attributeos.versionMAL Rules
Create
oap-server/server-starter/src/main/resources/otel-rules/ios/ios-metrickit.yaml:The listener emits histogram-bucketed samples (with
lelabels) for app launch time andhang time, enabling
histogram_percentileto compute P50/P75/P90/P95/P99 across the device fleet.Bucket ceiling: both histograms top out at a finite 30 s bucket rather than
+Inf. MALparses
le="Infinity"to(long) Double.POSITIVE_INFINITY = Long.MAX_VALUEand surfaces itverbatim in percentile queries; on a dashboard that renders as ~9.2×10¹⁸, which is worse than
a visibly alarming but human-readable cap. Values above 30 s are vanishingly rare for iOS app
launch / hang observations (MetricKit itself hard-caps hangs near 30 s), so the finite sentinel
preserves percentile accuracy without breaking the UI.
Aggregation Example
Given 3 devices reporting on the same day for service "MyApp":
Resulting daily metrics:
ios_app_launch_time_percentile P50ios_app_launch_time_percentile P90ios_peak_memoryios_foreground_abnormal_exit_countios_wifi_download10. MetricKit Diagnostic Log Processing (LAL)
MetricKit diagnostic payloads arrive as OTLP log records with
InstrumentationScope: MetricKit.The diagnostic type is identified by the
namelog record attribute.LogData Input to LAL
After the changes in Sections 2–3, the LogData seen by LAL for a crash diagnostic:
Key distinction:
sourceAttributes→ readable viasourceAttribute()in LAL, NOT persistedtags→ readable viatag()in LAL, persisted intagsRawDatatag 'key': value→ adds to both persistent tags and searchable tagsDiagnostic Types
nameAttributemetrickit.diagnostic.crashexception.type,exception.message,exception.stacktrace,metrickit.diagnostic.crash.exception.signal.namemetrickit.diagnostic.hangexception.stacktrace,metrickit.diagnostic.hang.hang_durationmetrickit.diagnostic.cpu_exceptionmetrickit.diagnostic.cpu_exception.total_cpu_timemetrickit.diagnostic.disk_write_exceptionmetrickit.diagnostic.disk_write_exception.total_writes_causedmetrickit.diagnostic.app_launchmetrickit.diagnostic.app_launch.launch_durationLAL Rules
Create
oap-server/server-starter/src/main/resources/lal/ios-metrickit.yaml:11. UI Menu and Dashboards
Menu Configuration
Add to
oap-server/server-starter/src/main/resources/ui-initialized-templates/menu.yaml:Dashboard Templates
Create dashboards under
ui-initialized-templates/ios/:ios-root.json — Root list view of all iOS app services.
ios-service.json — Per-app dashboard:
service_cpm,service_resp_time,service_sla,service_percentilemeter_ios_app_launch_timemeter_ios_foreground_abnormal_exit_count,meter_ios_background_oom_kill_countmeter_ios_peak_memorymeter_ios_wifi_download,meter_ios_cellular_download, etc.meter_ios_hang_timeios-instance.json — Per-version dashboard (instance = app version):
service_instance_cpm,service_instance_resp_time,service_instance_sla)ios-endpoint.json — Per-domain dashboard (endpoint =
net.peer.namedomain):endpoint_cpm,endpoint_resp_time,endpoint_sla,endpoint_percentile(from OAL)UI Side
A separate PR in skywalking-booster-ui is needed
for i18n menu entries for the "Mobile" group and "iOS" sub-item.
Imported Dependencies libs and their licenses.
No new dependencies. All processing uses existing OTLP receiver, OAL, LAL, and meter infrastructure.
Compatibility
IOSand menu entry — additive, no breaking change.sourceAttributesfield — backward compatible. Existing receivers that don'tpopulate it get an empty map. Existing LAL rules that don't call
sourceAttribute()are unaffected.layer: automode: Additive. Existing rules with specific layers (GENERAL, MESH, etc.)are unaffected. Only new rules can opt into
automode to match logs with absent layer.autorules: Inautomode, logs where the script does not set a layer arewarned and dropped. This only affects logs routed to
autorules — logs with explicit layersare unaffected.
General usage docs
Prerequisites
iOS App Setup
SkyWalking OAP Configuration
Enable the OTLP receiver and LAL rules in
application.yml:What You'll See
Limitations
UIViewController or SwiftUI lifecycle
Beta Was this translation helpful? Give feedback.
All reactions