-
Notifications
You must be signed in to change notification settings - Fork 128
Rust component metrics #8819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rust component metrics #8819
Conversation
| CROSS JOIN UNNEST(metrics.{{ metric.table }}.{{ category }}_{{ metric.name }}.values) as values | ||
| -- This generates multiple rows based on the `value` field. This is needed to make the `APPROX_QUANTILES` | ||
| -- weigh `value.key` correctly. | ||
| CROSS JOIN UNNEST(GENERATE_ARRAY(1, `values`.value)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the best way to get percentiles for a distribution metric? I tried to find a UDF for this, but couldn't.
4c0d58d to
45b1c7c
Compare
Created the `rust_component_metrics` dataset, which contains data for shared Rust components that ship on Desktop, iOS, and/or Android. See https://github.com/mozilla/application-services/ for examples of these components. Added SQL generators to create derived datasets that aggregate Glean metrics. The aggregate tables will be used to create dashboards for teams that own the Rust components. There's already some code in the application-services repo to generate these dashboards, however the queries are running slowly and often time out. Currently supported metrics are counters, distributions, labeled distributions and events. SQL generators were used in order to make it easy for teams to add metrics for their components in the future. We can tell them to update `sql_generators/rust_component_metrics/__init__.py` and open a PR. I could almost have just used the GLAM ETL tables, but we need some extra capabilities: * firefox-ios support (mozilla/glam#1830) * Aggregation by submission date (mozilla/glam#1073)
|
Fixed the YAML formatting and the partition date field name. Hopefully this makes CI green. |
| metrics=[ | ||
| LabeledDistribution("ingest_download_time", DistributionType.timing), | ||
| LabeledDistribution("ingest_time", DistributionType.timing), | ||
| LabeledDistribution("ingest_query_time", DistributionType.timing), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dry-run CI task is failing because it looks like this the suggest_ingest_query_time doesn't actually exist in the source table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the full dryrun error for the query:
sql/*****************************/rust_component_derived_moz_fx_data_shared_prod_2707a00cf2b29f0afd4e378935d3378c99f6599e_761c7b67/ingest_query_time_v1/query.sql ERROR
[{'code': 400, 'errors': [{'message': 'Field name suggest_ingest_query_time does not exist in STRUCT<network_http3_complete_load ARRAY<STRUCT<key STRING, value STRUCT<bucket_count INT64, count INT64, histogram_type STRING, ...>>>, network_http3_first_sent_to_last_received ARRAY<STRUCT<key STRING, value STRUCT<bucket_count INT64, count INT64, histogram_type STRING, ...>>>, network_http3_open_to_first_received ARRAY<STRUCT<key STRING, value STRUCT<bucket_count INT64, count INT64, histogram_type STRING, ...>>>, ...>; Did you mean suggest_ingest_time? at [21:50]', 'domain': 'global', 'reason': 'invalidQuery', 'location': 'q', 'locationType': 'parameter'}], 'response': {'headers': {'alt-svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'content-encoding': 'gzip', 'content-type': 'application/json; charset=UTF-8', 'date': 'Mon, 09 Feb 2026 18:29:08 GMT', 'server': 'ESF', 'transfer-encoding': 'chunked', 'vary': 'Origin, X-Origin, Referer', 'x-content-type-options': 'nosniff', 'x-frame-options': 'SAMEORIGIN', 'x-xss-protection': '0'}}, 'message': 'Field name suggest_ingest_query_time does not exist in STRUCT<network_http3_complete_load ARRAY<STRUCT<key STRING, value STRUCT<bucket_count INT64, count INT64, histogram_type STRING, ...>>>, network_http3_first_sent_to_last_received ARRAY<STRUCT<key STRING, value STRUCT<bucket_count INT64, count INT64, histogram_type STRING, ...>>>, network_http3_open_to_first_received ARRAY<STRUCT<key STRING, value STRUCT<bucket_count INT64, count INT64, histogram_type STRING, ...>>>, ...>; Did you mean suggest_ingest_time? at [21:50]'}]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that typo. I changed it to query_time, which should fix it.
* Rust component metrics Created the `rust_component_metrics` dataset, which contains data for shared Rust components that ship on Desktop, iOS, and/or Android. See https://github.com/mozilla/application-services/ for examples of these components. Added SQL generators to create derived datasets that aggregate Glean metrics. The aggregate tables will be used to create dashboards for teams that own the Rust components. There's already some code in the application-services repo to generate these dashboards, however the queries are running slowly and often time out. Currently supported metrics are counters, distributions, labeled distributions and events. SQL generators were used in order to make it easy for teams to add metrics for their components in the future. We can tell them to update `sql_generators/rust_component_metrics/__init__.py` and open a PR. I could almost have just used the GLAM ETL tables, but we need some extra capabilities: * firefox-ios support (mozilla/glam#1830) * Aggregation by submission date (mozilla/glam#1073) * Apply suggestion from @scholtzan --------- Co-authored-by: Anna Scholtz <anna@scholtzan.net>
Created the
rust_component_metricsdataset, which contains data for shared Rust components that ship on Desktop, iOS, and/or Android. See https://github.com/mozilla/application-services/ for examples of these components.Added SQL generators to create derived datasets that aggregate Glean metrics. The aggregate tables will be used to create dashboards for teams that own the Rust components. There's already some code in the application-services repo to generate these dashboards, however the queries are running slowly and often time out. Currently supported metrics are counters, distributions, labeled distributions and events.
SQL generators were used in order to make it easy for teams to add metrics for their components in the future. We can tell them to update
sql_generators/rust_component_metrics/__init__.pyand open a PR.I could almost have just used the GLAM ETL tables, but we need some extra capabilities:
Description
Related Tickets & Documents
Reviewer, please follow this checklist