-
Notifications
You must be signed in to change notification settings - Fork 859
DB and HTTP semantic convention stability migration for SQLAlchemy #4110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
DB and HTTP semantic convention stability migration for SQLAlchemy #4110
Conversation
| # Extract db_name for operation name | ||
| # Prefer old semconv (db.name) over new semconv (db.namespace) for backwards compatibility | ||
| db_name = attrs.get(DB_NAME) or attrs.get(DB_NAMESPACE) or "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a small design choice; lmk if should be another way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the db_name should be the same regardless, right? We should be able to simply pull the DB name from the cursor, since pulling it back from the attributes seems like a very roundabout way to do this.
My preference would be to create a separate function to pull the DB name from the cursor. This can be used in _get_attributes_from_cursor and here for the Span name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added _get_db_name_from_cursor in 66e0dbd. It looks a little odd for a few reasons:
- From existing
_normalize_vendorand_get_attributes_from_cursormethods, it seems this instrumentor only "supports" postgresql and sqlite connection. So I followed this to keep scope to semconv migration. - The new helper gets postgres cursor connection the same way
_get_attributes_from_cursordoes. - We use
conn.engine.url.databaseif sqlite, instead of from-cursor, else we'd have to create a new cursor (execute) and do PRAGMA database_list and that's not ideal.
...ntelemetry-instrumentation-sqlalchemy/src/opentelemetry/instrumentation/sqlalchemy/engine.py
Outdated
Show resolved
Hide resolved
...ntelemetry-instrumentation-sqlalchemy/src/opentelemetry/instrumentation/sqlalchemy/engine.py
Outdated
Show resolved
Hide resolved
| __version__, | ||
| tracer_provider, | ||
| schema_url="https://opentelemetry.io/schemas/1.11.0", | ||
| schema_url=_get_schema_url(sem_conv_opt_in_mode), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we need to worry about this scenario, and if it is even possible to define a schema URL for it, but what is the expected behavior if only of of DB or HTTP is opt in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Initial expected behaviour would have only been DB opt-in, but now both HTTP and DB are (and should be) considered in 653a1e3. Right now, DB opt-in dictates whether schema url is default or new. This is ok if mode is http,database or dup or both default, but a mix (http,default or default,database) will mix network and db attributes from the same instrumentor.
But I think this is ok. If users/vendors are knowledgeable enough to be migrating before the breaking change to opt-in-by-default, then it can be dup for both anyway. THoughts?
This doesn't fix things but it's more consistent now in b9ef17a for doing the same for meter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is more that once we start emitting stable HTTP attributes by default, the schema URL should be using 1.21 (since this is the version HTTP became made stable), now if DB opt is is enabled then we should be using 1.25. If we only check the DB opt in, then it could fallback on 1.11.
Of course, this is assuming we're using version 1.25 of DB semconv and not 1.21. Do we know which one we plan on targeting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think 1.11 (old hardcode), 1.21, and 1.25 make sense. How about this:
| opt-in | semconv |
|---|---|
| default | 1.11.0 |
| http,default | 1.21.0 |
| http/dup,default | 1.21.0 |
| *,database | 1.25.0 |
| *,database/dup | 1.25.0 |
"default" case would not exist anymore if stable HTTP is new default and no option to opt out (iirc from last sig). DB would dictate highest and HTTP takes lower precedence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly what I had in mind! I understand that this adds some scope, but figured we should do this correctly. Perhaps, we can add either a function directly in _OpenTelemetrySemanticConventionStability to get the schema URL to be used or expose a way to return the Semcov stability mapping in _OpenTelemetrySemanticConventionStability (e.g. with a function _get_stability_signal_type_mapping), and make a backwards compatible change to _get_schema_url to have the ability to accept this mapping and return the highest semconv version found. I also agree that we should simply use the highest associated semconv version based on the opted in semconv types. For example, if we add rpc that uses semconv 1.38, if this is set then it will take precedence over everything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a step in that direction, in 5477ff8 I've added a separate _get_schema_url_for_signal_types to choose the highest version if default, http, database, or gen_ai. In that commit, SQLAlchemy instrumentor is the only one that uses it while all other instrumentors are using the classic _get_schema_url.
Lmk what you think!
| # Extract db_name for operation name | ||
| # Prefer old semconv (db.name) over new semconv (db.namespace) for backwards compatibility | ||
| db_name = attrs.get(DB_NAME) or attrs.get(DB_NAMESPACE) or "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the db_name should be the same regardless, right? We should be able to simply pull the DB name from the cursor, since pulling it back from the attributes seems like a very roundabout way to do this.
My preference would be to create a separate function to pull the DB name from the cursor. This can be used in _get_attributes_from_cursor and here for the Span name.
| from enum import Enum | ||
| from typing import Container, Mapping, MutableMapping | ||
|
|
||
| from packaging import version as package_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
packaging is already a dep of opentelemetry-instrumentation so might as well use its built-in equality checks.
|
|
||
|
|
||
| class TestOpenTelemetrySemConvSchemaUrl(TestCase): | ||
| @stability_mode("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These decorators are needed for clean env at each test
Description
Add SQLAlchemy instrumentor support for when
OTEL_SEMCONV_STABILITY_OPT_INincludesThis changes the
SQLComment in span attributefeature if opted in. Inclusion of sqlcomment on query span would be ondb.statementand/ordb.query.textattribute if default, "database", or "database/dup".Fixes #2679
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.