Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1763555: OpenTelemetry integration utterly breaks driver if using DataDog #2084

Closed
lattwood opened this issue Oct 25, 2024 · 16 comments · Fixed by #2106
Closed

SNOW-1763555: OpenTelemetry integration utterly breaks driver if using DataDog #2084

lattwood opened this issue Oct 25, 2024 · 16 comments · Fixed by #2106
Assignees
Labels
bug status-fixed_awaiting_release The issue has been fixed, its PR merged, and now awaiting the next release cycle of the connector. status-triage_done Initial triage done, will be further handled by the driver team

Comments

@lattwood
Copy link

lattwood commented Oct 25, 2024

Python version

N/A

Operating system and processor architecture

N/A

Installed packages

ddtrace
opentelemetry-api

What did you do?

I used the driver w/ Datadog APM, which installs the opentelemetry packages that are used as the check to see if we can inject trace context.

With DataDog, this gets happens on every query, and it fails to run.

This is because the only exception caught is ModuleNotFoundError, and opentracing is throwing a ValueError which isn't getting caught.

(please forgive any typos, I copied the backtrace from a screenshot)

ValueError: Propagator tracecontext not found. It is either misspelled or not installed.
  File "/var/task/lambda_function.py", line 116, in lambda_handler
    process_scan(
  File "/var/task/store.py", line 92, in process_scan
    upload_file_to_snowflake(cursor, secrets, file_name=sv_file, table=snowflake_table)
  File "/var/task/store.py", line 55, in upload_file_to_snowflake
    cursor.execute (
  File "/var/task/snowflake/connector/cursor.py", line 984, in execute
    ret = self._execute_helper(query, **kwargs)
  File "/var/task/snowflake/connector/cursor.py", line 695, in _execute_helper
    ret = self._connection.cmd_query(
  File "/var/task/snowflake/connector/connection.py", line 1350, in cd_query
    ret = self. rest. request(
  File "/var/task/snowflake/connector/network.py", line 483, in request
    from opentelemetry.propagate import inject
  File "<frozen importlib._bootstrap»", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap»", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap»", Line 690, in _load_unlocked
  File "/var/task/ddtrace/internal/module.py", line 295, in _exec_module
    self.loader.exec_module (module)
  File "/var/task/opentelemetry/propagate/__init__.py", line 149, inmodule>
    raise ValueError(

What did you expect to see?

I would expect that opentelemetry integration would be done via this- https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation

and not via a hack in the actual library that breaks the driver for anyone that has opentelemetry installed but not in-use.

I hope that this issue results in that change being reverted, and the original requestor of the feature gets this done properly in the open-telemetry project.

Can you set logging to DEBUG and collect the logs?

We already have it at DEBUG and that's the problem.
@github-actions github-actions bot changed the title OpenTelemetry integration done noisily/wrong SNOW-1763555: OpenTelemetry integration done noisily/wrong Oct 25, 2024
@lattwood
Copy link
Author

This was directly caused by #1989

@lattwood
Copy link
Author

Aside- what kind of commit title is test?

image

@lattwood lattwood changed the title SNOW-1763555: OpenTelemetry integration done noisily/wrong SNOW-1763555: OpenTelemetry integration utterly breaks driver if using DataDog Oct 25, 2024
@lattwood
Copy link
Author

User has to import opentelemetry and configure it to have this. This code executes only if the import succeeds which requires user to import and configure opentelemetry.

specifically this is the erroneous assumption, there are many ways the import can fail.

@schammah
Copy link

affected by this as well
any estimate to resolve?

@fbexiga
Copy link

fbexiga commented Nov 11, 2024

Seems like a straightforward issue to fix... Can we get some eyes on this one?
@sfc-gh-mkeller @sfc-gh-yuwang @sfc-gh-aling @sfc-gh-sghosh @sfc-gh-yixie @sfc-gh-aalam

@sfc-gh-dszmolka
Copy link
Contributor

thanks folks for drawing attention to this, we'll take a look.

@sfc-gh-dszmolka sfc-gh-dszmolka added status-triage_done Initial triage done, will be further handled by the driver team and removed needs triage labels Nov 12, 2024
@sfc-gh-mkeller sfc-gh-mkeller self-assigned this Nov 12, 2024
@sfc-gh-bdrutu
Copy link
Contributor

Thanks for your bug report, this is very valuable input. Unfortunately, after further investigation this proves to be an issue specific to DataDog because by just including a clean opentelemetry-api dependency does not reproduce this problem, which means that what they include is not the "full" opentelemetry-api (see the fact that in opentelemetry-api the tracecontext is available see here and here and here(here you can see the exception being thrown).

This is still a problem that Snowflake can fix by catching all the exceptions there, though that is not the right way of writing code, since you cannot have a catch all exception every time you call any API that should not return an exception, but Snowflake will do it anyway since tracing should not break your data aplication.

@lattwood
Copy link
Author

@sfc-gh-bdrutu Based on what you've said, I've taken the liberty of opening an issue on the ddtrace-py repository- DataDog/dd-trace-py#11407

@bogdandrutu
Copy link

I would expect that opentelemetry integration would be done via this- https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation

and not via a hack in the actual library that breaks the driver for anyone that has opentelemetry installed but not in-use.

I hope that this issue results in that change being reverted, and the original requestor of the feature gets this done properly in the open-telemetry project.

Based on my experience with OpenTelemetry in different environments, OpenTelemetry was designed with flexibility in mind, supporting scenarios where telemetry is directly embedded within libraries, rather than requiring separate instrumentation libraries.

@PaulBormanTR
Copy link

I'm facing this as well. Specifically, incompatible with https://github.com/DataDog/datadog-lambda-python/releases/tag/v6.99.0
Was this introduced by 0d00519?

@lattwood
Copy link
Author

lattwood commented Nov 14, 2024

@bogdandrutu you literally have the same bio on your two accounts.

Image
Image

@sfc-gh-mkeller
Copy link
Collaborator

Since I was unable to reproduce the issue I'd really like someone with the issue to test drive the change I proposed in #2106 .
Please don't forget to let us know if it worked properly!

@sfc-gh-dszmolka sfc-gh-dszmolka added status-fixed_awaiting_release The issue has been fixed, its PR merged, and now awaiting the next release cycle of the connector. and removed status-in_progress Issue is worked on by the driver team labels Nov 17, 2024
@sfc-gh-dszmolka
Copy link
Contributor

merging the PR autoclosed this issue, however it needs a release first, so reopened until release is out.
Please folks, if you're able, test the fix (by installing the driver from main) and let us know how it worked for you. Thank you in advance !

@sfc-gh-dszmolka
Copy link
Contributor

released in PythonConnector v3.12.4

@lattwood
Copy link
Author

lattwood commented Dec 4, 2024

https://github.com/open-telemetry/opentelemetry-python/blob/415c94fb9834ce38459e58d5ee192a98494c4c76/opentelemetry-api/src/opentelemetry/context/__init__.py#L42-L64

We're getting a ton of log spam due to this.

Failed to load context: contextvars_context, fallback to contextvars_context
Traceback (most recent call last):
  File "/var/task/opentelemetry/context/__init__.py", line 43, in _load_runtime_context
    return next(  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^
StopIteration

Which it appears is due to datadog setting up a different context provider for if someone enables their opentelemetry integration.

Please implement opentelemetry in a way that it can be disabled with an environment variable, or put the functionality into a separate package.

@sfc-gh-bdrutu
Copy link
Contributor

Hi @lattwood

You should talk to the DD folks about this problem since our usage is a legit and correct usage of the opentelemetry-api.

Which it appears is due to datadog setting up a different context provider for if someone enables their opentelemetry integration.

Talk to them maybe they shouldn't do that, or they should correctly do it.

Please implement opentelemetry in a way that it can be disabled with an environment variable, or put the functionality into a separate package.

We are not "implementing" opentelemetry, we are only propagate the context to Snowflake and as mentioned our usage is correct and is DD causing your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug status-fixed_awaiting_release The issue has been fixed, its PR merged, and now awaiting the next release cycle of the connector. status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
8 participants