Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/geoipprocessor] Add attributes parameter and consider both source.address and client.address by default #37008

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

bencehornak
Copy link

@bencehornak bencehornak commented Jan 2, 2025

Description

Currently, the geoipprocessor reads the source.address attribute of the processed record, which is used to describe network exchanges/packets only (see the semantic conventions). For HTTP requests there is another attribute, called client.address with the same format (see the semantic conventions).

In this PR I am changing the geoipprocessor take both attributes into account by default (if both are defined client.attributes wins). Additionally, the attributes are configurable via the new attributes parameter to allow geo ip lookup based on custom attributes.

Testing

I duplicated the source address located in inner attributes test case for client.address, as they should work the same way.

Documentation

Updated the README with the second attribute.

@bencehornak bencehornak requested review from andrzej-stencel and a team as code owners January 2, 2025 15:55
Copy link

linux-foundation-easycla bot commented Jan 2, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link
Contributor

@rogercoll rogercoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The client.address attribute closely resembles source.address, so I support adding it to the default lookup attributes. (It would be great to have a configuration option to allow any arbitrary attribute key).

My assumption is that if both attributes are defined, they will likely have the same value, making the lookup order less critical.

@bencehornak bencehornak changed the title [processor/geoipprocessor] Read both the client.address and the source.address attributes [processor/geoipprocessor] Add attributes parameter and consider both source.address and client.address by default Jan 6, 2025
@bencehornak
Copy link
Author

@rogercoll I've implemented a new configuration parameter (attributes), which allows specifying the desired attributes.

@bencehornak
Copy link
Author

I think the attributes should default to something other than [source.address, client.address] when context: resource, because these attributes apply to spans (i.e. records), not resources (see the General Attributes Semantic Conventions).

I could imagine the host.ip for example, but I don't see it right away, why somebody would look up server IP addresses. @rogercoll can you give me an idea about what the intended purpose of context: resource is?

Thanks in advance!

Copy link
Contributor

@rogercoll rogercoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added some nits, overall LGTM!

processor/geoipprocessor/README.md Outdated Show resolved Hide resolved
processor/geoipprocessor/README.md Outdated Show resolved Hide resolved
processor/geoipprocessor/geoip_processor_test.go Outdated Show resolved Hide resolved
@rogercoll
Copy link
Contributor

I could imagine the host.ip for example, but I don't see it right away, why somebody would look up server IP addresses. @rogercoll can you give me an idea about what the intended purpose of context: resource is?

For example, the redis receiver creates metrics that contain the server.address as resource attribute (resource context). One could use the geoIP processor to add geographical metadata about their internal Redis servers. The same concept could be applied to the mentioned host.ip, add location metadata of their distributed servers.

I think the attributes should default to something other than [source.address, client.address] when context: resource, because these attributes apply to spans (i.e. records), not resources (see the General Attributes Semantic Conventions).

That's a good point, would you mind creating an issue with these findings? We can definitely consider changing the attributes defaults depending on the signal type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants