Skip to content
This repository has been archived by the owner on Sep 10, 2024. It is now read-only.

Generate span metrics at the collector level #33

Open
phxnsharp opened this issue Feb 20, 2020 · 6 comments
Open

Generate span metrics at the collector level #33

phxnsharp opened this issue Feb 20, 2020 · 6 comments
Labels

Comments

@phxnsharp
Copy link

I am using Jaeger from ephemeral but long running (minutes to days) processes to trace execution of engineering workflows. The tracing part is working great.

I would like to additionally track metrics from these processes. Since Prometheus is notoriously bad at handling ephemeral processes, and since Jaeger already provides a high performance, reliable, and scalable data path for the trace data, I would like to collect the metrics on the server side much along the lines of https://medium.com/jaegertracing/data-analytics-with-jaeger-aka-traces-tell-us-more-973669e6f848 . However, I would prefer to not add the additional requirements of running and maintaining Kafka.

I have created a prototype gRPC storage plugin which accepts trace data, but does not handle read operations. Since Jaeger allows multiple storage plug-ins but only reads from the first, it can be installed behind Cassandra or Elasticsearch plug-ins.

This plug-in uses the Golang Prometheus client to provide metrics on the spans that it sees. Currently it is hardcoded to collect the metrics that I particularly need and is not generic.

The metrics I am currently collecting do not require assimilating multiple spans. The main ones we are looking to get are average duration, run count, and failure count for particular span types. For us, our durations are long so latency effects between trace spans aren't that interesting. I am converting some, but not all, of the span tags into labels so that I can issue the required queries out of Prometheus.

One difficulty with this solution is that Prometheus expects each collection target to be definitive. With Jaeger scalability, the collector can be replicated. Prometheus currently expects that a single scrape target contains all the values for a particular time series/label combination. It has no ability to sum or aggregate values from different scrape targets that match, even with the honor_labels option (if you try, it ends up flopping back and forth between the values each scrape target provides). Without honor_labels, you can easily write labels for the actual source instance/ip and write queries to sum the results however you want, but there is a significant implication for the Prometheus time-series storage. In my case, if I have n computers reporting traces and m replicas of the jaeger-collector, I'll end up with n*m time series in Prometheus' storage.

@yurishkuro
Copy link
Member

Did you have a question to go with this use case?

One comment on monitoring ephemeral processes - you don’t have to use scraping/pull model, a push model might be more appropriate in your case.

@jotak
Copy link

jotak commented Feb 21, 2020

Hi @phxnsharp ,
Can you clarify more your running environment? Are you using some prometheus service discovery or not? For instance if you're in kubernetes, the k8s service discovery would probably do what you expect by scraping dynamically whichever replica pods it finds.
Maybe you can share your prometheus scrape config?

@jotak
Copy link

jotak commented Feb 21, 2020

Sorry, reading again, I better understand the issue, it's a problem of metric cardinality. Basically you don't want n*m metrics, you'd like to get rid of the replica instance label but cannot do that on exposed metrics because scraper would else override each other replica datapoints instead of summing them. Correct?

I'm not sure if Prometheus provides something for post-scraping / pre-ingestion aggregation (basically: remove the replica label before storing) but this is certainly something you could do, instead of having every replicas exposed to scraper, you could have them expose (or push) to a "bridge" of your own, which performs the aggregation you want and re-expose to prometheus the consolidated data.

See for example this: https://blog.codeship.com/monitoring-your-synchronous-python-web-applications-using-prometheus/ - this is solving a very similar problem I think. It uses statsd as a bridge.

PS: Also, without a bridge, prometheus relabelling is probably what comes closer to a solution for this kind of problem but I'm not certain that you can perform whatever aggregation you want on values when dropping a label - maybe should be investigated too https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config

@phxnsharp
Copy link
Author

phxnsharp commented Feb 21, 2020

@yurishkuro I created this issue at the request of Pavol Loffay @pavolloffay over on the Gitter Jaegertracing lobby. He asked me to document the request to have metrics collected server-side instead of client-side (some metrics are available via the jaegertracing client library).

@jotak To be clear, the mn problem annoys me, but at this point it is not a blocker for me and not the purpose of this issue. This issue is about asking for server side metrics. The mn problem is an unfortunate side-effect that I'm happy to live with for now in order to achieve my other goals.

The other solutions you have mentioned would of course work but have various pros and cons against this use case. For example, in your statsd example the statsd exporter becomes a single point of failure and requires setting up additional protocols, ports, and services. The solution given here is entirely scalable and re-uses the communication paths that Jaeger already provides.

Thanks to all for the help and discussion!

@jotak
Copy link

jotak commented Feb 28, 2020

hey @phxnsharp I just saw this article that might give you some ideas again, with federation this time: https://karlstoney.com/2020/02/25/federated-prometheus-to-reduce-metric-cardinality/

Basically the idea is:

  • Have a short-lived TTL prometheus instance scrape replicas with all labels information (n * m)
  • Have a recording rule that proceeds to aggregate replicas metrics (so, keep source label and drop replica label)
  • Have the master prometheus scrape the federate, with longer TTL

@phxnsharp
Copy link
Author

@jotak That looks like a really good idea. Thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants