Any tips to speed up some Prometheus queries?

vitobotta · October 2024

Wonder if there are any Prometheus/Thanos/Grafana gurus here for some advice.

Our current set up is two Prometheus instances with two shards each, fronted by Thanos configured to use Wasabi for long term storage.

Grafana uses Thanos for querying by default and all works beautifully.

However some queries concerning web traffic still take a long time (like several minutes), even with downsampling enabled (by just using the $__interval builtin interval variable).

For example with an estimated 164 million requests in 30 days, when I make a query to rate the requests per second for error related status codes, it can still take 8 minutes even with downsampling and only 200 data points.

Thanos query instances have 8 CPUs and 32 GB of ram each, while each Prometheus instance has 8 CPUs and 32GB.

What am I doing wrong?

tentor · October 2024

I am not Prometheus expert but from what I've seen, on heavy query, only one CPU thread is used by Prometheus.

Howdy, Stranger!

Categories

In this Discussion

Any tips to speed up some Prometheus queries?

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Any tips to speed up some Prometheus queries?

Comments