In this project, our main goal was to review and improve the viability of the client system’s infrastructure.
Challenge: Collecting and monitoring data smoothly
Aside from using Prometheus as the main data collector for monitoring data, we were aiming at measuring services performance to predict and track bottlenecks. To do so, we've co-worked with development teams implementing /metrics endpoint for each service. This approach allowed us to collect more useful data from each application. And, to make it even more detailed and trackable, we've added OpenTelemetry solutions to the services.
Solution: Implementing tools for data aggregation, visualization and alerting
A monitoring system won't be complete without aggregating logs and trying to parse them to get the businesses logic information from them. That’s why we suggested FleuntBit as a log delivery tool, combined with storing outputs in ES cluster and S3 storage as an archive. What’s crucial, adding different kinds of storage to ElasticSearch cluster and applying retention policy helped us cut down the costs.
Being able to collect the data is one thing, and being able to visualize it is a different challenge. To fix that, we recommend picking Grafana (perfect to draw metrics) along with Kibana – a top tool for browsing logs and visualizing the content of them on the graphs.
We had the data, we were able to monitor it but we were still missing alerting. The client asked us to set alerts via the Opsgenie tool and enable them on Slack. To make this feature even more flexible, we decided to use Ansible to generate both Prometheus alert rules and Grafana dashboards. Thanks to that, adding a new application to the customer stack doesn’t require manual work on a monitoring site anymore.
Results in numbers:
- gathering monitoring data from 5 Kubernetes clusters
- counting SLO based on SLI for more than 40 services
- aggregating 20G of logs daily
Prometheus, Grafana, Elasticsearch, Kibana, Fluentbit, OpenTelemetry
Tell us about your project
Can we offer you a hand in developing your product? Once you let us know about your requirements, our DevOps experts will prepare an initial estimation, schedule a call and discuss your project in detail. All information will be kept confidential. Let's get in touch >