What is Observability in Kyma?
Out of the box, Kyma provides tools to collect and ship telemetry data using the Telemetry Module. Of course, you'll want to view and analyze the data you're collecting. This is where observability tools come in.
Data collection
Kyma collects telemetry data with the following in-cluster components:
Fluent Bit collects logs, provided using the Telemetry Module.
An OTel Collector collects traces, provided using the Telemetry Module.
The collected telemetry data are exposed so that you can view and analyze them with observability tools.
NOTE: Kyma's telemetry component supports providing your own output configuration for your application's logs and traces. With this, you can connect your own observability systems inside or outside the Kyma cluster with the Kyma backend.
Data analysis
You can use the following in-cluster components to observe your applications' telemetry data:
- Prometheus, a lightweight backend for metrics.
NOTE: The Prometheus integration has been deprecated and is planned to be removed.
- Grafana to provide a dashboard and a query editor to visualize metrics collected from Prometheus.
NOTE: The Grafana integration has been deprecated and is planned to be removed.
Monitoring
NOTE: Prometheus and Grafana are deprecated and are planned to be removed. If you want to install a custom stack, take a look at Install a custom kube-prometheus-stack in Kyma.
Overview
For in-cluster monitoring, Kyma uses Prometheus as the open source monitoring and alerting toolkit that collects and stores metrics data. This data is consumed by several addons, including Grafana for analytics and monitoring, and Alertmanager for handling alerts.
Monitoring in Kyma is configured to collect all metrics relevant for observing the in-cluster Istio Service Mesh. For diagrams of the default setup and the monitoring flow including Istio, see Monitoring Architecture.
Learn how to enable Grafana visualization and enable mTLS for custom metrics.
Limitations
In the production profile, Prometheus stores up to 15 GB of data for a maximum period of 30 days. If the default size or time is exceeded, the oldest records are removed first. The evaluation profile has lower limits. For more information about profiles, see Install Kyma: Choose resource consumption.
The configured memory limits of the Prometheus and Prometheus-Istio instances define the number of time series samples that can be ingested.
The default resource configuration of the monitoring component in the production profile is sufficient to serve 800K time series in the Prometheus Pod, and 400K time series in the Prometheus-Istio Pod. The samples are deleted after 30 days or when reaching the storage limit of 15 GB.
The amount of generated time series in a Kyma cluster depends on the following factors:
- Number of Pods in the cluster
- Number of Nodes in the cluster
- Amount of exported (custom) metrics
- Label cardinality of metrics
- Number of buckets for histogram metrics
- Frequency of Pod recreation
- Topology of the Istio Service Mesh
You can see the number of ingested time series samples from the prometheus_tsdb_head_series
metric, which is exported by the Prometheus itself. Furthermore, you can identify expensive metrics with the TSDB Status page.
Telemetry
The page moved to the Telemetry - Logs section.
Useful links
If you're interested in learning more about the Observability area, check out these links:
Learn how to set up the Monitoring Flow for your services in Kyma.
Install a custom Loki stack.
- Install a custom Jaeger stack.
Install a custom Prometheus stack.
To collect and ship workload metrics to an OTLP endpoint, see Install an OTLP-based metrics collector.
Learn how to access and expose the services Grafana, Jaeger, and Kiali.
Troubleshoot Observability-related issues:
Understand the architecture of Kyma's monitoring, logging, and tracing components.
Find the configuration parameters for Monitoring.
Deploy Kiali to a Kyma cluster