quarkus-workshop/docs/monitoring.adoc

= Monitoring Quarkus Apps using MicroProfile Metrics
:experimental:

This exercise demonstrates how your Quarkus application can utilize the https://github.com/eclipse/microprofile-metrics[MicroProfile Metrics,window=_blank] specification through the SmallRye Metrics extension.

MicroProfile Metrics allows applications to gather various metrics and statistics that provide insights into what is happening inside the application. They ey serve to pinpoint issues, provide long term trend data for capacity planning and pro-active discovery of issues (e.g. disk usage growing without bounds). Metrics can also help those scheduling systems decide when to scale the application to run on more or fewer machines.

The metrics can be read remotely using JSON format or the https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format[OpenMetrics text format,window=_blank], so that they can be processed by additional tools such as Prometheus, and stored for analysis and visualisation. You can then use tools like http://prometheus.io[Prometheus,window=_blank] and http://grafana.com[Grafana,window=_blank] to collect and display metrics for your Quarkus apps.

== Install Prometheus

First, let's install Prometheus. Prometheus is an open-source systems monitoring and alerting toolkit featuring:

* a multi-dimensional https://prometheus.io/docs/concepts/data_model/[data model,window=_blank] with time series data identified by metric name and key/value pairs
* https://prometheus.io/docs/prometheus/latest/querying/basics/[PromQL,window=_blank], a flexible query language to leverage this dimensionality
* time series collection happens via a pull model over HTTP

To install it, first create a Kubernetes ConfigMap that will hold the Prometheus configuration. In the Terminal, run the following:

[source,sh,role="copypaste"]
----
oc create configmap prom --from-file=prometheus.yml=src/main/kubernetes/prometheus.yml
----

This will create a ConfigMap using the contents of the `src/main/kubernetes/prometheus.yml` file in your project (we've created this file for you). It contains basic Prometheus configuration, plus a specific _target_ which instructs it to look for application metrics from both Prometheus itself, and our `people` app, on HTTP port 8080 at the `/metrics` endpoint. Here's a snippet of that file:

[source,yml]
----
scrape_configs:
  - job_name: 'prometheus' # <1>
    static_configs:
    - targets: ['localhost:9090']

  - job_name: 'people_app' # <2>
    static_configs:
    - targets: ['people:8080']
----
<1> Configures Prometheus to scrape metrics from itself
<2> COnfigures Prometheus to scrape metrics from Kubernetes service `people` on port `8080` with HTTP, at the default metrics endpoint of `/metrics`

Next, deploy and expose Prometheus using its public Docker Hub image:

[source,sh,role="copypaste"]
----
oc new-app prom/prometheus && oc expose svc/prometheus
----

And finally, mount the ConfigMap into the running container:

[source,sh,role="copypaste"]
----
oc set volume dc/prometheus --add -t configmap --configmap-name=prom -m /etc/prometheus/prometheus.yml --sub-path=prometheus.yml
----

This will cause the contents of the ConfigMap's `prometheus.yml` data to be mounted at `/etc/prometheus/prometheus.yml` where Prometheus is expecting it.

Verify Prometheus is up and running:

[source,sh,role="copypaste"]
----
oc rollout status -w dc/prometheus
----

You should see `replication controller "prometheus-2" successfully rolled out`.

== Add Metrics to Quarkus

Like other exercises, we'll need another extension to enable metrics. Install it with:

[source,sh,role="copypaste"]
----
mvn quarkus:add-extension -Dextensions="metrics"
----

This will add the necessary entries in your `pom.xml` to bring in the Metrics capability. It will import the `smallrye-metrics` extension which is an implementation of the MicroProfile Metrics specification used in Quarkus.

== Start the app

Next, Run the app once more by using the command palette and select **Start Live Coding** to start the app up in dev local mode.

== Test Metrics endpoint

You will be able to immediately see the raw metrics generated from Quarkus apps. Run this in the Terminal:

[source,sh,role="copypaste"]
----
curl http://localhost:8080/metrics
----

You will see a bunch of metrics in the OpenMetrics format:

[source, none]
----
# HELP base:jvm_uptime_seconds Displays the time from the start of the Java virtual machine in milliseconds.
# TYPE base:jvm_uptime_seconds gauge
base:jvm_uptime_seconds 5.631
# HELP base:gc_ps_mark_sweep_count Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector.
# TYPE base:gc_ps_mark_sweep_count counter
base:gc_ps_mark_sweep_count 2.0
----

This is what Prometheus will use to access and index the metrics from our app when we deploy it to the cluster.

== Add additional metrics

Out of the box, you get a lot of basic JVM metrics which are useful, but what if you wanted to provide metrics for your app? Let's add a few using the MicroProfile Metrics APIs.

Open the `GreetingService` class (in the `org.acme.people.service` package). Let's add a metric to count the number of times we've greeted someone. Add the following annotation to the `greeting()` method:

[source,java,role="copypaste"]
----
@Counted(name = "greetings", monotonic = true, description = "How many greetings we've given.")
----

(You'll need to import the new `Counted` class using _Assistant > Organize Imports_).

Trigger a greeting:

[source,sh,role="copypaste"]
----
curl http://localhost:8080/hello/greeting/quarkus
----

And then access the metrics again, this time we'll look for our new metric, specifying a scope of _application_ in the URL so that only metrics in that _scope_ are returned:

[source,sh,role="copypaste"]
----
curl http://localhost:8080/metrics/application
----

You'll see:

[source, none]
----
# HELP application:org_acme_people_service_greeting_service_greetings How many greetings we've given.
# TYPE application:org_acme_people_service_greeting_service_greetings counter
application:org_acme_people_service_greeting_service_greetings 1.0
----

This shows we've accessed the greetings once (`1.0`). Repeat the `curl` greeting a few times and then access metrics again, and you'll see the number rise.

[NOTE]
====
The comments in the metrics output starting with `#` are part of the format and give human-readable descriptions to the metrics which you'll see later on.
====

[NOTE]
====
In the OpenMicroProfile Metrics names are prefixed with things like `vendor:` or `application:` or `base:`. These _scopes_ can be selectively accessed by adding the name to the accessed endpoint, e.g. `curl http://localhost:8080/metrics/application` or `curl http://localhost:8080/metrics/base`.
====

== Add a few more

Let's add a few more metrics for our Kafka stream we setup in the previous exercise. Open the `NameConverter` class (in the `org.acme.people.stream` package), and add these metrics annotations to the `process()` method:

[source,java,role="copypaste"]
----
@Counted(name = "convertedNames", monotonic = true, description = "How many names have been converted.") // <1>
@Timed(name = "converter", description = "A measure how long it takes to convert names.", unit = MetricUnits.MILLISECONDS) // <2>
----
<1> This metric will count the number of times this method is called
<2> This metric will measure how long it takes the method to run

Don't forget to import the correct classes as before.

== Rebuild Executable JAR

Now we are ready to run our application on the cluster and look at the generated metrics. Using the command palette, select **Create Executable JAR**. You should see a bunch of log output that ends with a `SUCCESS` message.

== Deploy

Let's deploy our app to the cluster and see if Prometheus picks up our metrics! To do this, start the container build using our executable JAR:

[source,sh,role="copypaste"]
----
oc start-build people --from-file target/*-runner.jar --follow
----

== Confirm deployment

Run and wait for the app to complete its rollout:

[source,sh,role="copypaste"]
----
oc rollout status -w dc/people
----

== Test

You'll need to trigger the methods that we've instrumented, so first run this command to get the URL to the word cloud page we previously created, and then open it in your browser, which will start producing names (and generating metrics):

[source,sh,role="copypaste"]
----
echo http://$(oc get route people -o=go-template --template={% raw %}'{{ .spec.host }}'{% endraw %})/names.html
----

Within about 15-30 seconds, Prometheus should start scraping the metrics. Run this command to output the URL to the Prometheus GUI:

[source,sh,role="copypaste"]
----
echo http://$(oc get route prometheus -o=go-template --template={% raw %}'{{ .spec.host }}'{% endraw %})
----

Open a separate browser tab and navigate to that URL. This is the Prometheus GUI which lets you issue queries to retrieve metrics Prometheus has gathered. Start typing in the query box to look for 'acme':

[NOTE]
====
If you do not see any `acme` metrics when querying, wait 15 seconds, reload the Prometheus page, and try again. They will eventually show up!
====

image:prom.png[Prometheus,800]

These are the metrics exposed by our application, both raw numbers (like number of converted names in the `application:org_acme_people_stream_name_converter_converted_names` metric) along with quantiles of the same data across different time periods (e.g. `application:org_acme_people_stream_name_converter_converter_rate_per_second`).

Select `application:org_acme_people_stream_name_converter_converted_names` in the box, and click **Execute**. This will fetch the values from our metric showing the number of converted names:

image:promnames.png[names,800]

Click the **Graph** tab to see it visually, and adjust the time period to `5m`:

image:promg1.png[names,800]

Cool! You can try this with some of the JVM metrics as well, e.g. try to graph the `process_resident_memory_bytes` to see how much memory our app is using over time:

image:promg2.png[names,800]

Of course Quarkus apps use very little memory, even for apps stuffed with all sorts of extensions and code.

== Visualizing with Grafana

https://grafana.com/[Grafana,window=_blank] is commonly used to visualize metrics and provides a flexible, graphical frontend which has support for Prometheus (and many other data sources) and can display https://prometheus.io/docs/visualization/grafana/[customized, realtime dashboards,window=_blank]:

image::https://grafana.com/api/dashboards/3308/images/2099/image[Grafana dashboard,800]

Let's create a Grafana Dashboard for our Quarkus App!

== Install Grafana

Run the following command to deploy Grafana to our cluster:

[source,sh,role="copypaste"]
----
oc new-app grafana/grafana && oc expose svc/grafana
----

Verify Grafana is up and running:

[source,sh,role="copypaste"]
----
oc rollout status -w dc/grafana
----

You should see `replication controller "grafana-1" successfully rolled out`.

== Open Grafana Dashboard

Obtain the URL to the Grafana dashboard using this command:

[source,sh,role="copypaste"]
----
echo http://$(oc get route grafana -o=go-template --template={% raw %}'{{ .spec.host }}'{% endraw %})
----

Open that URL in your browser, and login using the default credentails:

* Username: `admin`
* Password: `admin`

image::graflogin.png[grafana,700]

At the password change prompt, use any password you wish.

== Add Prometheus as a data source

You'll land on the Data Source screen. Click **Add Data Source**, and select **Prometheus** as the _Data Source Type_. In the URL box, type `http://prometheus:9090` (this is the hostname and port of our running Prometheus in our namespace):

image::grafds.png[datasource, 700]

Click **Save and Test**. You should see:

image::grafworking.png[working, 500]

With our data source working, let's make a dashboard.

== Create Dashboard

Hover over the `+` button on the left, and select _Create > Dashboard_:

image::grafcreate.png[create, 600]

This will create a new dashboard with a single Panel. Each Panel can visualize a computed metric (either a single metric, or a more complex query) and display the results in the Panel.

Click **Add Query**. In the Query box, type `acme` to again get an autocompleted list of available metrics from our app:

image::grafquery.png[query,600]

Choose the first one `application:org_acme_people_stream_name_converter_converted_names`. The metrics should immediately begin to show in the graph above:

image::grafgraf.png[graf,800]

Next click on the _Visualization_ tab on the left:

image::grafvis.png[graf,800]

This lets you fine tune the display, along with the type of graph (bar, line, gauge, etc). Leave them for now, and click on the _General_ tab. Change the name of the panel to `Converted Names`.

image::grafgen.png[graf,800]

There is an _Alerts_ tab you can configure to send alerts (email, etc) when conditions are met for this and other queries. We'll skip this for now.

Click the _Save_ icon at the top to save our new dashboard (you can enter a change comment if you want):

image::grafsave.png[graf,800]

Give your new dashboard a name and click **Save**.

== Add more Panels

See if you can add additional Panels. Use the **Add Panel** button to add a new Panel:

image::grafmorepanels.png[graf,800]

Follow the same steps as before to create a few more panels, and **don't forget to Save each panel when you've created it.**

Add Panels for:

* The different quantiles of time it takes to process names `application:org_acme_people_stream_name_converter_converter_seconds` (configure it to _stack_ its values on the _Visualization_ tab, and name it "Converter Performance" on the _General_ tab).
* The JVM RSS Value `process_resident_memory_bytes` (set the visualization type to `Gauge` and the Field Units to `bytes` on the _Visualization_ tab, and the title to `Memory` on the _General_ tab.

image::grafjvm.png[jvm,500]

== Fix layout

After saving, go back to the main dashboard (click on **My Dashboard** at the top and then select it under _Recent Dashboards_). Change the time value to _Last 30 Minutes_ at the top-right:

image::graftime.png[time,500]

Finally, move the _Converted Names_ Dashboard to the right of the _Converter Performance_ by dragging its title bar to the right, and then expand the memory graph to take up the full width.

Click **Save Dashboad** again to save it. Your final Dashboard should look like:

image::graffinal.png[final,500]

Beautiful, and useful! You can add many more metrics to monitor and alert for Quarkus apps using these tools.

== Cleanup

Go to the first Terminal tab and press kbd:[CTRL+C] to stop our locally running app (or close the Terminal window).

== Congratulations!

This exercise demonstrates how your Quarkus application can utilize the https://github.com/eclipse/microprofile-metrics[MicroProfile Metrics,window=_blank] specification through the SmallRye Metrics extension. You also consumed these metrics using a popular monitoring stack with Prometheus and Grafana.

There are many more possibilities for application metrics, and it's a useful way to not only gather metrics, but act on them through alerting and other features of the monitoring stack you may be using.