Monosoul's Dev Blog A blog to write down dev-related stuff I face
Who stole my Spring Boot system metrics

Who stole my Spring Boot system metrics?!

Recently I’ve spent a lot of time making my team’s Grafana dashboards great again. It was all nice & fun adding domain-specific metrics to the dashboard until I realized that I can’t find any values for Spring Boot system metrics like CPU usage or memory usage. What happened to them? Who stole my system metrics?! Let’s find out.

Awkward Pulp Fiction GIF - Find & Share on GIPHY

It’s been quite a while since the last post here, isn’t it? Life can be tough in Berlin. One moment you’re just living your life and the next thing you know – you got to find a new apartment, then you have a month-long vacation and – boom! – you got too lazy to write a new blog post. But anyway, here I am, doing it.

The setup

Let’s start this with figuring out the setup we have.

On the metering side we have:

On the app side we have:

  • OpenJDK 11 as Java runtime.
  • Kotlin as the language of our team’s choice.
  • Spring Boot as the main magician in the app. ๐Ÿ™‚
  • micrometer as a metering API and Prometheus-specific implementation for it.

The problem

So, like I mentioned before, the problem I was having was that the system metrics for the service were gone. Take process.cpu.usage as an example.

This is what I saw when I was querying this metric without any filter:

System Metrics provided by Spring Boot are there for other services

And this is what I saw when I added a filter by service name:

System Metrics provided by Spring Boot are missing in Prometheus when filtering by the service name

Nothing. It was nothing and it was troubling me.

My thought process was like this:

  • I knew that there were values for this metric for other services, so I was pretty sure it’s not a Prometheus issue.
  • Values for other domain-specific metrics that we register in the code were there, it shouldn’t be a communication problem between the service and Prometheus. Prometheus was clearly able to collect the data.
  • Could it be a misconfiguration in our service’s application.yml?
  • Could it be that there was a missing dependency for spring boot actuator that’s responsible for automatic system metric registration?

So, I knew that the problem was definitely not on the Prometheus end, but somewhere on our end. And I had a few hypotheses on what it could be. Let’s check them out one by one!

Hypothesis #1. Bad application.yml

I started with checking our application.yml file for anything that can catch my attention regarding metrics. From the first look everything was looking fine:

application.yml

More or less it was configured according to this documentation. The two main differences were:

  • root path that was used as the base path for the management endpoints;
  • prometheus endpoint was mapped to /metrics path.

Was that enough to break something? I didn’t think so. But the next thing I did – is started the service locally to check the /metrics endpoint output.

System Metrics provided by Spring Boot are missing

Some of the system metrics were clearly there: datasource-related, like HikariCP connections usage; JDBC connections; resilience4j metrics. But not a single one with “cpu” in it. Once again I was asking myself: who stole my system metrics?!

Hypothesis #2. A missing dependency

This was my next guess. I checked the service’s classpath. Obviously, both spring-boot-actuator and micrometer-core were there. micrometer-registry-prometheus was also there, otherwise I would not have seen any metrics at all. Was there something else I was missing?

I decided to resort to the world’s best troubleshooting guide – Google! After a few minutes of googling for various combinations of words “Spring”, “Boot” and “process.cpu.usage”, I found this answer on StackOverflow. And even though it wasn’t an answer to my question, it had a link to the class responsible for the CPU usage metric registration – ProcessorMetrics. At least it was something to keep investigation going.

Hypothesis #3. A bug in Spring Boot

So, I had a look at ProcessorMetrics class. It has a method bindTo() that seemingly was doing exactly what I needed – registering the CPU-related meters.

ProcessorMetrics.java::bindTo

So, the method should have registered a gauge for metric process.cpu.usage, but it wasn’t there. I decided to search for usages of this class.

What I found was SystemMetricsAutoConfiguration, an autoconfiguration class that was pretty much just registering a bean for ProcessorMetrics.

SystemMetricsAutoConfiguration.java

The configuration itself had a few conditional annotations, but I was sure that all conditions were satisfied. Also, this configuration was just registering the bean, but it wasn’t calling bindTo() method, so I was looking for something else.

Searching for usages of bindTo() led me to MeterRegistryConfigurer.

MeterRegistryConfigurer.java

I was on the right track, as it was definitely the class that was supposed to bind the ProcessorMetrics instance to the MeterRegistry instance. But it’s configure() method was supposed to be called from somewhere else, right?!

And so I searched for usages of that method now. I was untangling the ball of call chain step by step.

The method had only one call site. It was used in MeterRegistryPostProcessor.

MeterRegistryPostProcessor.java

MeterRegistryPostProcessor is a bean post processor. You can learn more about different kinds of post processors in Spring from my previous blog posts: How to customize dependency injection in Spring Part 1 and Part 2.

So, this class was supposed to be applied to MeterRegistry and bind the processor metrics to it. But clearly it wasn’t doing that.

I even ran it in debug mode with a breakpoint inside the if statement body, but I didn’t get there even once. Even though the post processor was definitely in the context and meter registry bean was also there (remember, we have other metrics that work).

So, my next big guess was that there’s a bug in the version of Spring Boot that we use. It wouldn’t have been the first time I’ve ever faced a bug in Spring Boot, so I wouldn’t be surprised. In our service we depend on Spring Boot version 2.3.7 and the first thing I did is checked if there’s a new minor version for 2.3.X. The latest version available there was 2.3.9.

I changed the version to 2.3.9, ran the service locally again and called the /metrics endpoint. To my surprise the picture I saw was the same as before – there were no metrics with “cpu” in their name.

System Metrics provided by Spring Boot are missing

I was pretty sure that if it was a bug, it would’ve been fixed by that time. There was no chance that only we have been affected by it.

Hypothesis #4. Something is messed up in our Spring context

At that point I knew it was not an issue with Prometheus, not an issue with our config and not a bug in Spring Boot. My only guess at that point was that something was messing up with Spring context in our service. But what was that? How would I find it?

I resorted to Google once again. This time I was trying to find any mentions of MeterRegistryPostProcessor not working. Once again, Google didn’t let me down and didn’t give me up. I found an issue on GitHub with a problem very similar to the one I had. Andy Wilkinson there was pointing out that there were messages in the logs saying that a bean “is not eligible for post-processing”. I checked my service logs and found quite a few of these messages. But most importantly, I found a message saying that the meter registry is not eligible for post processing!

Spring Boot can not apply post processors to the meter registry

Gotcha! But what can cause that? What can make a bean not eligible for post processing?

A bean becomes not eligible for post processing if it gets instantiated before Spring actually applies post processors. But what can cause that? One possible reason is that there’s a post processor implementation that depends on such bean. Since post processors are among the first beans to be instantiated by Spring, the dependencies of post processors are have to be instantiated even before that! And due that, post processors can not be applied to such beans!

And I knew we have at least one bean post processor with dependencies…

The bean post processor

One of the libraries that we use in our service provides us with a very specific instance of RestTemplate. We have to use that rest template instance to connect to a third party service. We weren’t quite happy with the exact configuration of that instance, and wanted to do some changes to it. What do you do when you have a bean definition that you cannot change? Right, you implement BeanPostProcessor to customize the bean after it got instantiated.

To be more specific, we were changing the URI template handler implementation in that bean. But by doing that we were breaking URI template handler customization done automatically by MetricsRestTemplateCustomizer. This is a class that adds some common metrics to every rest template in the context, like call time, count etc. So our bean post processor implementation was depending on MetricsRestTemplateCustomizer bean to do the customization again.

This is how it looked like:

BadRestTemplatePostProcessor.kt

So, it all went wrong like a chain reaction. BadRestTemplatePostProcessor depends on MetricsRestTemplateCustomizer causing it’s premature instantiation. And MetricsRestTemplateCustomizer depends on MeterRegistry (among other dependencies).

MetricsRestTemplateCustomizer.java

Because of that MeterRegistry was also instantiated prematurely, making it not eligible for post processing. Even by MeterRegistryPostProcessor.

Okay, I fugured that out. But what’s next? I still needed a way to customzie the rest template, but without breaking the system metrics.

The solution

At that point I had at least 2 options. The one you pick depends on if you there’s a Java config class available for the bean or not.

In my case the rest template bean was defined with a Java config class.

With Java Config

The configuration class for the rest template was providing a method to get a new instance of the rest template. So I can instantiate the configuration class myself, and then call the method to create a new instance.

BadRestTemplateReplacementConfiguration.kt

This is what I do here:

  1. Define a new RestTemplate bean called badRestTemplateReplacement using BadRestTemplateConfig().badRestTemplate() call;
  2. Define a bean implementing BeanDefinitionRegistryPostProcessor;
  3. Remove the original bean definition of the bad rest template;
  4. Register the bean definition for badRestTemplateReplacement bean as badRestTemplate.

The idea here is to make sure that any class that depends on a bean with name badRestTemplate, will still keep working without any changes while using the new bean I defined in the config.

Without Java Config

Let’s imagine that I’m in a situation where Java config for the bean is not available. For example I use a very old library were all the beans were defined with xml configuration.

BadRestTemplateReplacementConfiguration.kt

This is what I do here:

  1. Define a new RestTemplate bean called badRestTemplateReplacement;
  2. Define a dependency on a bean of type RestTemplate and name originalBadRestTemplate and customize this bean in the method definition;
  3. Define a bean implementing BeanDefinitionRegistryPostProcessor;
  4. Register the bean definition for badRestTemplate with a new name – originalBadRestTemplate;
  5. Remove the original definition with name badRestTemplate;
  6. Register the bean definition for badRestTemplateReplacement bean as badRestTemplate.

This way I keep the original bean definition, but with a different name. So all classes that depend on badRestTemplate will get the new bean I defined.

The result

Okay, now that I’ve implemented the solution it’s time to check that it works. To do that I will run the service locally again.

First, I will check the logs for any metnions of meter registry being not eligible for post-processing:

Spring Boot can apply post processors to the meter registry again

There are no mentions of meter registry in the logs whatsoever. That looks promising!

No let’s check the /metrics endpoint.

System Metrics provided by Spring Boot are there again

Great! The metrics are there now!

Wait, but why does it work?

Why does this solution work and the other one with the bean post processor – doesn’t?

The answer is pretty simple. The rest template defined in the Java config is just a regular bean that will be instantiated by Spring only when it’s needed. Since I replace the beans by manipulating the bean definitons, and not the beans themselves – no premature bean instantiation is happening anymore. So all the beans are eligible for post-processing again. Ain’t that beatiful? ๐Ÿ™‚

Summary

Sometimes it’s really easy with Spring Framework to get yourself into troubles. But that’s just the price of flexibility.

There are situations when you might really need to inject something into a BeanPostProcessor implementation. And that’s totally fine. As long as you keep in mind the consequences it comes with. ๐Ÿ™‚

Thanks for reading it up to this point, I hope you liked it!

Happy hacking!

Like it? Share it!

Leave a comment

Your email address will not be published.

2 thoughts on “Who stole my Spring Boot system metrics?!”

    • Andrei Nevedomskii