At the root of the OctoPerf project is the will to provide realistic load testing. We continue to take steps to improve the JMeter experience and make it accessible. But there’s a limit to what pure performance testing can achieve.
To push this limit it is vital to assess what happens on the hardware during the tests.
Of course, a realistic test must reproduce the users expected behavior, but that’s not all. Once your test is running you need to know what is happening to your servers if you want to fix bottlenecks. This is something that always bothered us when launching only the load from OctoPerf.
There are existing monitoring solutions on the market but they have their own agenda and do not always provide the flexibility required when load testing. This is why we have been working on a way to monitor the servers while you run a test in OctoPerf.
As I said before, while running load against your servers will tell you when they break it will not help understand what happened. To understand the root cause of any issue, you need to know what’s going on on your servers. It’s only by comparing this monitoring information with response times and other metrics that you can tell how to improve performance.
Let’s take an example that may or may not be based on real life experience:
Kevin has developed a website for one of his customers. As the budget is tight, he decides to run a few performance tests with an open source tool and notices that after 50 concurrent users the response times slow down. He looks at his tomcat server and notices that the CPU is overloaded approximately when he reaches 50 users. He then decides to upgrade the CPU and run another test but it also fails around 50 users. As time runs out, he decides that since the application will never have to handle more than 100 concurrent users, he will use 2 servers with the same configuration.
As you can guess the CPU was not the problem in this configuration it is probably linked to another internal tomcat metrics. If I had to bet it would be on internal tomcat memory, a setting that can easily be increased without any hardware change. Which means that upgrading the CPU and adding another server was probably useless.
Optimize your costs
![cut costs] (/img/blog/monitoring-introduction/cut-costs.jpg “Cut costs”)
Built in monitoring allows you to quickly spot the issue and take action to move forward to another bottleneck. It will also further reduce the hosting costs by understanding how to make a better use of each machine resources. The previous example perfectly shows this, by monitoring tomcat internal metrics, the issue would have been clear early on. On top of that, in my personal experience, most performance issues are linked to a bad configuration. Which means you can easily fix them and move on with your tests.
Also, as performance testing is usually done just before the release, it is important to be able to identify quick solutions that do not require a hardware upgrade at the last minute. Of course this might be less of a problem since testing early is becoming the norm and also thanks to cloud environments that can scale. But you still pay for the resources you use when hosting your application in the cloud.
One agent to rule them all
To achieve this, you could use the JMeter monitoring plugins but they sometimes require a lot of prerequisites that may be difficult to fulfill and are not always up to date. This is why we developed our own monitoring independently of JMeter. It relies on an OctoPerf monitoring agent that is part of all our on-premise agents. This way you can install it simply by copy/pasting a single command line.
We use this agent as a proxy to get to the monitored servers. So you only have to install one agent and it can monitor all the machines around him. And of course if your infrastructure is split in several networks you can install one agent in each to avoid opening too many ports. ![monitoring schema] (/img/blog/monitoring-introduction/monitoring-schema.png “monitoring Schema”)
Preselection of counters
We know that when you test you must focus on a lot of different tasks already. So, to further enhance the ease of use, we made sure to preselect counters for you on every technology. This way you will benefit directly from our expertise. When you have a large infrastructure to monitor, it helps save a lot of time.
Computed counters and alerts
On top of that we added a lot of computed counters to take into account the number of cores of a server or just give you a percentage of use. For instance, the available memory percentage on linux must take into account cached and buffered memory: ![Linux Used memory] (/img/blog/monitoring-introduction/used-memory.png “Used memory”) But this still makes a lot of counters to navigate through and when you deal with a lot of servers it is difficult to identify bottlenecks. For this reason we also added automatic thresholds on every relevant counter. And you can configure them to your liking: ![Alert configuration] (/img/blog/monitoring-introduction/alert-config.png “Alert configuration”)
Live monitoring and alerting during the test
The monitoring data is available live during all your tests and can be compared to response times, hits, etc… ![Graph] (/img/blog/monitoring-introduction/graph.png “Graph example”)
There is also an alert section that will tell you every threshold that failed: ![thresholds] (/img/blog/monitoring-introduction/thresholds.png “Thresholds”)
Monitoring for everyone
We hope it can be usefull to all our users, that’s why we included this monitoring feature to every existing OctoPerf account as well as the new ones. Even free users can define monitors and use them to calibrate their tests before the big shot.
Also, next week we will show you a detailed use case of server monitoring and tuning thanks to OctoPerf that you can reproduce yourself. So stay tuned!