After multiple setup sessions with our customers, we came to the conclusion that we needed to improve the way load generators and monitoring agents are managed. We had to make core improvements to make OctoPerf EE much easier to install. Let’s see:
- How OctoPerf currently works and why it’s not optimal,
- And the changes we’ve made in the upcoming
OctoPerf v9.0.0to greatly improve the situation.
2014-2015: Apache Mesos Agent
Why the hell have we based OctoPerf on Rancher? That’s a fair question I’m going to answer.
Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual).
The entire stack was running on Amazon EC2. In fact, it could not run elsewhere due to architectural choices we’ve made then. Those choices were entirely conducted by those arguments:
- OctoPerf had to be based on existing software to improve time to market: we had no time to write our own engine to manage hosts and run JMeter on them,
- As a small dev team by then, our resources were strictly limited. We could not afford spending too much time on something which already existed as open-source.
Apache Mesos worked overall pretty well but appeared a new need: the ability to run on-premise load tests. And, to be honest, there were several issues preventing us from doing so:
- Apache Mesos is difficult to install: I didn’t see how I could make this thing easy to install for non-initiated folks, (Including Zookeeper),
- Our servers were designed to run on EC2 exclusively: some scripts and other stuff were heavily based on Amazon EC2 features. That was a pretty bad design decision motivated by quickly getting a running software on the market.
2015 - Mid 2018: Rancher Agent
Rancher had everything we wanted end 2015:
- Run both Cloud and On-Premise Hosts: several customers were asking for it,
- Support Multiple Cloud Providers: we only supported Amazon EC2 back then, but planned to support multiple ones to extend the number of locations available in the world,
- Based on Docker: Docker is flexible and easy to setup. We can run our own containerized
JMeterload generators easily.
Rancher was still in beta stages but looked promising. After a few months of development, we got rid of Mesos and were running our hosts through Rancher.
But, while being much easier to install that Apache Mesos, it still suffered from several issues:
- Rancher is still difficult to install: Rancher is powerful but the setup (even for Rancher v1.6.x) is quite difficult and time-consuming,
- Setup is hard to automate: some customers want to fully automate the installation process using tools like Ansible (And that’s our mantra too),
- Multiples agents are required: both Rancher’s agent to manage the host and OctoPerf Monitoring Agent just to monitor your infrastructure,
- Rancher High Availability Setup is Hard: You need to setup MySQL Replication,
- And Firewall and proxy setup are cumbersome (not to say really annoying).
Needless to say we went through several painful Rancher upgrades along the way. I’m not saying Rancher is bad: Rancher is just not suited to our needs anymore.
Mid 2018 - ?: OctoPerf Agent
v8.x.x is based on Rancher
v1.6.x (but we started supporting Rancher when
v0.42 was released). And beginning 2018, Rancher v2.0.0 was released. And they completely changed the software direction.
Rancher v2.0.0 and above is based on Kubernetes: it’s an open-source large-scale docker orchestration tool released by Google. While being very powerful, Kubernetes is much more complicated and harder to setup than Rancher
We had to make a decision:
- Support Rancher v2.0.0: upgrade our backend server to be able to use hosts through Rancher v2.0.0 and Kubernetes,
- Or Write our own agent to run containers on remote hosts.
Trust me, it wasn’t an easy decision. I hate writing software I don’t need to. That’s the easiest way to burn thousands of dollars by trying to save a few hundreds.
The line of code which costs you the less to maintain is the one you haven’t written.
So, why have we decided to write our own Docker Agent? Let’s understand this by seeing how it works today.
OctoPerf v8.x.x uses Rancher to manage the hosts used as load generators. The schema above shows how OctoPerf Saas works (but is about the same for OctoPerf EE):
- Load Generation Agent: it’s the Rancher Agent. It connects the computer to our Rancher server running on
https://rancher.octoperf.com:8080. It uses both HTTP and Websocket connections,
- Monitoring Agent: Our own Monitoring Agent written in Java and based on Spring Boot. It monitors your infrastructure. It communicates with our backend server on
https://api.octoperf.comvia Push Technology.
Because we use Rancher as third-party software to manage our hosts, we have 2 agents.
How It Works
If you take a look at the on-premise providers management panel, you see both of these agents.
You have 3 different tables:
- Providers: Lists your On-Premise providers. A Provider is a group of computers located in several regions,
- Hosts: Hosts connected to Rancher and queried by our backend via Rancher’s Rest API,
- Monitoring Agents: our own monitoring agents running on your hardware, also packaged as a docker container.
This setup rises several questions:
- Why do we need to setup 2 agents to use server-side monitoring? I must admit: this is confusing,
- We need to setup Rancher to run OctoPerf Enterprise-Edition: that makes the setup much more complicated.
Those are just part of the issues we had with Rancher. Let me expose them.
OctoPerf v9.0.0 uses our own Docker Agent to manage the hosts used as load generators. The schema above shows how OctoPerf Saas works (but is about the same for OctoPerf EE):
- Agent: Our own Docker Agent written in Java and based on Spring Boot. It monitors your infrastructure AND runs your load tests. It communicates with our backend server on
https://api.octoperf.comvia Push Technology.
There are several reasons why we chose to create our own agent to manage docker hosts:
- Easy to install: by merging the host and monitoring agent into a single one, the agent can be run in a single command-line,
- Easy to automate: Rancher setup was inherently difficult to automate (although we achieved to do so via our Vagrant image). By removing Rancher, we remove a bunch of painful setup steps,
- High Availability: Rancher was the Single Point Of Failure of the architecture. It’s based on MySQL, a relational database which is hard to scale. On the other side, our backend is highly available by leveraging Hazelcast,
- Firewall and Proxy friendly: our Docker Agent uses Server Push via HTTP Polling which is compatible with most proxies. In addition to that, the traffic is 100% outgoing from Docker Agent to a single host. That makes the setup in big companies much easier,
- HTTPS traffic only: when using OctoPerf Saas, our agent only makes calls to
https://api.octoperf.com, nothing else.
Of course, there are counter-parts when you decide to maintain a system yourself:
- Additional Maintenance Cost: we need to maintain ourselves the compatibility with most Docker versions,
- Increased Complexity: Rancher was a black-box making our life much easier. Things Rancher used to do for us is now managed by ourselves (like distributing containers on machines evenly).
We believe the additional burden on our side are worth it because it makes our customers life much easier.
How It Works
OctoPerf v9.0.0 features a much simpler On-Premise host setup:
- Providers: nothing changes here, providers represent a group of machines in different locations
- Agents: a unified monitoring + docker agent used to both monitor your infrastructure and manage
That’s it! Host installation on a Linux box is pretty much the same except the command-line to run.
Instead of launching Rancher’s Agent, it’s now launching our own Docker Agent. Nothing else.
OctoPerf Enterprise-Edition is the first version to receive the upgrade. As Rancher is no more required, the setup is consequently much simpler. Thus, we can leverage docker-compose along with a simple
Makefile to install and run OctoPerf EE.
ubuntu@desktop:~/git/octoperf/Conf/docker/enterprise-edition$ make docker-compose up --build -d Creating volume "enterprise-edition_elasticsearch-data" with local driver Creating volume "enterprise-edition_octoperf-data" with local driver Building haproxy Step 1/2 : FROM haproxy:alpine ---> 86db55d8d675 Step 2/2 : COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg ---> Using cache ---> 113bb10f6d61 Successfully built 113bb10f6d61 Successfully tagged enterprise-edition_haproxy:latest Creating enterprise-ui ... done Creating enterprise-documentation ... done Creating elasticsearch ... done Creating enterprise-edition ... done Creating haproxy ... done
Logs displayed when launching OctoPerf EE with
In fact, the
make command is just a shortcut to run OctoPerf EE.
default: serve serve: docker-compose up --build -d clean: docker-compose down
docker-compose.yml itself is pretty straightforward too. You can download the enterprise-edition.zip and give it a try!
It contains the following files:
Makefile: shortcuts to run OctoPerf EE via
docker-compose.yml: defines the OctoPerf EE services to run,
haproxyfolder: contains the HAProxy configuration to unify OctoPerf EE services behind a single port. (
Replacing Rancher Host Management with our own hosts management has a consequence: Hosts must be reinstalled.
This procedure explains how to upgrade a host:
- Remove Host from Rancher: Deactivate and remove each host from Rancher’s Environment through Rancher’s UI,
- Remove Rancher Agent from host: run
docker rm -f CONTAINER_IDwhere
CONTAINER_IDis the id of rancher’s agent container,
- Remove Rancher Agent Files:
rm -rf /var/lib/rancher/stateon each host where the agent was running,
- Launch OctoPerf Agent: for each host, go through the host command to get the command-line to run. Typically something like
docker run -d ....
Moving to our own docker agent brings also the following benefits:
- Less CPU / Memory Usage: previously, each JMeter container embedded its own monitoring agent. Now, the unified agent takes in charge all the work,
- Tests are starting faster: by communicating directly with our own agents (instead of using Rancher as a proxy to access hosts), we’re able to communicate much faster with the agents.
OctoPerf v9.0.0 should be available in a few weeks. Once released, users who have on-premise hosts connected to our Saas platform are invited to upgrade their hosts to our new unified agent.