Our Motto: Automate Everything
Devops

Our Motto: Automate Everything

You may have recognized the famous indie game Factorio used as the picture for this article. Factorio is best described as:

Factorio is a game in which you build and maintain factories. You will be mining resources, researching technologies, building infrastructure, automating production and fighting enemies.

Basically the goal of this game is to build a fully automated supply chain to produce a rocket to escape an hostile planet where you crashed. We may find similarities in software development. When you are a small team facing huge challenges, you know your time is precious.

At OctoPerf, we strongly believe the more you automate, the more time you can spend on tasks that really need your attention. We try to automate every single thing we can. From building cloud images to deploying a version, from testing the software to checking the code quality. This is how a compact team can have a huge throughput: delegate part of your work to routines executed by a machine.

Saving time is not the only reason to automate workflows: automation also prevents human errors. Try this yourself: make a paper ball and try to throw it in the recycle bin from 2m distance. I’m pretty sure you won’t succeed all the time.

In the next coming sections, i’m going to explain the different problems we have faced and how we try to automate as much as possible.

KISS

I believe the key to automation is to Keep It Simple and Stupid. Any random dude should be able to run any automated process with little to no knowledge of how it works. What matters is what the process does, not how he does it. How many times have you said to yourself: Oh man, I wish I had a program to do this instead of doing it myself. If you catch yourself saying this, you should probably take the time to actually write this program instead of spending your time doing it yourself.

By keeping things simple enough, you won’t spend too much time messing around to get the automated build working. Step by step, replace things you do yourself by programs who work for you. An automated routine costs you one time: when you write it. A manual task costs you everytime you need to perform it.

Makefile all the things

We are using Makefiles extensively to write our build routines. Why? Simply because it’s one of the most basic tool to build software available on any linux system. We have standard targets to build our backend and frontend:

  • clean: cleans everything by removing temporary files and build files,
  • test: execute unit-tests,
  • package: package the application, usually as an archive or a JAR file (Java),
  • docker.image: builds a self-containing docker image with the application within,
  • docker.push: pushes the docker image on a Docker HUB.
  • deploy: deploy the application in production environment.

Makefile Editing a Makefile with Nano

Makefiles are the foundation of all our builds: website, frontend, backend, cloud images and more. Why? Simplicity. Once you know that make package packages the application, you are able to package any of our components without any other knowledge.

Software Testing

Testing is one of the easiest thing to automate simply because there are so many tools available to achieve this goal.

Our backend is written in Java. It’s being tested using JUnit testing framework. At the time I write this post, we have almost 6200 unit tests covering 100% of our backend code base. The entire backend is being tested in 7 minutes. And, while the tests are running on Jenkins, I can do something else.

Jenkins Quality Check Build Our Quality Check Jenkins Pipeline

We have built a very simple Jenkins pipeline which checks the software quality. It runs the following tasks:

  • Checkout: retrieves our codebase from Git,
  • Clean: cleanup the build directories by running mvn clean,
  • Tests: once again nothing fancy, unit-tests are being run with test coverage,
  • SonarQube Analysis: performs the code quality analysis and sends the reports to our Sonarqube server,
  • Sonarqube Quality Gate: checks the code passes the quality standards configured on the Sonarqube server.

All the steps are written inside a Jenkinsfile, here is the content for those who are interested:


node {
    stage ("Checkout") {
        checkout scm
    }

    stage ('Clean') {
        sh 'make clean'
    }

    stage('Tests') {
        try {
            sh 'make test'
        } finally {
            junit '**/target/surefire-reports/TEST-*.xml'

            if (currentBuild.result == 'UNSTABLE') {
                currentBuild.result = 'FAILURE'
            }
        }
    }

    stage ('SonarQube Analysis') {
        withSonarQubeEnv('sonarqube') {
            sh 'make sonar'
        }
    }

    stage ('SonarQube Quality Gate') {
        timeout(5) {
            def qg = waitForQualityGate()
            if (qg.status != 'OK') {
                error "SonarQube Quality Gate: ${qg.status}"
            }
        }
    }
}

As you can see, everything is delegated to Makefile targets which do the real job. We could go further by doing Continuous Delivery. We could deploy the code in production if it passes the quality checks. But, we wanted to keep things separate and still have a manual control on this step.

Deploying in Production

The kind of tasks we need to do to deploy OctoPerf in production varies depending on the code which has been changed. There are several tasks to perform to deploy a new version.

Deploying Frontend

Our frontend is an AngularJS application deployed on Amazon CloudFront. This task is entirely automated:

  1. Build the AngularJS app with NPM, which compiles the TypeScript classes into Javascript and packages all the things together,
  2. Push the changes on the Amazon S3 Bucket containing all the files,
  3. Invalidate CloudFront cache to reflect changes.

We use an amazing tool named S3 Website. It allows to easily push a static website (like our frontend) to an Amazon S3 bucket while invalidating the right content on CloudFront.

All these steps can be executed by simply running make deploy. Our frontend is also available as a docker image too.

Deploying Saas Edition

Our Saas backend is packaged as a Docker Image. The deployment consists of building and pushing the docker image to Docker HUB. Again, this task can be done by simply running a single command-line which requires the version of the image.

make saas-edition.docker.push TAG=7.0.0

Deploying Enterprise Edition

Our Enterprise backend is packaged exactly the same way as the Saas edition. What differs from the Saas edition is the modules included within.

Deploying JMeter Plugin

This is the trickiest part in our deployment process. When we need to upgrade the JMeter plugin (for example when we add new metrics), there are a bunch of tasks which must be done:

  1. Build JMeter plugin Docker Images and deploy them on Docker HUB,
  2. Rebuild Cloud Providers (Amazon, DigitalOcean) Cloud Images with the new docker images in all regions,
  3. Update the Cloud Providers Configuration to use the new Cloud Images.

We have custom Cloud Provider Images with pre-loaded Docker Images to speed up machine launch. Also, we noticed that Docker HUB is quite unreliable. Machines would randomly fail to launch or take a huge amount of time because Docker HUB is offline or slow.

Imagine if you had to create Cloud Provider images manually in each region each time a new JMeter Plugin Docker Image is rolled out:

  1. Launch an instance in every single region,
  2. SSH into the instance, install docker, pull images,
  3. Stop the instance and create an AMI from it,
  4. Terminate all instances.

Don’t do this unless you like wasting your time. We rebuild our Cloud Images using Packer. It takes a couple of minutes to build the images in all the regions. Here is a sample packer script:


{
  "_comment": "Builds the rancher machine images in all Digital Ocean regions.",

  "builders": [
    {
      "type": "digitalocean",
      "name": "Amsterdam (2)",
      "api_token": "{{user `digital_ocean_token`}}",
      "image": "{{user `digital_ocean_base_image`}}",
      "region": "ams2",
      "size": "{{user `digital_ocean_size`}}",
      "snapshot_name": "{{user `ami_name`}}-{{timestamp}}",
      "ssh_username": "root"
    }
  ],

  "provisioners": [
    {
      "type": "shell",
      "inline": [
        "sudo docker pull {{user `rancher_agent`}}",
        "sudo docker pull {{user `jmeter-standard`}}",
        "sudo docker pull {{user `jmeter-webdriver`}}"
      ]
    }
  ],

  "post-processors": [
    {
      "type": "manifest",
      "output": "digitalocean-manifest.json",
      "strip_path": true
    }
  ]
}

This is our Packer script used for DigitalOcean. Packer is a great tool to automate Cloud Images creation. Once Packer has finished, it outputs a Json file containing the result of the build. To update our Cloud providers definition, we have an endpoint on our Backend:


@RestController
@AllArgsConstructor(access=PACKAGE)
@FieldDefaults(level=PRIVATE, makeFinal=true)
@RequestMapping("/workspaces/docker-providers")
class DockerProvidersController extends AbstractUserCrudController<DockerProvider> {

...

  @PutMapping("/update-images/{providerId}")
  @PreAuthorize("hasPermission(#providerId, 'DockerProvider', 'Update')")
  public DockerProvider updateImages(
    @PathVariable("providerId") final String providerId,
    @RequestBody final PackerManifest manifest) {
    return packer.updateImages(providerId, manifest).orElseThrow(notFound(providerId));
  }
}

We simply have to upload the manifest.json produces by Packer files via curl once the build is finished to update the Amazon AMIs and DigitalOcean Images. We spent some time writing a Packer Manifest Parser.

Why? Before we had this endpoint, we had to manually copy and paste every single Cloud Image ID and update the provider Json definition within the Elasticsearch database using Kibana. We made errors multiple times and the task was tedious.

Machines Provisioning

OctoPerf automatically launches and stops Cloud Machines on Cloud Providers depending on the needs. No one would do this manually. The code which does the job is quite tricky because there are many corner cases, it takes time to get this working properly in most situation. But it’s absolutely worth the time.

Online Payment

There are great payment platforms available now. We use Stripe to automate pretty much everything related to payments:

  • Recurring Subscriptions: managed by Stripe, our server only listen to changes pushed by Stripe using a WebHook,
  • Online Payment: required quite a bit of work, but it’s great to be able to accept money even on Sundays,
  • Subscription Cancellation: Stripe automatically handles this once you configure the dates properly.

Application Fail-Over

Our backend is running on Rancher which is a docker cluster orchestration system. It has a great feature named Health Check. It allows to periodically check if the application is up and running. Our backend is distributed over several machines to ensure high availability. Rancher automatically destroys and relaunches our backend containers in the case they are not responding.

Self-healing applications are great because they require no human intervention to be relaunched in case of failure.

Database Migration

When we roll out a new version which requires a database migration, this process is automated using code inside the backend. This way, our customers can seemlessly update to the latest version whithout having to take care about data migration. The backend automatically detects the datas version and applies the migration processes accordingly. This is pretty common in the software industry.

Conclusion

We see automation as a way to hire a machine to do your own job. This is how we can deliver software at a fast pace while keeping the team compact.

By - CTO.
Tags: Amazon Digital Ocean Automation Jenkins Testing Cloud Jmeter Json Java Angularjs Licencing Elastic Search Database Server