Website Monitoring using Golang and Prometheus
This is the story of how I got fed up with the commercial monitoring solution I inherited, and came up with one of my own.
I came in to help a client with re-inventing the dev organization to be more agile and adopt devops methodologies.
One of the things that drove everyone crazy, was an unreasonable number of false alerts about the website availability coming in from a commercial monitoring service that was acquired before I came.
When trying to understand the source of the false alerts, I realized that the monitoring solution we had in place was lacking and did not provide a clear picture.
I started searching for an alternative.
I found a few. Most of them said they can do all the things one might expect from a monitoring service.
I even tried a few that had a trial period.
When the trials ended and I was presented with quotes for monitors I set up, I was shocked by the high pricing.
I then decided to build my own monitoring solution.
The rationale behind this decision:
We already had a large cloud account — one I could use to distribute my monitors on.
We were already using Prometheus to collect metrics from our app and Grafana to visualize them.
Having complete control over our monitors, will give me the insight I was missing.
I decided to write my monitor in Golang (because Golang is so cool) and distribute it as a Docker image.
The vision was to have a graph showing website load time from multiple locations.
I started out by writing my monitor.
The monitor had to do one thing:
Send an HTTP request to the website, time the response and send it to Prometheus.
I decided not to post any code in this article.
All the code (all 90 lines of it) can be found in the following Github repo:
When running the monitor you should get output similar to this:
And the metrics will be published at the configured port
I went on to Dockerize the app (Dockerfile can be found in the repo) and pushed it to ECR (Amazon’s Docker registry).
After that, I could easily deploy my Docker image in any region in the world I wanted — and have my site load time monitored from that region as well.
In order for that to happen I added a new service in Kubernetes to describe my monitors. I then added endpoints for each docker instance of my monitor. Finally, I created a servicemonitor to scrape my new service (the yaml for the Kubernetes resources is……. in the repo :))
All that was left after that, was to create a dashboard in Grafana to show the beautiful lines :)
The query for the graph above:
And I set the label to be the from label, that the monitor adds.
I couldn’t actually post something without a single line of code :)
Now that I had monitors in different locations across the world timing how long it takes to load our website, monitored by Grafana (I also set Grafana alerts, of course, for long load times) — I can definitely sleep better :)
Update: since writing this article, I decided to work a bit more on my monitor.
You can find the Docker image at:
And I will be creating a new release of the monitor soon
We will contact you as soon as possible.