Hosted Prometheus

The pull based architecture of Prometheus makes it quite hard to build a SaaS offering around it. Not many people will be either willing or able to open ports over the internet for scraping metrics. Equally, businesses based around running entire environments per customer fall more traditionally under managed services.

I was surprised to hear somebody ask at the end of the kubernetes cloud native online meetup about companies providing a hosted solution. From my experience in the past people generally fit into two categories; they either want to build and run a solution themselves, or they want to buy a service. Good luck trying to convince either category of person to switch over to the other side. Generally the decision is based on a mix of project context and experiences. People do change their minds on their own sometimes.

So for those who want to buy a service that’s kind of similar to hosted Prometheus, but not quite, I’d say Outlyer is pretty close. We share much of the same philosophy around pull based monitoring where the central monitoring server knows what should be up. We’re also operationally focused with regards to helping teams of people run online services which was the driving force behind Prometheus.

As with most SaaS services we require an agent to be installed on a server. That’s our way to collect metrics securely using an outbound connection, avoiding the need to open ports. Unlike most SaaS services we hold a websocket connection open, a bit like a chat client would do, so that we get bidirectional communication. Getting this working with our agent wasn’t a small effort, however, it does provide us a platform for pulling metrics from services, and then forwarding them back to us over the websocket. So we’re sort of pull then push based, with some technology making sure the push part works reliably.

We’ve never been a fan of creating new standards so have always adopted the most commonly used. Nagios check scripts are abundant, simple to write and perform the task of checking custom things very efficiently. Graphite and StatsD are in use by many organisations but simply aren’t as powerful as dimensional metrics. So when Prometheus started to gain popularity we were happy to add Prometheus format scraping support to our agent.

Right now you can setup all of your Prometheus metric endpoints as you would normally do. Then install the Outlyer agent on them and scrape them using a few lines in a shell script. These metrics get stored at native resolution and there’s no limit on the number you can send. You also get fine grained access control, dashboards you can share around and alert rules among various other features. There’s also a Grafana 3.0 plugin that is approaching the same level of features as the Prometheus query language.

Is it hosted Prometheus? Not really. It’s just Outlyer scraping Prometheus metric endpoints, although if you’re looking to buy rather than build, it’s probably the closest you’ll get right now.