Looking Back on our TechTalkThursdays #11 and #12

Written by

Thomas Hug

Published

September 22, 2020

Besides our normal TechTalkThursdays in the evening, we tried new times during lunch and at 08:00 in the morning. Neither of them proved to be better than in the evening as we didn’t have the same amount of participants.

We use this article to summarize the topics of Demian Thoma and Daniel Lorch.

How a Titan empowers our Cloud Monitoring Infrastructure

Nine is hosting and managing thousands of servers for its customers. They recently moved to a new monitoring solution based on the open-source tools around Prometheus. Nine’s Demian Thoma talks about how nine implemented its new monitoring solution and how it gave them more insight into their infrastructure.

Nine was using Nagios before switching to Prometheus. By changing their monitoring stack, it allowed them to simplify the setup, get more insights into their services and to remove a separate analytics stack of infrastructure.

Site Reliability Engineering: What you need to know about Service Level Indicators, Service Level Objectives and Error Budgets

What does reliability mean to you? In his talk, Daniel Lorch reiterates the claim that reliability is the most important feature of any system. But services need to be just reliable enough to make its users happy – investing too much in reliability results in higher cost (engineering time and infrastructure) without added benefit. Investing too little on the other hand will result in unhappy users.

How do you determine and agree upon what “reliable enough” is to your services and your organization? Site Reliability Engineering provides tools and concepts to formalize this discussion, notably:

Service Level Indicators (SLIs): a monitoring metric that is indicative of a user’s goal
Service Level Objectives (SLOs): a target on an SLI that if barely met, keeps the users happy
Error Budgets: the maximum amount of time the system can fail without contractual consequences. It is the remainder / inverse of the SLO

Watch the 30’ talk below to learn about these concepts and see how an example SLI/SLO is being defined for a fictitious game platform. Links to further information are provided at the end of the talk.

On this occasion, we would like to once again thank our speakers for presenting!

Products

Security

There’s for example VPN to secure access to your Managed Servers, or Cloudflare to protect your system against DDoS attacks…

Use Cases

Peak Privacy

We supported Peak Privacy in designing and building a specialized server infrastructure for LLM inference. The process was characterized by…

Renuo AG

We helped Renuo AG to smoothly transition everything that was running on Heroku to Deploio, our own «Platform as a…

mühlemann+popp

We helped mühlemann+popp to smoothly realise the migration from the Google Cloud to the Nine infrastructure for their customer, a…

healthinal

We supported healthinal with the implementation of a new monitoring system based on Managed Kubernetes. healthinal was even able to…

Blog

Related blog posts

Miscellaneous 9 mins read

Looking Back on Our TechTalkThursday #26

Our fourth and last TechTalkThursday in 2025 – it was number 26 of our event series – took place on…

Product news 5 mins read

Retirement of the ingress-nginx Project

We're currently working on a proper strategy relating to the upcoming retirement of the ingress-nginx project, which is scheduled for…

Know-how 4 mins read

Why You Can Trust Our Servers To Be Fundamentally Secure

In the digital world, server security is not a “nice-to-have” but one of the cornerstones of a smooth operation. As…

Miscellaneous 10 mins read

Looking Back on Our TechTalkThursday #25

Our third TechTalkThursday in 2025 – it was number 25 of our event series – took place on the 4th…

Know-how 4 mins read

«Digital = Emission-Free?»

«Digitalisation is the path to a sustainable future.» This statement applies in many areas – but it is still worth…

Miscellaneous 9 mins read

Looking Back on Our TechTalkThursday #24

Our second TechTalkThursday this year was number 24 of the event series and took place on the 5th of June…

Know-how 3 mins read

Compliance as a Competitive Advantage: How We Meet the Highest Data Protection Requirements

Data protection is no longer just a «nice to have» – for many organisations, it has become an indispensable core…

Know-how 10 mins read

From Zero to Infra: Building a Production-Ready Setup Using Our API & Ansible

One common pattern we've observed at Nine is that many of our customers run web applications — think content management…

Miscellaneous 6 mins read

Looking Back on Our TechTalkThursday #23

The first TechTalkThursday in 2025 – and number 23 of the event series – took place on the 6th of…

Miscellaneous 7 mins read

Looking Back on Our TechTalkThursday #22

Our TechTalkThursday #22 took place on the 5th of Dezember 2024 at 6PM in our office. We were happy to…

Looking Back on our TechTalkThursdays #11 and #12

How a Titan empowers our Cloud Monitoring Infrastructure

Site Reliability Engineering: What you need to know about Service Level Indicators, Service Level Objectives and Error Budgets

Talk to one of our experts