Titus: Adventures in Multi-tenant Scheduling
Titus is a multitenant scheduler that runs a variety of workloads that vary from online workloads which serve customer traffic to big data workloads which perform machine learning. Getting all of these workloads to cooperate on a shared pool of resources together. Just to add a bit of complexity to the mix, these workloads all run on the cloud, and a shared storage and network fabric. Come to this talk to learn about how our approach to multi-tenancy works, as well as some of the challenges we faced along the way.
Titus is a system which allows users to submit arbitrary container workloads to the cloud, and get their workloads running across many thousands of cores or more. This comes with a variety of challenges. We attacked this problem with a three-pronged approach.
Our approach to scheduling is multi-tenant first. Our scheduler understands different workloads and the fact that different workloads have different Service Level Objectives. In addition to this, it understands the cloud, and the fact it’s a shared control plane. Lastly, we’ve had to teach our scheduler to handling situations during failover, and when scaling up is key versus traditional scheduling.
Our approach to systems is evolving. Historically, our fleet was many single-tenant VMs. We’ve attacked systems level multi-tenancy from the multiple perspectives. The first of these involved giving our user the APIs that were as close to what they had on the VM. Subsequently, we’ve tried to enable security mechanisms like seccomp and apparmor that allow us to run nearly any workload on Titus. Lastly, we’re still figuring out resource isolation. Cgroups have come a long way, but there is a long way to go ahead before we can be as good as VMs.
All of our infrastructure runs on the cloud. We decided that our approach to scheduling, and systems multi-tenancy should be cloud native, and leverage as many mechanisms as possible that already exist in the cloud rather than invent our own. Although this gave us a massive head start, it didn’t come for free. We had to solve problems like coordination-free optimistic interactions with our SDN, and solutions to shared-storage.