In this paper, we analyze the challenges of maintaining high QoS for low-latency workloads when sharing servers with other workloads.
The additional workloads can interfere with resources such as processing cores, cache space, memory or I/O bandwidth
The goal of this work is to investigate if workload colocation and good quality-of-service for latency-critical services are fundamentally incompatible in modern systems, or if instead we can reconcile the two
- queuing delay： increases in queuing delay due to interference on shared resources
- scheduling delay： long scheduling delays when timesharing processor cores
- load imbalance： poor tail latency due to thread load imbalance
1. Queuing delay
What: Queuing delay occurs due to coincident or rapid request arrivals，Interference from co-located workloads impacts queuing delay by increasing service time, thus decreasing service rate. Even if the co-located workload runs on separate processor cores, its footprint on shared caches, memory channels, and I/O channels slows down the service rate for the latency critical workload.
How: Thus, we propose that load be provisioned to services in an interference-aware manner, that takes into account the reduction in throughput that a service might experience when deployed on servers with co-located workloads.
2. Scheduling delay
- scheduler wait time
- context switch latency
Linux内核默认CFS调度器最大的问题是： CFS’s wakeup placement algorithm allows sporadic tasks to induce long wait time on latency-sensitive tasks like memcached.
How: Fortunately, there are several strategies one can employ to mitigate this wait time for latency-sensitive services, including
- adjusting task share values in CFS,
- utilizing Linux’s POSIX real-time scheduling disciplines instead of CFS, or
- using a general purpose scheduler with support for latency-sensitive tasks, like BVT
- CPU Bandwidth Limits to Enforce Fairness
3. Load imbalance
What: A latency-sensitive service’s vulnerability to load imbalance can be easily ascertained by purposefully putting it in a situation where threads are unbalanced
How: One solution to this problem is particularly straight-forward and effective: threads can be pinned explicitly to distinct cores, so that Linux can never migrate them on top of each other