文章归档

为什么我们需要Pod和Service?

我们知道pod和service是kubernetes中非常核心的两个概念,要了解这两个东西,就需要看看它们解决了什么问题。

Pod要解决的问题是:数据共享和交互

而Service,是解决Naming问题的另一种思路

»» 继续阅读全文

CPI2 : CPU performance isolation for shared compute clusters

论文原址: http://research.google.com/pubs/pub40737.html

This paper describes CPI2, a system that builds on the useful properties of CPI measures to automate all of the following:

  1. observe the run-time performance of hundreds to thousands of tasks belonging to the same job, and learn to distinguish normal performance from outliers
  2. identify performance interference within a few minutes by detecting such outliers
  3. determine which antagonist applications are the likely cause with an online cross-correlation analysis
  4. (if desired) ameliorate the bad behavior by throttling or migrating the antagonists.

»» 继续阅读全文

Heracles: Improving Resource Efficiency at Scale

论文原址:

  1. http://csl.stanford.edu/~christos/publications/2015.heracles.isca.pdf
  2. https://cs.stanford.edu/~davidlo/resources/2015.heracles.isca.slides.pdf

Average server utilization in most datacenter is low, ranging between 10%~50%. Difficult to consolidate the latency-critical services on a subset of highly utilized servers. Increase the server utilization by launching best-effort tasks on the same server with a latency-critical job

Goal: Eliminate SLO violations at all levels of load for the LC job while maximizing the throughput for BE tasks.

»» 继续阅读全文

Reconciling High Server Utilization and Sub-millisecond Quality-of-Service

论文原址:http://csl.stanford.edu/~christos/publications/2014.mutilate.eurosys.pdf

In this paper, we analyze the challenges of maintaining high QoS for low-latency workloads when sharing servers with other workloads.

The additional workloads can interfere with resources such as processing cores, cache space, memory or I/O bandwidth

The goal of this work is to investigate if workload colocation and good quality-of-service for latency-critical services are fundamentally incompatible in modern systems, or if instead we can reconcile the two

»» 继续阅读全文

基于veth的网络虚拟化

关于Network Namespace的原理不再详解,请直接移步:Namespaces in operation, part 7: Network namespaces 

但是需要注意的,这个文章里network namespace操作所使用的是最新内核&操作系统提供的非常便利的ip netns工具,不过这些工具在低版本的操作系统上都是不提供的。如果真的需要使用network namespace,最好通过netlink编程的方式来实现,直接基于操作系统调用来完成所有设备的虚拟化工作 我们知道在clone进程的时候使用CLONE_NEWNET参数可以创建一个新的独立的network namespace,但是光有这个还是远远不够的,所有网络设备都没有初始化、没启动,这个时候的容器就是一个完全的离线的容器,不在任何网络里,也访问不了任何网络。 

为了让容器独立能够与外网接通,我们需要创建并初始化一些设备,让容器内的网络和外网互通,veth是一种比较简单的方案

»» 继续阅读全文

pid namespaces销毁触发内核crash

最近在测试pid namespaces的过程中发现一个问题:就是当机器OOM的时候,杀掉了一个有pid namespace的进程,这个进程在回收的过程中,触发了内核crash

内核crash的地方是在回收进程pid的时候踩了空指针,内核版本是2.6.32

所以在低版本内核中,如果pid namespace使用不正确,可能会带来致命的稳定性问题

pid namespace是实现容器的基础技术之一,docker和lxc中都使用了pid namespace来为容器提供独立隔离的进程体系,实现容器之间的PID隔离,但是我看了一下docker和lxc的实现,实现方式很简单,在低版本内核上是极有可能触发这个bug的,我们来了解一下docker和lxc的进程隔离实现方案

关于pid namespace的实现原理,可以参考: Namespaces in operation, part 3: PID namespaces

»» 继续阅读全文

tc:linux流量控制

http://lartc.org

1. Simple, classless Queueing Disciplines 1.1 pfifo_fast

This queue is, as the name says, First In, First Out, which means that no packet receives special treatment. At least, not quite. This queue has 3 so called ’bands’. Within each band, FIFO rules apply. However, as long as there are packets waiting in band 0, band 1 won’t be processed. Same goes for band 1 and band 2.

The kernel honors the so called Type of Service flag of packets, and takes care to insert ’minimum delay’ packets in band 0.

Do not

»» 继续阅读全文

libprocess并发编程

libprocess是mesos中非常重要的一个基础库,提供一些很方便的helper函数以及并发编程所需要的基本原语,例如下面我将重点讲的future/promise。

为了更好的解释future/promise是什么,我抽取了一段mesos中的代码作为例子:

Future<Socket> PollSocketImpl::accept() { return io::poll(get(), io::READ) .then(lambda::bind(&internal::accept, get())); }

这个函数的基本作用是:使用io::poll()注册io::READ事件,并且当事件ready的时候,调用internal::accept()。

显然,这是一个异步的accept()方法。

»» 继续阅读全文

mesos单机资源隔离

Mesos Containerizer简单介绍

http://mesos.apache.org/documentation/latest/mesos-containerizer/

mesos默认的容器目前支持四种资源隔离,共享文件系统、namespaces、cgroups、posix、网络端口隔离

»» 继续阅读全文

mesos: Segfault in net::getIP

今天把mesos升级到0.22.1之后slave一起来就core了,打开debug调试了一下:

Program terminated with signal 11, Segmentation fault. #0 0x00007f639867c77e in free () from /lib64/libc.so.6 (gdb) bt #0 0x00007f639867c77e in free () from /lib64/libc.so.6 #1 0x00007f63986c25d0 in freeaddrinfo () from /lib64/libc.so.6 #2 0x00007f6399deeafa in net::getIP (hostname="<redacted>", family=2)   at ./3rdparty/stout/include/stout/net.hpp:201 #3 0x00007f6399e1f273 in process::initialize (delegate=Unhandled dwarf    expression opcode 0xf3) at src/process.cpp:837 #4 0x000000000042342f in main ()

»» 继续阅读全文

第 1 页,共 2 页12