文章归档

mesos单机资源隔离

Mesos Containerizer简单介绍

http://mesos.apache.org/documentation/latest/mesos-containerizer/

mesos默认的容器目前支持四种资源隔离,共享文件系统、namespaces、cgroups、posix、网络端口隔离

  // Create a MesosContainerizerProcess using isolators and a launcher.
  hashmap<string, Try<Isolator*> (*)(const Flags&)> creators;

  creators["posix/cpu"]   = &PosixCpuIsolatorProcess::create;
  creators["posix/mem"]   = &PosixMemIsolatorProcess::create;
  creators["posix/disk"]  = &PosixDiskIsolatorProcess::create;
#ifdef __linux__
  creators["cgroups/cpu"] = &CgroupsCpushareIsolatorProcess::create;
  creators["cgroups/mem"] = &CgroupsMemIsolatorProcess::create;
  creators["cgroups/perf_event"] = &CgroupsPerfEventIsolatorProcess::create;
  creators["filesystem/shared"] = &SharedFilesystemIsolatorProcess::create;
  creators["namespaces/pid"] = &NamespacesPidIsolatorProcess::create;
#endif // __linux__
#ifdef WITH_NETWORK_ISOLATOR
  creators["network/port_mapping"] = &PortMappingIsolatorProcess::create;
#endif

cgroups

单机资源隔离方面cgroup应该是最成熟和标准的方案了,unix/linux内核默认提供。

cgroup相关的资料:http://pipul.sinaapp.com/2013/05/control-groups/

cgroup子系统很多,mesos只使用了其中一部分,cpu/memory/perf_event

  • 设置 cpu.shares 实现软限
  • 设置 cpu.cfs_period_us  cpu.cfs_quota_us 实现硬限
  • 设置 memory.memsw.limit_in_bytes/memory.limit_in_bytes 实现内存硬限

其中cpuacct子系统可以得到当前控制组的cpu使用率,包括用户态和内核态。内存方面,memsw.limit_in_bytes限制的是内存+swap

共享文件系统

ContainerInfo specifies Volumes which map parts of the shared
filesystem (host_path) into the container’s view of the filesystem
(container_path), as read-write or read-only. The host_path can be
absolute, in which case it will make the filesystem subtree rooted at
host_path also accessible under container_path for each container.
If host_path is relative then it is considered as a directory
relative to the executor’s work directory. The directory will be
created and permissions copied from the corresponding directory (which
must exist) in the shared filesystem.

namespaces

http://lwn.net/Articles/531114/

如果宿主是unix/linux操作系统,还可以使用内核提供的各种namespaces机制,mesos目前支持的namespace有:NEWNET|NEWPID|NEWNS|NEWPID。这些参数在clone进程创建Executor的时候会用到,默认子进程继承。

网络

posix

通过posix-API来实现的资源隔离,包括cpu/memory/disk等isolator。

只不过此隔离非彼隔离,做不到绝对,实时,其原理很简单,就是定期采集一些运行时数据。举个例子,disk隔离,mesos-slave定期使用du工具统计container的磁盘使用量,一旦发现其利用率超过quota,就把container干掉。

1 comment to mesos单机资源隔离

  • fangdong

    单机只能使用一种containerizer
    containerizer->launch()启动executor(),向executor发送task

    taskInfo
    1. 如果有command,就用mesos-executor
    2. 如果没有command,有executorInfo,就用自己的executor

    mesos-executor只允许一个task一个container
    如果是自己实现的executor,既可以允许每个task一个独立的container,也可以是多个task跑在同一个container里

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>