# VC维的来龙去脉

VC维在机器学习领域是一个很基础的概念，它给诸多机器学习方法的可学习性提供了坚实的理论基础，但有时候，特别是对我们工程师而言，SVM，LR，深度学习等可能都已经用到线上了，但却不理解VC维。

## 说说历史

1943年，模拟神经网络由麦卡洛可（McCulloch）和皮茨（Pitts)提出，他们分析了理想化的人工神经元网络，并且指出了它们进行简单逻辑运算的机制。

1957年，康奈尔大学的实验心理学家弗兰克·罗森布拉特(Rosenblatt)在一台IBM–704计算机上模拟实现了一种他发明的叫作“感知机”（Perceptron）的神经网络模型。神经网络与支持向量机都源自于感知机（Perceptron）。

1962年，罗森布拉特著作：《神经动力学原理：感知机和大脑机制的理论》（Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms）。

1969年，明斯基和麻省理工学院的另一位教授佩普特合作著作：《感知机：计算几何学》（Perceptrons: An Introduction to Computational Geometry)。在书中，明斯基和佩普特证明单层神经网络不能解决XOR（异或）问题。

1971年，V. Vapnik and A. Chervonenkis在论文“On the uniform convergence of relative frequencies of events to their probabilities”中提出VC维的概念。

1974年，V. Vapnik提出了结构风险最小化原则。

# Understanding Waiting Times Between Events with the Poisson and Exponential Distributions

A webhook POSTs to our database each time a particular event occurs on our website. We receive about two of these requests per minute. I was mindlessly monitoring the log files one day and noticed it had been roughly 90 seconds since our database had been hit by this request. Before worrying, though, I wondered how rare that observation is. What is the likelihood of waiting longer than 1.5 minutes for the next request?

This is a probability problem that can be solved with an understanding of Poisson processes and the exponential distribution. A Poisson process is any process where independent events occur at constant known rate, e.g. babies are born at a hospital at a rate of three per hour, or calls come into a call center at a rate of 10 per minute. The exponential distribution is the probability distribution that models the waiting times between these events, e.g. the times between calls at the call center are exponentially distributed. To model Poisson processes and exponental distributions, we need to know two things: a time-unit and a rate .

# 常用激活函数比较

1. 什么是激活函数
2. 为什么要用
3. 都有什么
4. sigmoid ，ReLU， softmax 的比较
5. 如何选择

— 创世纪—数理统计

# Namespaces系列4：user namespace

User namespaces are have been introduced as early as Linux 3.5 and are considered as stable starting with Linux 4.3.

# 1. 简介

user namespace是最近才出现在内核主干里的，主要是为了隔离安全相关的标识和属性，例如 user IDs and group IDs (see credentials(7)), the root directory, keys (see keyctl(2)), and capabilities等等。但是从内核实现上来看，user namespace只是简单的提供了一种uid/gid映射机制，capabilities虽然与user namespace非常相关，但它不是user namespace中的概念，capabilities是进程的概念，它是进程的一种属性，它要远比user namespace出现的早

1. 我们先来了解一下，user namespace的基本用法
2. 以及user namespace如何结合capabilities实现容器的安全性隔离
3. user namespace在解决什么问题？
4. 内核实现
5. 与其他namespace的交互以及兼容性问题

# Reconciling High Server Utilization and Sub-millisecond Quality-of-Service

In this paper, we analyze the challenges of maintaining high QoS for low-latency workloads when sharing servers with other workloads.

The additional workloads can interfere with resources such as processing cores, cache space, memory or I/O bandwidth

The goal of this work is to investigate if workload colocation and good quality-of-service for latency-critical services are fundamentally incompatible in modern systems, or if instead we can reconcile the two

# Heracles: Improving Resource Efficiency at Scale

## 1. 简介

Heracles能够在保证LC服务足够SLO的情况下，最大限度的将空闲资源用来运行BE作业，同时，结合实时监控离线分析，自动检测干扰源，并在合适的时机，自动调整隔离机制防御干扰产生

1. 多种隔离机制的互相结合，是实现高利用率并且不打破LC服务SLO的关键所在
2. 将干扰问题细分成几个独立的子问题，能够大大的降低动态控制的复杂性
3. 为什么一个运行在所有机器上的、实时的监控程序是必须的

# CPI2 : CPU performance isolation for shared compute clusters

This paper describes CPI2, a system that builds on the useful properties of CPI measures to automate all of the following:

1. observe the run-time performance of hundreds to thousands of tasks belonging to the same job, and learn to distinguish normal performance from outliers
2. identify performance interference within a few minutes by detecting such outliers
3. determine which antagonist applications are the likely cause with an online cross-correlation analysis
4. (if desired) ameliorate the bad behavior by throttling or migrating the antagonists.

