pillyshi: 9月 2019

機械学習の問題設定

機械学習の問題設定を見直したのでメモ．

$(\Omega, \mathcal{F}, P)$ : ベースとなる確率空間
$(\mathcal{X}, \mathcal{F}_{\mathcal{X}})$ : 入力となる可測空間
$(\mathcal{Y}, \mathcal{F}_{\mathcal{Y}})$ : 出力となる可測空間
$\mathcal{Z} = \mathcal{X} \times \mathcal{Y}$
$m \in \mathbb{N}$ : サンプル数
$X_i: \Omega \to \mathcal{X}, \forall i = 1, \ldots, m$
$Y_i: \Omega \to \mathcal{Y}, \forall i = 1, \ldots, m$
$Z_i = (X_i, Y_i): \omega \mapsto (X_i(\omega), Y_i(\omega)), \forall i = 1, \ldots, m$
$S_m: \omega \mapsto (Z_1(\omega), \ldots, Z_m(\omega))$ : サンプル

i.i.d.
For all $i \neq j$ , $Z_i$ と $Z_j$ は独立
$P \circ Z_1^{-1} = \cdots = P \circ Z_m^{-1}$

$P_{Z} = P \circ Z_1^{-1}$ とする．
$D_m = P \circ S_m^{-1}$

ここまでで一通りの準備を終えた．

$\mathcal{H}$ : hypothesis set
$\ell: \mathcal{H} \times \mathcal{Z} \to [0, \infty)$ : loss function

generalization error

$\begin{aligned} L(h) := \mathbb{E}_{P_Z} \left[\ell(h, z)\right] \end{aligned}$

ここで，Agnostic PAC learningという概念を紹介しておく．ざっくり書くと，以下のような感じ．

$\mathcal{H}$ is Agnostic PAC learnable if there exists sample complexity $m_\mathcal{H}: (0, 1)^2 \to \mathbb{N}$ and learning algorithm $\mathcal{A}_\mathcal{H}: S_m \mapsto B \subset \mathcal{H}$ which satisfy the following property:

For all $\epsilon, \delta \in (0, 1)$

$\begin{aligned} m \geq m_\mathcal{H}(\epsilon, \delta) \Rightarrow \forall h \in \mathcal{A}_{\mathcal{H}}(S_m), L(h) \leq \epsilon \end{aligned}$

with probability at least $1 - \delta$

まず，我々は $\ell$ を定めることから始める． $L$ を定めてもいい．よくある二乗誤差なんかをここで使うべきではなく（使っても良いけど），本当に最小化したいものをここに持ってくる．indicator functionを使うといいと思う．

$\mathcal{H}$ と $\ell$ が与えられればAgnostic PAC learnableかどうか調べ始めることは可能だと思うが，普通はそうしないと思う．個人的には， $\mathcal{H}$ はアルゴリズムとセットで考えるべきだと思っている．アルゴリズムまで考えて，それが，Agnostic PAC learnerかどうかを調べるのが良いと思う．

Written with StackEdit.