2018.09.19 / ゼロから作る Deep Learning 2

    Posted on 2018/09/19

    ゼロから作るDeep Learning 2(自然言語処理編)の読書メモです。ちなみに前作はまだ読んでない…… :-|

    ベクトルと内積

    ニューラルネットワークの推論の全体図

    • バイアス; 前層のニューロンの値には影響を受けない定数
    • 全結合層による変換は線形、これに非線形な効果を与えるのが活性化関数(とのこと)
      • 線形
      • 非線形
        • 曲線的な動きをするやつとか
        • ReLU 関数みたいな動きをするやつとか
        • http://rishida.hatenablog.com/entry/2014/02/25/110643

          パーセプトロンとの違いは、ステップ関数ではなく、連続で非線形なシグモイド関数を用いる点である。
          連続関数を用いているため、パラメータに関して微分可能であり、高速な学習を可能としている。
          また、シグモイド関数の非線形領域を用いた場合のみ、ニューラルネットは万能の関数近似器になる。

        • “ニューラルネットワークとパーセプトロン - Sideswipe”
          http://kazoo04.hatenablog.com/entry/agi-ac-15
    • sigmoid 関数; 分母の exp(-x) が微分するときに便利らしい 🤔
    • アクティベーション; 活性化関数の出力

    各種用語

    損失関数

    微分と勾配

    • 勾配; ベクトル(と行列やテンソル。数学ではベクトルだけが対象らしい)の各要素に関する微分をまとめたもの

    チェインルール

    • 誤差逆伝搬法
      • ニューラルネットワークは複数の関数が連結された形になる
      • 連鎖律(合成関数の微分で積の形にするやつ)を使って効率よく勾配を求めることができる
      • 各関数の局所的な微分ができれば全体の微分は積の形で求まる

    Python バージョン

    %sh
    python3 --version
    Python 3.5.2
    

    必要なモジュール

    %sh
    pip3 install numpy matplotlib seaborn
    Collecting numpy
      Downloading https://files.pythonhosted.org/packages/0a/fa/afc1dc818589c9fd36a53f78999f2b5bd88bd5b167eb7d87fb56b565c185/numpy-1.15.1-cp35-cp35m-manylinux1_x86_64.whl (13.8MB)
    Collecting matplotlib
      Downloading https://files.pythonhosted.org/packages/7b/ca/8b55a66b7ce426329ab16419a7eee4eb35b5a3fbe0d002434b339a4a7b09/matplotlib-3.0.0-cp35-cp35m-manylinux1_x86_64.whl (12.8MB)
    Collecting seaborn
      Downloading https://files.pythonhosted.org/packages/a8/76/220ba4420459d9c4c9c9587c6ce607bf56c25b3d3d2de62056efe482dadc/seaborn-0.9.0-py3-none-any.whl (208kB)
    Collecting pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 (from matplotlib)
      Downloading https://files.pythonhosted.org/packages/42/47/e6d51aef3d0393f7d343592d63a73beee2a8d3d69c22b053e252c6cfacd5/pyparsing-2.2.1-py2.py3-none-any.whl (57kB)
    Collecting python-dateutil>=2.1 (from matplotlib)
      Using cached https://files.pythonhosted.org/packages/cf/f5/af2b09c957ace60dcfac112b669c45c8c97e32f94aa8b56da4c6d1682825/python_dateutil-2.7.3-py2.py3-none-any.whl
    Collecting kiwisolver>=1.0.1 (from matplotlib)
      Downloading https://files.pythonhosted.org/packages/7e/31/d6fedd4fb2c94755cd101191e581af30e1650ccce7a35bddb7930fed6574/kiwisolver-1.0.1-cp35-cp35m-manylinux1_x86_64.whl (949kB)
    Collecting cycler>=0.10 (from matplotlib)
      Using cached https://files.pythonhosted.org/packages/f7/d2/e07d3ebb2bd7af696440ce7e754c59dd546ffe1bbe732c8ab68b9c834e61/cycler-0.10.0-py2.py3-none-any.whl
    Collecting pandas>=0.15.2 (from seaborn)
      Downloading https://files.pythonhosted.org/packages/5d/d4/6e9c56a561f1d27407bf29318ca43f36ccaa289271b805a30034eb3a8ec4/pandas-0.23.4-cp35-cp35m-manylinux1_x86_64.whl (8.7MB)
    Collecting scipy>=0.14.0 (from seaborn)
      Downloading https://files.pythonhosted.org/packages/cd/32/5196b64476bd41d596a8aba43506e2403e019c90e1a3dfc21d51b83db5a6/scipy-1.1.0-cp35-cp35m-manylinux1_x86_64.whl (33.1MB)
    Collecting six>=1.5 (from python-dateutil>=2.1->matplotlib)
      Downloading https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl
    Requirement already satisfied (use --upgrade to upgrade): setuptools in /usr/lib/python3/dist-packages (from kiwisolver>=1.0.1->matplotlib)
    Collecting pytz>=2011k (from pandas>=0.15.2->seaborn)
      Using cached https://files.pythonhosted.org/packages/30/4e/27c34b62430286c6d59177a0842ed90dc789ce5d1ed740887653b898779a/pytz-2018.5-py2.py3-none-any.whl
    Installing collected packages: numpy, pyparsing, six, python-dateutil, kiwisolver, cycler, matplotlib, pytz, pandas, scipy, seaborn
    Successfully installed cycler-0.10.0 kiwisolver-1.0.1 matplotlib-3.0.0 numpy-1.15.1 pandas-0.23.4 pyparsing-2.2.1 python-dateutil-2.7.3 pytz-2018.5 scipy-1.1.0 seaborn-0.9.0 six-1.11.0
    You are using pip version 8.1.1, however version 18.0 is available.
    You should consider upgrading via the 'pip install --upgrade pip' command.
    

    サンプルリポジトリを落としてくる

    %sh
    git clone https://github.com/oreilly-japan/deep-learning-from-scratch-2
    Cloning into 'deep-learning-from-scratch-2'...
    

    numpyをインポート

    %python3
    import numpy as np

    Sigmoid レイヤーの実装

    %python3
    class Sigmoid:
      def __init__(self):
        self.params, self.grads = [], []
        self.out = None
      
      def forward(self, x):
        out = 1 / (1+ np.exp(-x))
        self.out = out
        return out
      
      def backward(self, dout):
        dx = dout * (1.0 - self.out) * self.out
        return dx

    backwardメソッドでは微分した関数に値を入れて差分を取っている

    Affine レイヤーの実装

    %python3
    class Affine:
      def __init__(self, W, b):
        self.params = [W, b]
        self.grads = [np.zeros_like(W), np.zeros_like(b)]
        self.x = None
      
      def forward(self, x):
        W, b = self.params
        out = np.dot(x, W) + b
        self.x = x
        return out
      
      def backward(self, dout):
        W, b = self.params
        dx = np.dot(dout, W.T)
        dW = np.dot(self.x.T, dout)
        db = np.sum(dout, axis=0)
        
        self.grads[0][...] = dW
        self.grads[1][...] = db
        return dx

    よく分からないが内積をとってバイアスを足している

    SGD(Stochastic Gradient Descent, 確率的勾配降下法)

    %python3
    class SGD:
      def __init__(self, lr=0.01):
        self.lr = lr
      
      def update(self, params, grads):
        for i in range(len(params)):
          params[i] -= self.lr * grads[i]

    数式よりイメージしやすい

    ライブラリを読み込む

    %python3
    import sys
    sys.path.append('./deep-learning-from-scratch-2')

    Seabornの有効化

    %python3
    import seaborn as sns
    
    sns.set()

    使用するデータセットを可視化する

    %python3
    from dataset import spiral
    import matplotlib.pyplot as plt
    
    x, t = spiral.load_data()
    print('x', x.shape)
    print('t', t.shape)
    
    N = 100
    CLS_NUM = 3
    markers = ['o', 'x', '^']
    plt.figure(figsize=(6,6))
    for i in range(CLS_NUM):
      plt.scatter(x[i*N:(i+1)*N, 0], x[i*N:(i+1)*N, 1], s=40, marker=markers[i])
    
    z.show(plt, fmt='svg')
    x (300, 2)
    t (300, 3)
    

    渦巻

    SoftmaxWithLossレイヤーの実装

    %python3
    def softmax(x):
        if x.ndim == 2:
            x = x - x.max(axis=1, keepdims=True)
            x = np.exp(x)
            x /= x.sum(axis=1, keepdims=True)
        elif x.ndim == 1:
            x = x - np.max(x)
            x = np.exp(x) / np.sum(np.exp(x))
    
        return x
    
    def cross_entropy_error(y, t):
        if y.ndim == 1:
            t = t.reshape(1, t.size)
            y = y.reshape(1, y.size)
            
        if t.size == y.size:
            t = t.argmax(axis=1)
                 
        batch_size = y.shape[0]
    
        return -np.sum(np.log(y[np.arange(batch_size), t] + 1e-7)) / batch_size
      
    class SoftmaxWithLoss:
      def __init__(self):
        self.params, self.grads = [], []
        self.y = None
        self.t = None
        
      def forward(self, x, t):
        self.t = t
        self.y = softmax(x)
        
        if self.t.size == self.y.size:
          self.t = self.t.argmax(axis=1)
        
        loss = cross_entropy_error(self.y, self.t)
        return loss
      
      def backward(self, dout=1):
        batch_size = self.t.shape[0]
        
        dx = self.y.copy()
        dx[np.arange(batch_size), self.t] -= 1
        dx *= dout
        dx = dx / batch_size
        
        return dx

    ニューラルネットワークをつくる

    %python3
    class TwoLayerNet:
      def __init__(self, input_size, hidden_size, output_size):
        I, H, O = input_size, hidden_size, output_size
        
        # 重みとバイアスの初期化
        W1 = 0.01 * np.random.randn(I, H)
        b1 = np.zeros(H)
        W2 = 0.01 * np.random.randn(H, O)
        b2 = np.zeros(O)
        
        # レイヤーの生成
        self.layers = [
            Affine(W1, b1),
            Sigmoid(),
            Affine(W2, b2)
        ]
        self.loss_layer = SoftmaxWithLoss()
        
        # 全ての重みと勾配をリストにまとめる
        self.params, self.grads = [], []
        for layer in self.layers:
          self.params += layer.params
          self.grads += layer.grads
      
      def predict(self, x):
        for layer in self.layers:
          x = layer.forward(x)
        return x
      
      def forward(self, x, t):
        score = self.predict(x)
        loss = self.loss_layer.forward(score, t)
        return loss
      
      def backward(self, dout=1):
        dout = self.loss_layer.backward(dout)
        for layer in reversed(self.layers):
          dout = layer.backward(dout)
        return dout

    トレーニング実行

    %python3
    # 1. ハイパーパラメータの設定
    max_epoch = 300
    batch_size = 30
    hidden_size = 10
    learning_rate = 1.0
    
    # 2. データの読み込み
    x, t = spiral.load_data()
    model = TwoLayerNet(input_size=2, hidden_size=hidden_size, output_size=3)
    optimizer = SGD(lr=learning_rate)
    
    # 変数
    data_size = len(x)
    max_iters = data_size // batch_size
    total_loss = 0
    loss_count = 0
    loss_list = []
    
    for epoch in range(max_epoch):
      idx = np.random.permutation(data_size)
      x = x[idx]
      t = t[idx]
      
      for iters in range(max_iters):
        batch_x = x[iters*batch_size:(iters+1)*batch_size]
        batch_t = t[iters*batch_size:(iters+1)*batch_size]
        
        # 勾配を求めてパラメータを更新
        loss = model.forward(batch_x, batch_t)
        model.backward()
        optimizer.update(model.params, model.grads)
        
        total_loss += loss
        loss_count += 1
        
        # 学習系かの確認
        if (iters+1) % 10 == 0:
          avg_loss = total_loss / loss_count
          print('epoch %d | iters %d / %d | loss %.2f' % (epoch+1, iters + 1, max_iters, avg_loss))
          loss_list.append(avg_loss)
          total_loss, loss_count = 0, 0
    
    print('done')
    epoch 1 | iters 10 / 10 | loss 1.13
    epoch 2 | iters 10 / 10 | loss 1.13
    epoch 3 | iters 10 / 10 | loss 1.12
    epoch 4 | iters 10 / 10 | loss 1.12
    epoch 5 | iters 10 / 10 | loss 1.11
    epoch 6 | iters 10 / 10 | loss 1.14
    epoch 7 | iters 10 / 10 | loss 1.16
    epoch 8 | iters 10 / 10 | loss 1.11
    epoch 9 | iters 10 / 10 | loss 1.12
    epoch 10 | iters 10 / 10 | loss 1.13
    epoch 11 | iters 10 / 10 | loss 1.12
    epoch 12 | iters 10 / 10 | loss 1.11
    epoch 13 | iters 10 / 10 | loss 1.09
    epoch 14 | iters 10 / 10 | loss 1.08
    epoch 15 | iters 10 / 10 | loss 1.04
    epoch 16 | iters 10 / 10 | loss 1.03
    epoch 17 | iters 10 / 10 | loss 0.96
    epoch 18 | iters 10 / 10 | loss 0.92
    epoch 19 | iters 10 / 10 | loss 0.92
    epoch 20 | iters 10 / 10 | loss 0.87
    epoch 21 | iters 10 / 10 | loss 0.85
    epoch 22 | iters 10 / 10 | loss 0.82
    epoch 23 | iters 10 / 10 | loss 0.79
    epoch 24 | iters 10 / 10 | loss 0.78
    epoch 25 | iters 10 / 10 | loss 0.82
    epoch 26 | iters 10 / 10 | loss 0.78
    epoch 27 | iters 10 / 10 | loss 0.76
    epoch 28 | iters 10 / 10 | loss 0.76
    epoch 29 | iters 10 / 10 | loss 0.78
    epoch 30 | iters 10 / 10 | loss 0.75
    epoch 31 | iters 10 / 10 | loss 0.78
    epoch 32 | iters 10 / 10 | loss 0.77
    epoch 33 | iters 10 / 10 | loss 0.77
    epoch 34 | iters 10 / 10 | loss 0.78
    epoch 35 | iters 10 / 10 | loss 0.75
    epoch 36 | iters 10 / 10 | loss 0.74
    epoch 37 | iters 10 / 10 | loss 0.76
    epoch 38 | iters 10 / 10 | loss 0.76
    epoch 39 | iters 10 / 10 | loss 0.73
    epoch 40 | iters 10 / 10 | loss 0.75
    epoch 41 | iters 10 / 10 | loss 0.76
    epoch 42 | iters 10 / 10 | loss 0.76
    epoch 43 | iters 10 / 10 | loss 0.76
    epoch 44 | iters 10 / 10 | loss 0.74
    epoch 45 | iters 10 / 10 | loss 0.75
    epoch 46 | iters 10 / 10 | loss 0.73
    epoch 47 | iters 10 / 10 | loss 0.72
    epoch 48 | iters 10 / 10 | loss 0.73
    epoch 49 | iters 10 / 10 | loss 0.72
    epoch 50 | iters 10 / 10 | loss 0.72
    epoch 51 | iters 10 / 10 | loss 0.72
    epoch 52 | iters 10 / 10 | loss 0.72
    epoch 53 | iters 10 / 10 | loss 0.74
    epoch 54 | iters 10 / 10 | loss 0.74
    epoch 55 | iters 10 / 10 | loss 0.72
    epoch 56 | iters 10 / 10 | loss 0.72
    epoch 57 | iters 10 / 10 | loss 0.71
    epoch 58 | iters 10 / 10 | loss 0.70
    epoch 59 | iters 10 / 10 | loss 0.72
    epoch 60 | iters 10 / 10 | loss 0.70
    epoch 61 | iters 10 / 10 | loss 0.71
    epoch 62 | iters 10 / 10 | loss 0.72
    epoch 63 | iters 10 / 10 | loss 0.70
    epoch 64 | iters 10 / 10 | loss 0.71
    epoch 65 | iters 10 / 10 | loss 0.73
    epoch 66 | iters 10 / 10 | loss 0.70
    epoch 67 | iters 10 / 10 | loss 0.71
    epoch 68 | iters 10 / 10 | loss 0.69
    epoch 69 | iters 10 / 10 | loss 0.70
    epoch 70 | iters 10 / 10 | loss 0.71
    epoch 71 | iters 10 / 10 | loss 0.68
    epoch 72 | iters 10 / 10 | loss 0.69
    epoch 73 | iters 10 / 10 | loss 0.67
    epoch 74 | iters 10 / 10 | loss 0.68
    epoch 75 | iters 10 / 10 | loss 0.67
    epoch 76 | iters 10 / 10 | loss 0.66
    epoch 77 | iters 10 / 10 | loss 0.69
    epoch 78 | iters 10 / 10 | loss 0.64
    epoch 79 | iters 10 / 10 | loss 0.68
    epoch 80 | iters 10 / 10 | loss 0.64
    epoch 81 | iters 10 / 10 | loss 0.64
    epoch 82 | iters 10 / 10 | loss 0.66
    epoch 83 | iters 10 / 10 | loss 0.62
    epoch 84 | iters 10 / 10 | loss 0.62
    epoch 85 | iters 10 / 10 | loss 0.61
    epoch 86 | iters 10 / 10 | loss 0.60
    epoch 87 | iters 10 / 10 | loss 0.60
    epoch 88 | iters 10 / 10 | loss 0.61
    epoch 89 | iters 10 / 10 | loss 0.59
    epoch 90 | iters 10 / 10 | loss 0.58
    epoch 91 | iters 10 / 10 | loss 0.56
    epoch 92 | iters 10 / 10 | loss 0.56
    epoch 93 | iters 10 / 10 | loss 0.54
    epoch 94 | iters 10 / 10 | loss 0.53
    epoch 95 | iters 10 / 10 | loss 0.53
    epoch 96 | iters 10 / 10 | loss 0.52
    epoch 97 | iters 10 / 10 | loss 0.51
    epoch 98 | iters 10 / 10 | loss 0.50
    epoch 99 | iters 10 / 10 | loss 0.48
    epoch 100 | iters 10 / 10 | loss 0.48
    epoch 101 | iters 10 / 10 | loss 0.46
    epoch 102 | iters 10 / 10 | loss 0.45
    epoch 103 | iters 10 / 10 | loss 0.45
    epoch 104 | iters 10 / 10 | loss 0.44
    epoch 105 | iters 10 / 10 | loss 0.44
    epoch 106 | iters 10 / 10 | loss 0.41
    epoch 107 | iters 10 / 10 | loss 0.40
    epoch 108 | iters 10 / 10 | loss 0.41
    epoch 109 | iters 10 / 10 | loss 0.40
    epoch 110 | iters 10 / 10 | loss 0.40
    epoch 111 | iters 10 / 10 | loss 0.38
    epoch 112 | iters 10 / 10 | loss 0.38
    epoch 113 | iters 10 / 10 | loss 0.36
    epoch 114 | iters 10 / 10 | loss 0.37
    epoch 115 | iters 10 / 10 | loss 0.35
    epoch 116 | iters 10 / 10 | loss 0.34
    epoch 117 | iters 10 / 10 | loss 0.34
    epoch 118 | iters 10 / 10 | loss 0.34
    epoch 119 | iters 10 / 10 | loss 0.33
    epoch 120 | iters 10 / 10 | loss 0.34
    epoch 121 | iters 10 / 10 | loss 0.32
    epoch 122 | iters 10 / 10 | loss 0.32
    epoch 123 | iters 10 / 10 | loss 0.31
    epoch 124 | iters 10 / 10 | loss 0.31
    epoch 125 | iters 10 / 10 | loss 0.30
    epoch 126 | iters 10 / 10 | loss 0.30
    epoch 127 | iters 10 / 10 | loss 0.28
    epoch 128 | iters 10 / 10 | loss 0.28
    epoch 129 | iters 10 / 10 | loss 0.28
    epoch 130 | iters 10 / 10 | loss 0.28
    epoch 131 | iters 10 / 10 | loss 0.27
    epoch 132 | iters 10 / 10 | loss 0.27
    epoch 133 | iters 10 / 10 | loss 0.27
    epoch 134 | iters 10 / 10 | loss 0.27
    epoch 135 | iters 10 / 10 | loss 0.27
    epoch 136 | iters 10 / 10 | loss 0.26
    epoch 137 | iters 10 / 10 | loss 0.26
    epoch 138 | iters 10 / 10 | loss 0.26
    epoch 139 | iters 10 / 10 | loss 0.25
    epoch 140 | iters 10 / 10 | loss 0.24
    epoch 141 | iters 10 / 10 | loss 0.24
    epoch 142 | iters 10 / 10 | loss 0.25
    epoch 143 | iters 10 / 10 | loss 0.24
    epoch 144 | iters 10 / 10 | loss 0.24
    epoch 145 | iters 10 / 10 | loss 0.23
    epoch 146 | iters 10 / 10 | loss 0.24
    epoch 147 | iters 10 / 10 | loss 0.23
    epoch 148 | iters 10 / 10 | loss 0.23
    epoch 149 | iters 10 / 10 | loss 0.22
    epoch 150 | iters 10 / 10 | loss 0.22
    epoch 151 | iters 10 / 10 | loss 0.22
    epoch 152 | iters 10 / 10 | loss 0.22
    epoch 153 | iters 10 / 10 | loss 0.22
    epoch 154 | iters 10 / 10 | loss 0.22
    epoch 155 | iters 10 / 10 | loss 0.22
    epoch 156 | iters 10 / 10 | loss 0.21
    epoch 157 | iters 10 / 10 | loss 0.21
    epoch 158 | iters 10 / 10 | loss 0.20
    epoch 159 | iters 10 / 10 | loss 0.21
    epoch 160 | iters 10 / 10 | loss 0.20
    epoch 161 | iters 10 / 10 | loss 0.20
    epoch 162 | iters 10 / 10 | loss 0.20
    epoch 163 | iters 10 / 10 | loss 0.21
    epoch 164 | iters 10 / 10 | loss 0.20
    epoch 165 | iters 10 / 10 | loss 0.20
    epoch 166 | iters 10 / 10 | loss 0.19
    epoch 167 | iters 10 / 10 | loss 0.19
    epoch 168 | iters 10 / 10 | loss 0.19
    epoch 169 | iters 10 / 10 | loss 0.19
    epoch 170 | iters 10 / 10 | loss 0.19
    epoch 171 | iters 10 / 10 | loss 0.19
    epoch 172 | iters 10 / 10 | loss 0.18
    epoch 173 | iters 10 / 10 | loss 0.18
    epoch 174 | iters 10 / 10 | loss 0.18
    epoch 175 | iters 10 / 10 | loss 0.18
    epoch 176 | iters 10 / 10 | loss 0.18
    epoch 177 | iters 10 / 10 | loss 0.18
    epoch 178 | iters 10 / 10 | loss 0.18
    epoch 179 | iters 10 / 10 | loss 0.17
    epoch 180 | iters 10 / 10 | loss 0.17
    epoch 181 | iters 10 / 10 | loss 0.18
    epoch 182 | iters 10 / 10 | loss 0.17
    epoch 183 | iters 10 / 10 | loss 0.18
    epoch 184 | iters 10 / 10 | loss 0.17
    epoch 185 | iters 10 / 10 | loss 0.17
    epoch 186 | iters 10 / 10 | loss 0.18
    epoch 187 | iters 10 / 10 | loss 0.17
    epoch 188 | iters 10 / 10 | loss 0.17
    epoch 189 | iters 10 / 10 | loss 0.17
    epoch 190 | iters 10 / 10 | loss 0.17
    epoch 191 | iters 10 / 10 | loss 0.16
    epoch 192 | iters 10 / 10 | loss 0.17
    epoch 193 | iters 10 / 10 | loss 0.16
    epoch 194 | iters 10 / 10 | loss 0.16
    epoch 195 | iters 10 / 10 | loss 0.16
    epoch 196 | iters 10 / 10 | loss 0.16
    epoch 197 | iters 10 / 10 | loss 0.16
    epoch 198 | iters 10 / 10 | loss 0.15
    epoch 199 | iters 10 / 10 | loss 0.16
    epoch 200 | iters 10 / 10 | loss 0.16
    epoch 201 | iters 10 / 10 | loss 0.15
    epoch 202 | iters 10 / 10 | loss 0.16
    epoch 203 | iters 10 / 10 | loss 0.16
    epoch 204 | iters 10 / 10 | loss 0.15
    epoch 205 | iters 10 / 10 | loss 0.16
    epoch 206 | iters 10 / 10 | loss 0.15
    epoch 207 | iters 10 / 10 | loss 0.15
    epoch 208 | iters 10 / 10 | loss 0.15
    epoch 209 | iters 10 / 10 | loss 0.15
    epoch 210 | iters 10 / 10 | loss 0.15
    epoch 211 | iters 10 / 10 | loss 0.15
    epoch 212 | iters 10 / 10 | loss 0.15
    epoch 213 | iters 10 / 10 | loss 0.15
    epoch 214 | iters 10 / 10 | loss 0.15
    epoch 215 | iters 10 / 10 | loss 0.15
    epoch 216 | iters 10 / 10 | loss 0.14
    epoch 217 | iters 10 / 10 | loss 0.14
    epoch 218 | iters 10 / 10 | loss 0.15
    epoch 219 | iters 10 / 10 | loss 0.14
    epoch 220 | iters 10 / 10 | loss 0.14
    epoch 221 | iters 10 / 10 | loss 0.14
    epoch 222 | iters 10 / 10 | loss 0.14
    epoch 223 | iters 10 / 10 | loss 0.14
    epoch 224 | iters 10 / 10 | loss 0.14
    epoch 225 | iters 10 / 10 | loss 0.14
    epoch 226 | iters 10 / 10 | loss 0.14
    epoch 227 | iters 10 / 10 | loss 0.14
    epoch 228 | iters 10 / 10 | loss 0.14
    epoch 229 | iters 10 / 10 | loss 0.13
    epoch 230 | iters 10 / 10 | loss 0.14
    epoch 231 | iters 10 / 10 | loss 0.13
    epoch 232 | iters 10 / 10 | loss 0.14
    epoch 233 | iters 10 / 10 | loss 0.13
    epoch 234 | iters 10 / 10 | loss 0.13
    epoch 235 | iters 10 / 10 | loss 0.13
    epoch 236 | iters 10 / 10 | loss 0.13
    epoch 237 | iters 10 / 10 | loss 0.14
    epoch 238 | iters 10 / 10 | loss 0.13
    epoch 239 | iters 10 / 10 | loss 0.13
    epoch 240 | iters 10 / 10 | loss 0.14
    epoch 241 | iters 10 / 10 | loss 0.13
    epoch 242 | iters 10 / 10 | loss 0.13
    epoch 243 | iters 10 / 10 | loss 0.13
    epoch 244 | iters 10 / 10 | loss 0.13
    epoch 245 | iters 10 / 10 | loss 0.13
    epoch 246 | iters 10 / 10 | loss 0.13
    epoch 247 | iters 10 / 10 | loss 0.13
    epoch 248 | iters 10 / 10 | loss 0.13
    epoch 249 | iters 10 / 10 | loss 0.13
    epoch 250 | iters 10 / 10 | loss 0.13
    epoch 251 | iters 10 / 10 | loss 0.13
    epoch 252 | iters 10 / 10 | loss 0.12
    epoch 253 | iters 10 / 10 | loss 0.12
    epoch 254 | iters 10 / 10 | loss 0.12
    epoch 255 | iters 10 / 10 | loss 0.12
    epoch 256 | iters 10 / 10 | loss 0.12
    epoch 257 | iters 10 / 10 | loss 0.12
    epoch 258 | iters 10 / 10 | loss 0.12
    epoch 259 | iters 10 / 10 | loss 0.13
    epoch 260 | iters 10 / 10 | loss 0.12
    epoch 261 | iters 10 / 10 | loss 0.13
    epoch 262 | iters 10 / 10 | loss 0.12
    epoch 263 | iters 10 / 10 | loss 0.12
    epoch 264 | iters 10 / 10 | loss 0.13
    epoch 265 | iters 10 / 10 | loss 0.12
    epoch 266 | iters 10 / 10 | loss 0.12
    epoch 267 | iters 10 / 10 | loss 0.12
    epoch 268 | iters 10 / 10 | loss 0.12
    epoch 269 | iters 10 / 10 | loss 0.11
    epoch 270 | iters 10 / 10 | loss 0.12
    epoch 271 | iters 10 / 10 | loss 0.12
    epoch 272 | iters 10 / 10 | loss 0.12
    epoch 273 | iters 10 / 10 | loss 0.12
    epoch 274 | iters 10 / 10 | loss 0.12
    epoch 275 | iters 10 / 10 | loss 0.11
    epoch 276 | iters 10 / 10 | loss 0.12
    epoch 277 | iters 10 / 10 | loss 0.12
    epoch 278 | iters 10 / 10 | loss 0.11
    epoch 279 | iters 10 / 10 | loss 0.11
    epoch 280 | iters 10 / 10 | loss 0.11
    epoch 281 | iters 10 / 10 | loss 0.11
    epoch 282 | iters 10 / 10 | loss 0.12
    epoch 283 | iters 10 / 10 | loss 0.11
    epoch 284 | iters 10 / 10 | loss 0.11
    epoch 285 | iters 10 / 10 | loss 0.11
    epoch 286 | iters 10 / 10 | loss 0.11
    epoch 287 | iters 10 / 10 | loss 0.11
    epoch 288 | iters 10 / 10 | loss 0.12
    epoch 289 | iters 10 / 10 | loss 0.11
    epoch 290 | iters 10 / 10 | loss 0.11
    epoch 291 | iters 10 / 10 | loss 0.11
    epoch 292 | iters 10 / 10 | loss 0.11
    epoch 293 | iters 10 / 10 | loss 0.11
    epoch 294 | iters 10 / 10 | loss 0.11
    epoch 295 | iters 10 / 10 | loss 0.12
    epoch 296 | iters 10 / 10 | loss 0.11
    epoch 297 | iters 10 / 10 | loss 0.12
    epoch 298 | iters 10 / 10 | loss 0.11
    epoch 299 | iters 10 / 10 | loss 0.11
    epoch 300 | iters 10 / 10 | loss 0.11
    done
    

    損失値のプロット

    %python3
    plt.figure(figsize=(6,4))
    plt.plot(loss_list)
    z.show(plt, fmt='svg')

    境界領域のプロット

    %python3
    plt.figure(figsize=(6,4))
    
    h = 0.001
    x_min, x_max = x[:, 0].min() - .1, x[:, 0].max()+.1
    y_min, y_max = x[:, 1].min() - .1, x[:, 1].max()+.1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    X = np.c_[xx.ravel(), yy.ravel()]
    score = model.predict(X)
    predict_cls = np.argmax(score, axis=1)
    Z = predict_cls.reshape(xx.shape)
    plt.contourf(xx, yy, Z)
    plt.axis('off')
    
    # データ点
    x, t = spiral.load_data()
    N = 100
    CLS_NUM = 3
    markers = ['o', 'x', '^']
    for i in range(CLS_NUM):
      plt.scatter(x[i*N:(i+1)*N, 0], x[i*N:(i+1)*N, 1], s=40, marker=markers[i])
    
    z.show(plt, fmt='svg')