ゼロから作る Deep Learning 2/2018.09.19

Posted on

ゼロから作るDeep Learning 2(自然言語処理編)の読書メモです。ちなみに前作はまだ読んでない…… :-|

ベクトルと内積

ニューラルネットワークの推論の全体図

  • バイアス; 前層のニューロンの値には影響を受けない定数
  • 全結合層による変換は線形、これに非線形な効果を与えるのが活性化関数(とのこと)
    • 線形
    • 非線形
      • 曲線的な動きをするやつとか
      • ReLU 関数みたいな動きをするやつとか
      • http://rishida.hatenablog.com/entry/2014/02/25/110643

        パーセプトロンとの違いは、ステップ関数ではなく、連続で非線形なシグモイド関数を用いる点である。
        連続関数を用いているため、パラメータに関して微分可能であり、高速な学習を可能としている。
        また、シグモイド関数の非線形領域を用いた場合のみ、ニューラルネットは万能の関数近似器になる。

      • “ニューラルネットワークとパーセプトロン - Sideswipe”
        http://kazoo04.hatenablog.com/entry/agi-ac-15
  • sigmoid 関数; 分母の exp(-x) が微分するときに便利らしい 🤔
  • アクティベーション; 活性化関数の出力

各種用語

損失関数

微分と勾配

  • 勾配; ベクトル(と行列やテンソル。数学ではベクトルだけが対象らしい)の各要素に関する微分をまとめたもの

チェインルール

  • 誤差逆伝搬法
    • ニューラルネットワークは複数の関数が連結された形になる
    • 連鎖律(合成関数の微分で積の形にするやつ)を使って効率よく勾配を求めることができる
    • 各関数の局所的な微分ができれば全体の微分は積の形で求まる

Python バージョン

%sh
python3 --version
Python 3.5.2

必要なモジュール

%sh
pip3 install numpy matplotlib seaborn
Collecting numpy
  Downloading https://files.pythonhosted.org/packages/0a/fa/afc1dc818589c9fd36a53f78999f2b5bd88bd5b167eb7d87fb56b565c185/numpy-1.15.1-cp35-cp35m-manylinux1_x86_64.whl (13.8MB)
Collecting matplotlib
  Downloading https://files.pythonhosted.org/packages/7b/ca/8b55a66b7ce426329ab16419a7eee4eb35b5a3fbe0d002434b339a4a7b09/matplotlib-3.0.0-cp35-cp35m-manylinux1_x86_64.whl (12.8MB)
Collecting seaborn
  Downloading https://files.pythonhosted.org/packages/a8/76/220ba4420459d9c4c9c9587c6ce607bf56c25b3d3d2de62056efe482dadc/seaborn-0.9.0-py3-none-any.whl (208kB)
Collecting pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 (from matplotlib)
  Downloading https://files.pythonhosted.org/packages/42/47/e6d51aef3d0393f7d343592d63a73beee2a8d3d69c22b053e252c6cfacd5/pyparsing-2.2.1-py2.py3-none-any.whl (57kB)
Collecting python-dateutil>=2.1 (from matplotlib)
  Using cached https://files.pythonhosted.org/packages/cf/f5/af2b09c957ace60dcfac112b669c45c8c97e32f94aa8b56da4c6d1682825/python_dateutil-2.7.3-py2.py3-none-any.whl
Collecting kiwisolver>=1.0.1 (from matplotlib)
  Downloading https://files.pythonhosted.org/packages/7e/31/d6fedd4fb2c94755cd101191e581af30e1650ccce7a35bddb7930fed6574/kiwisolver-1.0.1-cp35-cp35m-manylinux1_x86_64.whl (949kB)
Collecting cycler>=0.10 (from matplotlib)
  Using cached https://files.pythonhosted.org/packages/f7/d2/e07d3ebb2bd7af696440ce7e754c59dd546ffe1bbe732c8ab68b9c834e61/cycler-0.10.0-py2.py3-none-any.whl
Collecting pandas>=0.15.2 (from seaborn)
  Downloading https://files.pythonhosted.org/packages/5d/d4/6e9c56a561f1d27407bf29318ca43f36ccaa289271b805a30034eb3a8ec4/pandas-0.23.4-cp35-cp35m-manylinux1_x86_64.whl (8.7MB)
Collecting scipy>=0.14.0 (from seaborn)
  Downloading https://files.pythonhosted.org/packages/cd/32/5196b64476bd41d596a8aba43506e2403e019c90e1a3dfc21d51b83db5a6/scipy-1.1.0-cp35-cp35m-manylinux1_x86_64.whl (33.1MB)
Collecting six>=1.5 (from python-dateutil>=2.1->matplotlib)
  Downloading https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl
Requirement already satisfied (use --upgrade to upgrade): setuptools in /usr/lib/python3/dist-packages (from kiwisolver>=1.0.1->matplotlib)
Collecting pytz>=2011k (from pandas>=0.15.2->seaborn)
  Using cached https://files.pythonhosted.org/packages/30/4e/27c34b62430286c6d59177a0842ed90dc789ce5d1ed740887653b898779a/pytz-2018.5-py2.py3-none-any.whl
Installing collected packages: numpy, pyparsing, six, python-dateutil, kiwisolver, cycler, matplotlib, pytz, pandas, scipy, seaborn
Successfully installed cycler-0.10.0 kiwisolver-1.0.1 matplotlib-3.0.0 numpy-1.15.1 pandas-0.23.4 pyparsing-2.2.1 python-dateutil-2.7.3 pytz-2018.5 scipy-1.1.0 seaborn-0.9.0 six-1.11.0
You are using pip version 8.1.1, however version 18.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

サンプルリポジトリを落としてくる

%sh
git clone https://github.com/oreilly-japan/deep-learning-from-scratch-2
Cloning into 'deep-learning-from-scratch-2'...

numpyをインポート

%python3
import numpy as np

Sigmoid レイヤーの実装

%python3
class Sigmoid:
  def __init__(self):
    self.params, self.grads = [], []
    self.out = None
  
  def forward(self, x):
    out = 1 / (1+ np.exp(-x))
    self.out = out
    return out
  
  def backward(self, dout):
    dx = dout * (1.0 - self.out) * self.out
    return dx

backwardメソッドでは微分した関数に値を入れて差分を取っている

Affine レイヤーの実装

%python3
class Affine:
  def __init__(self, W, b):
    self.params = [W, b]
    self.grads = [np.zeros_like(W), np.zeros_like(b)]
    self.x = None
  
  def forward(self, x):
    W, b = self.params
    out = np.dot(x, W) + b
    self.x = x
    return out
  
  def backward(self, dout):
    W, b = self.params
    dx = np.dot(dout, W.T)
    dW = np.dot(self.x.T, dout)
    db = np.sum(dout, axis=0)
    
    self.grads[0][...] = dW
    self.grads[1][...] = db
    return dx

よく分からないが内積をとってバイアスを足している

SGD(Stochastic Gradient Descent, 確率的勾配降下法)

%python3
class SGD:
  def __init__(self, lr=0.01):
    self.lr = lr
  
  def update(self, params, grads):
    for i in range(len(params)):
      params[i] -= self.lr * grads[i]

数式よりイメージしやすい

ライブラリを読み込む

%python3
import sys
sys.path.append('./deep-learning-from-scratch-2')

Seabornの有効化

%python3
import seaborn as sns

sns.set()

使用するデータセットを可視化する

%python3
from dataset import spiral
import matplotlib.pyplot as plt

x, t = spiral.load_data()
print('x', x.shape)
print('t', t.shape)

N = 100
CLS_NUM = 3
markers = ['o', 'x', '^']
plt.figure(figsize=(6,6))
for i in range(CLS_NUM):
  plt.scatter(x[i*N:(i+1)*N, 0], x[i*N:(i+1)*N, 1], s=40, marker=markers[i])

z.show(plt, fmt='svg')
x (300, 2)
t (300, 3)

渦巻

SoftmaxWithLossレイヤーの実装

%python3
def softmax(x):
    if x.ndim == 2:
        x = x - x.max(axis=1, keepdims=True)
        x = np.exp(x)
        x /= x.sum(axis=1, keepdims=True)
    elif x.ndim == 1:
        x = x - np.max(x)
        x = np.exp(x) / np.sum(np.exp(x))

    return x

def cross_entropy_error(y, t):
    if y.ndim == 1:
        t = t.reshape(1, t.size)
        y = y.reshape(1, y.size)
        
    if t.size == y.size:
        t = t.argmax(axis=1)
             
    batch_size = y.shape[0]

    return -np.sum(np.log(y[np.arange(batch_size), t] + 1e-7)) / batch_size
  
class SoftmaxWithLoss:
  def __init__(self):
    self.params, self.grads = [], []
    self.y = None
    self.t = None
    
  def forward(self, x, t):
    self.t = t
    self.y = softmax(x)
    
    if self.t.size == self.y.size:
      self.t = self.t.argmax(axis=1)
    
    loss = cross_entropy_error(self.y, self.t)
    return loss
  
  def backward(self, dout=1):
    batch_size = self.t.shape[0]
    
    dx = self.y.copy()
    dx[np.arange(batch_size), self.t] -= 1
    dx *= dout
    dx = dx / batch_size
    
    return dx

ニューラルネットワークをつくる

%python3
class TwoLayerNet:
  def __init__(self, input_size, hidden_size, output_size):
    I, H, O = input_size, hidden_size, output_size
    
    # 重みとバイアスの初期化
    W1 = 0.01 * np.random.randn(I, H)
    b1 = np.zeros(H)
    W2 = 0.01 * np.random.randn(H, O)
    b2 = np.zeros(O)
    
    # レイヤーの生成
    self.layers = [
        Affine(W1, b1),
        Sigmoid(),
        Affine(W2, b2)
    ]
    self.loss_layer = SoftmaxWithLoss()
    
    # 全ての重みと勾配をリストにまとめる
    self.params, self.grads = [], []
    for layer in self.layers:
      self.params += layer.params
      self.grads += layer.grads
  
  def predict(self, x):
    for layer in self.layers:
      x = layer.forward(x)
    return x
  
  def forward(self, x, t):
    score = self.predict(x)
    loss = self.loss_layer.forward(score, t)
    return loss
  
  def backward(self, dout=1):
    dout = self.loss_layer.backward(dout)
    for layer in reversed(self.layers):
      dout = layer.backward(dout)
    return dout

トレーニング実行

%python3
# 1. ハイパーパラメータの設定
max_epoch = 300
batch_size = 30
hidden_size = 10
learning_rate = 1.0

# 2. データの読み込み
x, t = spiral.load_data()
model = TwoLayerNet(input_size=2, hidden_size=hidden_size, output_size=3)
optimizer = SGD(lr=learning_rate)

# 変数
data_size = len(x)
max_iters = data_size // batch_size
total_loss = 0
loss_count = 0
loss_list = []

for epoch in range(max_epoch):
  idx = np.random.permutation(data_size)
  x = x[idx]
  t = t[idx]
  
  for iters in range(max_iters):
    batch_x = x[iters*batch_size:(iters+1)*batch_size]
    batch_t = t[iters*batch_size:(iters+1)*batch_size]
    
    # 勾配を求めてパラメータを更新
    loss = model.forward(batch_x, batch_t)
    model.backward()
    optimizer.update(model.params, model.grads)
    
    total_loss += loss
    loss_count += 1
    
    # 学習系かの確認
    if (iters+1) % 10 == 0:
      avg_loss = total_loss / loss_count
      print('epoch %d | iters %d / %d | loss %.2f' % (epoch+1, iters + 1, max_iters, avg_loss))
      loss_list.append(avg_loss)
      total_loss, loss_count = 0, 0

print('done')
epoch 1 | iters 10 / 10 | loss 1.13
epoch 2 | iters 10 / 10 | loss 1.13
epoch 3 | iters 10 / 10 | loss 1.12
epoch 4 | iters 10 / 10 | loss 1.12
epoch 5 | iters 10 / 10 | loss 1.11
epoch 6 | iters 10 / 10 | loss 1.14
epoch 7 | iters 10 / 10 | loss 1.16
epoch 8 | iters 10 / 10 | loss 1.11
epoch 9 | iters 10 / 10 | loss 1.12
epoch 10 | iters 10 / 10 | loss 1.13
epoch 11 | iters 10 / 10 | loss 1.12
epoch 12 | iters 10 / 10 | loss 1.11
epoch 13 | iters 10 / 10 | loss 1.09
epoch 14 | iters 10 / 10 | loss 1.08
epoch 15 | iters 10 / 10 | loss 1.04
epoch 16 | iters 10 / 10 | loss 1.03
epoch 17 | iters 10 / 10 | loss 0.96
epoch 18 | iters 10 / 10 | loss 0.92
epoch 19 | iters 10 / 10 | loss 0.92
epoch 20 | iters 10 / 10 | loss 0.87
epoch 21 | iters 10 / 10 | loss 0.85
epoch 22 | iters 10 / 10 | loss 0.82
epoch 23 | iters 10 / 10 | loss 0.79
epoch 24 | iters 10 / 10 | loss 0.78
epoch 25 | iters 10 / 10 | loss 0.82
epoch 26 | iters 10 / 10 | loss 0.78
epoch 27 | iters 10 / 10 | loss 0.76
epoch 28 | iters 10 / 10 | loss 0.76
epoch 29 | iters 10 / 10 | loss 0.78
epoch 30 | iters 10 / 10 | loss 0.75
epoch 31 | iters 10 / 10 | loss 0.78
epoch 32 | iters 10 / 10 | loss 0.77
epoch 33 | iters 10 / 10 | loss 0.77
epoch 34 | iters 10 / 10 | loss 0.78
epoch 35 | iters 10 / 10 | loss 0.75
epoch 36 | iters 10 / 10 | loss 0.74
epoch 37 | iters 10 / 10 | loss 0.76
epoch 38 | iters 10 / 10 | loss 0.76
epoch 39 | iters 10 / 10 | loss 0.73
epoch 40 | iters 10 / 10 | loss 0.75
epoch 41 | iters 10 / 10 | loss 0.76
epoch 42 | iters 10 / 10 | loss 0.76
epoch 43 | iters 10 / 10 | loss 0.76
epoch 44 | iters 10 / 10 | loss 0.74
epoch 45 | iters 10 / 10 | loss 0.75
epoch 46 | iters 10 / 10 | loss 0.73
epoch 47 | iters 10 / 10 | loss 0.72
epoch 48 | iters 10 / 10 | loss 0.73
epoch 49 | iters 10 / 10 | loss 0.72
epoch 50 | iters 10 / 10 | loss 0.72
epoch 51 | iters 10 / 10 | loss 0.72
epoch 52 | iters 10 / 10 | loss 0.72
epoch 53 | iters 10 / 10 | loss 0.74
epoch 54 | iters 10 / 10 | loss 0.74
epoch 55 | iters 10 / 10 | loss 0.72
epoch 56 | iters 10 / 10 | loss 0.72
epoch 57 | iters 10 / 10 | loss 0.71
epoch 58 | iters 10 / 10 | loss 0.70
epoch 59 | iters 10 / 10 | loss 0.72
epoch 60 | iters 10 / 10 | loss 0.70
epoch 61 | iters 10 / 10 | loss 0.71
epoch 62 | iters 10 / 10 | loss 0.72
epoch 63 | iters 10 / 10 | loss 0.70
epoch 64 | iters 10 / 10 | loss 0.71
epoch 65 | iters 10 / 10 | loss 0.73
epoch 66 | iters 10 / 10 | loss 0.70
epoch 67 | iters 10 / 10 | loss 0.71
epoch 68 | iters 10 / 10 | loss 0.69
epoch 69 | iters 10 / 10 | loss 0.70
epoch 70 | iters 10 / 10 | loss 0.71
epoch 71 | iters 10 / 10 | loss 0.68
epoch 72 | iters 10 / 10 | loss 0.69
epoch 73 | iters 10 / 10 | loss 0.67
epoch 74 | iters 10 / 10 | loss 0.68
epoch 75 | iters 10 / 10 | loss 0.67
epoch 76 | iters 10 / 10 | loss 0.66
epoch 77 | iters 10 / 10 | loss 0.69
epoch 78 | iters 10 / 10 | loss 0.64
epoch 79 | iters 10 / 10 | loss 0.68
epoch 80 | iters 10 / 10 | loss 0.64
epoch 81 | iters 10 / 10 | loss 0.64
epoch 82 | iters 10 / 10 | loss 0.66
epoch 83 | iters 10 / 10 | loss 0.62
epoch 84 | iters 10 / 10 | loss 0.62
epoch 85 | iters 10 / 10 | loss 0.61
epoch 86 | iters 10 / 10 | loss 0.60
epoch 87 | iters 10 / 10 | loss 0.60
epoch 88 | iters 10 / 10 | loss 0.61
epoch 89 | iters 10 / 10 | loss 0.59
epoch 90 | iters 10 / 10 | loss 0.58
epoch 91 | iters 10 / 10 | loss 0.56
epoch 92 | iters 10 / 10 | loss 0.56
epoch 93 | iters 10 / 10 | loss 0.54
epoch 94 | iters 10 / 10 | loss 0.53
epoch 95 | iters 10 / 10 | loss 0.53
epoch 96 | iters 10 / 10 | loss 0.52
epoch 97 | iters 10 / 10 | loss 0.51
epoch 98 | iters 10 / 10 | loss 0.50
epoch 99 | iters 10 / 10 | loss 0.48
epoch 100 | iters 10 / 10 | loss 0.48
epoch 101 | iters 10 / 10 | loss 0.46
epoch 102 | iters 10 / 10 | loss 0.45
epoch 103 | iters 10 / 10 | loss 0.45
epoch 104 | iters 10 / 10 | loss 0.44
epoch 105 | iters 10 / 10 | loss 0.44
epoch 106 | iters 10 / 10 | loss 0.41
epoch 107 | iters 10 / 10 | loss 0.40
epoch 108 | iters 10 / 10 | loss 0.41
epoch 109 | iters 10 / 10 | loss 0.40
epoch 110 | iters 10 / 10 | loss 0.40
epoch 111 | iters 10 / 10 | loss 0.38
epoch 112 | iters 10 / 10 | loss 0.38
epoch 113 | iters 10 / 10 | loss 0.36
epoch 114 | iters 10 / 10 | loss 0.37
epoch 115 | iters 10 / 10 | loss 0.35
epoch 116 | iters 10 / 10 | loss 0.34
epoch 117 | iters 10 / 10 | loss 0.34
epoch 118 | iters 10 / 10 | loss 0.34
epoch 119 | iters 10 / 10 | loss 0.33
epoch 120 | iters 10 / 10 | loss 0.34
epoch 121 | iters 10 / 10 | loss 0.32
epoch 122 | iters 10 / 10 | loss 0.32
epoch 123 | iters 10 / 10 | loss 0.31
epoch 124 | iters 10 / 10 | loss 0.31
epoch 125 | iters 10 / 10 | loss 0.30
epoch 126 | iters 10 / 10 | loss 0.30
epoch 127 | iters 10 / 10 | loss 0.28
epoch 128 | iters 10 / 10 | loss 0.28
epoch 129 | iters 10 / 10 | loss 0.28
epoch 130 | iters 10 / 10 | loss 0.28
epoch 131 | iters 10 / 10 | loss 0.27
epoch 132 | iters 10 / 10 | loss 0.27
epoch 133 | iters 10 / 10 | loss 0.27
epoch 134 | iters 10 / 10 | loss 0.27
epoch 135 | iters 10 / 10 | loss 0.27
epoch 136 | iters 10 / 10 | loss 0.26
epoch 137 | iters 10 / 10 | loss 0.26
epoch 138 | iters 10 / 10 | loss 0.26
epoch 139 | iters 10 / 10 | loss 0.25
epoch 140 | iters 10 / 10 | loss 0.24
epoch 141 | iters 10 / 10 | loss 0.24
epoch 142 | iters 10 / 10 | loss 0.25
epoch 143 | iters 10 / 10 | loss 0.24
epoch 144 | iters 10 / 10 | loss 0.24
epoch 145 | iters 10 / 10 | loss 0.23
epoch 146 | iters 10 / 10 | loss 0.24
epoch 147 | iters 10 / 10 | loss 0.23
epoch 148 | iters 10 / 10 | loss 0.23
epoch 149 | iters 10 / 10 | loss 0.22
epoch 150 | iters 10 / 10 | loss 0.22
epoch 151 | iters 10 / 10 | loss 0.22
epoch 152 | iters 10 / 10 | loss 0.22
epoch 153 | iters 10 / 10 | loss 0.22
epoch 154 | iters 10 / 10 | loss 0.22
epoch 155 | iters 10 / 10 | loss 0.22
epoch 156 | iters 10 / 10 | loss 0.21
epoch 157 | iters 10 / 10 | loss 0.21
epoch 158 | iters 10 / 10 | loss 0.20
epoch 159 | iters 10 / 10 | loss 0.21
epoch 160 | iters 10 / 10 | loss 0.20
epoch 161 | iters 10 / 10 | loss 0.20
epoch 162 | iters 10 / 10 | loss 0.20
epoch 163 | iters 10 / 10 | loss 0.21
epoch 164 | iters 10 / 10 | loss 0.20
epoch 165 | iters 10 / 10 | loss 0.20
epoch 166 | iters 10 / 10 | loss 0.19
epoch 167 | iters 10 / 10 | loss 0.19
epoch 168 | iters 10 / 10 | loss 0.19
epoch 169 | iters 10 / 10 | loss 0.19
epoch 170 | iters 10 / 10 | loss 0.19
epoch 171 | iters 10 / 10 | loss 0.19
epoch 172 | iters 10 / 10 | loss 0.18
epoch 173 | iters 10 / 10 | loss 0.18
epoch 174 | iters 10 / 10 | loss 0.18
epoch 175 | iters 10 / 10 | loss 0.18
epoch 176 | iters 10 / 10 | loss 0.18
epoch 177 | iters 10 / 10 | loss 0.18
epoch 178 | iters 10 / 10 | loss 0.18
epoch 179 | iters 10 / 10 | loss 0.17
epoch 180 | iters 10 / 10 | loss 0.17
epoch 181 | iters 10 / 10 | loss 0.18
epoch 182 | iters 10 / 10 | loss 0.17
epoch 183 | iters 10 / 10 | loss 0.18
epoch 184 | iters 10 / 10 | loss 0.17
epoch 185 | iters 10 / 10 | loss 0.17
epoch 186 | iters 10 / 10 | loss 0.18
epoch 187 | iters 10 / 10 | loss 0.17
epoch 188 | iters 10 / 10 | loss 0.17
epoch 189 | iters 10 / 10 | loss 0.17
epoch 190 | iters 10 / 10 | loss 0.17
epoch 191 | iters 10 / 10 | loss 0.16
epoch 192 | iters 10 / 10 | loss 0.17
epoch 193 | iters 10 / 10 | loss 0.16
epoch 194 | iters 10 / 10 | loss 0.16
epoch 195 | iters 10 / 10 | loss 0.16
epoch 196 | iters 10 / 10 | loss 0.16
epoch 197 | iters 10 / 10 | loss 0.16
epoch 198 | iters 10 / 10 | loss 0.15
epoch 199 | iters 10 / 10 | loss 0.16
epoch 200 | iters 10 / 10 | loss 0.16
epoch 201 | iters 10 / 10 | loss 0.15
epoch 202 | iters 10 / 10 | loss 0.16
epoch 203 | iters 10 / 10 | loss 0.16
epoch 204 | iters 10 / 10 | loss 0.15
epoch 205 | iters 10 / 10 | loss 0.16
epoch 206 | iters 10 / 10 | loss 0.15
epoch 207 | iters 10 / 10 | loss 0.15
epoch 208 | iters 10 / 10 | loss 0.15
epoch 209 | iters 10 / 10 | loss 0.15
epoch 210 | iters 10 / 10 | loss 0.15
epoch 211 | iters 10 / 10 | loss 0.15
epoch 212 | iters 10 / 10 | loss 0.15
epoch 213 | iters 10 / 10 | loss 0.15
epoch 214 | iters 10 / 10 | loss 0.15
epoch 215 | iters 10 / 10 | loss 0.15
epoch 216 | iters 10 / 10 | loss 0.14
epoch 217 | iters 10 / 10 | loss 0.14
epoch 218 | iters 10 / 10 | loss 0.15
epoch 219 | iters 10 / 10 | loss 0.14
epoch 220 | iters 10 / 10 | loss 0.14
epoch 221 | iters 10 / 10 | loss 0.14
epoch 222 | iters 10 / 10 | loss 0.14
epoch 223 | iters 10 / 10 | loss 0.14
epoch 224 | iters 10 / 10 | loss 0.14
epoch 225 | iters 10 / 10 | loss 0.14
epoch 226 | iters 10 / 10 | loss 0.14
epoch 227 | iters 10 / 10 | loss 0.14
epoch 228 | iters 10 / 10 | loss 0.14
epoch 229 | iters 10 / 10 | loss 0.13
epoch 230 | iters 10 / 10 | loss 0.14
epoch 231 | iters 10 / 10 | loss 0.13
epoch 232 | iters 10 / 10 | loss 0.14
epoch 233 | iters 10 / 10 | loss 0.13
epoch 234 | iters 10 / 10 | loss 0.13
epoch 235 | iters 10 / 10 | loss 0.13
epoch 236 | iters 10 / 10 | loss 0.13
epoch 237 | iters 10 / 10 | loss 0.14
epoch 238 | iters 10 / 10 | loss 0.13
epoch 239 | iters 10 / 10 | loss 0.13
epoch 240 | iters 10 / 10 | loss 0.14
epoch 241 | iters 10 / 10 | loss 0.13
epoch 242 | iters 10 / 10 | loss 0.13
epoch 243 | iters 10 / 10 | loss 0.13
epoch 244 | iters 10 / 10 | loss 0.13
epoch 245 | iters 10 / 10 | loss 0.13
epoch 246 | iters 10 / 10 | loss 0.13
epoch 247 | iters 10 / 10 | loss 0.13
epoch 248 | iters 10 / 10 | loss 0.13
epoch 249 | iters 10 / 10 | loss 0.13
epoch 250 | iters 10 / 10 | loss 0.13
epoch 251 | iters 10 / 10 | loss 0.13
epoch 252 | iters 10 / 10 | loss 0.12
epoch 253 | iters 10 / 10 | loss 0.12
epoch 254 | iters 10 / 10 | loss 0.12
epoch 255 | iters 10 / 10 | loss 0.12
epoch 256 | iters 10 / 10 | loss 0.12
epoch 257 | iters 10 / 10 | loss 0.12
epoch 258 | iters 10 / 10 | loss 0.12
epoch 259 | iters 10 / 10 | loss 0.13
epoch 260 | iters 10 / 10 | loss 0.12
epoch 261 | iters 10 / 10 | loss 0.13
epoch 262 | iters 10 / 10 | loss 0.12
epoch 263 | iters 10 / 10 | loss 0.12
epoch 264 | iters 10 / 10 | loss 0.13
epoch 265 | iters 10 / 10 | loss 0.12
epoch 266 | iters 10 / 10 | loss 0.12
epoch 267 | iters 10 / 10 | loss 0.12
epoch 268 | iters 10 / 10 | loss 0.12
epoch 269 | iters 10 / 10 | loss 0.11
epoch 270 | iters 10 / 10 | loss 0.12
epoch 271 | iters 10 / 10 | loss 0.12
epoch 272 | iters 10 / 10 | loss 0.12
epoch 273 | iters 10 / 10 | loss 0.12
epoch 274 | iters 10 / 10 | loss 0.12
epoch 275 | iters 10 / 10 | loss 0.11
epoch 276 | iters 10 / 10 | loss 0.12
epoch 277 | iters 10 / 10 | loss 0.12
epoch 278 | iters 10 / 10 | loss 0.11
epoch 279 | iters 10 / 10 | loss 0.11
epoch 280 | iters 10 / 10 | loss 0.11
epoch 281 | iters 10 / 10 | loss 0.11
epoch 282 | iters 10 / 10 | loss 0.12
epoch 283 | iters 10 / 10 | loss 0.11
epoch 284 | iters 10 / 10 | loss 0.11
epoch 285 | iters 10 / 10 | loss 0.11
epoch 286 | iters 10 / 10 | loss 0.11
epoch 287 | iters 10 / 10 | loss 0.11
epoch 288 | iters 10 / 10 | loss 0.12
epoch 289 | iters 10 / 10 | loss 0.11
epoch 290 | iters 10 / 10 | loss 0.11
epoch 291 | iters 10 / 10 | loss 0.11
epoch 292 | iters 10 / 10 | loss 0.11
epoch 293 | iters 10 / 10 | loss 0.11
epoch 294 | iters 10 / 10 | loss 0.11
epoch 295 | iters 10 / 10 | loss 0.12
epoch 296 | iters 10 / 10 | loss 0.11
epoch 297 | iters 10 / 10 | loss 0.12
epoch 298 | iters 10 / 10 | loss 0.11
epoch 299 | iters 10 / 10 | loss 0.11
epoch 300 | iters 10 / 10 | loss 0.11
done

損失値のプロット

%python3
plt.figure(figsize=(6,4))
plt.plot(loss_list)
z.show(plt, fmt='svg')

境界領域のプロット

%python3
plt.figure(figsize=(6,4))

h = 0.001
x_min, x_max = x[:, 0].min() - .1, x[:, 0].max()+.1
y_min, y_max = x[:, 1].min() - .1, x[:, 1].max()+.1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
X = np.c_[xx.ravel(), yy.ravel()]
score = model.predict(X)
predict_cls = np.argmax(score, axis=1)
Z = predict_cls.reshape(xx.shape)
plt.contourf(xx, yy, Z)
plt.axis('off')

# データ点
x, t = spiral.load_data()
N = 100
CLS_NUM = 3
markers = ['o', 'x', '^']
for i in range(CLS_NUM):
  plt.scatter(x[i*N:(i+1)*N, 0], x[i*N:(i+1)*N, 1], s=40, marker=markers[i])

z.show(plt, fmt='svg')

うまく予測できているようだ :-)

等高線参考

“python で等高線を描くなら meshgrid して contour”
http://ailaby.com/contour/