深度学习实践系列笔记
损失函数
L1损失:基于模型预测的值与标签的实际值之差的绝对值
平方误差(L2误差):均方误差(MSE)指每个样本的平均平方损失
梯度下降法
梯度:矢量
沿着负梯度方向探索
超参数
超参数:开始学习过程之前设置的参数,而不是训练得到的参数
典型超参数:学习率、神经网络的隐含层数量
步骤
- 准备数据
- 构建模型
- 训练模型
- 进行预测
生成人工数据集
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| %matplotlib inline
import matplotlib.pyplot as plt import numpy as np import tensorflow as tf
np.random.seed(5)
x_data = np.linspace(-1, 1, 100)
y_data = 2*x_data + 1.0 + np.random.randn(*x_data.shape) * 0.4
|
numpy.random.randn(d0, d1, …, dn)是从标准正态分布中返回一个或多个样本值
实参前加上*
和**
时代表拆包,单个*
表示将元祖拆成一个个单独的实参
画图
1 2 3 4 5
| plt.scatter(x_data, y_data)
plt.plot(x_data, 2*x_data + 1.0, color='red', linewidth=3)
|
构建模型
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| x = tf.placeholder("float", name = "x") y = tf.placeholder("float", name = "y")
def model(x, w, b): return tf.multiply(x, w) + b
w = tf.Variable(1.0, name="w0")
b = tf.Variable(0.0, name="b0")
pred = model(x, w, b)
|
训练模型
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| train_epochs = 10
learning_rate = 0.05
loss_function = tf.reduce_mean(tf.square(y - pred))
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss_function)
sess = tf.Session() init = tf.global_variables_initializer() sess.run(init)
for epoch in range(train_epochs): for xs,ys in zip(x_data, y_data): _, loss = sess.run([optimizer,loss_function], feed_dict={x: xs, y: ys}) b0temp = b.eval(session=sess) w0temp = w.eval(session=sess) print("w:", sess.run(w)) print("b:", sess.run(b))
plt.scatter(x_data, y_data, label='Original data') plt.plot(x_data, x_data * sess.run(w) + sess.run(b), label='Fitted line', color='r', linewidth=3) plt.legend(loc=2)
|
常见损失函数:均方差(Mean Square Error, MSE)和交叉熵(cross-entropy)
定义优化器Optimizer,初始化一个GradientDescentOptimizer
设置学习率和优化目标:最小化损失
w: 1.9822965
b: 1.0420128
预测
1 2 3 4 5 6 7
| x_test = 3.21
predict = sess.run(pred, feed_dict={x: x_test}) print("预测值:%f" % predict)
target = 2 * x_test + 1.0 print("目标值:%f" % target)
|
或者
1 2 3
| x_test = 3.21 predict = sess.run(w) * x_test + sess.run(b) print("预测值:%f" % predict)
|
预测值:7.405184
目标值:7.420000
显示损失值
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| step = 0 loss_list = [] display_step = 2
for epoch in range(train_epochs): for xs,ys in zip(x_data, y_data): _, loss = sess.run([optimizer,loss_function], feed_dict={x: xs, y: ys}) loss_list.append(loss) step = step + 1 if step % display_step == 0: print("Train Epoch:","%02d" % (epoch+1), "Step: %03d" % (step), "loss=", "{:.9f}".format(loss))
plt.plot(loss_list,'r+')
|
1 2 3 4 5 6 7 8 9 10 11 12
| Train Epoch: 05 Step: 408 loss= 0.125508696 Train Epoch: 05 Step: 410 loss= 0.036273275 Train Epoch: 05 Step: 412 loss= 0.000716237 Train Epoch: 05 Step: 414 loss= 0.097748078 Train Epoch: 05 Step: 416 loss= 0.026035903 Train Epoch: 05 Step: 418 loss= 0.633028984 Train Epoch: 05 Step: 420 loss= 0.084138028 Train Epoch: 05 Step: 422 loss= 0.088319123 Train Epoch: 05 Step: 424 loss= 0.002654018 Train Epoch: 05 Step: 426 loss= 0.116265893 Train Epoch: 05 Step: 428 loss= 0.018808722 Train Epoch: 05 Step: 430 loss= 0.000472802
|
显示损失值
1
| [x for x in loss_list if x>1]
|
打印突出的点
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| [1.0133754, 1.2284044, 1.0088208, 1.2116321, 2.3539772, 2.3148305, 1.3175836, 1.0387748, 1.5018207, 1.547514, 1.5514, 1.5517284, 1.5517554, 1.551758, 1.551758, 1.551758, 1.551758]
|
随机梯度下降
梯度下降法中,批量
指用于在单次迭代中计算梯度的样本总数。
批量可能相当巨大。
随机梯度下降法(SGD)每次迭代只是用一个样本(批量大小为1)。随机
表示构成各批量的一个样本是随机选择的。
小批量随机梯度下降法(小批量SGD)是介于全批量迭代与SGD之间的折中方案。通常包含10-1000个随机选择的样本。
完整代码
https://github.com/hubojing/DeepLearningCode-TensorFlow/blob/master/Simple linear regression.py