标签：Tensorflow2 plt self tf train test model

Tensorflow2.0

有深度学习基础的建议直接看class3

class1

介绍

人工智能3学派

行为主义:基于控制论，构建感知-动作控制系统。(控制论，如平衡、行走、避障等自适应控制系统)
符号主义:基于算数逻辑表达式，求解问题时先把问题描述为表达式，再求解表达式。(可用公式描述、实现理性思维，如专家系统)
连接主义:仿生学，模仿神经元连接关系。(仿脑神经元连接，实现感性思维，如神经网络)

行为主义的例子，让机器人单脚站立，通过感知要摔倒的方向控制两只手的动作，保持身体的平衡，这就构建了一个感知-动作控制系统

张量生成

TensorFlow的数据类型

创建一个张量

shape括号中隔开了几个数字，就是几维张量，上图隔开一个数字，说明是一维张量

a=tf.constant([[3,6,4],[11,56,2]],dtype=tf.int64)
print(a)
print(a.shape)
print(a.dtype)
"""
tf.Tensor(
[[ 3  6  4]
 [11 56  2]], shape=(2, 3), dtype=int64)
(2, 3)
<dtype: 'int64'>
"""

有时候数据格式是numpy

还可以直接用函数快速创建特殊的张量矩阵

还有一些生产随机数的方法

比如

生成均匀分布随机数（注意区间是前闭后开）

tf常用函数1

理解axis

例子

可训练的

TensorFlow中的数学运算

对应元素的四则运算

例子

平方、次方与开方

矩阵乘

tensorflow输入数据

具体例子

tf常用函数2

tensorflow中提供了one-hot函数

例子

使输出符合概率分布

tf.nn.softmax

assign_sub更新参数

鸢尾花分类

# -*- coding: UTF-8 -*-
# 利用鸢尾花数据集，实现前向传播、反向传播，可视化loss曲线

# 导入所需模块
import tensorflow as tf
from sklearn import datasets
from matplotlib import pyplot as plt
import numpy as np

# 导入数据，分别为输入特征和标签
x_data = datasets.load_iris().data
y_data = datasets.load_iris().target

# 随机打乱数据（因为原始数据是顺序的，顺序不打乱会影响准确率）
# seed: 随机数种子，是一个整数，当设置之后，每次生成的随机数都一样（为方便教学，以保每位同学结果一致）
np.random.seed(116)  # 使用相同的seed，保证输入特征和标签一一对应
np.random.shuffle(x_data)
np.random.seed(116)
np.random.shuffle(y_data)
tf.random.set_seed(116)

# 将打乱后的数据集分割为训练集和测试集，训练集为前120行，测试集为后30行
x_train = x_data[:-30]
y_train = y_data[:-30]
x_test = x_data[-30:]
y_test = y_data[-30:]

# 转换x的数据类型，否则后面矩阵相乘时会因数据类型不一致报错
x_train = tf.cast(x_train, tf.float32)
x_test = tf.cast(x_test, tf.float32)

# from_tensor_slices函数使输入特征和标签值一一对应。（把数据集分批次，每个批次batch组数据）
train_db = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(32)
test_db = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

# 生成神经网络的参数，4个输入特征故，输入层为4个输入节点；因为3分类，故输出层为3个神经元
# 用tf.Variable()标记参数可训练
# 使用seed使每次生成的随机数相同（方便教学，使大家结果都一致，在现实使用时不写seed）
w1 = tf.Variable(tf.random.truncated_normal([4, 3], stddev=0.1, seed=1))
b1 = tf.Variable(tf.random.truncated_normal([3], stddev=0.1, seed=1))

lr = 0.1  # 学习率为0.1
train_loss_results = []  # 将每轮的loss记录在此列表中，为后续画loss曲线提供数据
test_acc = []  # 将每轮的acc记录在此列表中，为后续画acc曲线提供数据
epoch = 500  # 循环500轮
loss_all = 0  # 每轮分4个step，loss_all记录四个step生成的4个loss的和

# 训练部分
for epoch in range(epoch):  #数据集级别的循环，每个epoch循环一次数据集
    for step, (x_train, y_train) in enumerate(train_db):  #batch级别的循环 ，每个step循环一个batch
        with tf.GradientTape() as tape:  # with结构记录梯度信息
            y = tf.matmul(x_train, w1) + b1  # 神经网络乘加运算
            y = tf.nn.softmax(y)  # 使输出y符合概率分布（此操作后与独热码同量级，可相减求loss）
            y_ = tf.one_hot(y_train, depth=3)  # 将标签值转换为独热码格式，方便计算loss和accuracy
            loss = tf.reduce_mean(tf.square(y_ - y))  # 采用均方误差损失函数mse = mean(sum(y-out)^2)
            loss_all += loss.numpy()  # 将每个step计算出的loss累加，为后续求loss平均值提供数据，这样计算的loss更准确
        # 计算loss对各个参数的梯度
        grads = tape.gradient(loss, [w1, b1])

        # 实现梯度更新 w1 = w1 - lr * w1_grad    b = b - lr * b_grad
        w1.assign_sub(lr * grads[0])  # 参数w1自更新
        b1.assign_sub(lr * grads[1])  # 参数b自更新

    # 每个epoch，打印loss信息
    print("Epoch {}, loss: {}".format(epoch, loss_all/4))
    train_loss_results.append(loss_all / 4)  # 将4个step的loss求平均记录在此变量中
    loss_all = 0  # loss_all归零，为记录下一个epoch的loss做准备

    # 测试部分
    # total_correct为预测对的样本个数, total_number为测试的总样本数，将这两个变量都初始化为0
    total_correct, total_number = 0, 0
    for x_test, y_test in test_db:
        # 使用更新后的参数进行预测
        y = tf.matmul(x_test, w1) + b1
        y = tf.nn.softmax(y)
        pred = tf.argmax(y, axis=1)  # 返回y中最大值的索引，即预测的分类
        # 将pred转换为y_test的数据类型
        pred = tf.cast(pred, dtype=y_test.dtype)
        # 若分类正确，则correct=1，否则为0，将bool型的结果转换为int型
        correct = tf.cast(tf.equal(pred, y_test), dtype=tf.int32)
        # 将每个batch的correct数加起来
        correct = tf.reduce_sum(correct)
        # 将所有batch中的correct数加起来
        total_correct += int(correct)
        # total_number为测试的总样本数，也就是x_test的行数，shape[0]返回变量的行数
        total_number += x_test.shape[0]
    # 总的准确率等于total_correct/total_number
    acc = total_correct / total_number
    test_acc.append(acc)
    print("Test_acc:", acc)
    print("--------------------------")

# 绘制 loss 曲线
plt.title('Loss Function Curve')  # 图片标题
plt.xlabel('Epoch')  # x轴变量名称
plt.ylabel('Loss')  # y轴变量名称
plt.plot(train_loss_results, label="$Loss$")  # 逐点画出trian_loss_results值并连线，连线图标是Loss
plt.legend()  # 画出曲线图标
plt.show()  # 画出图像

# 绘制 Accuracy 曲线
plt.title('Acc Curve')  # 图片标题
plt.xlabel('Epoch')  # x轴变量名称
plt.ylabel('Acc')  # y轴变量名称
plt.plot(test_acc, label="$Accuracy$")  # 逐点画出test_acc值并连线，连线图标是Accuracy
plt.legend()
plt.show()

class2

预备知识

这里，x,y都是二维数组，行数由第一个参数1:3:1决定，为2行，列数由第二个参数2:4:0.5决定，为4列。排布规律为：x跨行方向即列方向与第一个参数一致，y为跨列方向即行方向与第二个参数一致。

使用tensorflow原生代码搭建神经网络

class3

Sequential搭建网络八股

用Tensorflow API：tf.keras搭建网络八股

一般选sparse_categorical_accuracy，因为标签是数值，输出是概率分布

loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)

from_logits=False就是神经网络预测结果输出经过概率分布，如果是结果直接输出，from_logits=True

validation data和validation split二者选其中一个就行。下面是一个用keras搭建鸢尾花分类的例子

import tensorflow as tf
from sklearn import datasets
import numpy as np


X = datasets.load_iris().data
Y = datasets.load_iris().target

np.random.seed(13)
np.random.shuffle(X)
np.random.seed(13)
np.random.shuffle(Y)
np.random.seed(13)

model = tf.keras.models.Sequential([tf.keras.layers.Dense(units=3, activation='softmax',
                                                          input_shape=(4,),kernel_regularizer=tf.keras.regularizers.l2())])

model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.1), loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
              ,metrics=['sparse_categorical_accuracy'])

model.fit(X,Y,epochs=500,batch_size=32,validation_split=0.2,validation_freq=20)

model.summary()

神经网络中间的隐藏层你可以自己制定，只要保证最后输出层是三个神经元就行，下面中间又加一层也是可行的。

import tensorflow as tf
from sklearn import datasets
import numpy as np


X = datasets.load_iris().data
Y = datasets.load_iris().target

np.random.seed(13)
np.random.shuffle(X)
np.random.seed(13)
np.random.shuffle(Y)
np.random.seed(13)

model = tf.keras.models.Sequential([tf.keras.layers.Dense(units=5, activation='relu',
                                                          input_shape=(4,),kernel_regularizer=tf.keras.regularizers.l2()),
                                    tf.keras.layers.Dense(units=3, activation='softmax',

                                                          kernel_regularizer=tf.keras.regularizers.l2())
                                    ])

model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.1), loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
              ,metrics=['sparse_categorical_accuracy'])

model.fit(X,Y,epochs=500,batch_size=32,validation_split=0.2,validation_freq=20)

model.summary()

用类搭建网络八股

Sequential可以搭建上层输出就是下层输入的顺序网络结构，但是无法写出一些带有跳连的非顺序网络结构。这时候可以使用类MyModel封装一个网络结构

self.d1，d1是这一层的名字

import tensorflow as tf
import numpy as np
from sklearn import datasets
from tensorflow.keras import Model
from tensorflow.keras.layers import Dense, Dropout

x_train=datasets.load_iris().data
y_train=datasets.load_iris().target

np.random.seed(33)
np.random.shuffle(x_train)
np.random.seed(33)
np.random.shuffle(y_train)
np.random.seed(33)


class IrisModel(Model):
    def __init__(self):
        super(IrisModel, self).__init__()
        self.d1=tf.keras.layers.Dense(3, activation='softmax',)

    def call(self,x):
        y=self.d1(x)
        return y


model=IrisModel()
model.compile(optimizer=tf.keras.optimizers.SGD(lr=0.1),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])
model.fit(x_train,y_train,batch_size=32,epochs=500,validation_split=0.3,validation_freq=50)
model.summary()

MINIST数据集

可视化数据

import tensorflow as tf
from matplotlib import pyplot as plt

mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# 可视化训练集输入特征的第一个元素
plt.imshow(x_train[0], cmap='gray')  # 绘制灰度图
plt.show()

# 打印出训练集输入特征的第一个元素
print("x_train[0]:\n", x_train[0])
# 打印出训练集标签的第一个元素
print("y_train[0]:\n", y_train[0])

# 打印出整个训练集输入特征形状
print("x_train.shape:\n", x_train.shape)
# 打印出整个训练集标签的形状
print("y_train.shape:\n", y_train.shape)
# 打印出整个测试集输入特征的形状
print("x_test.shape:\n", x_test.shape)
# 打印出整个测试集标签的形状
print("y_test.shape:\n", y_test.shape)

训练模型

import tensorflow as tf
from tensorflow.keras import datasets, layers, models


(x_train, y_train), (x_test, y_test) = datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model=tf.keras.models.Sequential([
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.01),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

model.fit(x_train,y_train,epochs=5, batch_size=128, validation_data=(x_test,y_test),
          validation_freq=1)

model.summary()

因为最后一层激活函数是softmax，输出已经符合概率分布，所以loss中的参数from_logits=False。如果最后一层激活函数是relu，from_logits=True

import tensorflow as tf
from tensorflow.keras import datasets, layers, models


(x_train, y_train), (x_test, y_test) = datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model=tf.keras.models.Sequential([
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='relu')
])
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.01),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['sparse_categorical_accuracy'])

model.fit(x_train,y_train,epochs=5, batch_size=128, validation_data=(x_test,y_test),
          validation_freq=1)

model.summary()

用类函数实现

import tensorflow as tf
from tensorflow.keras import layers,datasets,Model

(x_train, y_train), (x_test, y_test) = datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test/255.0

class MinistModel(Model):
    def __init__(self):
        super(MinistModel, self).__init__()
        self.fc1 = layers.Flatten()
        self.fc2 = layers.Dense(128, activation='relu')
        self.fc3 = layers.Dense(10, activation='softmax')

    def call(self,x):
        x = self.fc1(x)
        x = self.fc2(x)
        x = self.fc3(x)
        return x

model = MinistModel()
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',
              metrics=['sparse_categorical_accuracy'])
model.fit(x_train, y_train, epochs=5, batch_size=128,validation_data=(x_test, y_test),validation_freq=1)
model.summary()

FASHION数据集

用Sequential实现

import tensorflow as tf
from tensorflow.keras import datasets, layers
from tensorflow.keras import Model

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
x_train, x_test = x_train / 255.0, x_test/255.0

model = tf.keras.models.Sequential([
    layers.Flatten(),
    layers.Dense(128,activation='relu'),
    layers.Dense(10, activation='relu')
])
model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
              , metrics=['sparse_categorical_accuracy'])
model.fit(x_train, y_train, epochs=5, batch_size=128, validation_data=(x_test, y_test),validation_freq=1)
model.summary()

用类实现

import tensorflow as tf
from tensorflow.keras import layers, Model

class FashionMNIST(Model):
    def __init__(self):
        super(FashionMNIST, self).__init__()
        self.layer1 = layers.Flatten()
        self.layer2 = layers.Dense(128, activation='relu')
        self.layer3 = layers.Dense(10, activation='softmax')

    def call(self, x):
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        return x



(x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
x_train, x_test = x_train / 255.0, x_test/255.0

model = FashionMNIST()
model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['sparse_categorical_accuracy'])
model.fit(x_train, y_train, batch_size=128, epochs=5, validation_data=(x_test, y_test),validation_freq=1)
model.summary()

class4

自制数据集

数据集下载提取码：mocm

之前都是进行的训练都是tensorflow自带的数据集，这些数据集特征表现好，因此容易训练出好的效果。如果要训练自己的数据集该怎么做

这一讲将进行扩充

回想一下之前的代码class3.3中数据的读入

先写一个读取数据的函数generateds

import numpy as np
import tensorflow as tf
import os
from PIL import Image

def generateds(path,txt):
    f=open(txt,'r')
    contexts=f.readlines()
    f.close()  # 别忘了，不然Too many open files
    x,y_=[],[]
    for context in contexts:
        values=context.split()
        img_path=path+'\\'+values[0]
        img=Image.open(img_path)
        img = np.array(img.convert('L'))   # 别忘了转换图片格式
        img = img / 255.
        x.append(img)
        y_.append(values[1])
        # print(type(values[1]))
        print('loading : ' + context)

    x=np.array(x)
    y_=np.array(y_)
    y_=y_.astype(np.int64)     # 把str转换成int
    return x,y_

if __name__=='__main__':
    path=r'D:\code\python\TF2.0\class4\MNIST_FC\mnist_image_label\mnist_train_jpg_60000'
    txt=r'D:\code\python\TF2.0\class4\MNIST_FC\mnist_image_label\mnist_train_jpg_60000.txt'
    x,y=generateds(path,txt)
    print(x.shape)
    print(y.shape)

然后把之前训练的步骤加上

import numpy as np
import tensorflow as tf
import os
from PIL import Image
from tensorflow import keras

def generateds(path,txt):
    f=open(txt,'r')
    contexts=f.readlines()
    f.close()  # 别忘了，不然Too many open files
    x,y_=[],[]
    for context in contexts:
        values=context.split()
        img_path=path+'\\'+values[0]
        img=Image.open(img_path)
        img = np.array(img.convert('L'))   # 别忘了转换图片格式
        img = img / 255.
        x.append(img)
        y_.append(values[1])
        # print(type(values[1]))
        print('loading : ' + context)

    x=np.array(x)
    y_=np.array(y_)
    y_=y_.astype(np.int64)     # 把str转换成int
    return x,y_

if __name__=='__main__':
    train_path=r'D:\code\python\TF2.0\class4\MNIST_FC\mnist_image_label\mnist_train_jpg_60000'
    train_label=r'D:\code\python\TF2.0\class4\MNIST_FC\mnist_image_label\mnist_train_jpg_60000.txt'
    train_save_path=r'D:\code\python\TF2.0\class4\MNIST_FC\mnist_image_label\mnist_x_train.npy'
    train_label_save_path=r'D:\code\python\TF2.0\class4\MNIST_FC\mnist_image_label\mnist_y_train.npy'

    test_path=r'D:\code\python\TF2.0\class4\MNIST_FC\mnist_image_label\mnist_test_jpg_10000'
    test_label=r'D:\code\python\TF2.0\class4\MNIST_FC\mnist_image_label\mnist_test_jpg_10000.txt'
    test_save_path=r'D:\code\python\TF2.0\class4\MNIST_FC\mnist_image_label\mnist_x_test.npy'
    test_label_save_path = r'D:\code\python\TF2.0\class4\MNIST_FC\mnist_image_label\mnist_y_test.npy'

    if os.path.exists(train_save_path) and os.path.exists(train_label_save_path) and os.path.exists(
            test_save_path) and os.path.exists(test_label_save_path):
        print('-------------Load Datasets-----------------')
        x_train=np.load(train_save_path)
        print(x_train.shape)
        y_train=np.load(train_label_save_path)
        print(y_train.shape)
        x_test=np.load(test_save_path)
        print(x_test.shape)
        y_test=np.load(test_label_save_path)
        print(y_test.shape)
    else:
        print('-------------Generate Datasets-----------------')
        x_train,y_train=generateds(train_path,train_label)
        x_test,y_test=generateds(test_path,test_label)
        print('-------------Save Datasets-----------------')

        np.save(train_save_path,x_train)
        np.save(train_label_save_path,y_train)
        np.save(test_save_path,x_test)
        np.save(test_label_save_path,y_test)


    model=tf.keras.models.Sequential([
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(512, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')
    ])

    model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
                  metrics=['sparse_categorical_accuracy'])
    model.fit(x_train,y_train,epochs=5,validation_data=(x_test,y_test),batch_size=32,validation_freq=1)
    model.summary()

数据增强

import tensorflow as tf
from tensorflow.keras import datasets, layers
from tensorflow.keras import Model
from tensorflow.keras.preprocessing.image import ImageDataGenerator


image_gen_train=ImageDataGenerator(
                        rescale =1./1.,
                        rotation_range = 45,
                        width_shift_range =.15 ,
                        height_shift_range =.15,
                        horizontal_flip =False,
                        zoom_range =0.5 )

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test/255.0
x_train = x_train.reshape(x_train.shape[0],28,28,1)
image_gen_train.fit(x_train)

model = tf.keras.models.Sequential([
    layers.Flatten(),
    layers.Dense(128,activation='relu'),
    layers.Dense(10, activation='relu')
])
model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
               ,metrics=['sparse_categorical_accuracy'])
model.fit(image_gen_train.flow(x_train,y_train,batch_size=32), epochs=5, validation_data=(x_test, y_test)
          ,validation_freq=1)
model.summary()

数据增强在小数据上能增加模型泛化效果

断点续训

import tensorflow as tf
from tensorflow.keras import datasets, layers
from tensorflow.keras import Model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os


(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test/255.0
x_train = x_train.reshape(x_train.shape[0],28,28,1)


model = tf.keras.models.Sequential([
    layers.Flatten(),
    layers.Dense(128,activation='relu'),
    layers.Dense(10, activation='relu')
])
model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
               ,metrics=['sparse_categorical_accuracy'])

checkpoint_save_path="./checkpoints/mnist.ckpt"
if os.path.exists(checkpoint_save_path+'.index'):
    model.load_weights(checkpoint_save_path)
    print("---------------------Loaded model---------------")

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True
                                              ,save_best_only=True)


history=model.fit(x_train,y_train,batch_size=32, epochs=5, validation_data=(x_test, y_test)
          ,validation_freq=1,callbacks=[cp_callback])
model.summary()

会发现加载了之前的模型，并接着训练

参数提取

import tensorflow as tf
from tensorflow.keras import datasets, layers
from tensorflow.keras import Model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os
import numpy as np
np.set_printoptions(threshold=np.inf)


(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test/255.0
x_train = x_train.reshape(x_train.shape[0],28,28,1)


model = tf.keras.models.Sequential([
    layers.Flatten(),
    layers.Dense(128,activation='relu'),
    layers.Dense(10, activation='relu')
])
model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
               ,metrics=['sparse_categorical_accuracy'])

checkpoint_save_path="./checkpoints/mnist.ckpt"
if os.path.exists(checkpoint_save_path+'.index'):
    model.load_weights(checkpoint_save_path)
    print("---------------------Loaded model---------------")

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True
                                              ,save_best_only=True)


history=model.fit(x_train,y_train,batch_size=32, epochs=5, validation_data=(x_test, y_test)
          ,validation_freq=1,callbacks=[cp_callback])
model.summary()
print(model.trainable_variables)
file=open('weights.txt','w')
for v in model.trainable_variables:
    file.write(str(v.name)+'\n')
    file.write(str(v.shape)+'\n')
    file.write(str(v.numpy())+'\n')
file.close()

acc&loss可视化

import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import datasets, layers
from tensorflow.keras import Model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os
import numpy as np
np.set_printoptions(threshold=np.inf)


(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test/255.0
x_train = x_train.reshape(x_train.shape[0],28,28,1)


model = tf.keras.models.Sequential([
    layers.Flatten(),
    layers.Dense(128,activation='relu'),
    layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
               ,metrics=['sparse_categorical_accuracy'])

checkpoint_save_path="./checkpoints/mnist.ckpt"
if os.path.exists(checkpoint_save_path+'.index'):
    model.load_weights(checkpoint_save_path)
    print("---------------------Loaded model---------------")

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True
                                              ,save_best_only=True)


history=model.fit(x_train,y_train,batch_size=32, epochs=5, validation_data=(x_test, y_test)
          ,validation_freq=1,callbacks=[cp_callback])
model.summary()
print(model.trainable_variables)
file=open('weights.txt','w')
for v in model.trainable_variables:
    file.write(str(v.name)+'\n')
    file.write(str(v.shape)+'\n')
    file.write(str(v.numpy())+'\n')
file.close()
# 绘图
acc=history.history['sparse_categorical_accuracy']
val_acc=history.history['val_sparse_categorical_accuracy']
loss=history.history['loss']
val_loss=history.history['val_loss']

plt.subplot(1,2,1)
plt.plot(acc,label='train acc')
plt.plot(val_acc,label='validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.subplot(1,2,2)
plt.plot(loss,label='train loss')
plt.plot(val_loss,label='val loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()

应用程序，给图识物

程序

img_arr=img_arr/255.0

是因为训练数据是黑底白字，而预测的数据是白底黑字，需要进行数据预处理，使输入数据满足训练数据特征

for i in range(28):
    for j in range(28):
        if img_arr[i][j] < 200:
            img_arr[i][j] = 255
        else:
            img_arr[i][j] = 0

上面代码可以让输入输出图片变成只有黑白的高对比图像

x_predict = img_arr[tf.newaxis, ...]

因为神经网络输入都是一个batch一个batch的输入，所以要增加一个维度。

import os

from PIL import Image
import numpy as np
import tensorflow as tf
from matplotlib import pyplot as plt

model_path=r'D:\code\python\TF2.0\checkpoints\mnist.ckpt'
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')])

model.load_weights(model_path)


x_test_path=r'D:\code\python\TF2.0\class4\MNIST_FC\test'
path=os.listdir(x_test_path)
for img_path in path:
    img_path_=os.path.join(x_test_path,img_path)
    img = Image.open(img_path_)

    img=img.resize((28,28),Image.ANTIALIAS)
    img_arr=np.array(img.convert('L'))


    for i in range(28):
        for j in range(28):
            if img_arr[i][j] < 200:
                img_arr[i][j] = 255
            else:
                img_arr[i][j] = 0

    img_arr = img_arr / 255.0
    x_pred=img_arr[tf.newaxis,...]
    prediction=model.predict(x_pred)

    pre=tf.argmax(prediction,axis=1)
    print('\n')
    tf.print(pre)

class5

卷积计算过程

然而实际图像一般是三通道，参数会更多

先用cov进行特征提取再进行全连接（yolo中甚至直接不用全连接，只用cov）

卷积核的channel和输入特征图的channel要保持一致

输入是三通道时

具体的计算过程

对于输入特征图是三通道的

下面看一个动图，会更好的理解

convSobel

感受野

经过两个3×3的卷积和经过一个5×5的卷积的区别是什么？当x也就是图像边长大于10的时候，两个3×3的卷积比一个5×5的卷积效果要好

全零填充

如果希望卷积后输入特征的尺寸不变，就可以输入特征图进行全0填充

计算公式

TF描述卷积层

批标准化

神经网络对零均值的数据拟合更好，但是随着神经网络层数的增加，数据还会偏离0均值，用标准化可以把数据拉回来。一般用在卷积和激活函数之间。

但是如果只进行上面这种简单的标准化，数据都分布在sigmoid函数中间的线性部分，失去了非线性化，因此又引入了两个训练参数，保证了网络的非线性表达力

tensorflow中是

池化

tf中的池化函数

dropout

在神经网络的训练过程中，对于一次迭代中的某一层神经网络，先随机选择其中的一些神经元并将其临时丢弃，然后再进行本次的训练和优化。在下一次迭代中，继续随机隐藏一些神经元，直至训练结束。由于是随机丢弃，故而每一个批次都在训练不同的网络。

然后把输入x 通过修改后的网络前向传播，然后把得到的损失结果通过修改的网络反向传播。一小批（这里的批次batch_size由自己设定）训练样本执行完这个过程后，在没有被删除的神经元上按照随机梯度下降法更新对应的参数（w，b）
重复以下过程：
1、恢复被删掉的神经元（此时被删除的神经元保持原样，而没有被删除的神经元已经有所更新），因此每一个mini- batch都在训练不同的网络。
2、从隐藏层神经元中随机选择一个一半大小的子集临时删除掉（备份被删除神经元的参数）。
3、对一小批训练样本，先前向传播然后反向传播损失并根据随机梯度下降法更新参数（w，b）（没有被删除的那一部分参数得到更新，删除的神经元参数保持被删除前的结果）。

卷积神经网络

Cifar10数据集

可视化部分样本

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
np.set_printoptions(threshold=np.inf)

data_cifar10=tf.keras.datasets.cifar10.load_data()
(x_train, y_train), (x_test, y_test) =data_cifar10

print(x_train.shape)
print(y_train.shape)
plt.imshow(x_train[0])
plt.show()

卷积神经网络搭建示例

一层卷积两层全连接

import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.layers import Dense, Dropout,MaxPooling2D,Flatten,Conv2D,BatchNormalization
from tensorflow.keras import Model
import os
import numpy as np
from tensorflow_core.python.keras.layers import Activation
# np.set_printoptions(threshold=np.inf)


class Baseline(Model):
    def __init__(self):
        super(Baseline, self).__init__()
        self.conv1 = Conv2D(6, (5,5), padding='same')
        self.bn1 = BatchNormalization()
        self.a1= Activation('relu')
        self.pool1 = MaxPooling2D(pool_size=(2,2),strides=2,padding='same')
        self.d1=Dropout(0.2)

        self.flatten1 = Flatten()
        self.f1=Dense(128,activation='relu')
        self.d2=Dropout(0.2)
        self.f2=Dense(10,activation='softmax')

    def call(self,x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.a1(x)
        x = self.pool1(x)
        x = self.d1(x)

        x = self.flatten1(x)
        x = self.f1(x)
        x =self.d2(x)
        y= self.f2(x)
        return y


(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train,x_test = x_train/255.0,x_test/255.0


model = Baseline()
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.001),loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
              ,metrics=['sparse_categorical_accuracy'])

checkpoint_save_path="baseline.ckpt"
if os.path.exists(checkpoint_save_path+'.index'):
    model.load_weights(checkpoint_save_path)
    print("---------------------Loaded model---------------")

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True
                                              ,save_best_only=True, verbose=1)


history=model.fit(x_train,y_train,batch_size=32, epochs=5, validation_data=(x_test, y_test)
          ,validation_freq=1,callbacks=[cp_callback])
model.summary()

file=open('weights.txt','w')
for v in model.trainable_variables:
    file.write(str(v.name)+'\n')
    file.write(str(v.shape)+'\n')
    file.write(str(v.numpy())+'\n')
file.close()

train_acc=history.history['sparse_categorical_accuracy']
val_acc=history.history['val_sparse_categorical_accuracy']
loss=history.history['loss']
val_loss=history.history['val_loss']

plt.subplot(1,2,1)
plt.plot(loss,label='train_loss')
plt.plot(val_loss,label='val_loss')
plt.title('model loss')
plt.legend()

plt.subplot(1,2,2)
plt.plot(train_acc,label='train_acc')
plt.plot(val_acc,label='val_acc')
plt.title('model acc')
plt.legend()
plt.show()

如果你是看到曹老师的课，这里你会遇到一个问题，准确率一直在0.1，因为30和40系显卡不支持老版本了，解决方法看这篇文章

LeNet

import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.layers import Dense, Dropout,MaxPooling2D,Flatten,Conv2D,BatchNormalization,Activation
from tensorflow.keras import Model
import os
import numpy as np

# np.set_printoptions(threshold=np.inf)


class Baseline(Model):
    def __init__(self):
        super(Baseline, self).__init__()
        self.conv1 = Conv2D(6, (5,5), activation='sigmoid')
        self.pool1 = MaxPooling2D(pool_size=(2,2),strides=2)
        self.conv2 = Conv2D(16, (5,5), activation='sigmoid')
        self.pool2 = MaxPooling2D(pool_size=(2,2),strides=2)

        self.flatten1 = Flatten()
        self.f1=Dense(120,activation='sigmoid')
        self.f2=Dense(84,activation='sigmoid')
        self.f3=Dense(10,activation='softmax')

    def call(self,x):
        x = self.conv1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.pool2(x)

        x = self.flatten1(x)
        x = self.f1(x)
        x = self.f2(x)
        y = self.f3(x)
        return y


(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train,x_test = x_train/255.0,x_test/255.0


model = Baseline()
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.001),loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
              ,metrics=['sparse_categorical_accuracy'])

checkpoint_save_path="./checkpoints/lenet.ckpt"
if os.path.exists(checkpoint_save_path+'.index'):
    model.load_weights(checkpoint_save_path)
    print("---------------------Loaded model---------------")

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True
                                              ,save_best_only=True, verbose=1)


history=model.fit(x_train,y_train,batch_size=32, epochs=5, validation_data=(x_test, y_test)
          ,validation_freq=1,callbacks=[cp_callback])
model.summary()

file=open('./checkpoints/weights_lenet.txt','w')
for v in model.trainable_variables:
    file.write(str(v.name)+'\n')
    file.write(str(v.shape)+'\n')
    file.write(str(v.numpy())+'\n')
file.close()

train_acc=history.history['sparse_categorical_accuracy']
val_acc=history.history['val_sparse_categorical_accuracy']
loss=history.history['loss']
val_loss=history.history['val_loss']

plt.subplot(1,2,1)
plt.plot(loss,label='train_loss')
plt.plot(val_loss,label='val_loss')
plt.title('model loss')
plt.legend()

plt.subplot(1,2,2)
plt.plot(train_acc,label='train_acc')
plt.plot(val_acc,label='val_acc')
plt.title('model acc')
plt.legend()
plt.show()

AlexNet

共8层

网络架构

import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.layers import Dense, Dropout,MaxPooling2D,Flatten,Conv2D,BatchNormalization,Activation,Flatten
from tensorflow.keras import Model
import os
import numpy as np

# np.set_printoptions(threshold=np.inf)


class AlexNet(Model):
    def __init__(self):
        super(AlexNet, self).__init__()
        self.conv1 = Conv2D(96, kernel_size=(3,3), padding='valid')
        self.bn1 = BatchNormalization()
        self.a1= Activation('relu')
        self.pool1 = MaxPooling2D(pool_size=(3,3),strides=2)

        self.conv2 = Conv2D(256, (3,3), padding='valid')
        self.bn2 = BatchNormalization()
        self.a2= Activation('relu')
        self.pool2 = MaxPooling2D(pool_size=(3,3),strides=2)

        self.conv3 = Conv2D(384, (3,3), padding='same',activation='relu')

        self.conv4 = Conv2D(384, (3,3), padding='same',activation='relu')

        self.conv5 = Conv2D(256, (3,3), padding='same',activation='relu')
        self.pool5 = MaxPooling2D(pool_size=(3,3),strides=2)

        self.flatten1 = Flatten()
        self.f1=Dense(2048,activation='relu')
        self.d1=Dropout(0.5)
        self.f2=Dense(2048,activation='relu')
        self.d2=Dropout(0.5)
        self.f3=Dense(10,activation='softmax')

    def call(self,x):
        x=self.conv1(x)
        x=self.bn1(x)
        x=self.a1(x)
        x=self.pool1(x)

        x=self.conv2(x)
        x=self.bn2(x)
        x=self.a2(x)
        x=self.pool2(x)

        x=self.conv3(x)

        x=self.conv4(x)

        x=self.conv5(x)
        x=self.pool5(x)

        x=self.flatten1(x)
        x=self.f1(x)
        x=self.d1(x)
        x=self.f2(x)
        x=self.d2(x)
        y=self.f3(x)
        return y


(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train,x_test = x_train/255.0,x_test/255.0


model = AlexNet()
model.compile(optimizer='Adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
              ,metrics=['sparse_categorical_accuracy'])

checkpoint_save_path="./checkpoints/Alex.ckpt"
if os.path.exists(checkpoint_save_path+'.index'):
    model.load_weights(checkpoint_save_path)
    print("---------------------Loaded model---------------")

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True
                                              ,save_best_only=True, verbose=1)


history=model.fit(x_train,y_train,batch_size=32, epochs=5, validation_data=(x_test, y_test)
          ,validation_freq=1,callbacks=[cp_callback])
model.summary()

file=open('./checkpoints/weights_alex.txt','w')
for v in model.trainable_variables:
    file.write(str(v.name)+'\n')
    file.write(str(v.shape)+'\n')
    file.write(str(v.numpy())+'\n')
file.close()

train_acc=history.history['sparse_categorical_accuracy']
val_acc=history.history['val_sparse_categorical_accuracy']
loss=history.history['loss']
val_loss=history.history['val_loss']

plt.subplot(1,2,1)
plt.plot(loss,label='train_loss')
plt.plot(val_loss,label='val_loss')
plt.title('model loss')
plt.legend()

plt.subplot(1,2,2)
plt.plot(train_acc,label='train_acc')
plt.plot(val_acc,label='val_acc')
plt.title('model acc')
plt.legend()
plt.show()

VGGNet

import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.layers import Dense, Dropout,MaxPooling2D,Flatten,Conv2D,BatchNormalization,Activation,Flatten
from tensorflow.keras import Model
import os
import numpy as np

# np.set_printoptions(threshold=np.inf)


class VGG16(Model):
    def __init__(self):
        super(VGG16, self).__init__()
        self.conv1 = Conv2D(64, (3, 3),padding='same')
        self.bn1 = BatchNormalization()
        self.a1= Activation('relu')

        self.conv2 = Conv2D(64, (3, 3),padding='same')
        self.bn2 = BatchNormalization()
        self.a2= Activation('relu')
        self.pool2 = MaxPooling2D((2, 2), strides=2,padding='same')
        self.d2=Dropout(0.2)

        self.conv3 = Conv2D(128, (3, 3),padding='same')
        self.bn3= BatchNormalization()
        self.a3= Activation('relu')

        self.conv4 = Conv2D(128, (3, 3),padding='same')
        self.bn4=BatchNormalization()
        self.a4= Activation('relu')
        self.p4= MaxPooling2D((2, 2), strides=2,padding='same')
        self.d4= Dropout(0.2)

        self.conv5 = Conv2D(256, (3, 3),padding='same')
        self.bn5=BatchNormalization()
        self.a5= Activation('relu')

        self.conv6= Conv2D(256, (3, 3),padding='same')
        self.bn6= BatchNormalization()
        self.a6= Activation('relu')

        self.conv7= Conv2D(256, (3, 3),padding='same')
        self.bn7= BatchNormalization()
        self.a7= Activation('relu')
        self.p7= MaxPooling2D((2, 2), strides=2,padding='same')
        self.d7= Dropout(0.2)

        self.conv8= Conv2D(512, (3, 3),padding='same')
        self.bn8=BatchNormalization()
        self.a8= Activation('relu')

        self.conv9= Conv2D(512, (3, 3),padding='same')
        self.bn9= BatchNormalization()
        self.a9= Activation('relu')

        self.conv10= Conv2D(512, (3, 3),padding='same')
        self.bn10= BatchNormalization()
        self.a10= Activation('relu')
        self.p10= MaxPooling2D((2, 2), strides=2,padding='same')
        self.d10= Dropout(0.2)

        self.conv11= Conv2D(512, (3, 3),padding='same')
        self.bn11= BatchNormalization()
        self.a11= Activation('relu')

        self.conv12= Conv2D(512, (3, 3),padding='same')
        self.bn12= BatchNormalization()
        self.a12= Activation('relu')

        self.conv13= Conv2D(512, (3, 3),padding='same')
        self.bn13= BatchNormalization()
        self.a13= Activation('relu')
        self.p13= MaxPooling2D((2, 2), strides=2,padding='same')
        self.d13= Dropout(0.2)

        self.flatten= Flatten()
        self.fc14= Dense(512,activation='relu')
        self.d14=Dropout(0.2)
        self.fc15= Dense(512,activation='relu')
        self.d15= Dropout(0.2)
        self.fc16=Dense(10,activation='softmax')


    def call(self,x):
        x = self.conv1(x)
        x=self.bn1(x)
        x=self.a1(x)

        x=self.conv2(x)
        x=self.bn2(x)
        x=self.a2(x)
        x=self.pool2(x)
        x=self.d2(x)

        x=self.conv3(x)
        x=self.bn3(x)
        x=self.a3(x)

        x=self.conv4(x)
        x=self.bn4(x)
        x=self.a4(x)
        x=self.p4(x)
        x=self.d4(x)

        x=self.conv5(x)
        x=self.bn5(x)
        x=self.a5(x)

        x=self.conv6(x)
        x=self.bn6(x)
        x=self.a6(x)

        x=self.conv7(x)
        x=self.bn7(x)
        x=self.a7(x)
        x=self.p7(x)
        x=self.d7(x)

        x=self.conv8(x)
        x=self.bn8(x)
        x=self.a8(x)

        x=self.conv9(x)
        x=self.bn9(x)
        x=self.a9(x)

        x=self.conv10(x)
        x=self.bn10(x)
        x=self.a10(x)
        x=self.p10(x)
        x=self.d10(x)

        x=self.conv11(x)
        x=self.bn11(x)
        x=self.a11(x)

        x=self.conv12(x)
        x=self.bn12(x)
        x=self.a12(x)

        x=self.conv13(x)
        x=self.bn13(x)
        x=self.a13(x)
        x=self.p13(x)
        x=self.d13(x)

        x=self.flatten(x)
        x=self.fc14(x)
        x=self.fc15(x)
        y=self.fc16(x)
        return y


(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train,x_test = x_train/255.0,x_test/255.0


model = VGG16()
model.compile(optimizer='Adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
              ,metrics=['sparse_categorical_accuracy'])

checkpoint_save_path="./checkpoints/VGG16.ckpt"
if os.path.exists(checkpoint_save_path+'.index'):
    model.load_weights(checkpoint_save_path)
    print("---------------------Loaded model---------------")

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True
                                              ,save_best_only=True, verbose=1)


history=model.fit(x_train,y_train,batch_size=32, epochs=5, validation_data=(x_test, y_test)
          ,validation_freq=1,callbacks=[cp_callback])
model.summary()

file=open('./checkpoints/weights_VGG16.txt','w')
for v in model.trainable_variables:
    file.write(str(v.name)+'\n')
    file.write(str(v.shape)+'\n')
    file.write(str(v.numpy())+'\n')
file.close()

train_acc=history.history['sparse_categorical_accuracy']
val_acc=history.history['val_sparse_categorical_accuracy']
loss=history.history['loss']
val_loss=history.history['val_loss']

plt.subplot(1,2,1)
plt.plot(loss,label='train_loss')
plt.plot(val_loss,label='val_loss')
plt.title('model loss')
plt.legend()

plt.subplot(1,2,2)
plt.plot(train_acc,label='train_acc')
plt.plot(val_acc,label='val_acc')
plt.title('model acc')
plt.legend()
plt.show()

Inception

引入Inception结构块，在同一层网络内使用不同尺寸的卷积核，提升了模型感知力；使用了批标准化，缓解了梯度消失。

核心是它的基本单元Inception结构块，无论是GoogLeNet（Inception v1），还是InceptionNet的后续版本，比如v2/v3/v4，

都是基于Inception结构块搭建的网络。Inception结构块在同一层网络中使用了多个尺寸的卷积核，可以提取不同尺寸的特征。

通过1×1卷积核作用到输入特征图的每个像素点，通过设定少于输入特征图深度的1*1卷积核个数，减少了输出特征图深度，起到

了降维的作用，减少了参数量和计算量。

Inception有四个分支，具体结构见上图右上角

里面有很多重复的代码，编写ConvBNRelu类，增加代码可读性。

有了Inception块后，就能搭建精简版本的InceptionNet

网络共有10层，第一次是一个3×3的conv，然后是4个Inception结构块顺序相连，每两个Inception结构块组成一个block每个block中的第一个Inception结构块卷积步长是2，第二个Inception结构块卷积步长是1，这使得第一个Inception结构块输出特征图尺寸减半，因此把输出特征图深度加深，尽可能保证特征抽取中信息的承载量一致。

block_0设置的卷积核个数是16，经过了四个分支，输出的深度为4 * 16=64。block_1卷积核个数是block_0通的两倍（self.out_channels * = 2），是32，同样经过了四个分支，输出深度是4*32=128.这128个通道的数据会被送入平均池化，送入10个分类的全连接

首先我们简单理解全局平均池化操作。之前我们需要把特征图展开然后进行全连接，而现在，我们直接没有了这一步。
如果有一批特征图，其尺寸为 [ B, C, H, W], 我们经过全局平均池化之后，尺寸变为[B, C, 1, 1]。
也就是说，全局平均池化其实就是对每一个通道图所有像素值求平均值，然后得到一个新的1 * 1的通道图。

由于网络规模比较大，把batch_size调整到1024，让训练时一次喂入神经网络的数据量多一些，以充分发挥显卡的性能，提高训练速度

一般让显卡达到70-80%的负荷比较合理。注意：数据量大的时候可以调大batchsize，数据量小的时候batchsize不要调太大，因为数据量小的时候，如果batchsize小，那么一个epoch会有很多batchsize，每一个batchsize都会进行梯度更新。

import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.layers import Dense, Dropout,MaxPooling2D,Flatten,Conv2D,BatchNormalization,\
    Activation,Flatten,GlobalAveragePooling2D
from tensorflow.keras import Model, Sequential
import os
import numpy as np

# np.set_printoptions(threshold=np.inf)


class ConvBNRelu(Model):
    def __init__(self,filters,kernel_size,stride=1):
        super(ConvBNRelu, self).__init__()
        self.model = Sequential([
            Conv2D(filters, kernel_size,strides=stride, padding='same'),
            BatchNormalization(),
            Activation('relu'),
        ])

    def call(self,x):
        y=self.model(x,training=False)
        return y


class Inception(Model):
    def __init__(self,init_ch,stride):
        super(Inception, self).__init__()
        self.c1=ConvBNRelu(init_ch,1,stride=stride)  # 方便下面Inception10控制特征图的size
        self.c2_1=ConvBNRelu(init_ch,1,stride=stride) #
        self.c2_2=ConvBNRelu(init_ch,3)
        self.c3_1=ConvBNRelu(init_ch,1,stride=stride) #
        self.c3_2=ConvBNRelu(init_ch,5)
        self.c4_1=MaxPooling2D((3,3),strides=(1,1),padding='same')
        self.c4_2=ConvBNRelu(init_ch,1,stride=stride)  #

    def call(self,x):
        x1=self.c1(x)  # x=self.c1(x)
        x2=self.c2_1(x)
        x2=self.c2_2(x2)
        x3=self.c3_1(x)
        x3=self.c3_2(x3)
        x4=self.c4_1(x)
        x4=self.c4_2(x4)
        y=tf.concat([x1,x2,x3,x4],axis=3)
        return y


class Inception10(Model):
    def __init__(self,num_classes,block_n,init_ch=16):
        super(Inception10,self).__init__()
        self.channel=init_ch   # 卷积核个数
        self.blocks=Sequential()
        self.c1=ConvBNRelu(16,3)  # 最开头的那一层
        for i in range(block_n):
            for j in range(2):
                if j%2==0:
                    block=Inception(self.channel,stride=2)
                else:
                    block=Inception(self.channel,stride=1)
                self.blocks.add(block)   # 
            self.channel*=2     # stride=2会使特征图size变小，通过增加channnel来表示更多信息
        self.pool=GlobalAveragePooling2D()
        self.dense=Dense(num_classes,activation='softmax')

    def call(self,x):
        x=self.c1(x)
        x = self.blocks(x)
        x = self.pool(x)
        x = self.dense(x)
        return x

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train,x_test = x_train/255.0,x_test/255.0


model = Inception10(10,block_n=2)
model.compile(optimizer='Adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
              ,metrics=['sparse_categorical_accuracy'])

checkpoint_save_path="./checkpoints/Inception10_gai2.ckpt"
if os.path.exists(checkpoint_save_path+'.index'):
    model.load_weights(checkpoint_save_path)
    print("---------------------Loaded model---------------")

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True
                                              ,save_best_only=True, verbose=1)


history=model.fit(x_train,y_train,batch_size=32, epochs=5, validation_data=(x_test, y_test)
          ,validation_freq=1,callbacks=[cp_callback])
model.summary()

file=open('./checkpoints/weights_Inception10_gai2.txt','w')
for v in model.trainable_variables:
    file.write(str(v.name)+'\n')
    file.write(str(v.shape)+'\n')
    file.write(str(v.numpy())+'\n')
file.close()

train_acc=history.history['sparse_categorical_accuracy']
val_acc=history.history['val_sparse_categorical_accuracy']
loss=history.history['loss']
val_loss=history.history['val_loss']

plt.subplot(1,2,1)
plt.plot(loss,label='train_loss')
plt.plot(val_loss,label='val_loss')
plt.title('model loss')
plt.legend()

plt.subplot(1,2,2)
plt.plot(train_acc,label='train_acc')
plt.plot(val_acc,label='val_acc')
plt.title('model acc')
plt.legend()
plt.show()

ResNet

提出了层间残差跳连，引入了前方信息，缓解梯度消失，使神经网络层数增加成为可能

单纯堆叠神经网络层数会使神经网络模型退化，以至于后边的特征丢失了前边特征的原本模样

用了一根跳连线，将前边的特征直接接到了后边，使输出结果H(x)包含了堆叠卷积的非线性输出F(x)，和跳过这两层堆叠卷积、直接连接过来的恒等映射x，让它们对应元素相加。这一操作有效缓解了神经网络模型堆叠导致的退化，使得神经网络可以向着更深层级发展。

ResNet块中有两种情况，一种是用下图实线表示，两层堆叠卷积没有改变特征图的维度，即特征图的个数、高、宽和深度都相同，可以直接将F(x)与x相加。另一种情况用虚线表示，两层堆叠卷积改变了特征图的维度，需要借助1*1的卷积来调整x的维度，使W(x)与F(x)维度一致

如果堆叠卷积层前后维度不同，residual_path=1，使用1*1卷积操作调整输入特征图inputs的尺寸或深度后，将堆叠卷积输出特征y和if语句计算出的residual相加，过激活，输出如果堆叠卷积层前后维度相同，直接将堆叠卷积输出特征y和输入特征图inputs相加，过激活，输出。下面的黄色框就是一个block，橘黄色框里有两种不同情况的block

ResNet18：8个ResNet块,每一个ResNet块有两层卷积，一共是18层网络。为了加速模型收敛，把batch_size调到128

import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.layers import Dense, Dropout,MaxPooling2D,Flatten,Conv2D,BatchNormalization,\
    Activation,Flatten,GlobalAveragePooling2D
from tensorflow.keras import Model, Sequential
import os
import numpy as np

# np.set_printoptions(threshold=np.inf)


class ResNetBlock(Model):
    def __init__(self, filters, stride,residual):
        super(ResNetBlock, self).__init__()
        self.filters = filters
        self.strides = stride
        self.residual = residual

        self.conv1 = Conv2D(self.filters, 3, strides=self.strides,padding='same',use_bias=False)
        self.bn1 = BatchNormalization()
        self.a1= Activation('relu')

        self.conv2 = Conv2D(self.filters, 3, strides=1,padding='same',use_bias=False)
        self.bn2 = BatchNormalization()

        if residual:
            self.conv3 = Conv2D(self.filters, 1, strides=self.strides,padding='same',use_bias=False)
            self.bn3 = BatchNormalization()
        self.a2 = Activation('relu')

    def call(self, inputs,*args, **kwargs):
        resi = inputs
        x = self.conv1(inputs)
        x = self.bn1(x)
        x = self.a1(x)

        x = self.conv2(x)
        x = self.bn2(x)

        if self.residual:
            y=self.conv3(inputs)    #
            y=self.bn3(y)
            resi=y

        return self.a2(x+resi)


class ResNet18(Model):
    def __init__(self,block_list,init_channels,num_classes):
        super(ResNet18, self).__init__()
        self.conv1 = Conv2D(64,3,strides=1,padding='same',use_bias=False)
        self.bn1 = BatchNormalization()
        self.a1 = Activation('relu')

        self.ll=len(block_list)
        self.blocks=Sequential()
        for block_i in range(self.ll):
            for layer in range(block_list[block_i]):
                if block_i !=0 and layer==0:
                    self.blocks.add(ResNetBlock(init_channels,2,residual=True))
                else:
                    self.blocks.add(ResNetBlock(init_channels,1,residual=False))
            init_channels=init_channels*2
        self.p1=GlobalAveragePooling2D()
        self.f1=Dense(num_classes,activation='softmax',kernel_regularizer=tf.keras.regularizers.l2())

    def call(self, inputs,*args, **kwargs):
        x = self.conv1(inputs)
        x=self.bn1(x)
        x = self.a1(x)
        x=self.blocks(x)
        x=self.p1(x)
        x=self.f1(x)
        return x





(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train,x_test = x_train/255.0,x_test/255.0


model = ResNet18([2,2,2,2],64,num_classes=10)
model.compile(optimizer='Adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
              ,metrics=['sparse_categorical_accuracy'])

checkpoint_save_path="./checkpoints/ResNet18.ckpt"
if os.path.exists(checkpoint_save_path+'.index'):
    model.load_weights(checkpoint_save_path)
    print("---------------------Loaded model---------------")

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True
                                              ,save_best_only=True, verbose=1)


history=model.fit(x_train,y_train,batch_size=32, epochs=5, validation_data=(x_test, y_test)
          ,validation_freq=1,callbacks=[cp_callback])
model.summary()

file=open('./checkpoints/weights_ResNet18txt','w')
for v in model.trainable_variables:
    file.write(str(v.name)+'\n')
    file.write(str(v.shape)+'\n')
    file.write(str(v.numpy())+'\n')
file.close()

train_acc=history.history['sparse_categorical_accuracy']
val_acc=history.history['val_sparse_categorical_accuracy']
loss=history.history['loss']
val_loss=history.history['val_loss']

plt.subplot(1,2,1)
plt.plot(loss,label='train_loss')
plt.plot(val_loss,label='val_loss')
plt.title('model loss')
plt.legend()

plt.subplot(1,2,2)
plt.plot(train_acc,label='train_acc')
plt.plot(val_acc,label='val_acc')
plt.title('model acc')
plt.legend()
plt.show()

可以比较一下这几个模型在cifar10上的表现，epoch=5，batchsize=32

Model	train_acc	val_acc
baseline	0.5469	0.4969
LeNet	0.4435	0.4407
AlexNet	0.6545	0.5186
VGGNet	0.7525	0.7039
Inception10	0.7927	0.7444
ResNet18	0.8688	0.7946

经典卷积网络小结

class6

循环核

有些数据是和时间序列相关的，是可以根据上午预测出下文的

给你一段话，鱼离不开_,你可能下意识会说水，因为你记住了前面的四个，可以推出大概率是水

输入\(x_t\)维度和输出\(y_t\)的维度，以及循环核的个数确定，三个参数矩阵的维度也就确定了

每一个循环核构成一个循环计算层

TF描述循环计算层

return_sequences设为False，True的区别如下（布尔。是返回输出序列中的最后一个输出，还是返回完整序列。默认值：False。）

对输入的样本维度有要求

循环计算过程 I

记忆体的个数为3，\(W_{hx},W_{hh},W_{hy}\)是训练好的参数。过tanh激活函数后得到当前时刻的状态信息ht

记忆体存储的状态信息被刷新为[-0.9,0.8,0.7],然后输出yt是把提取到的时间信息，通过全连接进行识别预测的过程，是整个网络的输出层

模型认为有91%的概率输出c。下面看一下代码实现

字母预测onehot_1pre1

用RNN实现输入一个字母，预测下一个字母（One hot 编码）

SimpleRNN(3),  # 这里可以自行调节记忆体个数
Dense(5, activation='softmax')  # 一层全连接，实现输出层yt的计算，由于要映射到独立热编码，找到最大概率字母，所以=5

代码

循环计算过程2

前面是输入是一个字母，预测下一个字母。现在感受一下把循环核按时间步展开，连续输入多个字母预测下一个字母。仍然使用三个记忆体，初始为0，用一套训练好的参数矩阵，带你感受循环计算的前向传播

输入b更新记忆体

这四个时间步中所用到的参数矩阵，Wxh和bh是完全相同的。输出预测通过全连接完成

百分之70概率是a，预测正确。

字母预测onehot_4pre1

需要修改的地方不多

代码如下

import os

import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import SimpleRNN, Dense, Dropout
import numpy as np
import matplotlib.pyplot as plt


inputs_word='abcde'
word_id={'a':0,'b':1,'c':2,'d':3,'e':4}
id_onehot={0:[1.,0.,0.,0.,0.],1:[0.,1.,0.,0.,0.],2:[0.,0.,1.,0.,0.],3:[0.,0.,0.,1.,0.],4:[0.,0.,0.,0.,1.]}

x_train=[
    [id_onehot[word_id['a']],id_onehot[word_id['b']],id_onehot[word_id['c']],id_onehot[word_id['d']]],
    [id_onehot[word_id['b']],id_onehot[word_id['c']],id_onehot[word_id['d']],id_onehot[word_id['e']]],
    [id_onehot[word_id['c']],id_onehot[word_id['d']],id_onehot[word_id['e']],id_onehot[word_id['a']]],
    [id_onehot[word_id['d']],id_onehot[word_id['e']],id_onehot[word_id['a']],id_onehot[word_id['b']]],
    [id_onehot[word_id['e']],id_onehot[word_id['a']],id_onehot[word_id['b']],id_onehot[word_id['c']]]
]
# y_train=[id_onehot[word_id['e']],id_onehot[word_id['a']],id_onehot[word_id['b']],id_onehot[word_id['c']],id_onehot[word_id['d']]] #错的
y_train=[word_id['e'],word_id['a'],word_id['b'],word_id['c'],word_id['d']]
# x_train=np.array(x_train)
# y_train=np.array(y_train)
# print(x_train.shape)  # (4, 4, 5)
# print(y_train.shape)  # (4, 5)
np.random.seed(7)
np.random.shuffle(x_train)
np.random.seed(7)
np.random.shuffle(y_train)
tf.random.set_seed(7)


x_train=np.reshape(x_train,[len(x_train),4,5])
y_train=np.array(y_train)


model=Sequential([
    SimpleRNN(3),
    Dense(units=5,activation='softmax')
])

model.compile(tf.keras.optimizers.Adam(0.01),loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

checkpoint_path="./checkpoint/rnn_ont4pre1.ckpt"
if os.path.exists(checkpoint_path+'.index'):
    print('---------------------load  model-------------------------')
    model.load_weights(checkpoint_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(checkpoint_path,monitor='loss',
                                                 save_best_only=True,save_weights_only=True)

history=model.fit(x_train,y_train,batch_size=32,epochs=100,callbacks=[cp_callback])
model.summary()

acc = history.history['sparse_categorical_accuracy']
loss = history.history['loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.title('Training Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.title('Training Loss')
plt.legend()
plt.show()

input_s=int(input("Enter the number "))
for i in range(input_s):
    x=input('输入一串字符串，长度为4')
    x_pre=[id_onehot[word_id[a]] for a in x]
    x_pre=np.reshape(x_pre,(1,4,5))
    y_pred=model.predict(x_pre)
    y_pred=np.argmax(y_pred,axis=1)
    y=int(y_pred)
    print(inputs_word[y])

Embedding编码

用RNN实现输入一个字母，预测下一个字母（Embedding 编码）

全部代码如下

import os

import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import SimpleRNN, Dense, Dropout,Embedding
import numpy as np
import matplotlib.pyplot as plt


inputs_word='abcde'
word_id={'a':0,'b':1,'c':2,'d':3,'e':4}

x_train=[word_id['a'],word_id['b'],word_id['c'],word_id['d'],word_id['e']]
y_train=[word_id['b'],word_id['c'],word_id['d'],word_id['e'],word_id['a']]

np.random.seed(7)
np.random.shuffle(x_train)
np.random.seed(7)
np.random.shuffle(y_train)
tf.random.set_seed(7)


x_train=np.reshape(x_train,[len(x_train),1])
y_train=np.array(y_train)


model=Sequential([
    Embedding(5,3),   # (5,2),(5,5),(5,4)  第一个数是字典长度，即输入数据最大下标+1。第二个数可以随意，表示嵌入的维度
    SimpleRNN(3),
    Dense(units=5,activation='softmax')
])

model.compile(tf.keras.optimizers.Adam(0.01),loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

checkpoint_path="./checkpoint/rnn_Embed1pre1.ckpt"
if os.path.exists(checkpoint_path+'.index'):
    print('---------------------load  model-------------------------')
    model.load_weights(checkpoint_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(checkpoint_path,monitor='loss',
                                                 save_best_only=True,save_weights_only=True)

history=model.fit(x_train,y_train,batch_size=32,epochs=100,callbacks=[cp_callback])
model.summary()

acc = history.history['sparse_categorical_accuracy']
loss = history.history['loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.title('Training Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.title('Training Loss')
plt.legend()
plt.show()

input_s=int(input("Enter the number "))
for i in range(input_s):
    al=input('输入一个字符串')
    x=word_id[al]
    x=np.reshape(x,(1,1))
    y=model.predict(x)
    y=np.argmax(y,axis=1)
    y=int(y)
    print(al+'--->'+inputs_word[y])

用Embedding预测4pre1

用RNN实现输入连续四个字母，预测下一个字母（Embedding 编码）

增加了数据范围