首页 > 其他分享 >机器学习 -> Machine Learning (II)

机器学习 -> Machine Learning (II)

时间:2023-08-29 14:45:29浏览次数:41  
标签:layers II transformer make Machine train Learning import history

这次来学习深度学习吧!

1 训练前

1.1 神经元与神经网络

神经元是神经网络的基本单位, 模拟了生物神经元的工作机制. 每个神经元接受一组输入, 将这些输入与其权重相乘, 然后对所有的乘积求和, 并加上一个偏置. 最后, 将得到的结果传递给激活函数.

神经网络由多个神经元组成, 这些神经元按层次结构排列: 输入层, 一个或多个隐藏层和输出层. 神经网络可以学习从输入到输出的映射, 这通常涉及大量的数据和迭代训练.

1.2 激活函数

是一个非线性函数, 用于确定神经元的输出. 由于激活函数的非线性特性, 多层神经网络能够捕捉并学习非线性关系. 常见的激活函数包括:

  1. ReLU
  2. Sigmoid
  3. Softmax
  4. Tanh

1.3 层

输入层是神经网络的第一层, 负责接收特征数据. 它不进行任何计算, 只是传递数据.

输出层是神经网络的最后一层, 负责生成最终的预测或分类结果. 输出层的神经元数量取决于任务类型.

在输入层和输出层之间的层称为隐藏层. 这些层的神经元执行计算, 并使用激活函数来确定其输出. 神经网络可以有多个隐藏层, 这使得网络能够学习更复杂的表示.

全连接层 (dense) 指的是每一个神经元与前一层的所有神经元都连接.

1.4 损失函数与优化器

也称为代价函数或目标函数, 损失函数量化模型预测的结果与实际标签之间的差异. 它为模型提供了一个明确的反馈, 告诉模型它的预测有多差. 常见的损失函数包括:

  1. 均方误差
  2. 交叉熵
  3. Hinge Loss

一旦我们有了损失函数来度量模型的误差, 我们就需要一种方法来调整模型的权重和偏置, 以减少这种误差. 这就是优化器的作用, 优化器使用损失函数的梯度来决定如何更新模型的参数. 常见的优化器包括:

  1. 梯度下降
  2. 随机梯度下降
  3. Momentum
  4. Adam
  5. RMSProp
  6. Adagrad

1.5 批次, 轮次与迭代

批次 (Batch) 是数据集的一个子集, 用于在模型上进行一次前向和后向传播.

轮次 (Epoch) 是整个数据集完整地通过神经网络一次的过程.

迭代 (Step / Iteration) 通常是指模型在一个批次上的一次前向和后向传播.

如果数据集有 1000 个样本, 且选择的批次大小为 100, 则每个轮次需要 10 次迭代.

这些需要手动设定并可能影响模型的训练效率和最终性能的参数称为超参数.

2 训练中

2.1 过拟合

如图, 像这种损失函数变化缓慢甚至不减反增的情况表明出现了过拟合, 通常采用提前终止的方法.

early_stopping = callbacks.EarlyStopping(
    min_delta=0.001, # minimium amount of change to count as an improvement
    patience=5, # how many epochs to wait before stopping
    restore_best_weights=True,
)

2.2 随机失活与批标准化

Dropout 是一种正则化技巧, 其在训练过程中随机 "丢弃" 神经元, 从而减少过拟合.

layers.Dropout(rate=0.3)

Batch Normalization 用于规范化前一层的输出, 使其具有近似的均值和方差, 以便提高训练的稳定性, 这可以应对数据复杂收敛值过大的情况.

layers.BatchNormalization()

3 实例

3.1 二分类

Hotel Cancellations
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.pipeline import make_pipeline
from sklearn.compose import make_column_transformer

hotel = pd.read_csv('../input/dl-course-data/hotel.csv')

X = hotel.copy()
y = X.pop('is_canceled')

X['arrival_date_month'] = \
    X['arrival_date_month'].map(
        {'January':1, 'February': 2, 'March':3,
         'April':4, 'May':5, 'June':6, 'July':7,
         'August':8, 'September':9, 'October':10,
         'November':11, 'December':12}
    )

features_num = [
    "lead_time", "arrival_date_week_number",
    "arrival_date_day_of_month", "stays_in_weekend_nights",
    "stays_in_week_nights", "adults", "children", "babies",
    "is_repeated_guest", "previous_cancellations",
    "previous_bookings_not_canceled", "required_car_parking_spaces",
    "total_of_special_requests", "adr",
]
features_cat = [
    "hotel", "arrival_date_month", "meal",
    "market_segment", "distribution_channel",
    "reserved_room_type", "deposit_type", "customer_type",
]

transformer_num = make_pipeline(
    SimpleImputer(strategy="constant"), # there are a few missing values
    StandardScaler(),
)
transformer_cat = make_pipeline(
    SimpleImputer(strategy="constant", fill_value="NA"),
    OneHotEncoder(handle_unknown='ignore'),
)

preprocessor = make_column_transformer(
    (transformer_num, features_num),
    (transformer_cat, features_cat),
)

# stratify - make sure classes are evenlly represented across splits
X_train, X_valid, y_train, y_valid = \
    train_test_split(X, y, stratify=y, train_size=0.75)

X_train = preprocessor.fit_transform(X_train)
X_valid = preprocessor.transform(X_valid)

input_shape = [X_train.shape[1]]

构建层

model = keras.Sequential([
    layers.BatchNormalization(input_shape=input_shape),
    layers.Dense(256, activation='relu'),
    layers.BatchNormalization(),
    layers.Dropout(rate=0.3),
    layers.Dense(256, activation='relu'),    
    layers.BatchNormalization(),
    layers.Dropout(rate=0.3),
    layers.Dense(1, activation='sigmoid'),
])

构建损失函数与优化器

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['binary_accuracy'],
)

训练

early_stopping = keras.callbacks.EarlyStopping(
    patience=5,
    min_delta=0.001,
    restore_best_weights=True,
)
history = model.fit(
    X_train, y_train,
    validation_data=(X_valid, y_valid),
    batch_size=512,
    epochs=200,
    callbacks=[early_stopping],
)
history_df = pd.DataFrame(history.history)
history_df.loc[:, ['loss', 'val_loss']].plot(title="Cross-entropy")
history_df.loc[:, ['binary_accuracy', 'val_binary_accuracy']].plot(title="Accuracy")
完整代码
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.pipeline import make_pipeline
from sklearn.compose import make_column_transformer

hotel = pd.read_csv('../input/dl-course-data/hotel.csv')

X = hotel.copy()
y = X.pop('is_canceled')

X['arrival_date_month'] = \
    X['arrival_date_month'].map(
        {'January':1, 'February': 2, 'March':3,
         'April':4, 'May':5, 'June':6, 'July':7,
         'August':8, 'September':9, 'October':10,
         'November':11, 'December':12}
    )

features_num = [
    "lead_time", "arrival_date_week_number",
    "arrival_date_day_of_month", "stays_in_weekend_nights",
    "stays_in_week_nights", "adults", "children", "babies",
    "is_repeated_guest", "previous_cancellations",
    "previous_bookings_not_canceled", "required_car_parking_spaces",
    "total_of_special_requests", "adr",
]
features_cat = [
    "hotel", "arrival_date_month", "meal",
    "market_segment", "distribution_channel",
    "reserved_room_type", "deposit_type", "customer_type",
]

transformer_num = make_pipeline(
    SimpleImputer(strategy="constant"), # there are a few missing values
    StandardScaler(),
)
transformer_cat = make_pipeline(
    SimpleImputer(strategy="constant", fill_value="NA"),
    OneHotEncoder(handle_unknown='ignore'),
)

preprocessor = make_column_transformer(
    (transformer_num, features_num),
    (transformer_cat, features_cat),
)

# stratify - make sure classes are evenlly represented across splits
X_train, X_valid, y_train, y_valid = \
    train_test_split(X, y, stratify=y, train_size=0.75)

X_train = preprocessor.fit_transform(X_train)
X_valid = preprocessor.transform(X_valid)

input_shape = [X_train.shape[1]]

from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
    layers.BatchNormalization(input_shape=input_shape),
    layers.Dense(256, activation='relu'),
    layers.BatchNormalization(),
    layers.Dropout(rate=0.3),
    layers.Dense(256, activation='relu'),    
    layers.BatchNormalization(),
    layers.Dropout(rate=0.3),
    layers.Dense(1, activation='sigmoid'),
])

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['binary_accuracy'],
)

early_stopping = keras.callbacks.EarlyStopping(
    patience=5,
    min_delta=0.001,
    restore_best_weights=True,
)
history = model.fit(
    X_train, y_train,
    validation_data=(X_valid, y_valid),
    batch_size=512,
    epochs=200,
    callbacks=[early_stopping],
)

history_df = pd.DataFrame(history.history)
history_df.loc[:, ['loss', 'val_loss']].plot(title="Cross-entropy")
history_df.loc[:, ['binary_accuracy', 'val_binary_accuracy']].plot(title="Accuracy")

标签:layers,II,transformer,make,Machine,train,Learning,import,history
From: https://www.cnblogs.com/Arcticus/p/17664483.html

相关文章

  • jts learning
    JTS简介JTS提供了一套操作几何向量的java类库。早期版本com.vividsolutions,已废弃不在维护。现在版本com.locationtech.jts由eclipse开源基金会托管使用说明入门指导GIS开发入门指导jts-core核心库使用说明jts-core核心类库使用说明具体结合示例代码详细介绍JTS下面核心......
  • IPQ4019 IPQ4029 IPQ6010|IIOT|5G and WiFi 6:Application in Business and Industry
    5GandWiFi6:Application inBusinessandIndustryIntroductionAstheworldhurtlestowardsaneraofunprecedenteddigitaltransformation,twotechnologiesstandattheforefront,poisedtoreshapethelandscapeofbusinessandindustry:5GandWiFi6.Th......
  • Proj CDeepFuzz Paper Reading: Deepxplore: Automated whitebox testing of deep lea
    Abstract背景:现有的深度学习测试在很⼤程度上依赖于⼿动标记的数据,因此通常⽆法暴露罕⻅输⼊的错误⾏为。本文:DeepXploreTask:awhite-boxframeworktotestDLModels方法:neuroncoveragedifferentialtestingwithmultipleDLsystems(models)joint-optimizationpro......
  • hdu:Machine Schedule(二分图匹配)
    ProblemDescriptionAsweallknow,machineschedulingisaveryclassicalproblemincomputerscienceandhasbeenstudiedforaverylonghistory.Schedulingproblemsdifferwidelyinthenatureoftheconstraintsthatmustbesatisfiedandthetypeof......
  • 剑指 Offer 10- II. 青蛙跳台阶问题(简单)
    题目:classSolution{public:intnumWays(intn){vector<int>dp(n+1,1);for(inti=2;i<=n;i++){dp[i]=(dp[i-1]+dp[i-2])%1000000007;}returndp[n];}};......
  • iic和spi简记
    IIC通信协议两线式串行总线,多用于主控制器和从器件间的主从通信,在小数据量场合使用,有传输距离短,任意时刻只能有一个主机等特性。  SDA(Serialdata)数据线,D代表Data也就是数据,SendData也就是用来传输数据的SCL(Serialclockline)时钟线,C代表Clock也就是时钟也就是......
  • 剑指Offer 32 - III. 从上到下打印二叉树
    题目链接:剑指Offer32-III.从上到下打印二叉树题目描述:请实现一个函数按照之字形顺序打印二叉树,即第一行按照从左到右的顺序打印,第二层按照从右到左的顺序打印,第三行再按照从左到右的顺序打印,其他行以此类推。解法思路:本题在一题的基础上,区分打印方向,加一个bool型的方向变......
  • 剑指Offer 32 - II. 从上到下打印二叉树 II
    题目链接:剑指Offer32-II.从上到下打印二叉树II题目描述:从上到下按层打印二叉树,同一层的节点按从左到右的顺序打印,每一层打印到一行。解法思路:本题与上题的唯一区别就是在输出的时候,要将同一层的数输出在一行,这就意味着你需要知道哪些数是在一行的;这里巧妙的利用求队列......
  • 机器学习 -> Machine Learning (I)
    1机器学习概述1.1定义及应用领域机器学习是一种让计算机通过经验学习并对输入数据做出决策或预测的方法.它是人工智能的一个重要分支,已广泛应用于各种领域,如自然语言处理,计算机视觉,推荐系统,医疗诊断,金融风险预测等.1.2机器学习与人工智能,深度学习的关系人......
  • 剑指 Offer 68 - II. 二叉树的最近公共祖先(简单)
    题目:classSolution{public:TreeNode*lowestCommonAncestor(TreeNode*root,TreeNode*p,TreeNode*q){if(root==p||root==q||root==nullptr)returnroot;//如果当前节点为空或者当前节点即为其中某个指定节点TreeNode*left=lowestCommo......