首页 > 其他分享 >MLPClassifier 隐藏层不包括输入和输出

时间:2023-09-08 14:03:05浏览次数:34  
标签:function constant MLPClassifier default solver rate learning 隐藏 输入


多层感知机(MLP,Multilayer Perceptron)也叫人工神经网络(ANN,Artificial Neural Network),除了输入输出层,它中间可以有多个隐层,最简单的MLP只含一个隐层,即三层的结构,如下图:

MLPClassifier 隐藏层不包括输入和输出_sed






f(W1X+b1),W1是权重(也叫连接系数),b1是偏置,函数f 可以是常用的sigmoid函数或者tanh函数:


MLPClassifier 隐藏层不包括输入和输出_机器学习_02



MLPClassifier 隐藏层不包括输入和输出_机器学习_03





class sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001, power_t=0.5, max_iter=200, shuffle=True, random_state=None, tol=0.0001, verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True, early_stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e-08, n_iter_no_change=10)[source]

Multi-layer Perceptron classifier.

This model optimizes the log-loss function using LBFGS or stochastic gradient descent.

New in version 0.18.


hidden_layer_sizes : tuple, length = n_layers - 2, default (100,)
The ith element represents the number of neurons in the ith hidden layer.
activation : {‘identity’, ‘logistic’, ‘tanh’, ‘relu’}, default ‘relu’
Activation function for the hidden layer.
• ‘identity’, no-op activation, useful to implement linear bottleneck, returns f(x) = x
• ‘logistic’, the logistic sigmoid function, returns f(x) = 1 / (1 + exp(-x)).
• ‘tanh’, the hyperbolic tan function, returns f(x) = tanh(x).
• ‘relu’, the rectified linear unit function, returns f(x) = max(0, x)
solver : {‘lbfgs’, ‘sgd’, ‘adam’}, default ‘adam’
The solver for weight optimization.
• ‘lbfgs’ is an optimizer in the family of quasi-Newton methods.
• ‘sgd’ refers to stochastic gradient descent.
• ‘adam’ refers to a stochastic gradient-based optimizer proposed by Kingma, Diederik, and Jimmy Ba
Note: The default solver ‘adam’ works pretty well on relatively large datasets (with thousands of training samples or more) in terms of both training time and validation score. For small datasets, however, ‘lbfgs’ can converge faster and perform better.
alpha : float, optional, default 0.0001
L2 penalty (regularization term) parameter.
batch_size : int, optional, default ‘auto’
Size of minibatches for stochastic optimizers. If the solver is ‘lbfgs’, the classifier will not use minibatch. When set to “auto”, batch_size=min(200, n_samples)
learning_rate : {‘constant’, ‘invscaling’, ‘adaptive’}, default ‘constant’
Learning rate schedule for weight updates.
• ‘constant’ is a constant learning rate given by ‘learning_rate_init’.
• ‘invscaling’ gradually decreases the learning rate learning_rate_ at each time step ‘t’ using an inverse scaling exponent of ‘power_t’. effective_learning_rate = learning_rate_init / pow(t, power_t)
• ‘adaptive’ keeps the learning rate constant to ‘learning_rate_init’ as long as training loss keeps decreasing. Each time two consecutive epochs fail to decrease training loss by at least tol, or fail to increase validation score by at least tol if ‘early_stopping’ is on, the current learning rate is divided by 5.
Only used when solver='sgd'.
learning_rate_init : double, optional, default 0.001
The initial learning rate used. It controls the step-size in updating the weights. Only used when solver=’sgd’ or ‘adam’.
power_t : double, optional, default 0.5
The exponent for inverse scaling learning rate. It is used in updating effective learning rate when the learning_rate is set to ‘invscaling’. Only used when solver=’sgd’.


MLPClassifier 隐藏层不包括输入和输出_激活函数_04



MLPClassifier 隐藏层不包括输入和输出_神经网络_05

MLPClassifier 隐藏层不包括输入和输出_python_06

MLPClassifier 隐藏层不包括输入和输出_神经网络_07

MLPClassifier 隐藏层不包括输入和输出_sed_08


neural networks 神经网络

activation function 激活函数

hyperbolic tangent 双曲正切函数

bias units 偏置项

activation 激活值

forward propagation 前向传播

feedforward neural network 前馈神经网络(参照Mitchell的《机器学习》的翻译)

From: https://blog.51cto.com/u_11908275/7409488


