def batch_norm(layer, **kwargs): """ Apply batch normalization to an existing layer. This is a convenience function modifying an existing layer to include batch normalization: It will steal the layer's nonlinearity if there is one (effectively introducing the normalization right before the nonlinearity), remove the layer's bias if there is one (because it would be redundant), and add a :class:`BatchNormLayer` and :class:`NonlinearityLayer` on top. Parameters ---------- layer : A :class:`Layer` instance The layer to apply the normalization to; note that it will be irreversibly modified as specified above **kwargs Any additional keyword arguments are passed on to the :class:`BatchNormLayer` constructor. Returns ------- BatchNormLayer or NonlinearityLayer instance A batch normalization layer stacked on the given modified `layer`, or a nonlinearity layer stacked on top of both if `layer` was nonlinear. Examples -------- Just wrap any layer into a :func:`batch_norm` call on creating it: >>> from lasagne.layers import InputLayer, DenseLayer, batch_norm >>> from lasagne.nonlinearities import tanh >>> l1 = InputLayer((64, 768)) >>> l2 = batch_norm(DenseLayer(l1, num_units=500, nonlinearity=tanh)) This introduces batch normalization right before its nonlinearity: >>> from lasagne.layers import get_all_layers >>> [l.__class__.__name__ for l in get_all_layers(l2)] ['InputLayer', 'DenseLayer', 'BatchNormLayer', 'NonlinearityLayer'] """ nonlinearity = getattr(layer, 'nonlinearity', None) if nonlinearity is not None: layer.nonlinearity = lasagne.nonlinearities.identity if hasattr(layer, 'b') and layer.b is not None: del layer.params[layer.b] layer.b = None layer = BatchNormLayer(layer, **kwargs) if nonlinearity is not None: layer = L.NonlinearityLayer(layer, nonlinearity) return layer
源代码地址:
https://gitee.com/devilmaycry812839668/rllab/blob/master/rllab/core/lasagne_layers.py
=================================================
这是经典reinforcement learning框架rllab中的batch_norm的使用。可以看到,在对一个线性层进行batch_norm的时候是不先对线性层的输出进行非线性变换的,而是先对其进行batch_norm,然后再进行非线性变换。而且要注意这里使用的线性层是不使用偏置参数b的,形象的来说,这里建议使用的对 tanh(w*x+b) 的 batch_norm 是这样运行的:
tanh( batch_norm( w*x ) )
而不是:
batch_norm( tanh( w*x + b ) )
This introduces batch normalization right before its nonlinearity:
It will steal the layer's nonlinearity if there is one (effectively introducing the normalization right before the nonlinearity), remove the layer's bias if there is one (because it would be redundant)
=================================================
标签:layer,nonlinearity,batch,norm,BatchNormLayer,强化,normalization From: https://www.cnblogs.com/devilmaycry812839668/p/17468231.html