The aim is to add noise during training of the network. For example:
1
2
3
with torch.no_grad():
for param in model.parameters():
param.add_(torch.randn(param.size()) * 0.1)
However, by doing so, the back propagation in Pytorch would be sabotaged:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation..
One possible workaround is to write a custom layer just like dropout’s implementation in Pytorch. The source code of Dropout in pytorch could be found as following:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class Dropout(_DropoutNd):
"""During training, randomly zeroes some of the elements of the input
tensor with probability :attr:`p` using samples from a Bernoulli
distribution. Each channel will be zeroed out independently on every forward
call.
This has proven to be an effective technique for regularization and
preventing the co-adaptation of neurons as described in the paper
`Improving neural networks by preventing co-adaptation of feature
detectors`_ .
Furthermore, the outputs are scaled by a factor of :math:`\frac{1}{1-p}` during
training. This means that during evaluation the module simply computes an
identity function.
Args:
p: probability of an element to be zeroed. Default: 0.5
inplace: If set to ``True``, will do this operation in-place.
Default: ``False``
Shape:
- Input: :math:`(*)`. Input can be of any shape
- Output: :math:`(*)`. Output is of the same shape as input
Examples::
>>> m = nn.Dropout(p=0.2)
>>> input = torch.randn(20, 16)
>>> output = m(input)
.. _Improving neural networks by preventing co-adaptation of feature
detectors: https://arxiv.org/abs/1207.0580
"""
def forward(self, input: Tensor) -> Tensor:
return F.dropout(input, self.p, self.training, self.inplace)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Activation functions
def dropout(input, p=0.5, training=True, inplace=False):
# type: (Tensor, float, bool, bool) -> Tensor
r"""
During training, randomly zeroes some of the elements of the input
tensor with probability :attr:`p` using samples from a Bernoulli
distribution.
See :class:`~torch.nn.Dropout` for details.
Args:
p: probability of an element to be zeroed. Default: 0.5
training: apply dropout if is ``True``. Default: ``True``
inplace: If set to ``True``, will do this operation in-place.
Default: ``False``
"""
if not torch.jit.is_scripting():
if type(input) is not Tensor and has_torch_function((input,)):
return handle_torch_function(
dropout, (input,), input, p=p, training=training, inplace=inplace)
if p < 0. or p > 1.:
raise ValueError("dropout probability has to be between 0 and 1, "
"but got {}".format(p))
return (_VF.dropout_(input, p, training)
if inplace
else _VF.dropout(input, p, training))
- However the rest of the problem is to find the
_VF
module: