• 【python实现卷积神经网络】卷积层Conv2D反向传播过程


    代码来源:https://github.com/eriklindernoren/ML-From-Scratch

    卷积神经网络中卷积层Conv2D(带stride、padding)的具体实现:https://www.cnblogs.com/xiximayou/p/12706576.html

    激活函数的实现(sigmoid、softmax、tanh、relu、leakyrelu、elu、selu、softplus):https://www.cnblogs.com/xiximayou/p/12713081.html

    损失函数定义(均方误差、交叉熵损失):https://www.cnblogs.com/xiximayou/p/12713198.html

    优化器的实现(SGD、Nesterov、Adagrad、Adadelta、RMSprop、Adam):https://www.cnblogs.com/xiximayou/p/12713594.html

    本节将根据代码继续学习卷积层的反向传播过程。

    这里就只贴出Conv2D前向传播和反向传播的代码了:

    def forward_pass(self, X, training=True):
            batch_size, channels, height, width = X.shape
            self.layer_input = X
            # Turn image shape into column shape
            # (enables dot product between input and weights)
            self.X_col = image_to_column(X, self.filter_shape, stride=self.stride, output_shape=self.padding)
            # Turn weights into column shape
            self.W_col = self.W.reshape((self.n_filters, -1))
            # Calculate output
            output = self.W_col.dot(self.X_col) + self.w0
            # Reshape into (n_filters, out_height, out_width, batch_size)
            output = output.reshape(self.output_shape() + (batch_size, ))
            # Redistribute axises so that batch size comes first
            return output.transpose(3,0,1,2)
    
        def backward_pass(self, accum_grad):
            # Reshape accumulated gradient into column shape
            accum_grad = accum_grad.transpose(1, 2, 3, 0).reshape(self.n_filters, -1)
    
            if self.trainable:
                # Take dot product between column shaped accum. gradient and column shape
                # layer input to determine the gradient at the layer with respect to layer weights
                grad_w = accum_grad.dot(self.X_col.T).reshape(self.W.shape)
                # The gradient with respect to bias terms is the sum similarly to in Dense layer
                grad_w0 = np.sum(accum_grad, axis=1, keepdims=True)
    
                # Update the layers weights
                self.W = self.W_opt.update(self.W, grad_w)
                self.w0 = self.w0_opt.update(self.w0, grad_w0)
    
            # Recalculate the gradient which will be propogated back to prev. layer
            accum_grad = self.W_col.T.dot(accum_grad)
            # Reshape from column shape to image shape
            accum_grad = column_to_image(accum_grad,
                                    self.layer_input.shape,
                                    self.filter_shape,
                                    stride=self.stride,
                                    output_shape=self.padding)
    
            return accum_grad

    而在定义卷积神经网络中是在neural_network.py中  

       def train_on_batch(self, X, y):
            """ Single gradient update over one batch of samples """
            y_pred = self._forward_pass(X)
            loss = np.mean(self.loss_function.loss(y, y_pred))
            acc = self.loss_function.acc(y, y_pred)
            # Calculate the gradient of the loss function wrt y_pred
            loss_grad = self.loss_function.gradient(y, y_pred)
            # Backpropagate. Update weights
            self._backward_pass(loss_grad=loss_grad)
    
            return loss, acc

    还需要看一下self._forward_pas和self._backward_pass:

        def _forward_pass(self, X, training=True):
            """ Calculate the output of the NN """
            layer_output = X
            for layer in self.layers:
                layer_output = layer.forward_pass(layer_output, training)
    
            return layer_output
    
        def _backward_pass(self, loss_grad):
            """ Propagate the gradient 'backwards' and update the weights in each layer """
            for layer in reversed(self.layers):
                loss_grad = layer.backward_pass(loss_grad)

    我们可以看到,在前向传播中会计算出self.layers中每一层的输出,把包括卷积、池化、激活和归一化等。然后在反向传播中从后往前更新每一层的梯度。这里我们以一个卷积层+全连接层+损失函数为例。网络前向传播完之后,最先获得的梯度是损失函数的梯度。然后将损失函数的梯度传入到全连接层,然后获得全连接层计算的梯度,传入到卷积层中,此时调用卷积层的backward_pass()方法。在卷积层中的backward_pass()方法中,如果设置了self.trainable,那么会计算出对权重W以及偏置项w0的梯度,然后使用优化器optmizer,也就是W_opt和w0_opt进行参数的更新,然后再计算对前一层的梯度。最后有一个colun_to_image()方法。

    def column_to_image(cols, images_shape, filter_shape, stride, output_shape='same'):
        batch_size, channels, height, width = images_shape
        pad_h, pad_w = determine_padding(filter_shape, output_shape)
        height_padded = height + np.sum(pad_h)
        width_padded = width + np.sum(pad_w)
        images_padded = np.empty((batch_size, channels, height_padded, width_padded))
    
        # Calculate the indices where the dot products are applied between weights
        # and the image
        k, i, j = get_im2col_indices(images_shape, filter_shape, (pad_h, pad_w), stride)
    
        cols = cols.reshape(channels * np.prod(filter_shape), -1, batch_size)
        cols = cols.transpose(2, 0, 1)
        # Add column content to the images at the indices
        np.add.at(images_padded, (slice(None), k, i, j), cols)
    
        # Return image without padding
        return images_padded[:, :, pad_h[0]:height+pad_h[0], pad_w[0]:width+pad_w[0]]

    该方法是将之间为了方便计算卷积进行的形状改变image_to_column()重新恢复成images_padded的格式。

    像这种计算期间的各种的形状的变换就挺让人头疼的,还会碰到numpy中各式各样的函数,需要去查阅相关的资料。只要弄懂其中大致过程就可以了,加深相关知识的理解。

  • 相关阅读:
    数据结构实现(四)二叉查找树java实现
    数据结构实现(三)二叉树
    git
    抓包原理
    数据结构实现(二)队列
    86. Partition List
    82. Remove Duplicates from Sorted List II
    83. Remove Duplicates from Sorted List
    排序算法总结
    上下文切换详解
  • 原文地址:https://www.cnblogs.com/xiximayou/p/12713930.html
Copyright © 2020-2023  润新知