Convolutional Neural Network

Hubel and Weisel(1962) experiment -> inspiration for CNN single neuron detects edges oriented at 45degress

filter kernel -> (a patch in image - matrix) (typical to 3x3|5x5|7x7 smaller the better) returns a feature map CNN -> multiple layers of kernels(1st layer computes on the input image, subsequent layers computes on the feature maps generated by the previous layer) strides -> amount of pixels to overlap between kernel computation on the same layer (max) pooling kernel -> looks at a patch of image and returns the (maximum) value in that patch (doesn't have any learnable parameters) usually the number of feature maps is doubled after a pooling layer is computed maps (n,n)eg.[(28x28)x128] -> (mxm)eg.[(14,14)x128] -> (x256)

No of weight required per layer = (k1xk1)xc1xc2 (c1 is channels in input layer) (k1,k1) is the dimension of filter kernel (c2 is number of feature maps in first layer) -> in 1st layer (k2,k2)xc2xc3 (c3) number of feature maps

conv2d -> padding 'same' adds 0's at the borders to make the output dimension same as image size 'valid' does the convolution one actual pixels alone -> will return a smaller dimension relative to the image

technique: use a smaller train/test data and try to overfit the model (100% on train to verify that the model is expressive enough to learn the data)

Deconvolutional Layers(misnomer): upsampling an image using this layer (tf.layers.conv2d_transpose,tf.nn.conv2d_transpose)

Transfer Learning:

using pretrained networks as starting point for a task (using a subset of layers) eg. VGG(Visual Geometry Group) networks (224x224 -> 1000 classes) -> classification(what) & localization(where) CNN works great for classification(since it is invariant to location) to predict the location (use the earlier layers(cotains locality info) for final output) using it to identify a class not in the 1000 pretrained classes using it to identify a class with input size 64x64(depends on the first layer filter size)

Regularization:

Dropout based regularization is great for image classification application. (Warning: not to be used on data without redundancy(image data has lot of redundancy eg. identifing a partial face is quite easy))

2.6 KiB Raw Blame History

Convolutional Neural Network

Transfer Learning:

Regularization:

2.6 KiB

Raw Blame History