Deep-Learning-Course/FourthSaturday/Notes.md

Convolutional Neural Network
============================
Hubel and Weisel(1962) experiment -> inspiration for CNN
single neuron detects edges oriented at 45degress

filter kernel -> (a patch in image - matrix) (typical to 3x3|5x5|7x7
                 smaller the better)
                 returns a feature map
CNN -> multiple layers of kernels(1st layer computes on the input image,
       subsequent layers computes on the feature maps generated by the previous
       layer)
strides -> amount of pixels to overlap between kernel computation on the same
           layer
(max) pooling kernel -> looks at a patch of image and
                        returns the (maximum) value in that patch
                        (doesn't have any learnable parameters)
                        usually the number of feature maps is doubled after a
                        pooling layer is computed
                        maps (n,n)eg.[(28x28)x128] -> (mxm)eg.[(14,14)x128] -> (x256)

No of weight required per layer = (k1xk1)xc1xc2 (c1 is channels in input layer)
                                  (k1,k1) is the dimension of filter kernel
                                  (c2 is number of feature maps in first layer)
                                  -> in 1st layer
                                  (k2,k2)xc2xc3 (c3) number of feature maps

conv2d -> padding 'same' adds 0's at the borders to make the output
          dimension same as image size
          'valid' does the convolution one actual pixels alone -> will return
          a smaller dimension relative to the image


technique: use a smaller train/test data and try to overfit the model
           (100% on train to verify that the model is expressive enough
           to learn the data)

Deconvolutional Layers(misnomer):
upsampling an image using this layer
(tf.layers.conv2d_transpose,tf.nn.conv2d_transpose)


Transfer Learning:
==================
using pretrained networks as starting point for a task (using a subset of layers)
eg. VGG(Visual Geometry Group) networks (224x224 -> 1000 classes)
    -> classification(what) & localization(where)
CNN works great for classification(since it is invariant to location)
to predict the location (use the earlier layers(cotains locality info)
for final output)
using it to identify a class not in the 1000 pretrained classes
using it to identify a class with input size 64x64(depends on the first layer filter size)


Regularization:
===============
Dropout based regularization is great for image classification application.
(Warning: not to be used on data without redundancy(image data has lot of redundancy
  eg. identifing a partial face is quite easy))
implemented conv2d of mnist from scratch added fourth week notes 2017-10-28 12:16:00 +00:00			`Convolutional Neural Network`
			`============================`
			`Hubel and Weisel(1962) experiment -> inspiration for CNN`
			`single neuron detects edges oriented at 45degress`

			`filter kernel -> (a patch in image - matrix) (typical to 3x3\|5x5\|7x7`
			`smaller the better)`
			`returns a feature map`
			`CNN -> multiple layers of kernels(1st layer computes on the input image,`
			`subsequent layers computes on the feature maps generated by the previous`
			`layer)`
			`strides -> amount of pixels to overlap between kernel computation on the same`
			`layer`
			`(max) pooling kernel -> looks at a patch of image and`
			`returns the (maximum) value in that patch`
			`(doesn't have any learnable parameters)`
			`usually the number of feature maps is doubled after a`
			`pooling layer is computed`
			`maps (n,n)eg.[(28x28)x128] -> (mxm)eg.[(14,14)x128] -> (x256)`

			`No of weight required per layer = (k1xk1)xc1xc2 (c1 is channels in input layer)`
			`(k1,k1) is the dimension of filter kernel`
			`(c2 is number of feature maps in first layer)`
			`-> in 1st layer`
			`(k2,k2)xc2xc3 (c3) number of feature maps`

			`conv2d -> padding 'same' adds 0's at the borders to make the output`
			`dimension same as image size`
			`'valid' does the convolution one actual pixels alone -> will return`
			`a smaller dimension relative to the image`


			`technique: use a smaller train/test data and try to overfit the model`
			`(100% on train to verify that the model is expressive enough`
			`to learn the data)`

			`Deconvolutional Layers(misnomer):`
			`upsampling an image using this layer`
			`(tf.layers.conv2d_transpose,tf.nn.conv2d_transpose)`


			`Transfer Learning:`
			`==================`
			`using pretrained networks as starting point for a task (using a subset of layers)`
			`eg. VGG(Visual Geometry Group) networks (224x224 -> 1000 classes)`
			`-> classification(what) & localization(where)`
			`CNN works great for classification(since it is invariant to location)`
			`to predict the location (use the earlier layers(cotains locality info)`
			`for final output)`
			`using it to identify a class not in the 1000 pretrained classes`
			`using it to identify a class with input size 64x64(depends on the first layer filter size)`



			`Regularization:`
			`===============`
			`Dropout based regularization is great for image classification application.`
			`(Warning: not to be used on data without redundancy(image data has lot of redundancy`
			`eg. identifing a partial face is quite easy))`