diff --git a/LinearAlgebra.ipynb b/FirstSaturday/LinearAlgebra.ipynb similarity index 100% rename from LinearAlgebra.ipynb rename to FirstSaturday/LinearAlgebra.ipynb diff --git a/LinearAlgebra.md b/FirstSaturday/LinearAlgebra.md similarity index 100% rename from LinearAlgebra.md rename to FirstSaturday/LinearAlgebra.md diff --git a/ComputationalGraphs.ipynb b/FirstSunday/ComputationalGraphs.ipynb similarity index 100% rename from ComputationalGraphs.ipynb rename to FirstSunday/ComputationalGraphs.ipynb diff --git a/linear_regression_tf_lowlevel.py b/FirstSunday/linear_regression_tf_lowlevel.py similarity index 100% rename from linear_regression_tf_lowlevel.py rename to FirstSunday/linear_regression_tf_lowlevel.py diff --git a/SecondSaturday/Notes.md b/SecondSaturday/Notes.md new file mode 100644 index 0000000..556719f --- /dev/null +++ b/SecondSaturday/Notes.md @@ -0,0 +1,74 @@ +1. flowchart application -> ios mobile find symbol through gestures. +2. matrimony portal profiles -> finding matching profiles +3. swipe input keyboard for indian languages. +4. mnist hand-written digit database -> build application for recognizing full phone numbers(10 digit). +5. live cricket feed -> generating highlights of the match. +6. designing a chatbot for getting cricket scores. + + +# general approach to machine-learning +1. model, objective and learning algo +3. ml is technique for learning from examples + -> recommending smart-phone to a friend. price, branch, cam quality, screen size, processing speed. ==> model + -> objective ( why is he buying the phone) should be able to be boiled-down to a single number. + -> learning algo + binary features -> camera 0|1 + -> screen small|large + -> battery good|bad + -> memory high|low + -> processing fast|slow + -> audio good|bad + +prior probability -> probability of an event occuring without any knowledge of conditions + +P(A)*P(B|A) = P(B)*P(A/B) = P(A,B) + +P(+ve|[x1,x2,x3...xn]) = P([x1,x2...xn]|+ve)*P(+ve)/P([x1,x2...xn]) + = P(x1|+ve)*P(x2|+ve)*P(x3|+ve)*P(+ve)/P([x1,x2...xn]) + = Pi (i=1 to n) P(xi|+ve) * P(+ve)(Class variable)/ Sum (C=+ve to -ve) (P([x1,x2....xn],C)) + = Pi (i=1 to n) P(xi|+ve) * P(+ve)(Class variable) / (( Pi (i=1 to n ) P(xi|+ve)*P(+ve))+..+ Pi (i=1 to n) P(xi|-ve)*P(-ve)) + +P(X,Y) = Sum (y=y1...yn) P(X,Y=y) + +W2 = P(+ve|xi=1) (human approach) (Naive bayes) (naively thinking that all features are independent) + + +Regression : output contains real values -> (predicting the position of joints of a body given an image of a person) +Classification : output classifies to discrete set of classes + -> predicting the posture of a person (sitting,walking,standing,running) given an image of a person +(Numerical/Categorical) + +Representation Learning: embedding + +Deeplearning is all about hierarchical representation learning. + +Metric Learning: distance( of facial features)/similarity(of fashion apparels)/relevance( of search document) +Structured Output(Models): auto-corrects dependent outputs based on output on the upper hierarchy outputs. + +Types of input: +Bag of features,bag of words: ( finding whether a feature is present/not without caring where the feature occurs in the input) +eg: Using unsupervized learning to convert the input to a given set of classes(clusters) and use them as bag of features. + +Spatial data(sequential data): if there local dependencies use CNN(convolutional nn) if there are near past dependencies in the data use RNN(Recurrent NN-LSTM) +eg: stock market temporal data / speech data/ image data + +Non-Parametric models : k-NN(K-nearest neighbor), Decision Trees, Random Forests (independent of parameters) + -> very inaccurate because doesn't know much about the data +Parametric Models: based on fixed set of parameters + -> more accurate coz the knows more about the parameters from the data + + +Types of Learning: + supervized learning -> labeled data + unsupervized learning -> unlabeled data + exercise: *take 3s from mnist data *create a gmm model with them and *cluster them with 5/3/10 gaussians. + *take all images and cluster them to 10 gaussians. + semi-supervized learning -> combination of supervized and unsupervized models + + Auto Encoder: + finding a low dimensional representation of a high dimensional data. + eg. image of 200x200 pixels create a fingerprint of image of 128 dimensions. + exercise: use the 128 dimensional data to reconstruct the 200x200 image(using inverse of the model). + + Reinforcement learning: + eg:playing chess -> using the final result of the game to assign weights/score for moves that were made upto the final result. and training the model to predict based on those scores. diff --git a/SecondSaturday/Workshop.md b/SecondSaturday/Workshop.md new file mode 100644 index 0000000..2cbf17b --- /dev/null +++ b/SecondSaturday/Workshop.md @@ -0,0 +1,28 @@ +# swipe input keyboard for indian languages. +given a gesture made on the keybard,language chosen-> keyboard layout +predict the word that matches closest. + + +## Input +gesture data -> polygon(shape,size,corners, path), (time, pauses)?, spatial data with word character correlation. +weighted-vocabulary,corpus for the language,history of gesture-word mappings/corrections for the user. +language, keyboard layout + +## Output + Predict the word + +## Model + Structured Output/HMM/ CNN? + + +# mnist hand-written digit database -> build application for recognizing full phone numbers(10 digit). + +## Input +mnist digit database, generated 10 digit images with random positioning,orientation,scale of individual digit images sampled randomly from the mnist database. + +## Output + predict the phone number + +## Model + regression model to identify the points where the split for the images has to be made and pass the + split images to mnist digit recognizer to identify the digit.