# ILLINOIS CS 446 - 103117.1 (9 pages)

Previewing pages*1, 2, 3*of 9 page document

**View the full content.**## 103117.1

Previewing pages
*1, 2, 3*
of
actual document.

**View the full content.**View Full Document

## 103117.1

0 0 63 views

- Pages:
- 9
- School:
- University of Illinois - urbana
- Course:
- Cs 446 - Machine Learning

**Unformatted text preview:**

CS446 Machine Learning Fall 2017 Lecture 18 Convolutional Neuron Net Work Lecturer Sanmi Koyejo Scribe Scribe Yayi Ning Oct 31th 2017 Introduction This lecture will include a recap on Feed Forward Neural Network and introduce to the topic of Convolutional Neuron Net Work Recap Neural Network A typical two hidden layer Feed Forward Neural Net also called Multilayer Perceptrons McCullock 2012 1 2 18 Convolutional Neuron Net Work The number of layers number of hidden layers 1 output layer The final prediction function f x u Wl g Wl 1 g g W1T X 1 Where g is the activation function and u is the outout layer function There is a bias term bi at every layer Adding bias term can be very useful in implementation Loss function n X L l yi f xi 2 i An overview of Neuron Network construction u output nonlinearity Sigmoid Function Softmax Function Linear Function Sigmoid Function Linear Function l loss function Log Loss Cross entropy Hinge Loss Square Loss Square Loss Problem Type Binary classification Multiclass Binary classification Multiclass Multilabel 0 1 regression Linear Regression Rk Feature mapping For simplicity consider nonlinearity f as linear function f u WlT x x g Wl 1 g g W1T X 3 4 Where is feature representation function that map nonlinearity into lineariity Regularization L1 Regularization w argminw n X l yi f W xi kW k 5 l yi f W xi kW k2 6 i L2 Regularization w argminw n X i 18 Convolutional Neuron Net Work 3 Optimization Stochastic Gradient Descent SGD Q Does SGD find the optimal model the model actually minimize the loss function w minw n X l yi f xi 7 i Where f is Neuron Network Answer In general even with linear activation function Neuron Network is not convex and does not depends on loss function or activation function For example f x u wk g w1 g w0T x w w2 w1 w0 f x is not convex Exercise show that f w1 w2 w1 w2 is not convex The reason behind this the composition of two convex function is not necessary convex If we have a and b be two convex function in order to make a b x convex additional properties are needed such as one of the function is monotone increasing This nonconvex property implies SGD gives us local optimal In fact local optimal is good enough for final prediction We are still trying to understand the underlying mechanics of what make this work in practice However evaluation is most important when we construct our model We need to have a good sense of how accurate our model predicts Convolutional Neural Networks CNN Introduction We had already seen classification models However what should we do when we want to predict certain pattern like a graph For example if our data is like in this figure 4 18 Convolutional Neuron Net Work Assume we know the pattern we looking for is a rectangle shape Then how could us build the model Possible solution 1 We can build the weight vector by listing rectangles at all possible location 18 Convolutional Neuron Net Work 5 The prediction function h x max z h x 1 when max z threshold h x 0 otherwise An alternative solution We can instead just have one weight and shift the filter around in data vector show in the figure below This approach is spatial efficient Q What if we do not know W ahead of time A We can use back propagation to learn W Q What if we have multiple patterns to search for A We can use multiple filters Q What if patterns are in composition combination of parts A We can use multiple layers This is the basic idea of convolution Overview of Convolutional Neural Network Convolutional Neural Networks is constructed similar with Neural Networks However it is composed by convolution layers Convolutional Neural Networks is designed specifically for images predictions Inspired by how human brain recognize image Convolutional Neural Networks process image into smaller pieces and evaluate by filters Convolutional Neural Networks is in fact easier to train than ordinary Neural Network since it has few parameters than ordinary Neural Network 6 18 Convolutional Neuron Net Work The right side of the figure below shows a ordinary 3 layers Neural Network The right side of the figure displays a three Convolutional Neural Networks In right Convolutional Neural Networks instead of have 2D hidden layers Convolutional Neural Networks actually have two volumes of hidden convolutional layers figure source The figure below shows a short cut of Convolutional Neural Network procedure pacocp github 2016 Usually our input image is 3D If we have an image that is 32 32 3 then the image length and width are 32 by 32 and 3 is three RGB color layers Assume we want to construct 10 filters for example recognize number from 0 to 9 Then our first convolutional layer will have dimension 32 32 10 Pooling will shrink the dimensions For example shrink it to 16 16 10 Then full connected layes will compute the class scores in this case how likely it will be 0 9 Let W Input volume size F Filter size S Stride P Number of zero paddings 18 Convolutional Neuron Net Work 7 The formular of how well is our spatial arrangement of hyperparameters W2 W F 2P 1 S 8 If W2 turn out to not be an integer then our hyperparameters probably will not give us a good fit Next class will continue on Convolutional Neural Network 8 18 Convolutional Neuron Net Work Bibliography McCullock J 2012 Introduction The xor problem pacocp github 2016 Convolutional neural network cnn 9

View Full Document