ILLINOIS CS 446 - 110217.1 (7 pages)

Previewing pages 1, 2 of 7 page document View the full content.
View Full Document

110217.1



Previewing pages 1, 2 of actual document.

View the full content.
View Full Document
View Full Document

110217.1

45 views


Pages:
7
School:
University of Illinois - urbana
Course:
Cs 446 - Machine Learning
Machine Learning Documents

Unformatted text preview:

CS446 Machine Learning Fall 2017 Lecture 20 CNN RNN and LSTM Lecturer Sanmi Koyejo Scribe Wei Qian Nov 02th 2017 Announcement Exam 2 is next Thursday 11 09 17 Short review recap next Tuesday 11 07 17 Not explicitly cumulative and will include lectures from 10 3 17 to 10 31 17 Course final project is released and will be due on 12 19 17 Agenda for Today Recap CNN Continue Pooling Layer Output Layer Demo Introduction to Recurrent Neural Network RNN Agenda for Next Tuesday RNN Continue with LSTM Start Unsupservised Learning Short Exam 2 Review 1 2 20 CNN RNN and LSTM CNN Operation and Output Tensor If we do convolution on a W1 H1 D1 input tensor using K number of F F D1 filters with P padding and S stride Figure 1 Convolution Operation the resulting output tensor will have shape W2 H2 D2 where W1 F 2P 1 W2 S H1 F 2P H2 1 S D2 K Model Hyper parameter As we can see here K F P S and sometimes the filter depth D are all hyper parameter so for small network we do Hyper Parameter Search using Cross Validation Bayesian Optimization for large network that can take days or weeks to train we just start with others published parameters tweak based on our own task 20 CNN RNN and LSTM 3 CNN Pooling and Output Layer Recall in an CNN we also have pooling and output layers besides convolution layers described above Figure 2 Convolution Neural Network Architecture Pooling Layer The main idea of pooling layer is to capture location and scaling invariance of the input data Max Pooling max of values in the block Average Pooling average of values in the block Figure 3 Example of Max Pooling Similar to convolution we can control the following parameters for the pooling layer including Pooling function max average etc Filter size F 4 20 CNN RNN and LSTM Stride S which is usually the same as filter size F Padding P If we consider the 1D pattern matching example from last lecture one prediction function is to take the max of the final output vector Z and that would give us 1 regardless where



View Full Document

Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view 110217.1 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 110217.1 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?