**Unformatted text preview:**

95 891 Introduction to Artificial Intelligence Session 6 Artificial Neural Networks David Steier steier andrew cmu edu September 12 2024 95 891 Introduction to Artificial Intelligence 1 Agenda What are artificial neural networks Convolutional neural networks CNNs Gradient descent and backpropagation Transfer learning Recurrent Neural Networks RNNs LSTMs Long Short Term Memories Reminders HW 2 due Thursday Sep 12 Quiz Tuesday Sep 17 Appendices Backpropagation in action Generative Adversarial Networks GANs Variational Autoencoders 95 891 Introduction to Artificial Intelligence 2 Traditional Machine Learning vs Deep Learning G Kesari The Real Reason behind all the Craze for Deep Learning Sep 10 2018 https towardsdatascience com decoding deep learning a big lie or the next big thing b924298f26d4 95 891 Introduction to Artificial Intelligence 3 Why Neural Networks Neurons produce outputs within milliseconds of receiving inputs Brains are network of 85 billion neurons each of which has thousands of interconnections through synapses Hebbian learning is a theory that brains learn to produce new outputs by changing the strengths of the connections between neurons Neurons that fire together wire together E A Weaver and H H Doyle How Does the Brain Work August 11 2019 https www dana org article how does the brain work 95 891 Introduction to Artificial Intelligence 4 An Artificial Neuron Perceptron Rosenblatt 1957 S Raschka Single Layer Neural Networks and Gradient Descent May 24 2015 https sebastianraschka com Articles 2015 singlelayer neurons html adaptive linear neurons and the delta rule 95 891 Introduction to Artificial Intelligence 5 Artificial Neurons Activation Functions S Raschka Single Layer Neural Networks and Gradient Descent May 24 2015 https sebastianraschka com Articles 2015 singlelayer neurons html adaptive linear neurons and the delta rule 95 891 Introduction to Artificial Intelligence 6 Multi Layered Artificial Neural Networks Input layer Hidden Layer 1 Hidden Layer 2 Output layer Neurons are interconnected so outputs of some neurons become inputs to others Neurons arranged in layers can recognize complex combinations of features The deep in deep learning Goodfellow Bengiou and Courville Deep Learning 2016 http www deeplearningbook org contents intro html p 6 95 891 Introduction to Artificial Intelligence 7 1 2 3 4 Classifying Images With Artificial Neural Networks Goodfellow Bengiou and Courville Deep Learning 2016 http www deeplearningbook org contents intro html p 6 95 891 Introduction to Artificial Intelligence 8 CNNs and RNNs Convolutional Neural Networks Images and data where you need to look at some context Recurrent Neural Networks Sequential data where time plays an important role C C Chatterjee Basics of the Classic CNN July 31 2019 https towardsdatascience com basics of the classic cnn a3dce1225add 9 Types of Layers in CNNs Convolutional Computes output of neurons connected to small input region Pooling Element wise downsampling Fully Connected Computes class score Peng M et al Dual Temporal Scale Convolutional Neural Network for Micro Expression Recognition Frontiers in Psychology 13 October 2017 https www frontiersin org articles 10 3389 fpsyg 2017 01745 full 95 891 Introduction to Artificial Intelligence 10 Convolution Convolution applies a convolution filter to produce a feature map https towardsdatascience com applied deep learning part 4 convolutional neural networks 584bc134c1e2 95 891 Introduction to Artificial Intelligence 11 Convolutional Layer We slide the green matrix over our original image blue by 1 pixel also called stride and for every position we compute element wise multiplication between the two matrices and add the multiplication outputs to get the final integer which forms a single element of the output matrix pink We perform multiple convolutions on an input each using a different filter and resulting in a distinct feature map We then stack all these feature maps together and that becomes the final output of the convolution layer 5x5x3 1x1x1 32 32 3 1x1x1 32 32x32x1 32 10 https towardsdatascience com applied deep learning part 4 convolutional neural networks 584bc134c1e2 95 891 Introduction to Artificial Intelligence 12 Pooling Layer Enables us to reduce the number of parameters which both shortens the training time and combats overfitting Pooling layers downsample each feature map independently reducing the height and width keeping the depth intact https towardsdatascience com applied deep learning part 4 convolutional neural networks 584bc134c1e2 95 891 Introduction to Artificial Intelligence 13 https playground tensorflow org 95 891 Introduction to Artificial Intelligence 14 Exercise https playground tensorflow org Set data set to the spiral on the lower left and noise to 10 Design a network that gets the test loss to converge to 012 or less How many epochs does it take to converge to that test loss 95 891 Introduction to Artificial Intelligence 15 Backpropagation How Neural Nets Learn Predicted CAR Actual PERSON 1 Detection of error as difference between predicted and actual output 2 Error is propagated backwards to each neuron 3 Gradient of error is calculated with respect to each weight 4 Weights are adjusted Rumelhart David E Hinton Geoffrey E Williams Ronald J 1986a Learning representations by back propagating errors Nature 323 6088 533 536 http www cs utoronto ca hinton absps naturebp pdf 95 891 Introduction to Artificial Intelligence 16 Gradient Descent for a Single Weight Cost Function https sebastianraschka com Articles 2015 singlelayer neurons html adaptive linear neurons and the delta rule 95 891 Introduction to Artificial Intelligence 17 Partial Derivatives https www khanacademy org math multivariable calculus multivariable derivatives partial derivative and gradient articles a introduction to partial derivatives 95 891 Introduction to Artificial Intelligence 18 Gradient Descent for Multiple Weights We have various parameters to optimize e g weights and biases in all layers Consider a single linear unit Gradient Descent for linear unit https sebastianraschka com Articles 2015 singlelayer neurons html adaptive linear neurons and the delta rule 95 891 Introduction to Artificial Intelligence 19 Batch Gradient Descent Algorithm for a Linear Unit https sebastianraschka com Articles 2015 singlelayer neurons html adaptive linear neurons and the delta rule 95 891 Introduction to Artificial Intelligence 20 Stochastic Gradient Descent

View Full Document