0 0 11 views

**Unformatted text preview:**

5 Understanding of Neural Networks Bogdan M Wilamowski Auburn University Introduction 5 1 The Neuron 5 1 Should We Use Neurons with Bipolar or Unipolar Activation Functions 5 5 5 4 Feedforward Neural Networks 5 5 References 5 10 5 1 5 2 5 3 5 1 Introduction The fascination of artificial neural networks started in the middle of the previous century First artificial neurons were proposed by McCulloch and Pitts MP43 and they showed the power of the threshold logic Later Hebb H49 introduced his learning rules A decade later Rosenblatt R58 introduced the perceptron concept In the early 1960s Widrow and Holf WH60 developed intelligent systems such as ADALINE and MADALINE Nilsson N65 in his book Learning Machines summarized many developments of that time The publication of the Mynsky and Paper MP69 book with some discouraging results stopped for sometime the fascination with artificial neural networks and achievements in the mathematical foundation of the backpropagation algorithm by Werbos W74 went unnoticed The current rapid growth in the area of neural networks started with the work of Hopfield s H82 recurrent network Kohonen s K90 unsupervised training algorithms and a description of the backpropagation algorithm by Rumelhart et al RHW86 Neural networks are now used to solve many engineering medical and business problems WK00 WB01 B07 CCBC07 KTP07 KT07 MFP07 FP08 JM08 W09 Descriptions of neural network technology can be found in many textbooks W89 Z92 H99 W96 AQ1 5 2 The Neuron A biological neuron is a complicated structure which receives trains of pulses on hundreds of excitatory and inhibitory inputs Those incoming pulses are summed with different weights averaged during the time period WPJ96 If the summed value is higher than a threshold then the neuron itself is generating a pulse which is sent to neighboring neurons Because incoming pulses are summed with time the neuron generates a pulse train with a higher frequency for higher positive excitation In other words if the value of the summed weighted inputs is higher the neuron generates pulses more frequently At the same time each neuron is characterized by the nonexcitability for a certain time after the firing pulse This so called refractory period can be more accurately described as a phenomenon where after excitation the threshold value increases to a very high value and then decreases gradually with a certain time constant The refractory period sets soft upper limits on the frequency of the output pulse train In the biological neuron information is sent in the form of frequency modulated pulse trains 5 1 K10149 C005 indd 1 8 31 2010 4 32 01 AM 5 2 Intelligent Systems A 1 1 B 1 C A 1 T 0 5 A T 0 5 1 A A B C B 1 C 1 T 2 5 Memory 1 Write 1 1 T 0 5 2 Write 0 ABC Figure 5 1 Examples of logical operations using McCulloch Pitts neurons The description of neuron action leads to a very complex neuron model which is not practical McCulloch and Pitts MP43 show that even with a very simple neuron model it is possible to build logic and memory circuits Examples of McCulloch Pitts neurons realizing OR AND NOT and MEMORY operations are shown in Figure 5 1 Furthermore these simple neurons with thresholds are usually more powerful than typical logic gates used in computers Figure 5 1 Note that the structure of OR and AND gates can be identical With the same structure other logic functions can be realized as shown in Figure 5 2 The McCulloch Pitts neuron model Figure 5 3a assumes that incoming and outgoing signals may have only binary values 0 and 1 If incoming signals summed through positive or negative weights have a value equal or larger than threshold then the neuron output is set to 1 Otherwise it is set to 0 1 if net T out 0 if net T 5 1 where T is the threshold net value is the weighted sum of all incoming signals Figure 5 3 A 1 B 1 1 C T 0 5 A B C A 1 1 B 1 C T 1 5 AB BC CA A 1 1 B 1 C T 2 5 ABC Figure 5 2 The same neuron structure and the same weights but a threshold change results in different logical functions n n x1 x2 x3 x4 xn a net wi xi i 1 T t x1 x2 x3 x4 net wi xi wn 1 xn i 1 T 0 wn 1 t b 1 Figure 5 3 Threshold implementation with an additional weight and constant input with 1 value a neuron with threshold T and b modified neuron with threshold T 0 and additional weight wn 1 t K10149 C005 indd 2 8 31 2010 4 32 04 AM 5 3 Understanding of Neural Networks ADALINE MADALINE n net wi xi wn 1 i 1 Hidden layer 1 1 o net During training 1 Figure 5 4 ADALINE and MADALINE perceptron architectures The perceptron model has a similar structure Figure 5 3b Its input signals the weights and the thresholds could have any positive or negative values Usually instead of using variable threshold one additional constant input with a negative or positive weight can be added to each neuron as Figure 5 3 shows Single layer perceptrons are successfully used to solve many pattern classification problems Most known perceptron architectures are ADALINE and MADALINE WH60 shown in Figure 5 4 Perceptrons using hard threshold activation functions for unipolar neurons are given by o f uni net sgn net 1 1 if net 0 2 0 if net 0 5 2 and for bipolar neurons 1 if net 0 o fbip net sgn net 1 if net 0 5 3 For these types of neurons most of the known training algorithms are able to adjust weights only in single layer networks Multilayer neural networks as shown in Figure 5 8 usually use soft activation functions either unipolar o f uni net 1 1 exp net 5 4 or bipolar o fbip net tanh 0 5 net 2 1 1 exp net 5 5 These soft activation functions allow for the gradient based training of multilayer networks Soft activation functions make neural network transparent for training WT93 In other words changes in weight values always produce changes on the network outputs This would not be possible when hard activation K10149 C005 indd 3 8 31 2010 4 32 12 AM 5 4 Intelligent Systems f net f net net o f net sgn net 1 2 f net funi o k o 1 o net k o f uni net net o f net sgn net k 1 o2 f bip o 2 f net k net o fbip net 2 funi net 1 k net 2 1 tanh 2 1 exp k net 1 1 exp k net Figure 5 5 Typical activation functions hard in upper row and soft in the lower row functions are used Typical activation functions are shown in Figure 5 5 Note that even neuron models with continuous activation functions are far from an actual biological …