Networks of Artificial Neurons, Single Layer PerceptronsNeural Computation : Lecture 3© John A. Bullinaria, 20141. Networks of McCulloch-Pitts Neurons2. Single Layer Feed-Forward Neural Networks: The Perceptron3. Implementing Logic Gates with McCulloch-Pitts Neurons4. Finding Weights Analytically5. Limitations of Simple Perceptrons6. Introduction to More Complex Neural Networks7. General Procedure for Building Neural NetworksL3-2Networks of McCulloch-Pitts NeuronsOne neuron can’t do much on its own. Usually we will have many neurons labelled byindices k, i, j and activation flows between them via synapses with strengths wki, wij:€ outkwki= inki€ outi= step( inkik=1n∑−θi) € outiwij= inijinni∑in1iin2ioutiwijθiinijneuron ineuron jsynapse ijL3-3The Need for a Systematic NotationIt takes some care to keep track of all the neural activations and connection weights ina network without introducing confusion or ambiguity.There are numerous complications, for example:Each neuron has input and output activations, butthe outputs from one neuron provide the inputs to others.The network has input and output neurons that usually need special treatment.Most networks will be built up of layers of neurons, andit makes sense to number the neurons in each layer separately, butthe weights and activations for different layers must be distinguished.The letters used for labels/subscripts are arbitrary, and limited in number.Frequently, the most convenient notation for a new neural network won’t match that ofother networks you have seen before.L3-4A Convenient Approach to NotationTo start with, we shall adopt the following conventions for our notation:The labels used to distinguish neurons within a layer (e.g., i = 1, 2, 3, …) are arbitrary,but it helps avoid confusion if they are consistently different for different layers.Reserve the labels “ini” and “outi” for the network input and output activations, and useother notation for activations of other neurons, e.g. “hidi”.Network input neurons don’t process information – they just supply the inputactivations to the network. When counting the layers of neurons within a network, theinput layer is not counted, only the processing neurons are.Weights between different layers need to be distinguished by using different letters(e.g., “wij”, “Wij”, “vij”) or some kind of subscript or superscript (e.g., wij(1), wij(2), wij(3)).The term “artificial neuron” is often replaced by “processing unit”, or just “unit”.L3-5The PerceptronAny number of McCulloch-Pitts neurons can be connected together in any way we like.The arrangement with one input layer of McCulloch-Pitts neurons feeding forward toone output layer of McCulloch-Pitts neurons is known as a Perceptron:Already this is a powerful computational device. Later we shall see variations of it thatmake it even more powerful.wijθ1θ2θm12n12mij••••••€ outj= step( inii=1n∑wij−θj)L3-6Implementing Logic Gates with M-P NeuronsIt is possible to use McCulloch-Pitts neurons to implement the basic logic gates.All we need to do is find the appropriate connection weights and neuron thresholds toproduce the right outputs for each set of inputs.We shall first see explicitly that it is easy to construct simple networks that performNOT, AND, and OR.It is then a well known result from logic that we can construct any logical function fromthese three basic operations. This constitutes an existence proof.The resulting networks, however, will usually have a much more complex architecturethan a simple Perceptron, because they require more than a single layer of neurons.We generally want to avoid decomposing complex problems into simple logic gates, byfinding weights and thresholds that work directly in a Perceptron-style architecture.L3-7Implementation of Logical NOT, AND, and ORIn each case, we have inputs ini and outputs out, and need to determine appropriateweights and thresholds. It is easy to find solutions by inspection:NOTin out0 11 0ANDin1 in2 out0 0 00 1 01 0 01 1 1ORin1 in2 out0 0 00 1 11 0 11 1 1-0.5–11.5110.511Thresholds ⇒Weights ⇒L3-8The Need to Find Weights AnalyticallyConstructing simple networks by hand (by trial and error) is one thing. But what aboutharder problems? For example, what about:How long do we keep looking for a solution? We need to be able to calculateappropriate parameters rather than searching for solutions by trial and error.Each training pattern produces a linear inequality for the output in terms of the inputsand the network parameters. These can be used to compute the weights and thresholds.XORin1 in2 out0 0 00 1 11 0 11 1 0???L3-9Finding Weights Analytically for the AND NetworkWe have two weights w1 and w2 and the threshold θ, and for each training pattern weneed to satisfy:€ out = step w1in1+ w2in2−θ( )So the training data lead to four inequalities:It is easy to see that there are an infinite number of solutions. Similarly, there are aninfinite number of solutions for the NOT and OR networks.in1 in2 out0 0 00 1 01 0 01 1 1w1 0 + w2 0 – θ < 0 w1 0 + w2 1 – θ < 0 w1 1 + w2 0 – θ < 0 w1 1 + w2 1 – θ ≥ 0 θ > 0 w2 < θ w1 < θ w1 + w2 ≥ θ ⇒⇒L3-10Limitations of Simple PerceptronsWe can follow the same procedure for the XOR network:Clearly the second and third inequalities are incompatible with the fourth, so there is infact no solution. We need more complex networks, e.g. that combine together manysimple networks, or use different activation/thresholding/transfer functions.It then becomes much more difficult to determine all the weights and thresholds byhand. Next lecture we shall see how a neural network can learn these parameters.First, let us consider what these more complex networks might involve.in1 in2 out0 0 00 1 11 0 11 1 0w1 0 + w2 0 – θ < 0 w1 0 + w2 1 – θ ≥ 0 w1 1 + w2 0 – θ ≥ 0 w1 1 + w2 1 – θ < 0 θ > 0 w2 ≥ θ w1 ≥ θ w1 + w2 < θ ⇒⇒L3-11ANN Architectures/Structures/TopologiesMathematically, ANNs can be represented as weighted directed graphs. For ourpurposes, we can simply think
View Full Document