10/14/2014 Neural Network Basicshttp://www.webpages.ttu.edu/dleverin/neural_network/neural_networks.html 1/25A Basic Introduction toFeedforward Backpropagation Neural NetworksDavid Leverington Associate Professor of GeosciencesThe Feedforward Backpropagation Neural Network AlgorithmAlthough the longterm goal of the neuralnetwork community remains the design of autonomousmachine intelligence, the main modern application of artificial neural networks is in the field of patternrecognition (e.g., Joshi et al., 1997). In the subfield of data classification, neuralnetwork methods havebeen found to be useful alternatives to statistical techniques such as those which involve regressionanalysis or probability density estimation (e.g., Holmström et al., 1997). The potential utility of neuralnetworks in the classification of multisource satelliteimagery databases has been recognized for wellover a decade, and today neural networks are an established tool in the field of remote sensing.The most widely applied neural network algorithm in image classification remains the feedforwardbackpropagation algorithm. This web page is devoted to explaining the basic nature of this classificationroutine.1 Neural Network BasicsNeural networks are members of a family of computational architectures inspired by biological brains(e.g., McClelland et al., 1986; Luger and Stubblefield, 1993). Such architectures are commonly called"connectionist systems", and are composed of interconnected and interacting components called nodes orneurons (these terms are generally considered synonyms in connectionist terminology, and are usedinterchangeably here). Neural networks are characterized by a lack of explicit representation ofknowledge; there are no symbols or values that directly correspond to classes of interest. Rather,knowledge is implicitly represented in the patterns of interactions between network components (Lugarand Stubblefield, 1993). A graphical depiction of a typical feedforward neural network is given in Figure1. The term “feedforward” indicates that the network has links that extend in only one direction. Exceptduring training, there are no backward links in a feedforward network; all links proceed from input nodestoward output nodes.10/14/2014 Neural Network Basicshttp://www.webpages.ttu.edu/dleverin/neural_network/neural_networks.html 2/25 Figure 1: A typical feedforward neural network.Individual nodes in a neural network emulate biological neurons by taking input data and performingsimple operations on the data, selectively passing the results on to other neurons (Figure 2). The outputof each node is called its "activation" (the terms "node values" and "activations" are usedinterchangeably here). Weight values are associated with each vector and node in the network, and thesevalues constrain how input data (e.g., satellite image values) are related to output data (e.g., landcoverclasses). Weight values associated with individual nodes are also known as biases. Weight values aredetermined by the iterative flow of training data through the network (i.e., weight values are establishedduring a training phase in which the network learns how to identify particular classes by their typicalinput data characteristics). A more formal description of the foundations of multilayer, feedforward,backpropagation neural networks is given in Section 5.Once trained, the neural network can be applied toward the classification of new data. Classifications areperformed by trained networks through 1) the activation of network input nodes by relevant data sources[these data sources must directly match those used in the training of the network], 2) the forward flow ofthis data through the network, and 3) the ultimate activation of the output nodes. The pattern ofactivation of the network’s output nodes determines the outcome of each pixel’s classification. Usefulsummaries of fundamental neural network principles are given by Rumelhart et al. (1986), McClellandand Rumelhart (1988), Rich and Knight (1991), Winston (1991), Anzai (1992), Lugar and Stubblefield(1993), Gallant (1993), and Richards and Jia (2005). Parts of this web page draw on these summaries. Abrief historical account of the development of connectionist theories is given in Gallant (1993).10/14/2014 Neural Network Basicshttp://www.webpages.ttu.edu/dleverin/neural_network/neural_networks.html 3/25Figure: 2 Schematic comparison between a biological neuron and an artificial neuron (after Winston,1991; Rich and Knight, 1991). For the biological neuron, electrical signals from other neurons areconveyed to the cell body by dendrites; resultant electrical signals are sent along the axon to bedistributed to other neurons. The operation of the artificial neuron is analogous to (though much simplerthan) the operation of the biological neuron: activations from other neurons are summed at the neuronand passed through an activation function, after which the value is sent to other neurons.2 McCullochPitts NetworksNeural computing began with the development of the McCullochPitts network in the 1940's (McCullochand Pitts, 1943; Luger and Stubblefield, 1993). These simple connectionist networks, shown in Figure 3,are standalone “decision machines” that take a set of inputs, multiply these inputs by associated weights,and output a value based on the sum of these products. Input values (also known as input activations) arethus related to output values (output activations) by simple mathematical operations involving weightsassociated with network links. McCullochPitts networks are strictly binary; they take as input andproduce as output only 0's or 1's. These 0's and 1's can be thought of as excitatory or inhibitory entities,respectively (Luger and Stubblefield, 1993). If the sum of the products of the inputs and their respectiveweights is greater than or equal to 0, the output node returns a 1 (otherwise, a 0 is returned). The value of0 is thus a threshold that must be exceeded or equalled if the output of the system is to be 1. The aboverule, which governs the manner in which an output node maps input values to output values, is known asan activation function (meaning that this function is
View Full Document