Artificial Neural Networks
Main page | Viewpoints | Uses | Architecture and Training | Supervision | History | Dynamics | Contacts


Architecture and Training


A neural network is composed of individual, locally-connected units termed neurons. Typically these sum up the effects of their respective input connections, weigh them according to their own fashion and transform this weighted sum with a non-linear function. The latter function is often termed activation function, in analogy with the biological neuron.

Connecting several layers in succession is a great idea. One can show that a network with only two layers of adaptive weights suffices to model any function, given 'enough' neurons. This is just theory, however, and one should note that it only grants the existence of such a network solution - not the ability to actually find it! That requires some kind of learning procedure.

Changing the network parameters is termed network training. This procedure is also referred to as weight adaption, or in short 'learning', since that is really what's going on. One can think of learning as attempting to store data in a way that allows generalisation.

Training of a network can be done by most types of standard, non-linear optimisation algorithms such as gradient descent or BFGS2. To understand this, picture the network parameters as latitude and longitude in a large, and often insanely multi-dimensional, rolling landscape, where the altitude represents how far from the desired answer the output is. The optimisation algorithm then strolls along on the surface trying to find as low a valley as possible.

A very neat feature developed in this context is 'error-backpropagation'. This solves the problem of assigning the blame for bad prediction to individual neurons (aka the credit assignment problem). Neurons are very local creatures, remember, but using differentiable non-linearities means that we can use the chain-rule 3 to determine who did what to the final result. In the landscape analogy this corresponds to computing how steep the terrain around the current location is.