What is an artificial neural network and how does it work?
Artificial Neural Networks
An artificial neural network, or simply a neural network (ANN or neural network), is a mathematical or computational model based on biological neural networks. It consists of a group of artificial neurons (nodes) connected together, and processes information by passing along the connections and calculating new values at the nodes (connectivity approach to computation). In many cases, an artificial neural network is an adaptive system that changes its structure based on external or internal information flowing through the network during the learning process.
In practice, many neural networks are nonlinear statistical data modeling tools. They can be used to model complex relationships between input data and outputs or to search for patterns in data.
How do artificial neural networks work?
For an artificial neural network to work, it needs to go through two basic steps: training the artificial neural network as raw material and input for the system. Only then can the artificial neural network and the artificial neural network continue to learn.
Training
Neural networks learn (or are trained) through processing examples, each of which contains a known input and outcome, forming probability-weighted combinations between the two components, which are stored in the network’s own data structure. Training a neural network from a given example is usually done by determining the difference between the network’s processed output (usually a prediction) and a target output. This difference is the error. The network then adjusts its weighted combinations according to a learning rule and uses this error value. Successive adjustments will cause the neural network to produce outputs that are increasingly similar to the target output. After a sufficient number of these adjustments, the training can be terminated based on some criteria. This is a form of supervised learning.
Such systems “learn” to perform tasks by looking at examples, typically without being programmed with task-specific rules. For example, in computer vision, they can learn to identify images containing cats by analyzing sample images that have been manually labeled as “with cat” or “without cat” and using the results to identify cats in other images. They do this without any prior knowledge of cats, such as that they have fur, tails, whiskers, and cat-like faces. Instead, they automatically generate identifying features from the examples they process.
Using artificial neural networks
Perhaps the biggest advantage of artificial neural networks is their ability to be used as an arbitrary function approximation mechanism that “learns” from observed data. However, using them is not as straightforward, and a relatively good understanding of the underlying theory is essential.
Model selection: this will depend on the data representation and the applications. Overly complex models tend to lead to learning challenges.
Learning algorithms: there is a lot of trade-off between learning algorithms. Most algorithms will work well with the right hyperparameters for training on a particular fixed dataset. However, selecting and tuning an algorithm for training on unseen data requires a significant amount of experimentation.
Robustness: if the models, cost functions, and learning algorithms are chosen appropriately, ANNs can be incredibly robust.
With proper implementation, ANNs can be used naturally for online learning and large dataset applications. Their simple implementation and the existence of mostly local dependencies expressed in the structure allow for fast, parallel implementation in hardware.