ACTRON TECHNOLOGY
AI Through Software
ARI's best bet at creating human level intelligence in machines, won't be ordinary computer programs, genetic algorithms, or even advanced hardware innovations that will achieve the super-intelligent AI systems portrayed in science fiction. Instead, ARI's self-learning, contemplative neural systems will drive all forms of synthetic intelligence, whether they be embedded within household appliances or within large computational clusters that are relentlessly churning Nobel caliber discoveries.
Neural Network (NN) could be define as an interconnected of simple processing element whose functionality is based on the biological neuron. Biological neuron is a unique piece of equipment that carries information or a bit of knowledge and transfers to other neuron in a chain of networks. Artificial Neuron imitates these functions and their unique process of learning. Basically, biological neuron has three types of components called dendrites, soma and axon. Dendrites are the sensitive part of neuron that receive signal from other neuron. Soma calculates and sums the signals and transmitted to other cells through axon.
Simple neuron introduced by McCulloch and Pitts in 1940s, consists of input layer, activation function, and output layer. Input layer receive input signal from external environment (or other neuron). Activation function is the neuron internal states that calculates and sum the input signals. The signals are then transmitted to output layer. The input layer, activation function and output layer in artificial neuron are similar to the function of dendrites, soma and axon in biological neuron.

McCulloch-Pitts Neuron Model
Learning in Neural Network
Assume we have n input units, Xi,…,Xn with input signals
x1,…,xn. When the network receive the signals (xi) from input units
(Xi), the net input to output (y_inj) is calculated by summing the weighted input signals . The matrix multiplication method for calculating the net input is shown in the equation below.
y_inj = 
where wij is the connection weights of input unit xi and output unit
yj.
The network output (yj) is calculated using the activation function
f(x). In which yj = f(x), where x is y_inj. The computed weight from the training is stored and will become the information or knowledge for the future application.
NN can be divided into three architectures, namely single layer, multilayer network and competitive layer. The number of layers in a net is defined based on the number of interconnected weight in the neuron. Single layer network consists only one layer of connection weights. Whereas, multilayer networks consists of more than one layer of connection weights. The network also consists of additional layer called hidden layer. Multilayer networks can be used to solve more complicated problems compared to single layer network. Both of the network are also called feedforward network where the signal flows from the input units to the output units in a forward direction. The competitive layer network, for example the Recurrent Networks is a feedback network where there are closed-loop signal from a unit back to itself.
Learning Mechanisms
NNs learning algorithms can be divided into two main groups that are supervised (or Associative learning) and unsupervised (Self-Organisation) learning. Many supervised and unsupervised learning NN have been invented. Some are listed in NN FAQ (frequently-ask-question) and discussion group web page, but many other are not.
Supervised Learning
Supervised learning learns based on the target value or the desired outputs. During training the network tries to match the outputs with the desired target values. This method has two sub varieties called auto-associative and hetero-associative. In auto-associative learning, the target values are the same as the inputs, whereas in hetero-associative learning, the targets are generally different from the inputs.
One of the most commonly used supervised NN model is backpropagation network that uses backpropagation learning algorithm. Backpropagation (or backprop) algorithm is one of the well-known algorithms in neural networks. Backpropagation algorithm has been popularized by Rumelhart, Hinton, and Williams in 1980s as a euphemism for generalized delta rule. Backpropagation of errors or generalized delta rule is a decent method to minimize the total squared error of the output computed by the net (Fausett, 1994). The introduction of backprop algorithm has overcome the drawback of previous NN algorithm in 1970s where single layer perceptron fail to solve a simple XOR problem.
Unsupervised Learning
Unsupervised learning method is not given any target value. A desired output of the network is unknown. During training the network performs some kind of data compression such as dimensionality reduction or clustering. The network learns the distribution of patterns and makes a classification of that pattern where, similar patterns are assigned to the same output cluster. Kohonen network is the best example of unsupervised learning network. According to Sarle (1997) Kohonen network refers to three types of networks that are Vector Quantization, Self-Organizing Map and Learning Vector Quantization.
Training the Network
Training the network is time consuming. It usually learns after several epochs, depending on how large the network is. Thus, large network required more training time compared to the smaller one. Basically, the network is trained for several epochs and stopped after reaching the maximum epoch. For the same reason minimum error tolerance is used provided that the differences between network output and known outcome is less than the specified value (see for example Pofahl et al., 1998). We could also stop the training after the network meet certain stopping criteria.
During training the network might learn too much. This problem is referred to as overfitting. Overfitting is a critical problem in most all standard NNs architecture. Furthermore, NNs and other AI machine learning models are prone to overfitting (Lawrence et al., 1997). One of the solutions is early stopping (Sarle, 1995), but this approach need more critical intention as this problem is harder than expected (Lawrence et al., 1997). The stopping criteria is also another issue to consider in preventing overfitting (Prechelt, 1998). Hence, for this problem during training, validation set is used instead of training data set. After a few epochs the network is tested with the validation data. The training is stopped as soon as the error on validation set increases rapidly higher than the last time it was checked (Prechelt, 1998). The graph below shows that the training should stop at time t when validation error starts to increase.

Training and validation curve
Conclusion
Constructing a program for Neural Network is not a difficult task. Basically, it was only several steps of algorithms that are easily followed even by novice practitioners. However, preparing the network for training is a difficult task since the network dealing with a large amount of data. Another problem is when to stop the training? Over training could cause memorization where the network might simply memorize the data patterns and might fail to recognize other set of patterns. Thus, early stopping is recommended to ensure that the network learn accordingly.
|