In the process of arriving at such a decision three factors need to be considered. Elements selected for change should tend to be those whose output would thereby be affected for a minimum number of other possible inputs. At the same time it should be ascertained that a change in each of the elements in question does indeed contribute significantly towards correcting the final output. Finally, a minimum number of such elements should be used.

It would appear at first that this kind of decision is impossible to achieve if the complexity of the decision apparatus is kept comparable to that of the basic input-output network as mentioned earlier. However, in the method to be described it is felt that a reasonable approximation to these requirements will be achieved without an undue increase in complexity.

It is assumed that in addition to its normal inputs, each element receives a variable input bias which we can call b. The output of every element should then be determined by the sign of the usual weighted sum of its inputs plus this bias quantity. This bias is to be the same for each element of the network. If b = 0 the network will behave as before. However, if b is increased gradually, various elements throughout the network will commence changing from -1 to +1, with one or a few changing at any one time as a rule. If b is decreased, the opposite will occur.

Now suppose that for a given input the final output ought to be +1 but actually is -1. Assume that b is then raised so high that this final output is corrected. Then commence a gradual decline in b. Various elements may revert to -1, but until the final output does, no weights are changed. When the final output does revert to -1, it is due to an element’s having a sum (weighted sum plus bias) which just passed down through zero. This then caused a chain effect of changing elements up to the final element, but presumably this element is the only one possessing a zero sum. This can then be the signal for the weights on an element to change—a change of final output from right to wrong accompanied simultaneously by a zero sum in the element itself.

After such a weight change, the final output will be correct once more and the bias can again proceed to fall. Before it reaches zero, this process may occur a number of times throughout the network. When the bias finally stands at zero with the final output correct, the network is ready for the next input. Of course if -1 is desired, the bias will change in the opposite direction.

It is possible that extending the weight change process a little past the zero bias level may have beneficial results. This might increase the life expectancy of each learned input-output combination and thereby reduce the total number of errors. This is because the method used above can stop the weight correction process so that even though the final output is correct, some elements whose output are essential to the final output have sums close to zero, which are easily changed by subsequent weight changes.

It will be noted that this method conforms to all three considerations mentioned previously. First, by furnishing each element the same bias, and by not changing weights until the final output becomes incorrect with dropping bias, there is a strong tendency to select elements which, with b = 0, would have sums close to zero. But the size of the sum in an element is a good measure of the amount of damage done to an element for other inputs if its current output is to be changed. Second, it is obvious that each element changed has had a demonstrable effect on the final output. Finally, there will be a clear tendency to change only a minimum of elements because changes never occur until the output clearly requires a change.

On the other hand this method requires little more added complexity to the network than it already has. Each element requires a bias, an error signal, and the desired final output, these things being uniform for all elements in a network. Some external device must manipulate the bias properly, but this is a simple behavior depending only on an error signal and the desired final output—not on the state of individual elements in the network. What one has, then, is a network consisting of elements which are nearly autonomous as regards their decisions to change weights. Such a scheme appears to be the only way to avoid constructing a central weight-change decision apparatus of great complexity. This rather sophisticated decision is made possible by utilizing the computational capabilities the network already possesses in producing outputs from inputs.

It should be noted here that this varying bias method requires that the variable bias be furnished to just those elements which have variable weights and to no others. Any fixed portion of the network, such as preliminary layers or final majority function for example, must operate independently of the variable bias. Otherwise, the final output may go from right to wrong as the bias moves towards zero and no variable-weight element be to blame. In such a case the network would be hung up.

Logical Redundancy in the Network