The basic principle of an error correction network is to minimize an error value. This error value is typically computed by the difference (or, typically, the squared difference) between the current output pattern and the desired output pattern. An error correction network then adjusts the weig ...
The Bellman equation refers to a way to estimate the value of the current Markov state in reinforcement learning. It states that the value of a state equals the sum of the reward of the immediate next state plus the temporally discounted value of all future states. Once a complete state-value f ...