posted on 2022-08-29, 05:07authored byM W Mitchell
TRACA (Temporal Reinforcement learning and Classification Architecture) is a learning system intended for robot navigation problems (Mitchell 2000). One problem in this area is the input-generalisation problem. Input generalisation requires learning a small set of internal states which represent useful abstractions of the much larger set of actual states. As such, the input-generalisation problem is fundamentally similar to the classical problems of classification, concept learning and discrimination. However, for on-line robot-learning tasks different evaluation criteria is applied than that for batch classification tasks. Specifically, a small number of trials is desirable to reduce the risks of damage to the agent and/or its environment. This may come at the cost of more computation during learning and slightly lower predictive accuracy. Other requirements are the ability to learn on-line without predefined classes (i.e classes must be learned during training), an efficient adaptable representation and minimal parameter tuning.
This paper describes TRACA's generalisation mechanism in detail and evaluates its performance on a number of common classification tasks. The ability of TRACA to use short-term memory to represent hidden-state is ignored in this comparison as in all the following tasks perceptual aliasing can be overcome by including additional features. On most tasks, TRACA's predictive accuracy is within a few percent of the best performing systems compared and TRACA's result is often achieved with less training experience. The experiments also support claims by Holte (Holte 1993) that a high predictive accuracy (above 90 percent in these experiments) can easily be achieved on many wellknown classification tasks which are often used for evaluating learning systems.