Making more sensors for the NAO. Quite a relaxing pastime :) Thanks to Ho for teaching me how to solder and crimping me out.
Also Alex Churchill (from Sussex, CCNR) and I are thrashing out a DN algorithm with co-evolution of actors and goals, in quite an elegant way, that uses methods from multi-objective optimisation.
So basically the algorithm consists of two populations. A population of actors (CTRNNs etc... hierarchical action representations) and a population of target states or goals. The fitness of an actor is how well it does on the targets currently in the target population. MOO methods are used to permit a diversity of target competences to be maintained. The fitness of a target state (goal) is the variance of the fitness component of actors due to performance on that target.
Behaviour works in generations. The agent tries all actors in a generation. There is no sophisticated method of action selection biasing as yet. One expects the evolution of actors that satisfy goals.
A Pareto archive or a memory is stored of optimal goal-actor pairs and these are used to reduce the fitness of goals and actors that are too similar to those in the archive, thus maintaining diversity.
This simple Darwinian neurodynamic co-evolutionary cognitive architecture seems capable of evolving indefinitely interesting behaviours. Its just a matter of the devil in the details, e.g. how to represent actors and target states (goals) in an open-ended manner?
However, before we get onto this new model, I have been convinced by Vera that I must first try the higher-order s-m-p group evolution experiments properly to see what kinds of actions involving two limb movements are evolved to influence a single sensor (in terms of MI).

No comments:
Post a Comment