Having reviewed the literature on inverse and forward model learning, I'm included to now try to learn inverse and forward models of the sm pair discovered by the MI algorithm.
References below...
http://homepages.inf.ed.ac.uk/svijayak/publications/klanke-JMLR2008.pdf
TRY TO USE PYTHON LWPR LIBRARY
http://wcms.inf.ed.ac.uk/ipab/slmc/research/software-lwpr
http://homepages.inf.ed.ac.uk/svijayak/publications/vijayakumar-NeuCom2005.pdf
http://wcms.inf.ed.ac.uk/ipab/slmc/research/lwpr/lwpr-doc.pdf
Learning Inverse Kinematics
Aaron D’Souza
[Very good paper explaining a local method to learn inverse models. Basically, you learn
[change in sensor state] sensor(t)-sensor(t+1), [current joint angle] m(t) to m(t)-m(t+1) [change in joint angles]
by storing data during exploration. While exploring you use LWPR to approximate the above function. Action selection is slowly taken over by LWPR, rather than being randomly generated. The input to LWPR for action selection is a policy about how to change the sensory state given a goal, e.g. simply try to go directly to the goal is one such policy in sensor space.]
So, first things first, how to store the data from different sm pairs in a python data structure? I need to quickly get the data by a key to the sm pair name, so possibly a dictionary?
OK, so I'm now storing the data obtained during cma-es to 10 random goals for each sm pair.
Each fitness assessment now looks like this...
self.x1 = motor number
self.x2 = sensor number
j[0] = joint angle command given at t
pv = sensor value at t
pv2 = sensor value at t+1
motor_state_t = motor angle at t
motor_state_t_p1 = motor angle at t+1
The above information should be sufficient to construct a range of models by function approximation.
def calcGoalScore(self, j): #J[0] contains the joint position(s) to be tested.
#self.rest()
#print("testing" + str(j) )
#Get sensor state at t
sensedAngles = self.get_sensor_values()
pv = sensedAngles[self.x2]
#Get motor state at t
motor_state_t = sensedAngles[self.x1]
#Set motors and do the action
self.set_motor_values(j, self.x1)
sleep(0.3)
#Get new sensor values at t+1
sensedAngles = self.get_sensor_values()
pv2 = sensedAngles[self.x2]
#Get motor state at t+1
motor_state_t_p1 = sensedAngles[self.x1]
#Update the inverse and the forward models here (for single sm contingency pairs)
self.updateModels(self.x1,self.x2,j[0],pv,pv2, motor_state_t, motor_state_t_p1)
f = pow(pv2-self.randomGoal,2)
#print( "Now pos = " + str(pv2) + "Goal = " + str(self.randomGoal) + ": Fitness = " + str(f)+ "\n")
#print(self.x1, self.x2, pv, pv2, f)
return f
The next step with all this data is to construct for the first sm pair, an inverse model with LWPR. Once this is done, it should be possible to check whether NAO is able to achieve the desired goal sensory state by issuing the right motor commands that move towards that goal state most rapidly.
Alternatively, construct a forward model of the sensory consequences of a motor command and do search in possible motor trajectories.
Lets imagine that all this worked fine, what would we get? A set of separate pairwise controllers capable of achieving to some extent desired sensory goal states with motor commands.
It should be possible to measure the prediction error of the models, for each type of model, and plot the prediction error of each type of model over time. This should be the next step. To visualise the extent to which models could be learned for each sm pair and what these models show. We can then move onto coupling models to multiple motors and multiple sensors, or coupling models HIERARCHICALLY.
No comments:
Post a Comment