Monday, 11 March 2013


I've started working with Boris on his variant of MCTS. Here is one version of MCTS from here  https://github.com/glesica/mcts-project/blob/master/paper.markdown
function TREEPOLICY(v)
    while v is nonterminal do
        if v is not fully expanded then
            return EXPAND(v)
        else
            v = BESTCHILD(v, Cp)
    return v

function EXPAND(v)
    a = an untried action, valid at v
    v' = result of applying a to v
    return v'

function BESTCHILD(v, c)
    return argmax of the children of v, based on weight (see text)

function DEFAULTPOLICY(s)
    while s is nonterminal do
        choose a valid action based on s, uniformly at random
        s = result of applying the action to s
    return reward for state s

function BACKUP(v, d)
    while v is not null do
        increment visit count of v
        update value of v based on d
        v = parent of v

No comments:

Post a Comment