Minutes 21.5

(Individual meeting with Kurt)
Last posts
First we looked at the my last posts and my results.
It seems that my results for the “selection strategies” are wrong, because it doesn’t make sense, that UCT-pairwise and UCT-global differ after 25 samples.
-> (I looked it up in the data and it is the 28th sample that differ + because it is “TLS_RECALL”, sometimes already 2 splits has been done.. but I also found a bug in the UCT-pairwise selection ;-)… I will correct the pictures as far as I have the results again)

The significance results seems to be good.

There are two ways to add noise to your environment.
1. Add noise to your action (i.e. noise engine)
2. Add noise to your reward (i.e. noise sensors)
It is possible to combine those else well as doing just one of them.
Kurt:”For the beginning just add a (gausian distributed-)noise to your X and Y value in your environment (DonutWorld). Take noise rates i.e. of 5% and 10%.”

Reusing knowledge
I told Kurt my problems and ideas about “reusing old knowledge” in TLS and he agreed that my approaches will not help. He gave me the hint that I could somehow weight the “old knowledge” and the “new knowledge”. I guess this is the only reasonable idea and I have to figure out how (and where) to apply this to TLS.

– fix axis labels + make clear what “regret” means
– investigate on “reuse”
– do the splitting probabilities also for “multi-dimensional-sinusfunction”
– add noise to your environments and run the selection strategies experiments again

This entry was posted in Minutes. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s