Results 1 - 10
of
4,953
Table 2 Average and standard deviation of the best test result during a run and the total cumulative reward during training.
1998
"... In PAGE 5: ... First of all, Table 1 shows that this strategy nds opti- mal or near-optimal policies in 90% of the cases, whereas the others fail in at least 50% of the cases. The second improvement with MBIE is shown in Table2 . MBIE col- lects much more reward during training than all other exploration methods, thereby e ectively addressing the exploration/exploitation dilemma.... ..."
Table 3: Coefficient p decides the balance between exploration and exploitation. (Pure
2006
Cited by 3
Table 3: Coefficient p decides the balance between exploration and exploitation. (Pure
2006
Cited by 3
Table 6: Coefficient p decides the balance between exploration and exploitation. (Pure
2006
Table h part of the DILEMMA dictionary
1992
Cited by 5
Table 2:DILEMMA-1 output sample
1992
Cited by 5
Table 1. Categorization of different learning tasks with regard to the stability- plasticity dilemma.
"... In PAGE 1: ... In the past many learning methods have been proposed which also apply to RBF networks. Table1 aims to bring some order into the terminology, especially from the viewpoint of the stability-plasticity dilemma. Classical RBF learn- ing has dealt with a stationary environment.... ..."
Results 1 - 10
of
4,953