Results 1 -
2 of
2
Multi-Armed Recommendation Bandits for Selecting State Machine Policies for Robotic Systems
"... Abstract — We investigate the problem of selecting a statemachine from a library to control a robot. We are particularly interested in this problem when evaluating such state machines on a particular robotics task is expensive. As a motivating example, we consider a problem where a simulated vacuumi ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract — We investigate the problem of selecting a statemachine from a library to control a robot. We are particularly interested in this problem when evaluating such state machines on a particular robotics task is expensive. As a motivating example, we consider a problem where a simulated vacuuming robot must select a driving state machine well-suited for a particular (unknown) room layout. By borrowing concepts from collaborative filtering (recommender systems such as Netflix and Amazon.com), we present a multi-armed bandit formulation that incorporates recommendation techniques to efficiently select state machines for individual room layouts. We show that this formulation outperforms the individual approaches (recommendation, multi-armed bandits) as well as the baseline of selecting the ‘average best ’ state machine across all rooms. I.
Cold-start Problems in Recommendation Systems via Contextual-bandit Algorithms
"... In this paper, we study a cold-start problem in recom-mendation systems where we have completely new users entered the systems. There is not any interaction or feedback of the new users with the systems previoustly, thus no ratings are available. Trivial approaches are to select ramdom items or the ..."
Abstract
- Add to MetaCart
(Show Context)
In this paper, we study a cold-start problem in recom-mendation systems where we have completely new users entered the systems. There is not any interaction or feedback of the new users with the systems previoustly, thus no ratings are available. Trivial approaches are to select ramdom items or the most popular ones to rec-ommend to the new users. However, these methods per-form poorly in many case. In this research, we provide a new look of this cold-start problem in recommenda-tion systems. In fact, we cast this cold-start problem as a contextual-bandit problem. No additional infor-mation on new users and new items is needed. We con-sider all the past ratings of previous users as contextual information to be integrated into the recommendation framework. To solve this type of the cold-start prob-lems, we propose a new efficient method which is based on the LinUCB algorithm for contextual-bandit prob-lems. The experiments were conducted on three differ-ent publicly-available data sets, namely Movielens, Net-flix and Yahoo!Music. The new proposed methods were also compared with other state-of-the-art techniques. Experiments showed that our new method significantly improves upon all these methods. 1