ABSTRACT
In this paper, we propose a novel Deep Reinforcement Learning framework for news recommendation . --------------
Therefore, to address the aforementioned challenges, we propose a Deep Q-Learning based recommendation framework, which can model future reward explicitly.
1 INTRODUCTION
Several groups of methods are proposed to solve the online personalized news recommendation problem, including content based methods...........
Therefore, in this paper, we propose a Deep Reinforcement Learning framework that can help to address these three challenges in online personalized news recommendation. First,
Our contribution can be summarized as below:
• We propose a reinforcement learning framework to do online Although we focus on news recommendation, our framework can be generalized to many other recommendation problems.
• We consider user activeness to help improve recommendation accuracy, which can provide extra information than simply using user click labels.
• A more effective exploration method Dueling Bandit Gradient Descent is applied, which avoids the recommendation accuracy drop induced by classical exploration methods, e.g.,ϵ-greedy and Upper Confdence Bound.
• Our system has been deployed online in a commercial news recommendation application. Extensive ofine and online experiments have shown the superior performance of our methods.
The rest of the paper is organized as follows. Related work is discussed in Section 2. Then, in Section 3 we present the problem defnitions. Our method is introduced in Section 4.
After that, the experimental results are shown in Section 5. Finally, brief conclusions are given in Section 6.