Transfer Learning to Predict Missing Ratings in Recommender Systems
Weike Pan, Nathan Liu, Evan Xiang and Qiang Yang
Data sparsity due to missing ratings is a major challenge for collaborative filtering (CF) techniques in recommender systems. This is especially true for CF domains where the ratings are expressed numerically. We observe that, while we may lack the information in numerical ratings, we may have more data in the form of binary ratings. This is especially true when users can easily express themselves with their like and dislike for certain items. In this paper, we explore how to use the binary preference data expressed in the form of like/dislike to help reduce the impact of data sparsity of more expressive numerical ratings. We do this by transferring the rating knowledge of a latent rating space from some auxiliary data source in the form of like/dislike, to a target numerical rating matrix. Our solution is to model data of both numerical ratings and like/dislike in a principled way, using a novel collective matrix tri-factorization (CMTF) framework. In particular, we construct the shared latent space collectively and learn the data-dependent effect separately. A major advantage of the CMTF approach over previous collective matrix factorization (or bi-factorization) methods is that we are able to capture the data-dependent effect when sharing the data-independent knowledge, so as to increase the overall quality of knowledge transfer. Experimental results demonstrate the effectiveness of CMTF at various sparsity levels as compared to several state-of-the-art methods.