Introducing Recommenders This chapter covers: •What recommender are, within Mahout •A first look at a Recommender in action •Evaluating accuracy and quality of recommender engines •Evaluating a recommender on a real data set: GroupLens 39186
Each day we form opinions about things we like, don't like, and don't even care about. It happens unconsciously. You hear a song on the radio and either notice it because it's catchy, or because it sounds awful – or maybe don't notice it at all. The same thing happens with t-shirts, salads, hairstyles, ski resorts, faces, and television shows.
Although people's tastes vary, they do follow patterns. People tend to like things that are similar to other things they like. Because I love bacon-lettuce-and-tomato sandwiches, you can guess that I would enjoy a club sandwich, which is mostly the same sandwich but with turkey. Likewise, people tend to like things that similar people like.
These patterns can be used to predict such likes and dislikes. Recommendation is all about predicting these patterns of taste, and using them to discover new and desirable things you didn’t already know about.
After introducing the idea of recommendation in more depth, this chapter will help you experiment with some Mahout code to run a simple recommender engine, and understand how well it works, in order to give you an immediate feel for how Mahout works in this regard.
2.1 What is recommendation?
You picked up this book from the shelf for a reason. Maybe you saw it next to other books you know and find useful, and figure the bookstore has put it there since people who like those books tend to like this one too. Maybe you saw this book on the shelf of a coworker, who you know shares your interest in machine learning, or perhaps he recommended it to you directly.
These are different, but valid strategies for discovering new things: to discover items you may like, you could look to what people with similar tastes seem to like. On the other hand, you could figure out what items are like the ones you already like, again by looking to others’ apparent preferences. In fact, these describe the two broadest categories of recommender engine algorithms: “user-based” and “item-based” recommenders, both of which are well-represented within Mahout.
2.2 Collaborative filtering, not content-based recommendation
Strictly speaking, the scenarios above are examples of “collaborative filtering” -- producing recommendations based on, and only based on, knowledge of users’ relationships to items. These techniques require no knowledge of the properties of the items themselves. This is, in a way, an advantage. This recommender framework does not care whether the “items” are books, theme parks, flowers, or even other people, since nothing about their attributes enters into any of the input.
There are other approaches based on the attributes of items, and are generally referred to as “content-based” recommendation techniques. For example, if a friend recommended this book to you because it’s a Manning book, and the friend likes other Manning books, then the friend is engaging in something more like content-based recommendation. The thought is based on an attribute of the books: the publisher. The Mahout recommender framework does not directly implement these techniques, though it offers some ways to inject item attribute information into its computations. As such, it might technically be called a collaborative filtering framework.
There is nothing wrong with these techniques; on the contrary, they can work quite well. They are necessarily domain-specific approaches, and would be hard to meaningfully codify into a framework. To build an effective content-based book recommender, one would have to decide which attributes of a book -- page count, author, publisher, color, font -- are meaningful, and to what degree. None of this knowledge translates into any other domain; recommending books this way doesn’t help in recommend pizza toppings.
上一篇:MAP-REDUCE的程序和系统英文文献和中文翻译
下一篇:进销存管理系统英文文献和中文翻译

开关电源水冷却系统英文文献和中文翻译

多极化港口系统的竞争力外文文献和中文翻译

机床控制系统英文文献和中文翻译

动力传动系统振动特征英文文献和中文翻译

旋转式伺服电机的柔性电...

电力系统智能波形记录仪英文文献和中文翻译

集成生理传感器系统英文文献和中文翻译

压疮高危人群的标准化中...

AES算法GPU协处理下分组加...

上海居民的社会参与研究

基于Joomla平台的计算机学院网站设计与开发

酵母菌发酵生产天然香料...

提高教育质量,构建大學生...

从政策角度谈黑龙江對俄...

浅论职工思想政治工作茬...

浅谈高校行政管理人员的...

STC89C52单片机NRF24L01的无线病房呼叫系统设计