Upload
esther-charlotte-williams
View
215
Download
0
Embed Size (px)
Citation preview
8/3/2019 Ganging Up of Info Overload
1/3
106 Computer
Instea d of being starvedfor informat ion, we find
ourselves overloaded.
Humans have always addressed
the high cost of finding infor-
mation by sharing itinventing
oral traditions, written lan-
guage, and the Web as informa-
tion-sharing tools. The printing press,
broadcast media, and most recently the
Internet have all changed the nature of
the information problem. Information is
no longer scarce. Indeed, there is far toomuch of it for any one person to review,
let alone organize. Instead of being
starved for information, we find our-
selves overloaded.
When information is abundant, the
knowledge of which information is use-
ful and valuable matt ers most. We all use
our network of family, friends, and col-
leagues to recommend movies, books,
cars, and news articles. Collaborative fil-
tering technology automates the process
of sharing opinions on the relevance and
quality of information.
Collaborative filtering is one techniqueamong many information filtering tech-
niques that range from unfiltered to per-
sonalized and from effortless to laborious,
as illustrated by the chart shown as
Figure 1.
Libraries or the Web are good exam-
ples of unfiltered information sources.
E-mail directed to one recipient is a
good example of a filtered informat ion
source. A best-seller list requires little
effort for the user, but provides the same
recommendations t o all users, so it is in
the upper left of the chart. Filters basedon demographics, such as age, sex, or
marital statu s, require some effort from
the user in providing the demographics,
and provide some level of persona l fil-
tering, so they are near the middle of the
chart.
Collaborative filtering requires rela-
tively little effort from the user, and pro-
vides individually targeted recommen-
dations, so it is in the upper right of the
chart.
Effort, of course, can be reduced via
automation. While collaborative filtering
is not necessarily effortless, it requires a
relatively small amount of effort on the
part of the user and provides very indi-
vidualized recommendations. The col-
laborative filtering systems that we
discuss here each offer a high degree of
personalization, but each system takes
a different approach to automation,
attempting to find the best trade-off
between the amount of work the users
must put into the system and the per-
ceived value and benefits they receive in
return.
TAPESTRY
Collaborative filtering research beganin the early 1990s at Xerox PARC in
response to the overwhelming number of
e-mail messages within PARC, which
numbered far more than could be easily
managed by mailing lists and keyword
filtering.
The Tapestry system enabled users to
add annotations to messages. Two data-
bases stored the incoming stream of doc-
uments and the linked annotation
records. A sophisticated query system
allowed users to browse for messages
based on both their content and annota-
tions.Users could set up standing filter
queries that would watch the document
stream and annotation records, finding
documents that matched the query at any
time, present or future. For instance, a
user could ask for a ll messages about col-
laborat ive filtering rat ed excellent by a
superior. Only when the message was
rated excellent would it be selected and
forwarded to the user.
Tapestry was the first step in automat-
ing recommendation-sharing among
friends and colleagues. It capitalized on
the idea that humans working with com-puters could be more effective informa-
tion filters than computers or hum ans
working alone. People understand and
judge information in ways that current
computer systems cannot, largely be-
cause people can more readily determine
quality as well as content. Because users
needed to know whose recommenda-
tions to follow, Tapestry worked best in
a small community of people who al-
ready knew each o ther.
Ganging up onInfor mat ionOver load
Al Borchers, Jon Herlocker, Joseph Konstan, and John RiedlUniversity of Minnesota
Inter
netWatch
Editor: Ron Vetter, University of North
Carolina at Wilmington, Mathematical
Sciences Dept., 601 South College Rd.,
Wilmington, NC 28403; voice (910) 962-
3671, fax (910) 962-7107; vetter@cms.
uncwil.edu
8/3/2019 Ganging Up of Info Overload
2/3
April 1998 107
USENET AND GROUPLENSUsenet, one of the earliest and largest
bulletin board systems, was originally a
valuable source of information. But as
the number of users grew, the system
became increasingly overloaded. It
reached the point where most users
found only a few useful articles in a
group filled with dozens or even hun-
dreds of art icles a day.
The GroupLens system, started at the
University of Minnesota in 1992, at-
tempts to make Usenet useful again by
providing personalized predictions on
the qua lity of the messages. (GroupLens
has undergone dramatic changes, includ-
ing being commercialized by Net Percep-
tions. This column discusses the Group-Lens Research system.) Because there are
differences among user tastes, the
GroupLens system asks users to ra te arti-
cles on a one to five scale. GroupLens
then collects and compares these rat ings
to find users sharing similar t astes. If, for
example, you need a prediction for an
unread article, GroupLens would see
how the other users sharing your ta stes
had rated the article. If they liked it,
chances are you will too, so GroupLens
gives that article a high ranking.
GroupLens extends the Tapestry model
in several ways. The small Tapestry com-munities were limited to reading and eval-
uating only a relatively small set of
messages. A large community was needed
to generate recommendations across a
large stream of informat ion, like Usenet
news. GroupLens created a large virtual
community where users could share rec-
ommendations without actually knowing
each other. Surprisingly, the virtual com-
munity of GroupLens allowed personal-
ization at the same time it assured privacy
and anonymity. You did not need to know
the identity of those you correlated with
to gain the benefit of their recommenda-tions, unlike Tapestry where the benefits
came directly from your personal rela-
tionships with recommenders.
Conceptually, GroupLens works by
computing a correlation distance be-
tween each pair of u sers. For example,
users who are close to user Jane, accord-
ing to the distance function, form a
neighborhood for Jane. GroupLens uses
the op inions of the users in Janes neigh-
borhood to form predictions about her
interests. The opinions are weighed
according to how close each member of
the neighborhood is to Jane.
GroupLens shows that predictions
from an automated recommender system
can be meaningful to users. Predictions
generated by the GroupLens engine cor-
relate well with user ratings and are more
accurate than average ratings. Highly
rated articles are more likely to be read
and ra ted, which means that users are
more likely to rate articles so that the sys-
tem can better understand their interests.
RINGO AND VIDEO RECOMMENDERIn the mid-1990s other systems experi-
mented with variations on the GroupLens
model and algorithm. Upendra Shard-
anand and Pattie Maes developed Ringo,
an e-mail and Web system that recom-
mends music. They compared the Group-
Lens algorithm with others based on
different statistical measures of similarity
and another based on similarities among
the music CDs rather than among users.
Their research verified that predictions
improve as more ratings are collected.
Video Recommender, which makes
recommendations on movies, found a
middle ground on the trade-off between
lots of work and lots of value (the
Tapestry model) and no work and little
value (ratings by movie critics). In
exchange for submitting ratings on a
selected set of movies, the system gener-
ates personalized predictions that are
more accurate than critic recommenda-
tions. Video Recommenders predictions
have a 0 .62 correlation coefficient, whilemovie critics achieve only a 0.22 corre-
lation coefficient.
Both R ingo and Video Recommender
show that collaborative filtering can
apply to all media, even domains like
music and movies where computer-based
content analysis is not yet possible. These
systems showed collaborat ive filtering
allows serendipity where content-based
systems might not. If youve shown inter-
est only in country-western music, for
Automatic
Manual
Impersonal
Personal
Best -sellerlist
New YorkTimescritic
ConsumerReports
Implicitcollaborative
filtering
Explicitcollaborative
filtering
Web
surfing
Tapestry
Library research
Word of mouth
Demographics
Figure 1. Information retrieval techniques. The vertical dimension indicates how difficult it i s for
the end user to access the filtered information, while the horizontal dimension indicates the level
of personalization. Filters based on demographics require some effort from the end user and pro-
vide some level of personal filtering, so would be placed near the middle of t he chart. Automated
collaborative filt ering requires relati vely little effort f rom the end user and provides individually
targeted recommendations, so it would be placed in t he upper right corner of the chart.
8/3/2019 Ganging Up of Info Overload
3/3
108 Computer
example, a content filter would only rec-
ommend more country western. In a col-
laborative recommender system, how-
ever, users whose interests correlate with
yours on country western might lead you
to discover blues albums of interest.
Ringo and Video Recommender also
extend the virtual community to a real
connected community by allowing users
to post comments for others to read and
by revealing e-mail addresses of users
who have volunteered to reveal their
identities. Users wanted to get to know
others who shared their tastes and even
requested a Video Recommender singles
club. The knowledge derived from such
clubs made users more confident in therecommenda tions they received.
LOTUSLotus developed an active collabora-
tive filtering system that revived the
Tapestry model. Lotus researchers
believed people could always give more
relevant recommendations than any
computed function, so they chose to ask
for more work from the users in
exchange for better predictions about
user interests.
Built in Lotus Notes, the system made
it easy to send pointers to Web pages.Pointers could include hypertext links
and annotations explaining the content,
context, and relevance of the document.
One pointer, for example, might read:
Sally, you should definitely see this page
on collabora tive filtering.Jane. In the
Lotus system, pointers could be sent to
groups or individuals or published for all
to see.
Lotus found a striking division be-
tween those who would provide infor-
mation and those who would use it. In
the system Lotus implemented, one user
was responsible for 80 percent of thepointers. These information mediators
can help ensure the quality of informa-
tion, helping other users grow to trust
their recommendations.
As in Tapestry, in small social work-
groups information mediators may gen-
erate enough value to spend much of
their time mediating information. In
large anonymous groups, information
mediation may require shared work from
a larger community.
GroupLens continues to evolve at theUniversity of Minnesota and we are
experimenting with new t echniques
to help people find information that is of
value to them. Weve found that time
spent reading is a fairly accurate measure
of a users rating for an article. Future
GroupLens systems, then, will use time
measurements to gather implicit ratings
and to build predictions from those rat-
ings. Users can of course immediately see
the benefits of such a system, which
requires little extra work to personalize
their information needs.
Collaborative filtering can also incor-porat e agent technology through filter-
ing robots. Filterbots can automatically
rate new articles as they appear by using
different content analysis algorithms.
The first human raters to see these arti-
cles will already see predictions, person-
alized by their correlations with the
various filterbots. New prediction algo-
rithms will likely help counter the spar-
sity problem, where users have not rated
enough items in common to correlate,
and the scalability problem, where huge
numbers of users and items require over-
whelming computing resources. y
Acknowledgements
The authors gratefully acknowledge
the contributions of GroupLens co-
founder Paul Resnick, Hack Week par-
ticipants Dave Maltz and Brad Miller, all
the members of the GroupLens Research
team, and the support of the National
Science Foundation under grant IRI-
9613960.
Al Borchers is a visiting faculty memberand postdoctoral researcher at the Uni-
versity of Minnesota. He is developing
collaborative filtering algorithms to
power the next GroupLens system.
Jon Herlocker is a PhD student at the
University of Minnesota, researching
algorithmic issues in collaborative filter-
ing and w ays to measure the effectiveness
of recomm ender systems.
Joseph Konstan is an assistant professor
of computer science and engineering at
the University of Minnesota. He alsoserves as consulting scientist for N et Per-
ceptions, a company that he cofounded
to com mercialize collaborative filtering.
John Riedl is an associate professor of
computer science and engineering at the
University of M innesota. He is also chief
technical officer of N et Perceptions and
the cocreator of GroupLens.
Contact the authors at {borchers,her-
locke,konstan,riedl}@cs.umn.edu.
Web Recommender Systems
To experiment with recommender
systems, you don t have to wait. There
are already some online. Here are a
few good examples:
http://www.wisewire.com
http://www.amazon.com
http://www.moviefinder.com
http://www.movielens.umn.edu
http://www.cdnow.com
http://www.bignote.com
To Read More about Collaborative Filtering
D. Goldberg et al., Using Collaborative Filtering to Weave an Information
Tapestry, Comm. ACM, Dec. 1992, pp. 61-70.
P. Resnick et al., GroupLens: An O pen Architecture for Collaborative Filtering of
Netnews, Proc. CSCW 94, ACM Press, New York, 1994, pp. 175-186.
J. Konstan et al., GroupLens: Collabora tive Filtering for Usenet News, Comm.
ACM, Mar. 1997, pp. 77-87.
D.A. Maltz and K. Erlich, Pointing The Way: Active Collaborative Filtering,
Proc. CHI 95, ACM Press, New York, 199 5, pp. 202-209.
W. H ill et a l., Recommending and Evaluating Choices in a Virtual Community
of Use, Proc. CHI 95, ACM Press, New York, 1995 , pp. 194-201.
U. Shardanand and P. Maes, Social Information Filtering: Algorithms for Automating
Word of Mouth, Proc. CHI 95, ACM Press, New York, 1995, pp. 210-217.
Internet Watch