View
292
Download
10
Category
Preview:
Citation preview
Intelligent Agents
Katia Sycara
The E-Commerce Institute
katia@cs.cmu.edu
www.cs.cmu.edu/~softagents
Teaching assistant: Joe Giampapa
garof@cs.cmu.edu
Internet Agents• Web search Agents
• Information filtering agents
• Off-line delivery agents
• Notification agents
• Service agents
• Web site agents
• Mobile agents
Information Search• Ways to Find Information
– Browsing: Following hyper-links that seem of interest
– Searching: Sending a query to a search engine such as Lycos
– Categories: Following existing categories such as Yahoo
• Problems
– Spent a lot of time and effort to navigate. Can search be made more efficient?
– Search but it is difficult to accurately express the user’s intention.
– Search engines are not personalized
Search Engines
Web etiquette guidelines for spiders• Identify the name of the agent
• Identify the user deploying the agent
• Announce the agent by posting a message to the comp.infosystems.www.providers Usenet newsgroups
• Announce the agent to the Webmasters of the servers the agent will visit
• Provide additional information (using the Referrer field)
• Be accessible to fix problems the agent may cause
• Design the agent so it does not consume lot of resources (e.g. does not use successive hits on a single server, does not loop, runs at appointed times, etc.)
Advantages and Disadvantages of
Search Engines Feature Advantage Disadvantage
Keyword query Ease of use Lost productivity due ot poor precision
Instant response Increased productivity,
If user knows what he
Is looking for
Decreased productivity, due to chasing links
Hierarchical subject categories
Increased productivity due to high precision
Low recall in response to user needs
Information discovery
via spiders
Reduced user workload Lack of scalability and bandwidth inefficiency
Limitations of current search engines
• Lack of personalization; this results in low precision of answers
• Unscaleability: *the robot must visit not only new links but also old ones to keep them up to date; *the information gathering is centralized
Some solutions to scalability issues:• use specialized information brokers for building
information indices• use massive replication and caching of popular
information• distributed information gathering by placing gatherers
on the provider’s site; thus information is ready for analysis as new information comes in, but the provider must implement the software.
Information Filtering Agents
• Information Filtering agents find the content of interest to a user.
• Information Filtering agents could gather information from different sources
• They could filter information based on user’s personal interest
• Filtering agents typically use a fixed number for information sources
• Information filtering agents may use Information Retrieval techniques
*Vector space models, where a document is represented as a vector of attributes*Tree structure, which represents a
hierarchical view of a document
Filtering Agents Attributes
Element Description
Environment Internet
Task Skills Information gathering, filtering, presentation
Knowledge Web, news in different domains
Communication HTTP, HTML, indexing protocols
Filtering Agent Architecture
Filtering Agent Architecture
InsignificantLow-frequencywords
Insignificant High-frequencywords
Words usage frequency
Figure 3.4 Filtering based on word usage
Benefits of Information Filtering Agents
Advantage Feature User-benefit
Information profile Easy-to use, Form based spec
Good for persistent interests
Web page delivery Info available as Web page
Browser independent;
Requires site visits
E-mail delivery Proactive information delivery
Eliminates site visits; e-mail clutter
Profile filtering One-to-one “broadcasting”
Reduced information overload
Heterogeneous Combines hetero info sources
Reduces subscription costs
Functionality of WebMate
• Learning user’s interests for information filtering
– Multiple TF-IDF vectors representation
– Incremental and adaptive Learning
– Compile personal newspaper
• Support for efficiently finding information
– Automatic refinement using Trigger Pairs
– Relevance feedback
_____________________________
Chen, Sycara, “WebMate: A Personal Agent for Browsing and Searching”,Proceedings of the Second International Conference on Autonomous Agents, Minneapolis, MN, May 1998
Profile Representation
• Multiple TF-IDF vectors representation
• How many vectors are used? (Settable parameters; depends on # User’s interests, Computational complexity)
• How many dimensions are used in a vector? (Computational complexity, typical lexicons in a domain)
Learning Algorithm
• Preprocess: Parse HTML page, delete stop words, stemming
• Extract TF-IDF vector of the current interesting document
• If the number of vectors in the profile is less than predefined number, add the vector to the profile
• Otherwise, calculate the cosine similarity between every two TF-IDF vectors in the profile
• Combine the two vectors with the greatest similarity.
• Sort the weights in the new vector in decreasing order and keep the highest several elements
Compile Personal Newspaper
• Automatically spide a list of URLs or Construct a query from the profile
• Calculate the similarity and check whether the similarity is greater than some threshold
• Experiments: Accuracy in top 10 is between 50% and 60%; Accuracy in top 20 is about 50%; Accuracy in the whole is about 30%
Search Refinement
• Trigger Pairs Based Automated Refinement
– If a word S is significantly¹ correlated with another word T, then (S, T) is considered a “trigger pair”, with S being the trigger and T the triggered word.
• Relevance Feedback
– The context of the search keywords in the “relevant” pages is used to automatically refine the search
• Parallel Search and Rerank
• Similarity-based Query
)()(
),(log),(),(
tPsP
tsPtsPtsMI
___________________
¹Significance is measured by mutual information (MI):
Examples of Trigger Pairs
• Broadcast News Corpus: 140M words, Distance between S and T is 500
• Examples1: product << {maker,company, corporation, industry, incorporate, sale, computer, market, business,…}
• Example 2: car <<{motor, auto, model, maker, vehicle, for, buick, honda, inventory, assembly, chevrolet, sale, …}
• Example 3: fare << {airline, maxsaver, carrier, discount, air, coach, flight, traveler, continental, unrestrict, ticket,…}
• Example 4: music << {symphony, orchestra, composer, song, concert, tune, concerto, sound, musician, album, …}
Automatic Search Refinement
• The user chooses the domain, and the system automatically expands the query using domain specific triggers or ontology
• The user chooses the intended definition of the ambiguous words, and the system according to the definition expands the query
• For a search with only one keyword, the top several triggers to the keyword are used to expand the search
• For a search with more than 2 keywords, the intersection of the triggers to the keywords are used to expand the search
Relevance Feedback Algorithm
• The context of the search keywords in the “relevant” pages is used to refine the search
• Given a relevant page, the system looks for the context of the keywords, and calculates the frequency in order to use the top several frequent words to expand the query
The Query Restart Problem• Agent A sends query to Agent B.
• Agent B can complete the query in time X, where
X = 1 with probability p.
X = c (c > 1) with probability 1 - p.
Expectation: EX = p + (1 - p) c
• If not done by time 1, should agent A abort and restart, or wait?
• Can restarting reduce expectation? The variance? Both?
• Does it help to repeatedly restart k times?
_______________________
Chalasani, Jha, Shehory, Sycara, “Query Restart Strategies for Web Agents”,Proceedings of Autonomous Agents 98, Minneapolis, MN, May 1998
Strategy: restart just after time 1, if not done by then.
Let Xi = completion time of i'th query, i = 1,2.
X1, X2 are independent, identically distributed.
New completion time is Y:
Y =
New expectation
EY = p + (1 - p)(1 + E X2) (X1, X2 indep.)
= 1 + p (1 - p) + (1 - p) c
If (and only if) c > 1 + 1 / p, EY < X1 !
A Simple Scenario: Single restart
{ 1 if X1 = 1,
1 + X2 if X1 = c.
A Simple Scenario: k Restarts
Number of Restarts k
Off-Line Delivery Agents Information filtering agents that deliver personalized information without the need for a direct Internet connection
Off-line Delivery of Agents Attributes
Element Description
Environment Internet, news feeds
Task skills Information
Knowledge Web, news, finance, sports, weather
Communication skills HTTP, Meta tags, Desktop OS
Benefits of Off-line Delivery Agents
Feature Advantage Benefit
Direct delivery Transparent delivery
User does not need to visit sites
Automatic delivery Delivery according to user specified schedule
Avoidance of peak traffic hours
Local Viewing HTML links are locally resolved
Avoids the need to get on-line
Disk management New information replaces out of date
Relieves user from disk management task
Notification Agents A notification agent is one that notifies a user of significant events, i.e. a change in the state of information, e.g.• Content change in a particular Web page• Search engine additions for specific keyword queries• User-specified reminders for personal events (e.g. birthdays)
• Notification Agent Attributes
Element Description
Environment Internet
Task Skills Monitoring, determining, and notifying change in information
Knowledge Web
Communication
Skills
HTTP, Meta Tag, IDML
Benefits of Notification Agents
Feature Advantage Benefit
Monitoring Monitors for change
in information
Reduces user workload
Browserless monitoring
Monitor only header file or body text
Increased network efficiency
Change determination
Machine check of document change
Reduced user workload
Server implementation
Checks each resource for multiple clients
Eliminates bandwidth waste
Notification Notifies user of changes Increases site visits
Other Service Agents • Announcement Agents
• Business information monitoring agents
• Classified ads agents: search database of ads
• Direct mail agents: deliver direct mail advertising
• Financial service agents: deliver e-mails with prices or other financial news
• Food and wine agents
• Job agents: virtual recruiters to find appropriate employees
• Entertainment agents: find communities of interests similar to the user and recommend items, such as music, movies etc.
• Shopping agents: comparison shopping for user-specified items
• Site agents: virtual hosts at sites
Shopbots
Advantages:• Provide unified interface to different stores, thus mitigating need to
navigate and deal with different interfaces • Find best price and availability of a product
Challenges• Virtual stores stop agents since they do not want to be compared on
price and availability alone• User’s trust in a shopbots’s ability to notice sales and promotions. Solutions:• Cooperative vendor/agent model• Vendor form learning agent
Collaborative Filtering
A collaborative filtering system makes recommendations based on the preferences of similar users.
People: Yenta, Referral Web
Products: Firefly, Tunes, Syskill & Webert
Readings: Wisewire, Phoaks
Content vs. Collaboration
• Content-based retrieval returns documents that are similar to a query (search) or a user profile (preference)
• Collaborative recommendation retrieves documents liked by others with similar profiles
Early Apps
• Group Lens (1994) Filtered newsgroups.. news client displays predicted scores & user rates after reading..
• Phoaks Recommended webpages.. uses frequency of mention data within Usenet news groups to rate URL’s
Getting the Data
Explicit: Firefly rate match recommendImplicit: Amazon purchase match recommendPriming the Pump: Lifestyle Finder uses
demographic data to assign users to market research categories
Over the Shoulder: Letizia uses observed browsing behavior & heuristics to recommend links
Problems in Collaborative Filtering
Incentives & Startup• Need a critical mass of users/recommenders to
make meaningful predictions• Need mechanisms to maintain participationReliability• Spoofing- will content providers inflate their
ratings• Technical problems with clustering & similarity
measuresPrivacy• Once you share your profile who else may want
it?
Synthetic Agents (e.g. Julia)
Julia is a chatterbot that tries to convince users of its humanlike behavior:
Repeating user’s input in questions Admitting ignorance Changing the topic of conversation Using conversational statements Using humorous statements Providing excerpts fro Usenet News Simulating typing, mimicking a user’s imperfect performance Possible applications of chatterbots: Visiting on-line chatroooms on topics of interest to your company Initiating interesting conversations in chatrooms Presenting comparison ads against your rivals Querying information requests about your products Serving as a site guide for finding information Serving as a product guide on your site (e.g. demonstrate an automobile)
Intranets
Business applications of intranets:• Effective communication medium for enterprises• Create virtual communities within an enterprise• Automating order tracking and transaction• processing• Marketing support automation• Customer service and knowledge sharing among
customers• Internal help desk to provide guidance for corporate
processes and resources• Human resources support
Internet Search Agent Model Attributes
HTTP, SQL, CGI, WAISCommunication Skills
Corporate databases and document formats
Knowledge
Intranet
Indexing document databases, searching, and retrieval
Environment
Task Skills
Description
Table 4.1
Element
Benefits of Intranet Search Agents
Feature Advantage Benefit
Multidatabase search Client search of all corporate databases
Increased organizational productivity, reduced costs
Search save on servers
Enables sharing of search results within organization
Reduced workload
Multiple-level access control
Allows access of certain field to authorized users
Corporate security
Proactive Notification Notifies users of change in information
Increased productivity, enhanced corporate communications
Intranet Filtering Agent Attributes
Element Descriptions
Environment Intranet
Task Skills Information organizing, sharing and presentation
Knowledge Skills Corporate database, workgroup discussions, newsfeeds
Communication HTTP, HTML, OLAP
Benefits of Intranet Filtering Agents
Feature Advantage Benefit
Information Profile Form-based specification of individual workgroup interests
Ideal for persistent but cumbersome for dynamic interests
Notification Proactive information
Delivery
Increased site visits and increased productivity by alleviating information search
Profile based filtering
Relevant information for critical decisions
Increased organizational productivity
Heterogeneous information sources
Combines heterogeneous information sources
Increased productivity and reduced subscription costs through sharing
Drawbacks include:· Separate notification for each user interest, cluttering mailbox· Do not incorporate user model for tracking user’s
actions upon information delivery Advanced Features· Recommend an agent for each new user interest topic· Modify an existing agent, based on user’s use of agent
recommended information (e.g. specialize an information agent)
· Remove an agent that the user does not use· Temporally activate an agent based on user interest and
disinterest in the agent’s recommendation
Drawbacks and extended features
The software runs over a network and enables a team to work together and share information. It assists groups in: · Group scheduling· Discussion groups· Resource tracking· Document Management
It could do some simple tasks:
· Save and re-execute shareable queries that search groupware data bases· Perform a script under pre-specified conditions· Perform a script according to pre-specified schedule
Collaboration Agents
Agent definition· Agent name with optional comment· When the agent should run:
*manually*if new mail has arrived*if documents have been created, modified, deleted*at scheduled times, e.g. hourly, daily etc
• What document should the agent act on? *all documents *all new and modified documents since last time agent ran
*all unread documents *selected documents • What should the agent do?
*User can enter LotusScript program that can examine named fields, and apply simple conditional logic.
Example: Lotus Notes
The goal is to use agents to automate workflow in business applications Differences between traditional workflow and agent-based workflow· Traditional workflow is centralized; agents offere a
distributed infrastructure· Traditional workflow works only in structured
environments; agents could manage workflow during execution
· Traditional workflow pre-specifies paths to take for
exception handling: agents can negotiate new tasks and resources dynamically
Process Automation Agents
Attributes of Process Automation Agents
Element Description
Environment Intranet
Task Skills Process scheduling, negotiation, execution, and notification
Knowledge Business processes, resources
management
Communication skills KQML, KIF, CORBA
Feature Advantage Benefit
Task Scheduling
Schedule user tasks
Negotiating with server agents
Alleviate the need for
User to be present to execute a task
Resource
Management
Dynamically allocate resources for task execution
Reduced workload as the user no longer needs to worry about resource availability
Exception handling
Renegotiate to reschedule in response to execution errors
Reduced workload as this is transparent to user
Proactive notifications
Proactively notify user of task completion
Increased productivity by reducing user need to monitor
Advantages of Process Agents
Agents that provide Enterprise-based support · Run scheduled database analyses in the
background
· Exception reporting for operations management
· Notify of information changes in a user-specified database object
Database Agents
Database Agents: Enterprise data delivery system
Oracle
VLDB Drivers
OLAPServer
InformixSQL Server
. . .
DSS Agent
Desktop
Server
Database Agents Attributes
Element Description
Environment Intranet
Task Skills Data analysis automation, exception reporting, notification of information change
Knowledge Data warehouse, metadata, RDBMS
Communication
Skills
SQL, ODBC, OLE
Database Agent Benefits
Feature Advantage BenefitAutomatic data
Analysis
Automates users’ repetitive data analysis
Reduced workload
Exception reporting
Reports user-defined exceptions in business
Operations
Faster decision making
Notification alerts
Notifies user of changes in information
Increased productivity
Desired Features of Database Agents
Exception reporting alerts
· Time or event triggered report execution· Workflow actions triggered by reports· Incorporation of learning capability into the
Database agents· Incorporation of learning into the OLAP server
Recommended