Information Search and Retrieval

Embed Size (px)

Citation preview

  • 8/7/2019 Information Search and Retrieval

    1/23

    Information Search and Retrieval

  • 8/7/2019 Information Search and Retrieval

    2/23

    There are three different techniques in

    search and Resource discovery paradigms,

    Information searching & retrieval

    Electronic directories & catalogs

    Information filtering

  • 8/7/2019 Information Search and Retrieval

    3/23

    Information search & retrieval

    Information search and retrieval is a process of

    finding and extracting information according to the

    specification provided by a user

    The main purpose of developing this process is to

    support nave users in areas like electronic shopping

    and home banking.

  • 8/7/2019 Information Search and Retrieval

    4/23

    The goals of information search and

    retrieval:

    To satisfy the customers up to the maximum extent

    To reduce the cost.

    To fastly execute the requested query. Computer

    methods that are used to execute the query are, Method for finding exact match based on keyword.

    Method for finding nearest neighbors.

    Information search and retrieval is used in areas like

    libraries where customers are concentrating oninformation seeking behavior.

  • 8/7/2019 Information Search and Retrieval

    5/23

    Electronic Directories and Catalogs

    Directories and catalogs are used for the

    following tasks

    1)Information organizing 2)Information browsing

  • 8/7/2019 Information Search and Retrieval

    6/23

    Information Organizing:

    Organizing refers to the way of organizing orarranging the information so as to make decisionsfor interrelating it.

    Organizing the information in a static way isuseful for some people but causes harm to otherpeople.

    The apprehension of organizing the information is

    very intuitive which means that what one findseasy may be difficult for others to browsedepending on the requirement.

  • 8/7/2019 Information Search and Retrieval

    7/23

    Information Browsing:

    It is defines as an activity that is guided by human foranalyzing the enterprise and identifying the details of

    resource space.

    There are two major problems that occur while

    performing browsing, they are navigation issues anddisorientation issues of users.

    These problems can be solved by using system that

    supports different representation of similar

    information.

  • 8/7/2019 Information Search and Retrieval

    8/23

    Information Filtering:

    The objective f information filtering is to provide

    access to relevant and variable information when a

    user requests for it.

    Information filtering is a process of selecting only

    those information that matches the users request.

    The purpose of this process is not responsible for

    performing any kind of search but its only objective isto filter out inconsistent data.

  • 8/7/2019 Information Search and Retrieval

    9/23

    Software filters are used to provide access control.

    local filter

    Remote filter

    Local filter:

    Local filters are used for processing incoming streamof data.

    Remote filters:

    They are software agents that perform their task on

    behalf of users. They help users to perform daily task,search and retrieve information, support decision-making

  • 8/7/2019 Information Search and Retrieval

    10/23

    INFORMATION SEARCH AND

    RE

    TRIEV

    AL

    Searching is a process of identifying/finding the

    required information from a massive amount of

    stored semi structured information. This is in contrast

    to the database application that deals with thestructured format since it follow certain standards,

    syntaxes and make use of data type that have

    specific meaning

    Examples: students database and email messages.

  • 8/7/2019 Information Search and Retrieval

    11/23

    There are two phases in which search process can be

    accomplished. They are

    End user retrieval phase

    Publisher indexing phase.

  • 8/7/2019 Information Search and Retrieval

    12/23

    End user Retrieval phase:

    The following steps must be followed in end user retrievalphase are,

    A query is constructed by a user which specifies the search

    method to be used.

    Query is then sent to the server , that examines the query,process it and initiate the search process. The result is a

    table that contains a list of matching documents called hit

    list. This tables finally degenerate hit list is passed back to

    the user.

    Users then select the pertinent document according to

    their requirement , scans it and print only the desired part

    of a document.

  • 8/7/2019 Information Search and Retrieval

    13/23

    Publisher Indexing phase:

    This phase is responsible for,

    Making an entry of a document in the database.

    Creating and updating indexes and pointers that are useful

    while searching is performed

  • 8/7/2019 Information Search and Retrieval

    14/23

    What is a Document?

    Examples: web pages, email, books, news stories,

    scholarly papers, text messages, Word, Power point, PDF,forum postings, patents, IM sessions, etc.

    Common properties

    Significant text contentSome structure (e.g., title, author, date for papers;

    subject, sender, destination for email)

  • 8/7/2019 Information Search and Retrieval

    15/23

    Documents vs. Database Records

    Database records (or tuples in relational databases)

    are typically made up of well-defined fields (or

    attributes)e.g., bank records with account numbers,

    balances, names, addresses, social security numbers,

    dates of birth, etc.

    Easy to compare fields with well-defined semantics to

    queries in order to find matches

    Text is more difficult

  • 8/7/2019 Information Search and Retrieval

    16/23

    Dimensions of IRContent Applications Tasks

    Text Web search Ad hoc search

    Images Vertical search Filtering

    Video Enterprise search ClassificationScanned docs Desktop search Question answering

    Audio Forum search

    Music P2P search

    Literature search

  • 8/7/2019 Information Search and Retrieval

    17/23

    InfrmationRetrievalTasks

    Ad-hoc search Find relevant documents for an

    arbitrary text query

    Filtering Identify relevant user profiles for a newdocument

    Classification Identify relevant labels for documents

    Question answering Give a specific answer to a

    question

  • 8/7/2019 Information Search and Retrieval

    18/23

    Models of Information Retrieval:

    There are three models that are used for retrieving

    information from the database in an efficient

    manner. They are

    Boolean information retrieval model

    Vector space information retrieval model

    Probabilistic information retrieval model

  • 8/7/2019 Information Search and Retrieval

    19/23

    Boolean information retrieval model:

    Boolean refers to query specification which are foundusing word or phrases, which are combined usingstandard operators AND, OR, NOT.

    The drawback of this model fetches those text filesirrespective of their locations.

    The drawback of this model is that it does not giveany preference or priority to fetched document

    The system is more effective if a query exactlymatches with the retrieved document and on theother hand results ineffectiveness id the result is notdefinite and accurate.

  • 8/7/2019 Information Search and Retrieval

    20/23

    Vector space information retrieval model

    Vector space model is developed to overcome the

    problems of Boolean model.

    It performs vector comparison using cosine

    correlation similarity method.

    According to this method, the query matches the

    text, when vector text is similar to vector query.

  • 8/7/2019 Information Search and Retrieval

    21/23

    Probabilistic information retrieval model:

    This model is based on probability ranking criterion.

    According to this criteria, every text present in the

    database is given some priority.

    Both vector space model and probabilistic model uses

    Boolean queries.

  • 8/7/2019 Information Search and Retrieval

    22/23

    CHALLENGES AND PROBLEMS

    ENCOUNTERD IN INFORMATION SEARCH The following are the various challenges that a rises

    while searching information online.

    Information is being uploaded on the internet at ahigh rate.

    Since, the turnover of information is rapid, traditionaltools of information search are not sufficient for theconsumers. Therefore, a challenge is to design and

    implement advances searching, filtering and datamining tools that maximize the search process of anindividual in terms of time, cost and informationneeds

  • 8/7/2019 Information Search and Retrieval

    23/23

    Sometimes the search results gets overloaded often

    confusing the consumers.

    Consumer learn the environment slowly by knowing what iswhere. Additionally, directories and catalogs may be provided

    to the consumers that facilitate them to navigate and browse

    the product information of their choice.

    Today, the focus is on human technology interfaces,according to this feature, the information regarding

    preferences of a customer is taken and intelligent and useful

    information is provided to the consumer, but the challenge

    here is, how to represent this useful information on thescreen. Developments are being made to use virtual reality for

    displaying such information to user