Indexing and Fast Search

  • Upload
    akvalex

  • View
    227

  • Download
    0

Embed Size (px)

Citation preview

  • 8/8/2019 Indexing and Fast Search

    1/23

    Index ing andFast Search

    engineNBITSearch

    parameters

    www.nbitsearch.com Novosib-BIT LLC

    http://www.nbitsearch.com/http://www.nbitsearch.com/
  • 8/8/2019 Indexing and Fast Search

    2/23

    The System is Designed forCompact indexing of huge

    arrays of data on a hard disk

    2

    high-speed exact and fuzzy search for

    objects with minimum use of RAM.

    for

  • 8/8/2019 Indexing and Fast Search

    3/23

    3

    Exact and Fuzzy Search

    Interval queries provide

    fuzzy (inexact) search.

    Precise (exact) search

    is a particular case of fuzzy search.

  • 8/8/2019 Indexing and Fast Search

    4/23

    4

    Indexable Objects

    Volume Weight Speed

    54 175 500100 182 700

    Objects of any typeswith precise

    (exact, point)parameters:

  • 8/8/2019 Indexing and Fast Search

    5/23

    5

    Indexable ObjectsObjects of any types

    with fuzzy

    (inexact, interval)parameters:

    Volume Weight Speed

    54 59 175 180 500 600100 300 182 200 700 800

  • 8/8/2019 Indexing and Fast Search

    6/23

    6

    Indexable Objects

    See at www.nbitsearch.com

    Example:

    http://www.nbitsearch.com/http://www.nbitsearch.com/
  • 8/8/2019 Indexing and Fast Search

    7/23

    7

    Indexing of Objects

    At first, a user mapsa source objects to the so-called

    primitives :

    precise/fuzzy parameters,

    piecewise functionsor matrixes.

    Step 1:

  • 8/8/2019 Indexing and Fast Search

    8/23

    8

    Indexing of ObjectsStep 2:

    The system automatically transformsthe primitives to numeric masks .

    These masks are spatialhashes of objects.

    Then, the system automaticallyindexes the masks.

  • 8/8/2019 Indexing and Fast Search

    9/23

    9

    Sizes of Indexable ArraysThe most tangible effect is shown for

    such arrays of primitives ,

    which support 50 100 million and more objects

    for one index.

    A size of arrays of indexableobjects can be10 100 terabyte and larger .

  • 8/8/2019 Indexing and Fast Search

    10/23

    10

    Indexing Limitations

    One index supports

    2 billion ofits own objects.

    Limitations

    of number of indexes are artificial .

  • 8/8/2019 Indexing and Fast Search

    11/23

    11

    What is a Billion?

    1 billion seconds is

    32 years .

    1 billion pagesfor a laser printer is

    a pile with a height of 100 km .

  • 8/8/2019 Indexing and Fast Search

    12/23

    12

    Indexing Speed

    Estimator:

    T ~ (N) * LOG (N)T time of forming one index,

    N number of indexable objects.

  • 8/8/2019 Indexing and Fast Search

    13/23

  • 8/8/2019 Indexing and Fast Search

    14/23

    14

    Search SpeedTime estimation

    of defining the address the firstpotential block of data:

    T ~ LOG (N)T time of logic probing ,

    N number of indexed objects.

  • 8/8/2019 Indexing and Fast Search

    15/23

    15

    Search SpeedA speed of fetching

    the result of interval queriesfrom a hard disk can be

    10 100 times higher than(for the large data array) ,

    the speed of similar operationin a standard relational DBMS .

  • 8/8/2019 Indexing and Fast Search

    16/23

    16

    Search SpeedA speed of fetching

    the result of interval queriesfrom a hard disk can be

    1000 times (and more) higher than(for the large data array) ,

    the speed of similar operationwhen solving the problems

    with the use of brute force method .

  • 8/8/2019 Indexing and Fast Search

    17/23

    17

    Search SpeedA time of fetching

    the result of interval queriesfrom a hard disk

    depends linearly

    on objects number inresult set .

  • 8/8/2019 Indexing and Fast Search

    18/23

  • 8/8/2019 Indexing and Fast Search

    19/23

    19

    Search Memory

    A sizeof memory buffers

    to fetch the data dependson users needs.

    This size is often infinitesimal(~10 megabyte).

  • 8/8/2019 Indexing and Fast Search

    20/23

    20

    Reading of Result Set

    Reading

    the result setfrom a hard disk

    to the RAM

    is optimum:magnetic head does not oscillate .

  • 8/8/2019 Indexing and Fast Search

    21/23

    21

    Multidimensional of IndexesIndexes are

    multidimensional ,

    but there is no an effectof explosion of data .

    Efficient indexes ofobjects can be formedby 1 32 parameters .

  • 8/8/2019 Indexing and Fast Search

    22/23

    22

    MultifunctionalityIndexes are

    multifunctional:

    Indexing and searchingin the tables can be arranged

    by multiple virtual columns,which values are any functions

    of values of actual columns .

  • 8/8/2019 Indexing and Fast Search

    23/23

    23

    THANK YOU!

    www.nbitsearch.com

    Technology developed with support from FASIE formed by the Government of Russian Federation

    Novosib-BIT LLC 2004 - 2010 Patented

    http://www.nbitsearch.com/http://www.fasie.ru/http://www.fasie.ru/http://www.nbitsearch.com/