CQL

  • Upload
    jonjon

  • View
    225

  • Download
    0

Embed Size (px)

Citation preview

  • 8/9/2019 CQL

    1/2

    MSc project:

    Stream Query Processing with CQLTore Risch

    2010-03-05

    A Data Stream Management (DSMS) is similar to a database management system(DBMS) with the difference that a DBMS allows searching only stored data, while a

    DSMS in addition provides query facilities to search directly in data streaming fromsome source. DSMS queries are different from conventional database queries in, e.g.,

    SQL where a query requests data from tables stored in the database. The result of aDSMS query can be not only a set of tuples as in SQL, but also a potentially infinite

    stream of tuples. Furthermore, stream queries are continuous queries (CQs) in that theyrun all the time until they are terminated, while conventional queries are executed on

    demand and run until all requested data is delivered.

    There are several DSMS research prototypes developed, such as STEAM (Stanford),Aurora (MIT), Gigascope (Bell Labs), and Wavescope (MIT). Streambase(http://www.streambase.com/) provides the first DSMS product. In UDBL we are

    developing the DSMS SCSQ (SuperComputer Stream Query processor) based on themain memory DBMS Amos II (http://user.it.uu.se/~udbl/amos). Amos II is a functional

    DBMS where data and information represented as typed functions. In SCSQ databasequeries over streams are expressed in SCSQL, a query language similar to the OO parts

    of SQL:99 but extended with parallel stream query facilities. SCSQL is an extension ofthe functional query language AmosQL

    (http://user.it.uu.se/~udbl/amos/doc/amos_users_guide.html). Thus, in SCSQ queries andview are expressed as functions, i.e. SCSQ has a functional data model.

    There are several query language developed for DSMSs, CQL (Stanford), StreamSQL

    (StreamBase), WaveScript (MIT), and SCSQL (UU). The present project aims atproviding CQL support integrated with SCSQ.

    The performance of DSMSs are evaluated using the Linear Road Benchmark for DSMSs

    (http://www.cs.brandeis.edu/~linearroad/). Linear Road simulates an expressway systemwith dynamically varying toll rates producing data streams to be processed by a DSMS. It

    is endorsed by several universities, including Brandeis, Brown, MIT, and Stanford. Thetraffic events in Linear Road (LR) are generated by a traffic simulator from MIT. The

    performance of a DSMS is measured by how many expressways it can handle

    simultaneously, called theL-rating. LRB was originally developed by Stanford in termsof CQL. SCSQ is currently the fastest DSMS in the world according to LRB. Thisperformance is achieved by optimized parallelization of CQs on a cluster.

    The project is divided into two phases:

    1. The first phase, called evaluation of the stream query language CQL, is toinvestigate what are the main properties of CQL and to what extent they are

  • 8/9/2019 CQL

    2/2

    implemented by the Stanford STREAM project. The result should be a descriptionof how CQL differs from SQL, what is the implementation status, and what APIs

    are provided. A state-of-the-art overview of related query languages should beincluded. This phase counts as a 15 p Xjob on C level.

    2. The second phase, calledprocessing continuous CQL queries over a functionalDSMS, is to design and implement a CQL processor in Amos II/SCSQ. It countsas a 30 p Xjob on E level.

    a. First it should be investigated what existing functionality in SCSQ andAmos II can be utilized to support CQL. This investigation is based on theresult from the evaluation of the stream query language CQL.

    b. An extension of the current Amos II SQL parser to support CQL should beimplemented. The parser is mainly a preprocessor that translates SQL to

    AmosQL, but the parser is also integrated with the Amos II kernel andmay call the kernel during the parsing. The SQL parser translates SQL to

    Lisp so knowledge of Lisp is needed to extend it to support CQL.c. Cases where there is missing functionality in Amos II/SCSQ to support

    CQL should be identified. If the missing functionality is minor, properextensions to SCSQ should be made; more problematic extension should

    be documented and possible solutions outlined. In some cases AmosQLmay need to be extended using itsforeign function facilities

    (http://user.it.uu.se/~torer/publ/external.pdf).d. For testing the CQL parser it should be investigated whether there are

    some test scripts available for verifying CQL. If that is the case theyshould be used for verifying that the CQL parser over SCSQ behaves

    properly. Another possibility, in case test scripts for CQL cannot beobtained, is to use a subset of LRB as test script. Even if a full

    implementation of LRB may not be possible in the present project, itshould be investigated whether parts are implementable with the new CQL

    implementation.e. The result of the second phase should be a fully functional system with an

    implemented demonstration script that illustrates the functionality. Theimplementation should be made on the Windows platform.

    The result from this work should be a report describing both phases of the project. The

    implemented system should be satisfactory documented and the report should include anoverview of related state-of-the-art implementations.