Upload
katy-hercules
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
1
Object Oriented Databases
Ioan Despi
1. Advanced Database Applications
2. Object-Oriented Concepts
3. OODBMS
4. Common Issues
4. ODMG 2.0
2
1. Advanced Database Applications
RDBMS: widespread acceptance for traditional business applications: order processing, inventory control, banking, airline reservations
proven inadequate for new technologies: computer-aided design (CAD), computer-aided manufacturing (CAM), computer-aides software engineering (CASE), office information systems and multimedia
systems, digital publishing, geographic information systems
3
Disadvantages of Relational DBMS:
Poor representation of “real world” entities.
Semantic overloading
Poor support dor integrity and enterprise constraints.
Homogenous data structure.
Limited operations.
Difficulty handling recursive queries
Impedance mismatch.
Other problems concerning: concurrency, schema access, navigational access, and so on
4
2. Object- Oriented Concepts
Abstraction: the process of identifying the essential aspects of an entity and ignoring the unimportant properties.
1. Encapsulation: an object contains both the data structure and the set of operations that can be used to manipulate it.
2. Information hiding: we separate the external aspects of an object from its internal details, which are hidden from the outside world.
The internal details of an object can be changed without affecting the application that use it.
Loosely speaking, an object correspond to an entity in the ER model.
The object -oriented paradigm is based on encapsulating data and code related to an object into a single unit.
5
The current state of an object is described by one or more attributes, or instance variable. The value of each variable is itself an object.
1. A simple attribute: can be a primitive type (integer, string, real,…
2. A complex-attribute: can contain collections and/or references
3. A reference attribute: represents a relationship between objects
contains a value or collection of values, which are themselves objects (like a Foreign Key or a pointer)
Complex object: an object that contains one or more complex attributes
Notation: “dot” notation: branch.street, branch.manager, branch.city
Object= a uniquely identifiable entity that contains both the attributes that describe the state of the object and the actions that are associated with it, that is its behaviour.(Simula)
6
The behaviour of an object is given by:
a set of messages to which the object responds
each message may have 0, 1 or more parameters
a set of methods, each of which is a body of code to implement a message
a method returns a value as the response to the message
The physical representation of data is visible only to the implementor of the object.
Messages and responses provide the only extenal interface to an object.
The term message does not necessarily imply physical message passing. Messages can be implemented as procedures calls.
8
Methods are programs written in a general-purpose language respecting thee following restrictions:
1. Only variables in the object itself may be referenced directly
2. Data in other objects are referenced only by sending messages
They can be used to change the object’s state by modifying its attribute values, or to query the values of selected attributes.
A method consists of a name and a body that performs the behavior associated with the method name:
method void update_salary (float increment)
{salary = salary + increment;
}
9
Messages are the means by which objects communicate.
A message is simply a request from an object (the sender) to another object (the receiver)asking the second object to execute one of its methods.
The sender and receiver may be the same object.
The “dot” notation is generally used to access a method.
staff_object.update_salary(1000)
In a traditional programming language, a message would be written as a function call:
update_salary(staff_object, 1000)
10
Object classes
Similar objects (have the same attributes, respond to the same messages) are grouped into a class.
The attributes and associated methods are defined once for the class.
Al objects in a class have the same:
variable types
message interface
methods
They may differ in the values assigned to variables
Classes are analogous to entity sets in the ER model.
Example: all branch objects would be described by a single Branch class.
11
BRANCH
Attributes
bno
street
city
area
…
Methods
update_tel_no
….
bno=B5street=12 Deer Stcity=Sidcuparea=London...
bno=B7street=16 Dever Stcity=Dycearea=Aberden...
bno=B3street=154 Main Stcity=Partickarea=Glasgow...
12
A class is an object ===>has is own class attributes and class methods
The class is an instance of a higher-level class called metaclass
Class attributes describes the general characteristics of the class, such as totals or averages( ex: total no of branches)
Class methods are used to change or query the state of class attributes
There are special class methods to create new instances of the class:
new --constructor
destructor
In the following example, employment-length is a derived attribute.
For strict encapsulation, methods to read and set other variables are also needed
13
class employee {
/*Variables */
string name;
string address;
date start-date;
int salary;
/* Messages */
int annual-salary;
string get-name;
string get-address;
int set-address ( string new-address)
int employment-length;
};
14
Inheritance
Inheritance allows one class (subclass) to be defined as a special case of a more general class (superclass).
The process of forming a superclass is referred to as generalization.
The process of forming a subclass is referred to as specialization.
By default, a subclass inherits all the properties of its superclass(es) and, additionally, defines its own unique properties.
A subclass can redefine inherited methods.
All instances of the subclass are also instances of the superclass.
Principle of substitutability: we can use an instance of the subclass whenever a method or a construct expects an instance of the superclass..
15
The relation between the subclass and superclass: A KIND OF (AKO)
The relation between an instance and its class: IS-A.
Examples:
Manager is AKO Staff.
Susan Deer IS-A Manager.
Inheritance:
1. Single inheritance: the subclass inherits from no more than one superclass
2. Multiple inheritance: the subclass inherits from more than one superclass ===> conflicts!
16
Staff
Person
Manager Sales_Staff
Manager Sales_Staff
Sales_Manager
Single
inheritance
Multiple
inheritance
17
3. Repeated inheritance: a special case of multiple inheritance
superclasses inherit from a common superclass
The inheritance mechanism must ensure that the subclass does not inherits properties twice.
Staff
Manager Sales_Staff
Sales_Manager
4. Selective inheritance:allows a subclass to inherit a limited number of properties from the superclass.
18
Object Identity
Each object is assigned an Object Identifier (OID) when it is created that is:
system generated
unique to that object
invariant
independent of the values of its attributes
inivisible to the user
Other concepts:
overriding (+ overloading)
polymorphism & dynamic binding
complex objects
persistence
19
3. OODBMS
Hierarchical Data Model
Network Data Model
Relational Data Model
ER Data Model
Semantic Data Model
Object-Relational Data Model Object Oriented Data Model
1960 - 1970
First generation DBMS
1970 - 1980 Second generation DBMS
E. Codd, 1970
IMS
Chen, 1976
Third generation DBMS
Hammer, McLeod, 1981
1980-2000
20
OODM: a logical data model that captures the semantics of objects supported in oo programming
OODB: a persistent and sharable collection of objects defined by an OODM
OODBMS: the manager of an OODB (W. Kim 1991)
Zdonik &Maier(1991) ---An OODBMS must (at minimum) satisfy:
must provide database functionality
must support object identifier
must provide encapsulation
must support objects with complex state
or (Khoshafian &Abnous 1990)
object oriented= ADT + Inheritance + OID
OODBMS= OO + database capabilities
21
Traditional DBMS
•persistence
•sharing
•transasctions
•concurrency control
•recovery control
•security
•integrity
•quering
Semantic data models
•generalization
•aggregation
OO programming
•object identity
•encapsulation
•inheritance
•types & classes
•methods
•complex objects
•polymorphism
•extensibility
Special requirements
•versionong
•schema evolution
Object Oriented Data Model
22
Strategies for Developing an OODBMS:
1. Extend an existing object-oriented programming language with database capabilities. Smalltalk, C++, Java --> GemStone
2. Provide extensible object-oriented DBMS libraries. Ontos, Versant, ObjectStore
3. Embed object-oriented database language constructs in a convenient host language. O2 embeds OODL in C.
4. Extend an existing database language with object-oriented capabilities. Extend SQL--> SQL3, OQL.
5. Develop a novel database data model / data language. SIM (semantic information manager, 1988).
23
OODBMS Perspectives:
Modern database systems are characterized by their support of the following features:
1. Data model: a particular way of describing data, relationships between data, and constraints on the data
2. Data persistence: the ability of data to outlive the ecxecution of a program, and possibly thee lifetime of the program
itself.
3. Data sharing: the ability of multiple applications to access common data, possibly at the same time
4. Reliability: the assurance that the data in the database is protected from hardware and software failures
5. Security : the protection of the data against unauthorized access
24
7. Integrity: the assurance that the data conforms to specified correctness and consistency rules
8. Distribution: the ability to physically distribute a logically interrelated collection of shared data over a network
Traditional programming languages provide:
1. Constructs for procedural control and for data and functional abstraction
2. Lack built-in support for many of the above database features
Novel applications require functionality from both perspectives.
25
Issues:
1. Persistence: objects must survive user session or application program that created them has terminated
transient objects: only last for the invocation of the program
To implement persistence in OODB: 3 schemes
A. Checkpointing: copy all or part of a program’s address space to secondary storage
• a checkpoint can only be used by the program that created it
• a checkpoint may contain a large amount of data that is of no use in subsequent executions
26
B. Serialization: copy the closure of a data structure to disk.
A write operation on a data value involves the traversal of the graph of objects reachable from the value and, then, the
writing of a flattened version of the structure to disk.
Reading back this flattened structure: serialization, pickling, marshaling.
• Does not preserve object identity: if two data structures that share a common sunstructure are separately serialized, then on retrieval the substructure will no longer be shared in the new copies.
• It is not incremental, and so saving small changes to a large data structure is not efficient.
27
C. Explicit paging: paging objects between the application heap and the persistent store.
Requires the conversion of object pointers from a disk-based scheme to a memory-based scheme.
There are two common methods for creating/updating persistent objects:
a. Reachability-based: an object will persist if it is reachable from a persistent root object
at any time after creation, an object can become persistent by adding it to the reachability tree.
Garbage collection: deletes objects when they are no longer accessible from any other object
Smalltalk, Java
28
b. Allocation-based: an object is explicitely declared as being persistent within the application program
i) By class: a class is statically declared to be persistent --> all instances of the class are made persistent when they are created
a clas may be a subclass of a system-supplied persistent class
Ontos, Objectivity/DB
ii) By explicit call: an object may be specified as persistent when it is created or, in soome cases, dynamically at runtime (added to a persistent collection)
ObjectStore
29
Alternatively, to provide persistence in a programming language: orthogonal persistence, based on the following principles:
1. Persistence independence: the persistence of a data object is independent of how the program manipulates the data object and conversely, a fragment of the program is expressed independently of the persistence of data it manipulates.
2. Data type orthogonality: all data objects should be allowed the full range of persistence irrespective of their type: Ps-algol,
Napier88, Galileo, GemStone
Persistence is only a quality attributable to a subset of the language data types: Pascal/R, Amber, E, Avalon/C++
3. Transitive persistence: the choice of how to identify and provide persistent objects at the language level is independent of the choice of data types in the language. Most used technique: reachability-based.
30
Orthogonal persistence:
Advantages:
1. There is no need to define long-term data in a separate schema language
2. No special application code is required to access or update persistent data
3. There is no limit to the complexity of the data structures that can be made persistent
4. Improved programmer productivity from simpler semantics
5. Improved maintenance
6. Consistent protection mechanisms over the whole environment
7. Support for incremental evolution
8. Automatic referential integrity
31
Issues:
2. Pointer Swizzling Techniques:
the action of converting object identifiers (OIDs) to main memory pointers, and back again
Aim: to optimize access to objects.
Obvious approach: to hold a lookup table that maps OIDs to main memory pointers
Pointer swizzling: stores the main memory pointers in the place of the referenced OIDs and vice versa, when the
object has to be written back to disk
32
A. No swizzling: the OID is used every time the object is accessed
the system maintains a lookup table, so that the object’s virtual memory pointer can be located and then used to access the object.
Could be inefficient if the same objects are accessed repeatedly
Could be acceptable if applications access an object once
B. Object referencing: to be able to swizzle a persistent object’s OID to a virtual memory pointer, a mechanism is required to distinguish between resident and non-resident objects.
Most techniques are variations of edge marking or node marking: (Hoskings&Moss, 1993):
33
Virtual memory is considered to be a directed graph, with objects as nodes and references as directed edges:
1. Edge marking marks every object pointer with a tag bit.
If the bit is set, then the reference is to a virtual memory pointer
Otherwise, it is still pointing to an OID and needs to be swizzled when the object it referes to is faulted
into the application’s memory space.
2. Node marking requires that all object references are immediately converted to virtual pointers when the object is faulted into memory.
1 is a software-based technique;
2 can be implemented using software or hardware-based techniques.
34
C. Hardware-based schemes: use virtual memory access protection violations to detect accesses of non-resident objects(Lamb91)
Use the standard virtual memory hardware to trigger the transfer of persistent data from disk to main memory.
Once a page has been faulted in, objects are accessed on that page via normal virtual memory pointers.
The hardware approach avoids the overhead of residency checks incurred by software approaches but
limits the amount of data that can be accessed during a transaction to the size of virtual memory and complicates other issues, like recovery, fine-grained locking, aso.
ObjectStore, Texas
35
Issues:
3. Transactions
in classical DBMSs: short duration transactions
in CAD, CASE,…: long duration transactions (hours, days)
a need for new protocols:
nested transactions, sagas, multi-level transactions.
4. Versions: Ontos, Versant, ObjectStore, Objectivity/DB, Itasca
object version = an identifiable state of an object
version history = the evolution of an object
version management = object references always point to the correct version of an object
36
Types of versions:
1. Transient version: unstable, can be updated and deleted
it can be created from new by checking out a released version from a public database or
by deriving it from a working or transient version in a private database, when the base transient version is promoted to a
working version. Always sored in the creator’s private workspace.
2. Working version: stable and cannot be updated but it can be deleted by its creator. It is stored in the creator’s private workspace.
3. Released version: stable, cannot be updated or deleted.
it is stored in a public database by checking in a working version from a private database
37
Issues:
5. Schema evolution: design is an incremental process.
To support this process, applications require flexibility in
dynamically defining and modifying the database schema.
Typical changes to the schema include:
changes to the class definition:
modifying attributes, modifying methods
changes to the inheritance hierarchy:
making a class S the superclass of a class C,
removing a class S from the list of superclasses of C,
modifying the order of superclasses of C
changes to the set of classes:
creating and deleting classes, modifying class names
38
Client - Server Architectures:
1. Object server: distributes the processing between the two components
Server process: responsible for managing storage, locks, commits to secondary storage, logging, recovery, security, query optimization and executing stored procedures
Client process: responsible for transaction management and interfacing to the programming languages
2. Page server: most of the database processing is performed by the client.
Server process: responsible for secondary storage and for providing pages at the client’s request
39
3. Database server: most of the database processing is performed by the server.
Client process: passes requests to the server, receives results and passes them on to the application.
Used by relational DBMS
In each case, the server resides on the same machine as the physical database.
The client may reside on the same or different machine.
If the client needs access to databases distributed across multiple machines, then the clients communicates with a server on each machine.
There may be a number of clients communicating with one server: for example, one client for each user or application.
40
Advantages od OODBMSs:
enriched modeling capabilities
extensibility
removal of impedance mismatch
more expressive query language
support for schema evolution
support for long duration transactions
aplicability to advanced database applications
improved performance
41
Disadvantages of OODBMSs:
lack of universal data model
lack of experience
lack of standards
query optimization compromises encapsulation
locking at object level may impact performance
complexity
lack of support for views
lack of support for security
42
Object Database Standard ODMG 2.0 1997
Object Database Management Group proposed an OODM consisting of:
1. An object model
2. An object definition language (ODL) (like traditional DDL)
3. An object query language, with a SQL-like syntax
ODMG object model is a superset of the Object Management Group (OMG) object model.
1990: OMG published its Object Management Architecture (OMA) Guide document .
It specified a single terminology for oo languages, systems, databases and applications.
43
Object Request Broker
WP Spreadsheet CAD Help email browser
Application objects
Common
facilities
storageTransactionmanagement queries versioning security
Object
servicesOMA
44
1. The Object Model-- OM
is a design-portable abstract model for communicating with OMG-compliant object-oriented systems
a requester sends a request for object services to the ORB
which keeps track of all the objects in the system and the types of services they can provide
the ORB then forwards the message to a provider
who acts on the message and passes a response back
to the requester via the ORB
requester ORB provider
45
2. The Object Request Broker -- ORB
handles distribution of messages between application objects
is a distributed ‘software bus’ that enables objects (requesters) to make and receive requests and responses from a provider
on receipt of a response from the provider, the ORB translates the response into a form the original requester can understand
--> provides a mechanism by which objects make and receive requests and responses transparently
--> interoperability between applications in a heterogeneous distributed environment
46
3. The Object Services --OS
provide the main functions for realizing basic object functionality
collection: a uniform way to create and manipulate most common collections generically:
sets, queues, stacks, lists, binary trees
concurrency control: a lock manager that enables multiple clients to coordinate their access to shared rresources
event management: allows components to dynamically register or unregister their interest in specific events
exeternalization: provides protocols and conventions for externalizing and internalizing objects.
47
externalization: records the state of an object as a stream of data (in memory, on disk, across network)
internalization: creates a new object from it in a different process
licensing: operations for metering the use of components to ensure fair compensation for their use, and
protect intellectual property
lifecycle: operations for creating, copying, moving, and deleting groups of related objects
naming: facilities to bind a name to an object relative to a naming context
persistence: interfaces to mechanisms for storing and managing objects persistently
property: operations to associate named values (properties) with any (external) component
48
query: declarative query statements with predicates, the ability to invoke operations and other object services
relationship: a way to create dynamic associations between components that know nothing of each other
security: services such as identification and authentification, authorization and access control, auditing, security of communication, non-repudiation, administration
time: maintains a single notion of time across different machines
trader: a matchmaking service for objects. It allows objects to dynamicaly advertise their services, and
other objects to register for a service.
transactions: a two-phase commit coordination among recoverable components using flat or nested transactions
49
4. The Common Facilities --CF
comprise a set of tasks that many applications must perform but are traditionally duplicated within each one.
they are made available through OMA-compliant class interfaces
in the latest version: CF are split in
horizontal common facilities (printing, electronic mail, aso) and
vertical domain facilities (finance, helthcare, manufacturing, e-commerce, transportation, telecommunications)
50
The Common Request Broker Architecture -- CORBA
defines the architecture of ORB-based environments
is the basis of any OMG component, defing the parts that form the ORB and its associated structure
1991: CORBA 1.1 defined:
Interface Definiton Language (IDL)
Application Programming Interfaces (API) - enable client-server interaction with a specific implementation of an ORB
1994dec: CORBA 2.0 improved interoperability
specified how ORBs from different vendors can interoperate
1997: CORBA 2.1
51
Main elements:
IDL: permits the description of class interfaces independent of any particualr DBMS or programming language
a type model that defines the values that can be passed over the network.
an Interface Repository, which provides information on interfaces and types, and is used to construct dynamic runtime requests, by the Dynamic Invocation Interface
Methods for getting the interfaces and specifications of objects
Methods for transforming OIDs to and from strings
From the IDL definitions, CORBA objects can be mapped into particular programming languages, as C, C++, Smalltalk and Java. This produces interface stubs within the application programming language (client) that are used to invoke the requests. The same stubs are used on the object implementation side (server) to create skeletons, which are completed to provide the requested behavior.
52
The ODMG Object Model
Vendors: GemStone Systems, Object Design, O2 Technology, Versant Object Technology, UniSQL, POET Software, Objectivity, IBEX Computing SA, Lockheed Martin
formed Object Database Management Group (ODMG)
It produced an object model that specifies a standard model for the semantics of database objects.
The model is important because it determines the built-in semantics that the OODBMS undestands and can enforce
The design of class libraries and applications that use these semantics should be portable across the various OODBMSs that support the object model.
53
The major components of the ODMG for an OODBMS are:
1. Object model--OM
2. Object definition language --ODL
3. Object query language -- OQL
4. C++ language bindings
5. Smalltalk language bindings
6. Java language bindings
Initial ODMG standard: 1993
Major version: ODMG 2.0 september 1997
54
1. The Object Model --OM
ODMG object model is a superset of th OMG object model
enables both designs and implementations to be ported between complian systems
Basic modeling primitives: the object and the literal.
Objects and literals can be categorized in types: all objects of a given type exihibit common behavior and state. A type is an object.
Behavior is defined by a set of operations that can be performed on or by object.
State is defined by the values an object carries for a set of properties
A property may be either an attribute or a relationship between the object and one or more other objects.
55
Atomic_type
long
short
unsigned long
unsigned short
float
double
boolean
octet
char
string
enum < >
Collection_literal
set < >
bag < >
list < >
array < >
dictionary < >
Structured_literal
date
time
timestamp
interval
structure < >
Literal_type
56
Structured_object
Date
Time
Timestamp
Interval
Object_type
Atomic_object Collection_object
Set< >
Bag< >
List < >
Array < >
Dictionary < >
57
A database stores objects, enabling them to be shared by multiple users and applications.
A database is based on a schema that is defined in ODL. The database contains instances of the types defined by its schema.
Objects types are: atomic, collections or structured types.
Types shown in italics are abstract types. Types shown in normal are directly instantiable. They are the only base types.
Types with < > indicate type generators.
Objects are created using the new() method of the corresponding factory interface provided by the language binding interface.
All objects have an ODL interface which is implicitly inherited by the definition of all user-defined objects:
58
Interface Object {
enum Lock_Type {read, write, upgrade};
exception LockNot Granted {};
void lock(in Lock_Type mode) raises (LockNotGranted);
boolean try_lock(in Lock_Type mode);
boolean same_as(in Object anObject);
Object copy();
void delete(); }
Each object has an unique identity, OID, which does not change and is not reused when the object is deleted.
In addition, each object has one or more meaningful user names
Objects can be transient or persistent.
59
Literals : atomic, collections, structured, null
The values of a literal’s properties may not change.
Literals do not have their own OID and cannot stand alone as objects: they are embedded in objects
Structured literals contain a fixed number of named heterogenous elements of the form: < name , value >, where value may be any literal type.
Struct Address {
string street;
string area;
string city;
string post_code; };
attribute Address branch_address;
60
Collections: contain an arbitrary number of unnammed homogeneous elements, each of which can be an instance of an atomic type, a collection or literal type
There are ordered and unordered collections. Ordered collections must be traversed first to last or vice versa; unordered collections have no fixed order of iteration.
Set: unordered collections that do not allow duplicates
Bag: unordered collections that do allow duplicates
List: ordered collections that allow duplicates
Array:one-dimensional array of dynamically varying length
Dictionary: unordered sequence of key-value pairs with no duplicate ekeys
Each subtype has operations to create an instance of the type and insrt an element into the collection. Sets and Bags have usual set operations: , ,
61
Interface Collection: Object {
exception InvalidCollection{};
exception ElementNotFound{any element};
unsigned long cardinality();
boolean is_empty();
boolean is_ordered();
boolean allows_duplicates();
boolean contains_element(in any_element);
void insert _element(in any_element);
void remove _element(in any_element);
raises (ElementNotFound);
Iterator create_iterator(in boolean stable);
` Bidirectionaliterator create_bidirectional_iterator(in boolean stable);
Raises(InvalidCollectionType); };
ODL interface for collections
62
A type has a specification and one or more implementations.
The (external)specification defines the properties and operations that can be invoked on instances of the type.
An implementation defines data structures, exceptions and methods that operates on the data structures to support the required state and behavior.
Class: The combiantion of a type specification and an implementation.
An interface definition is a specification that defines only the abstract behavior of an object type: supertypes, extend and keys.
A literal definition defines only the abstract state of a literal type.
63
Properties: in ODMG object model: attributes and relationships
Attributes: is defined on a single object type
is not a “first class” object (is not an object)--> no OID
its value is a literal or an OID
Relationships: only binary and are defined between types
cardinality: 1:1, 1:M, M:N
is not a “first class” object, does not have a name
traversal paths are defined in the interface for each direction of traversal
on the many side: objects can be unordered (set, bag) or ordered (list). OODBMS maintains referential integrity.
64
Example: a Branch Has a set of Staff and a member of Staff WorksAt a Branch:
interface Branch {
relationship set <Staff> Has inverse Staff:: WorksAt }
interface Staff {
relationship Branch WorksAt inverse Branch:: Has}
The model has built-in operations to form and to drop members from relationships and to manage the required referential integrity constraints
attribute BranchWorksAt;
void form_WorksAt(in Branch aBranch);
void drop_WorkAt(in Branch aBranch);
65
2. The Object Definition Language --ODL
is a specification language for defining the specifications of object types for OMG-complian systems.
facilitates portability of schemes between compliant systems
defines the attributes and relationships of types
specifies (but not addresses the implementation of) the signature of the operations
the syntax of ODL extends the IDL (Interface Definition Language) of the CORBA
will be the basis for integrating schemas from multiple sources and applications
66
3. The Object Query Language --OQL
provides declarative access to the object database using an SQL-like syntax.
does not provide explicit update operators, but leaves this to the operations defined on object types.
can be used as a standalone or as an embedded language in another language (now: C++, Smalltalk, Java).
can invoke operations programmed in these languages
An OQL query is a function that delivers an object
whose type may be infered from
the operator contributing to the
query expression.
67
Query definition expression:
DEFINE Q AS e /* defines a query with name Q given a query /* expression e
1. Elementary expressions:
• an atomic literal: 10, 17.5, ‘c’, “qwerty”, false, nill
• a named object:
• an iterator variable from the FROM clause of the SELECT-FROM-WHERE:
e as x or e x or x in e
where e is of type collection(T), then x is of type T
• a query definition expression (Q above)
68
2. Construction expression:
•If T is a type name with properties p1, p2, …,pn and e1, e2, …, en are expressions then T(p1 : e1, p2 : e2, …,pn : en) is an expression of type T.
Example: Branch(bno : ”B22”, manager : ”Susan Brand”)
•Similarly, we can construct expressions using struct, set, list, bag and array:
struct (bno : “B22”, street : “166 Main ST”)
is an expression which dynamically creates an instance of this type
69
3. Atomic Type Expressions
•Expressions can be formed using the standard unary and binary operations on expressions.
•If S is a string, expressions can be formed using:
the string concatenation operation ( || or + )
a string offset Si , meaning the i + lth character of the string
S[low : up], meaning the substring of S from low + lth to up+lth character
c in S (where c is a char) returning a bolean expression
S like pattern . Pattern contains the characters ? or _ , meaning any char, or the wildcard characters * or %, mening any substring. Returns a boolean expression
70
4. Object Expressions
•Expressions can be formed using the equlity and inequality operations ( = and != ) returning a boolean.
•If e is an expression of a type having an attribute or a relationship P of the type T, then e.P and e -->P are expressions of type T.
•In a same way, methods can be invoked to return an expression
•If a method has no parameteers, the brackets in the method call can be omitted
71
5. Collections expressions
Expressions can be formed using
universal quantification for all
existential quantification exists
membership testing in
select clause select from where
sort-by operator sort
unary set operators min, max, count, sum, avg
group-by operator group
The format of the SELECT clause is similar to the standdard SQL SELECT clause:
72
SELECT [DISTINCT] <expression>
FROM <from_list>
[WHERE <expression>]
[GROUP BY <attributes> [HAVING <predicate>]
[ORDER BY <expression>]
Where:
<from_list>::= <variable_name> IN <expression> |
<variable_name> IN <expression>, <from_list> |
<expression> AS <variable_name> |
<expression> AS <variable_name>, <from_list> |
The result of a SELECT DISTINCT query is a set
The result of a SELECT query is a bag
73
6. Indexed Collections Expressions
• If e1 and e2 are lists or arrays and e3 and e4 are integers, then e1[e3], e1[e3:e4], first(e1), last(e1) and (e1 + e2) are expressions
7. Binary Set Expressions
• If e1 and e2 are sets or bags, then the set operators union, except and intersect of e1 and e2 are expressions.
8. Structure Expression
• If e is a expression and p is a property name, then e.p and e-->p are expressions, which extract the property p of an object e.
74
9. Conversion Expressions
If e is an expression, then element(e) is an expression that checks e is a singleton, raising an exception if it is not.
If e is a list expression, then listtoset(e) is an expression that converts the list into a set.
If e is a collection-valued expression, then flatten(e) is an expression that converts a collection of collections into a collection, that is, it flattens the structure.
If e is an expression and c is a type name, then c(e) is an expression that asserts e is an object of type c, raising an exception if it is not.
10. Object Expressions
If e is an expression and f is an operation, then e.f and e-->f are expressions that apply an operation to an object. The operation can optionally take a number of expressions as parameters.
75
A query consists of a (possibly empty) set of query definition expressions followed by an expression.
The result of a query is an object with or without identity.
Examples:
A. get the set of all staff (with identity)
staff
B. get the set of all branch managers (with identity):
branch_offices.ManagedBy
76
C. get the set of all staf who live in London (without identity):
define Londoners as
select x
from x in staff
where x.address.city = “London”
select x.name.lname from x in Londoners
returns a literal of type set<string>
D. get the structured set (without identity) containing name, sex, and age for all staf who live in London:
select struct (lname:x.name.lname, sex:x.sex, age:x.age)
from x in staff
where x.address.city = “London”
returns a literal of type set<struct>
77
E. get the structured set (with identity) containing name, sex, and age for all deputy managers over 60:
type deputies {attribute
lname : string; sex: string; age : integer;}
deputies (select
struct ( lname:x.name.lname,
sex:x.sex,
age:x.age)
from x in (select y from staff
where position = “Deputy”)
where x.age > 60)
78
F. get a structured set (without identity) containing branch number and the set of all Assistants at the branches in London:
select struct (bno:x.bno,
assistants: (select y from y in x.WorksAt
where y.position=“Assistant”))
from x in
(select z from branch_offices
where z.address.city= “London”)
Object without identity are created using struct, (see D, F).