Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
The Index Rulez
Author: ...................... Will van Beek
Date: ................... 9th
september 2016
eMail: [email protected]
2 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Table of Contents
THE INDEX RULEZ .................................................................................................................................................. 1
RELATIONAL CONCEPTS AND MISCONCEPTIONS ....................................................................................................... 3 INDEXES .................................................................................................................................................................... 5
The Progress ROWID .......................................................................................................................................... 6 The Progress Index .............................................................................................................................................. 7 Indextypes ............................................................................................................................................................ 9 Index Cursors ..................................................................................................................................................... 12 Index bracketing ................................................................................................................................................. 13
RETRIEVING INDEX INFORMATION .......................................................................................................................... 16 Compile-time Index Information using XREF .................................................................................................... 17 Run-time Index Information using INDEX-INFORMATION ............................................................................. 22 Run-time Index Information using LOG-MANAGER ......................................................................................... 25
THE INDEX RULESETS .............................................................................................................................................. 28 The USE-INDEX Ruleset.................................................................................................................................... 30 The RECID Ruleset ............................................................................................................................................ 33 The AND Ruleset – the multiple indexes capacity .............................................................................................. 35 The OR Ruleset – the multiple indexes capacity ................................................................................................ 38 The Single Index Ruleset .................................................................................................................................... 41
BEST PRACTICES ...................................................................................................................................................... 49 Index design ....................................................................................................................................................... 50 Index use ............................................................................................................................................................ 51
SUMMARY INDEX RULESETS ................................................................................................................................... 54
The Index Rulez 3 Copyright © 2012-2016 proWill B.V.
Relational Concepts and Misconceptions
Introduction
The OpenEdge Relational Database Management System or RDBMS is, as the name
implies, a relational database.
Relational databases are grounded in the relational model which was developed in the
seventies by E. F. Codd. The rules he designed were based on the mathematical
principles of the relational algebra.
Elements of the RDBMS
Relational Databases consist of the following elements:
● Tables: collections of information that logically belong together and are
treated as such.
● Columns (or fields): constitute the whole of information that make up a table.
For example: If the table is called Customer, then the fields can be considered
properties pertaining to a customer, such as a Name, a Phone number and a
State.
● Row (or record): contains the information for a single instance of a table. For
example: each customer has its information from the fields stored in a single
row.
Customer
Name Phone State
Fanatical Athletes 0224-692-903 Alabama
Lift Tours 617-450-0086 Massachusets
Hoops 617-355-1557 Georgia Above, you can see 3 records containing the information in the fields for 3
customers.
continued on next page
4 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Relational Concepts and Misconceptions, continued
What is Relational about this ?
When people are asked this question, they always give this answer:
“Well, suppose you have two tables; one holding Customer information and another
holding Order information. The Customer table then needs a unique field, for
example “CustNum”, which is also present in the Order table, thereby relating one or
more Order records to the Customer information.”
This would suggest that if our database only contained one table, so without relations
to other tables, it would not be relational. This should also apply to the database
described above, containing only the “Customer” table.
This is, of course, not true. The concept “Relation” refers to something completely
different, namely the set theory in mathematics. If you consider the total set of
possible customers, the three customers mentioned above can be considered subsets
of the total universe of customers. A synonym for subset is relation. So every row or
record in a database table is a relation.
Logical and Technical key
The three customers all contain a combination of field values, that makes each
customer unique. The combination of field values needed to uniquely identify a
record is called a logical key. In the customers example, we would need a
combination of all the fields to uniquely identify a record. That’s why, in practice,
often a technical or physical key is constructed, which is usually implemented as a
separate field, containing a unique number by which the pertaining record can be
identified.
So, whether based on a logical or a technical key, every table needs a way to uniquely
identify a record. In the relational database model this is the primary key or index.
Although there can be several fields to uniquely identify a record; only one field or
combination of fields can constitute a primary key.
The Index Rulez 5 Copyright © 2012-2016 proWill B.V.
Indexes
Introduction
Next to the primary index used to uniquely identify a record, a table usually contains
more indexes. First let’s go into more detail on what an index actually is.
What is it
An index on a field is like a book index. The book index is a list of sorted topics with
one or more page numbers where readers can find information on the topic. Likewise
an index is a list of the sorted fieldvalues and an identifier used to locate the
pertaining record in the database. If no book index would exist, you would have to
read the entire book, since the topic could be covered on more then one location. The
same goes for database retrieval: if there are no indexes, every record in the pertai-
ning table has to be scanned in order to search for the wanted occurences.
So, an index is a datastructure that accelerates the retrieval of one or more records
from a database table. The acceleration is ideally achieved by maximizing access to
the index tables and resolving the query as much as possible to minimize the number
of records that actually have to be read from the database.
This is the main advantage of an index. However, you should not put an index on
every field or fieldcombination, because every time the value of a field that partici-
pates in an index is changed, the pertaining index table has to be updated. So there is
always a performance bottleneck, whether in the retrieval or the storage of records.
In this section the following topics will be covered:
● The Progress ROWID as the unique row identifier and its function in the
retrieval of records.
● The Progress Index as the implementation of a row identifier and a sorted key
value.
● Indextypes covers the types of indexes that Progress can use.
● Index cursors handles how the index position is retained.
● Index bracketing is a way of selecting a subset of index entries based on the
expressions in the where clause.
6 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The Progress ROWID
Introduction
A ROWID is a hexadecimal number containing the encoding of the storage address
of a record within a Progress database. This rowid contains a unique identifier for
every row in a table. Using this datatype, which is 8 bytes (256bits8), Progress can
store up to 1616
uniquely identifiable records per table.
Here is an example of the contents of the Customer database displaying the rowids.
Customer
ROWID Name Phone State
0x0000000000000063 Fanatical Athletes 0224-692-903 Alabama
0x0000000000000065 Lift Tours 617-450-0086 Massachusets
0x0000000000000067 Hoops 617-355-1557 Georgia
A new rowid is assigned to a record as soon as you create it and stays the same for the
row’s entire life, that is: until the record is deleted1. So the initial order will be on the
rowid, based on the order in which the records are entered into the database. If the
record is deleted however, the rowid becomes available again and may be used as the
rowidentifier of a yet to be created new record.
Because the rowid of the record is a unique identifier, you are not obliged to define a
unique and primary index in the OpenEdge RDBMS. However, it is good practice to
do so anyway.
Prior to the rowid, Progress used the recid as a unique identifier, which is supported
for backward compatibility. The Recid is based on an integer and therefor is but 4
bytes, so 168.
1. The values of the rowids are reassigned when an index is rebuild.
The Index Rulez 7 Copyright © 2012-2016 proWill B.V.
The Progress Index
Introduction
An index is a datastructure that accelerates the retrieval of one or more records from a
database table. This is also the main advantage of an index. However, you should not
put an index on every field or fieldcombination, because every time the value of a
field that participates in an index is changed, the pertaining index table has to be
updated. So there is always a performance bottleneck, whether in the retrieval or the
storage of records.
Index = sorted keyvalue + ROWID
An index on one or more fields is no more then the combination of the Rowid pertai-
ning to the record and the sorted value(s) of the field(s). Here you can see an example
of the datastructure of the CustNum index.
Index on CustNum Table: Customer
Cust Num
ROWID ROWID CustNum
Name
1 0x0000000000000065 0x0000000000000063 6 Fanatical Athletes
3 0x0000000000000067 0x0000000000000065 1 Lift Tours
6 0x0000000000000063 0x0000000000000067 3 Hoops
The rowid is Progress’ primary retrieval mechanism; in order to retrieve any
record, Progress has to use this row identifier, whether through an index or not.
In the code snippet below, for example, the rowid is directly used to retrieve a custo-
mer record using the find statement and the find-by-rowid method. These are in fact
the only cases where Progress retrieves a record without the use of data from an index
table.
Sample
find Customer where rowid(Customer) = <rowid-variable>.
buffer Customer:find-by-rowid(<rowid-variable>).
For the example below, Progress accessses the index table containing the CustNum
index and searches for value 100. It fetches the pertaining rowid and uses it to retrieve
the record from the Customer table.
Sample
find Customer where CustNum = 100.
continued on next page
8 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The Progress Index, continued
Index writes
The AVM writes changes in field values to the database at the end of the transaction
scope or the end of the record scope; whichever comes first. This is the normal
behavior for data writes.
For index writes, the mechanism is different: It doesn’t wait for the transaction- or
recordscope to end; but immediately writes to the pertaining indextables after the
transactionstatement (assign, create, update, etc), in which you have changed the
value of an index field, has completed.
Because this write is done after each transactionstatement it makes sense to gather
them, if possible, within a single assign statement to limit I/O processing.
The Index Rulez 9 Copyright © 2012-2016 proWill B.V.
Indextypes
Introduction
The Progress 4GL can use any of the following index types
● Unique indexes
● Primary indexes
● Word indexes
● Multi-component indexes
● Foreign Indexes
Unique and Primary indexes
Note, that in the Customer table, you will have to combine the values of Name,
Phone, State and maybe even more fieldvalues, in an index to construct a unique
logical key. In that case it is usefull to create a technical key, by adding a (16 or 64
bits) integer field containing a value that is:
● Shorter. This ensures that it is faster than an index based on the logical key.
● Unique. All values of the field are different
● Mandatory. If not, the unknown value may be assigned. Since the ? is in fact
no value at all, the AVM doesn’t check for multiple occurrences of it.
In the Customer table this technical key is implemented in the CustNum index which
is based on the values of the CustNum field.
Word indexes
In the OpenEdge RDBMS you have the capability to put an index on a character field
that contains all the words from that field, so you can search for records containing
specific words or phrases. All the words in the field are index entries. Word indexes
are accessed when you use the CONTAINS operator.
continued on next page
10 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Indextypes, continued
Multi-component indexes
If an index is based on more than one field, it is a multi-component, compound or
composite index. These indexes are extremely usefull when your application needs to
do rangematches. For example:
sample
for each Customer where Discount <= 10 and
Balance >= 20 and
CreditLimit >= 100 no-lock.
In this example it is assumed that a multi-component index exists containing these
three fields1.
Foreign Keys
A foreign key is an index based on a column or set of columns in one (referencing)
table that refers to a column or set of columns in another (referenced) table. The
foreign key in the referencing table must be the primary key in the referenced table.
Foreign keys are an essential part of reducing redundant information in the tables
(first normal form).
The constraint here is that foreign keys should always have a relation to an existing
primary key (referential integrity). If not, the foreign key will represent an orphan; a
meaningless value, that doesn’t represent anything.
Note that this referential integrity is not ingrained into the Progress RDBMS.
However it can be enforced using the Progress 4GL in database triggers or a data
access layer.
continued on next page
1 The file xtraIdx.df in this package contains the changes in the structure of the sports2000 database necessary
to use these indexes. Start the Data Administration tool; make a connection to a copy of the sports2000 database; choose, from the menubar, Admin ► Load Data and Definitions ► Data Definitions (.df file)… Browse
to the xtraIdx.df file, select and load it.
The Index Rulez 11 Copyright © 2012-2016 proWill B.V.
Indextypes, continued
Foreign Keys, continued
Below is the new structure for the sports2000 database, with the following changes:
● the unique field “CustNum” was added to the Customer table. This field will
be the basis of the primary key or index for this table.
● the non-unique field “State”. This field will be the basis of the foreign key that
will relate to the primary key “State” of the new table “State”.
Table: State Table: Customer
State StateName Region CustNum Name State Phone
AL Alabama South 6 Fanatical Athletes AL 0224-692-903
GA Georgia East 1 Lift Tours MA 617-450-0086
MA Massachusets East 3 Hoops GA 617-355-1557
12 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Index Cursors
Introduction
When a record is retrieved from the database, Progress keeps track of the current
record position using an index cursor – a pointer to the record. This pointer is located
at the entry in the indextable with the key value used for retrieval.
Example
Here is an example:
Sample
do with 2 down:
find first Customer where Name begins "A".
display CustNum Name.
release Customer.
find next Customer.
down.
display CustNum Name.
end.
After the find first statement the index cursor is
pointing at the first occurrence of Name starting
with an “A” in the Name index. The display is
shown on the right. Next, the recordbuffer is
flushed. This is not necessary; it’s only to
demonstrate that another find statement is not influenced by that1.
On the find next statement that follows, the Name index is not used anymore; instead
the primary index CustNum is accessed. Progress first has to reposition the index
cursor in the CustNum index to the same rowid as it was positioned on in the Name
index. The find next advances the index cursor to the next rowid based on the
CustNum index and retrieves the pertaining record.
So, depending on the nature of the statement, Progress can not only go off in any
direction from the current index cursor location, but can also switch to another index
altogether.
1. Only the FIND CURRENT will fail after a release.
The Index Rulez 13 Copyright © 2012-2016 proWill B.V.
Index bracketing
Introduction
An index bracket is a set of consecutive entries in an index. A bracket is defined by
an index identifier), a low-key value, and a high-key value. All the index entries
starting with the low-key value and ending with the high-key value are included in the
bracket. A bracket scan is an operation which examines index entries from the low-
key – to the high-key value. There are two types of brackets: equality brackets and
range brackets.
Note that this is the only way index bracketing can function. You have to specify
where you enter (low-key) and leave (high-key) the index. You cannot use <>,
because of the lack of an entry point.
Range Brackets/Match
Sample
for each Customer where CustNum > 5 and CustNum < 10
In this example the index on CustNum is bracketed at the low-key value 6 and the
high-key value 9. This is also called a range match.
Range brackets are, if possible, used with any of the following operators:
● BEGINS
● > or GT
● < or LT
● >= or GE
● <= or LE
Notes:
● the BEGINS is higher ranked then the other range qualifiers
● theoretically a CONTAINS with wildcards defines a range bracket. However
the CONTAINS is a bit of a special case. Even with wildcards, it is considered
an equality match.
continued on next page
14 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Index bracketing, continued
Equality Brackets/Match
Sample
for each Customer where Name = "Lift Tours"
This is an equality bracket or equality match. In this case the low-key – and the high-
key value are the same. However, if there are more records satisfiying this filter, the
rowid of the low-key – and high-key value will not be the same.
Equality brackets are, if possible, used on the following operators:
● = or EQ
● CONTAINS for example: for each Customer where Comments contains "remark" for each Customer where Comments contains "remark*"
Note again that the CONTAINS with wildcards is also present here.
WHOLE-INDEX
Progress tries to search for records as efficient as possible, that is, it will use index-
bracketing where possible. If no bracketing is possible the entire index is bracketed.
This is called WHOLE-INDEX. This will result in a full index scan, which means that
every index entry is read to retrieve the pertaining row. In effect, the entire table is
read.
WHOLE-INDEX – no problem
for each Customer
This is an example of a WHOLE-INDEX on CustNum. That is, every index entry and
every record in the Customer table is read, which is just what we want here.
continued on next page
The Index Rulez 15 Copyright © 2012-2016 proWill B.V.
Index bracketing, continued
WHOLE-INDEX, continued
WHOLE-INDEX – problems !
for each Customer where Balance <= 100
In this example things are different. Since there is no index on the Balance field,
Progress can not use its index, let alone bracket it. And because there are no other
index criteria in the filter, Progress has to use the primary index CustNum again.
However, from the sorting of the CustNum index, nothing can be deduced pertaining
to the value of the Balance field, so Progress cannot use a portion or bracket of the
index. Instead, the whole index on CustNum has to be bracketed and every record in
the Customer table is accessed to resolve this query.
No index bracketing can occur either when no selection criteria are specified to limit
the range of index keys searched or when there is no appropriate index is available to
optimize the selection criteria. In the latter case WHOLE-INDEX is an indication for
non-efficient record retrieval.
Furthermore, in the case of a non indexed field, Progress builds an index on the fly
and stores it in the srt file and has to scan the records again to fullfil the condition.
16 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Retrieving Index Information
Introduction
So, using indexes can dramatically accelerate the retrieval of records. However, using
the wrong index or combination of indexes may have the same performance as using
no index at all.
Imagine you want to find someone’s name. You know the city the person lives in, the
address and the phonenumber, but not the name. Using the old-fashioned phonebook,
which is indexed on city and name, you would have to go through all the names of the
pertaining city to retrieve the name.
When referencing data from the OpenEdge RDBMS, Progress almost always uses an
index. So, the point here is not “using an index”, but “using the correct index”. But,
how to determine whether the correct index is used. First of all, you need to be able to
retrieve information on index use. After that, the index rules that the compiler uses
are explained.
The Index Rulez 17 Copyright © 2012-2016 proWill B.V.
Compile-time Index Information using XREF
Introduction
When compiling an application, the compiler determines which indexes to use for
every retrieval of records in your code. This only applies to the following static
record retrieval statements:
● find
● for|preselect each
● open query
Note that, for statements depending on the dynamic retrieval of records, you cannot
rely on the compiler for the index information, in the form it is displayed here, be-
cause this information will only become available at run-time.
XREF
XREF is an abbreveation of “Cross Reference”. This is information that can be
produced by the compiler and stored in a file. The contents contain information on:
● the file being compiled
● the character tables used for compilation
● the indexes used
● the translatability of strings
● and more.
An XREF can be generated using the following statement:
Syntax
compile <fileName> xref <XREF fileName>.
Here is an example of some simple code using a connection to the sports2000
database:
sample
for each Customer by Phone
continued on next page
18 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Compile-time Index Information using XREF, continued
XREF, continued
Here you can see a portion of the XREF file that was generated:
.\test.p .\test.p 1 COMPILE test.p
.\test.p .\test.p 1 CPINTERNAL ISO8859-1
.\test.p .\test.p 1 CPSTREAM ISO8859-1
.\test.p .\test.p 1 STRING "Customer" 8 NONE UNTRANSLATABLE
.\test.p .\test.p 1 SEARCH sports2000.Customer CustNum WHOLE-INDEX
.\test.p .\test.p 2 SORT-ACCESS sports2000.Customer Phone
.\test.p .\test.p 2 ACCESS sports2000.Customer CustNum
.\test.p .\test.p 2 ACCESS sports2000.Customer Phone
SEARCH
Look for the keyword SEARCH in the XREF file. It indicates that an index bracket
will be used. After the keyword you will find the logical databasename and the table
that was accessed, followed by the name of the index used in your static retrieval
statement. When multiple brackets and indexes are used for the same query, you will
see one search line for each bracket.
Multiple index equality bracketing
for each Customer where Name = "Lift Line Skying" and SalesRep = "BBB"
In this example a filter is used with two equality matches. This will instruct Progress
to use a bracket on the Name and the SalesRep index.
.\test.p .\test.p 1 CPINTERNAL ISO8859-1
.\test.p .\test.p 1 CPSTREAM ISO8859-1
.\test.p .\test.p 1 STRING "Customer" 8 NONE UNTRANSLATABLE
.\test.p .\test.p 1 SEARCH sports2000.Customer Name
.\test.p .\test.p 2 SEARCH sports2000.Customer Salesrep
...
SEARCH...WHOLE-INDEX
This indicates that a suitable bracket could not be constructed and an index scan of
the entire table will be performed using the index noted.
In the example above there is a sort on the Phone field. However, this field is not
indexed in the sports2000 database, so Progress can not use its index. Instead, it uses
the primary index of the Customer table, being CustNum. Since the order in customer
numbers is completely independant of the order in phonenumbers, the entire index of
CustNum has to be scanned.
continued on next page
The Index Rulez 19 Copyright © 2012-2016 proWill B.V.
Compile-time Index Information using XREF, continued
SORT-ACCESS1
When you find this in your crossreference file you may have found a performance
bottleneck in your application.
SORT-ACCESS means that you tried to sort on a field without a suitable or corres-
ponding index, so Progress will use the entire (whole) primary index. Using this
index, every record will have to be scanned, meaning that the ROWID of the record
and the value of the Phone-field will be sorted on the fly and written to the temporary
sort file, for example srta04504. After this is done, all the records will be read again,
because that is what the FOR EACH requires, using the order of records that was
written to the sort file.
Remember that the sort file is on the client-machine and will be deleted when the
client session ends, so with every new session, this process will have to be repeated.
Compiletime index information tool – introduction
Accompanied with this package is a file called xRefAnalysis.w.
This is a procedure that helps you analyse your program by generating and showing
cross reference information. Here is an example of its view.
continued on next page
1. From OpenEdge 11 SORT-ACCESS has been renamed to ACCESS.
20 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Compile-time Index Information using XREF, continued
Compiletime index information tool – workings
You can run it in Progress Version 9 and
subsequent OpenEdge versions. When you do,
it checks if you already have a database
connection. If not, you will be prompted to make one using the default Connect
database dialog.
If you need more database connections, just choose File ► Connect to database.
In the editor labeled Source code you can type your 4GL data retrieval statements or
open an existing file using File ►Open.
When you’re ready to generate the cross reference listing, choose File ► Generate
Cross Reference Listing from the menu bar, access the editor’s contextmenu and
choose Generate Cross Reference Listing or press keycombination Shift-F2. The
listing will be displayed in the browse below.
In the example on the
previous page the sort
by Phone was made.
The listing displays
the SEARCH and the SORT-ACCESS keywords. Because the use of WHOLE-INDEX
and SORT-ACCESS are potential performance hazards, they get a red background to
alert you.
Below is another example. In this case index bracketing and multiple indexes are
used. This is generally good for performance, so the references get a green
background.
continued on next page
The Index Rulez 21 Copyright © 2012-2016 proWill B.V.
Compile-time Index Information using XREF, continued
Compiletime index information tool – additional features
In the browse showing the listing has the contextmenu dis-
played here. By default, the entire contents of the cross refe-
rence is shown. You can choose however, to limit the infor-
mation displayed by choosing one of the topics from this
menu.
Note that, if you scroll through the browse, the label of the fourth column will change
depending on the information shown in the row you have selected. The label can
hold any of the following: Procedure Name, Page, CharString, Variable Name and
Table Name.
As you may have noticed, the program used temporary files like AAAAAAAA to store
the cross reference information. These files should be deleted when the program is
closed.
22 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Run-time Index Information using INDEX-INFORMATION
Introduction
Progress can also generate index information during runtime. This can be used with a
dynamic or a static query:
● <query-handle>:QUERY-PREPARE(<predicate>)
● query <query-name>:QUERY-PREPARE(<predicate>)
Note that although this is runtime, it is still the Progress compiler that determines
which index to use. This is because the QUERY-PREPARE method is a request to
compile the query predicate.
Also note that there is no way to retrieve index information for the ABL dynamic find
methods FIND-FIRST, FIND-LAST, FIND-UNIQUE and FIND-CURRENT. Here the
same rules apply as for static find statements and you will need to use the index rules
later in this lesson or the xRefAnalysis program.
continued on next page
The Index Rulez 23 Copyright © 2012-2016 proWill B.V.
Run-time Index Information using INDEX-INFORMATION, continued
INDEX-INFORMATION1
INDEX-INFORMATION is an attribute of the query object that returns a character
string consisting of a comma-separated list of the index or indexes the query uses for
(one of) its buffer(s). If the index or indexes do not have bracketing, the first entry in
the list is the string WHOLE-INDEX; the second entry is the name of the index.
Syntax
{<query-handle>|query <query-name>}:index-information(<buffer-number>).
Returns: [WHOLE-INDEX,]<indexname>
Buffer-number stands for the sequence number of the buffer in the query for which
index information is requested.
Here is an example of the code using the sort by Phone and the result shown in an alert-box:
Procedure: main block Program: test.p
define variable hQuery as handle no-undo.
define query qCustomer for Customer.
hQuery = query qCustomer:handle.
hQuery:query-prepare("for each Customer BY Phone").
message hQuery:index-information(1)
view-as alert-box title "BY Phone".
continued on next page
1. There is also an INDEX-INFORMATION method which is applied to a buffer handle. It supplies information on
which indexes are available for a particular buffer. For example: BUFFER Customer:INDEX-INFORMATION(1) returns: “CustNum,1,1,0,CustNum,0” (Ascending = 0,Descending = 1)
Syntax
{<buffer-handle>|buffer <buffer-name>}:index-information(<index-number>). Returns csv-list: <indexname>,UNIQ,PRIM,WORD,{<field-name>,ASC|DESC[,...]}
24 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Run-time Index Information using INDEX-INFORMATION, continued
Runtime index information tool – introduction
Also contained in this package is a tool to retrieve runtime index information. This
tool consists of the following procedures:
● wSelection.w – enables you to specify a table, a fieldlist, a where – and a sort-
by clause to be analysed.
● wPresentationDynamic.w – takes care of the display of the resulting records in
a dynamic browse. Selection.w communicates with wPresentationDynamic.w
using named events.
● fetchData.p – is a generic procedure that uses a dynamic query, – recordbuffer
and – temp-table to retrieve any data from a single table.
Here is an example of its view
This is equivalent to: for each Customer fields(Phone Name) no-lock by Phone
Note that this program uses fieldlists and that the sort field
is placed in front of the list.
Also note that the runtime index information shows the
same as the compiletime version: the WHOLE-INDEX of
CustNum is used for this retrieval.
Note that this is also a good resource is you want to review
dynamic access to data using dynamic temp-table,
recordbuffers and queries.
continued on next page
The Index Rulez 25 Copyright © 2012-2016 proWill B.V.
Run-time Index Information using LOG-MANAGER
Introduction to the Log Manager
OpenEdge contains a logging infrastructure that provides a mechanism for logging
run-time diagnostic information based on a set of logging characteristics.
Log-Manager – Workings
Three settings are important here:
● Destination. The logging information will be written to a logfile on your file-
system. Here is an example where the logging information will be written to:
Syntax
log-manager:logfile-name = "C:\temp.log".
● Type of Information. There are a lot of information types available, such as:
4GLMESSAGES, 4GLTRACE, 4GLTRANS, QryInfo, SAX and many more.
Here we are only interested in the informationtype “QryInfo”.
Syntax
log-manager:log-entry-types = "QryInfo".
When using the QryInfo type, you can only get information on for each or
preselect each based record access, whether in a separate blockheader or in a
static or dynamic query. For FIND statements no index information is
generated.
● Amount of Information. You can determine how detailed the information is
you want to record in the logfile. This varies from None (0) to Extended (4).
Syntax
log-manager:logging-level = 0|1|2|3|4.
When using the QryInfo type, only logging-level 2 (basic) and 3 (verbose) are
valid. If you don’t set the logging level, the default will be basic.
continued on next page
26 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Run-time Index Information using LOG-MANAGER, continued
Log-Manager – Workings, continued There are several other ways to set the logging level.
Setting it together with the log-entry-type, using an optional parameter, in this
example, to verbose.
Syntax
log-manager:log-entry-types = "QryInfo:3".
Setting it on individual query handles
Syntax
hQuery:basic-logging = true.
Note that this last option can only be used on actual query objects, not on for eaches
in blockheaders.
Query information logging provide you with extended logging information that helps
you to locate and correct issues with queries that cause performance degradation. You
can direct the log-manager to accumulate query statistics for the specified query and
write it to the log file by using:
Syntax
hQuery:dump-logging-now(true|false).
This way you can get an insight in how long particular record retrievals take. The
logical parameter determines whether the statistics will be reset again.
It’s good practise to separate your dumps by some notification in the logfile.
Sample
log-manager:write-message("- - - - -","-- DUMP").
This wil direct the log-manager to write this to the logfile. The first parameter is the
message you want to pass, the next is in fact your default log-entry-type. An example
is displayed on the next page.
continued on next page
The Index Rulez 27 Copyright © 2012-2016 proWill B.V.
Run-time Index Information using LOG-MANAGER, continued
The Logfile – an example
Here is an example of a simple query, again using the sort by Phone. Note the
references to the log-manager handle
Sample
define variable hQuery as handle no-undo.
define query qCustomer for Customer.
hQuery = query qCustomer:handle.
log-manager:logfile-name = "C:\temp.log".
log-manager:log-entry-types = "QryInfo".
log-manager:write-message("- - - - - - - - -","-- DUMP").
hQuery:query-prepare("FOR EACH Customer BY Phone").
hQuery:query-open().
hQuery:dump-logging-now(true).
Below is the partial output produced in temp.log. Note the customized entry
type --DUMP, the prepare-string and the indexes used.
Use of the log-manager is built into the runtime index information tool discussed
earlier. Just check the togglebox: . The log information will be
written to log-file C:\temp.log.
[yy/mm/dd@hh:mm:ss.msc+-GMT] procesID TreadID level entrytype message
---------------------------- -------- -------- - ------- --------------------------------------------------
[08/10/02@15:32:18.697+0200] P-004592 T-005072 1 4GL -- Logging level set to = 2
[08/10/02@15:32:18.697+0200] P-004592 T-005072 1 4GL -- Log entry types activated: QryInfo
[08/10/02@15:32:18.697+0200] P-004592 T-005072 1 4GL -- Log entry types activated: QryInfo
[08/10/02@15:32:18.697+0200] P-004592 T-005072 1 4GL -- DUMP - - - - - - - - -
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Query Plan: C:\OpenEdge\wrk\test.p line 11
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO QueryId: 0xd47c38
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Query Name: qCustomer
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Query Handle: 2265
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Type: Dynamically Opened Query
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO PREPARE-STRING: FOR EACH Customer BY Phone
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Prepared at Runtime
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Client Sort: Y
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Scrolling: Y
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Table: C:\OpenEdge\wrk\sports2000.Customer
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Indexes: WHOLE-INDEX,CustNum
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Query Statistics: C:\OpenEdge\wrk\test.p line -1
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO QueryId: 0xd47c38
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Query Name: qCustomer
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Query Handle: 2265
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Times prepared: 1
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Time to prepare (ms): 0
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO DB Blocks accessed to prepare:
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO C:\OpenEdge\wrk\sports2000 : 6
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Times opened: 1
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Entries in result list: 1117
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Time to build result list (ms): 0
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO DB Blocks accessed to build result list:
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO C:\OpenEdge\wrk\sports2000 : 2246
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO DB Reads to build result list:
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Table: C:\OpenEdge\wrk\sports2000.Customer : 1117
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Index: Customer.CustNum : 1118
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO C:\OpenEdge\wrk\sports2000.Customer Table:
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Records from server: 1117
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Useful: 1117
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Failed: 0
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Select By Client: N
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Fields: Phone
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Query Statistics: C:\OpenEdge\wrk\test.p line -1
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO QueryId: 0xd47c38
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Query Name: qCustomer
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Query Handle: 2265
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Used REPOSITION: N
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO DB Blocks accessed:
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO C:\OpenEdge\wrk\sports2000 : 0
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO DB Reads:
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Table: C:\OpenEdge\wrk\sports2000.Customer : 0
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO Index: Customer.CustNum : 0
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO C:\OpenEdge\wrk\sports2000.Customer Table:
[08/10/02@15:32:18.697+0200] P-004592 T-005072 2 4GL QRYINFO 4GL Records: 0
---------------------------- -------- -------- - ------- --------------------------------------------------
28 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The Index Rulesets
Introduction
Progress uses a number of rulesets to determine which index or indexes are used
when data is retrieved from the database.
These rulesets all contain a number of conditions that have to be met in order for
Progress to be able to use a particular index or combination of indexes.
1. USE-INDEX ruleset.
2. RECID ruleset.
3. AND ruleset
4. OR ruleset
5. SINGLE-INDEX ruleset
The rulesets are tested sequentially, until the index is resolved.
continued on next page
The Index Rulez 29 Copyright © 2012-2016 proWill B.V.
The Index Rulesets, continued
Single and Multiple indexes
Since version 7, the Progress index algoritm can use more then one single index if
particular conditions, which will be covered in the rulesets, are met. Multiple index
use however, is only possible with the following FOR EACH based searches:
● for each|first|next|prev|last <buffer-name> ...
● open query <query-name> for each <buffer-name> ...
● query <query-name>:query-prepare(“for each <buffer-name>...”)
If you use any of the following FIND based searches, only one single index can be
used for the search:
● find first|next|prev|last|current <buffer-name>
● buffer <buffer-name>:find-last()
● buffer <buffer-name>:find-first()
● buffer <buffer-name>:find-current()
● buffer <buffer-name>:find-unique()
● buffer <buffer-name>:find-by-rowid()
If using the static FIND accompanied by first, next, prev or last, you can better use
the FOR with first, next, prev or last. This way you can benefit from the performance
gain when Progress uses multiple indexes, that is, if you meet its conditions.
Note, that for the dynamic find methods, there is no equivalent dynamic for first, next,
prev or last; static or dynamic queries always require a FOR EACH.
When Progress uses a single index to retrieve records, the rows are returned in the
sorted order of the index. When multiple indexes are used, the order is undetermined;
that is: determined by the order in which the index brackets are used. You could use
the BY option to enforce a particular order. However, this sorting would require
Progress to go through the resultlist once more.
30 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The USE-INDEX Ruleset
Introduction
Specifying USE-INDEX overrides the index selection algorithm that Progress uses.
The effect being that your index is selected. You can specify only one index in the
USE-INDEX phrase.
Warning
You should be very carefull with USE-INDEX. In fact, you should only use it when
you are certain that the Progress indexmechanism has choosen an inefficient or
incorrect index. In order to make that judgement you will have to know the rulesets
Progress uses to determine which index bracket(s) is/are used. In addition, always
check your answer by comparing the number of records that are read using the
different indexes. The point is: if you think you're smarter then Progress; prove it !
Don't just blithily throw a USE-INDEX in your code; run tests and gather data about
the number of logical reads required to satisfy your query and document the results
with Comments in the pertaining code to make them verifiable.
An Example
Let’s consider the next simple query from the Order table:
sample
for each Order where OrderDate > 1/1/00 and
OrderDate < 1/1/05 and OrderStatus = 'ordered'
If we enter this sample into the Compile-time Index Information Tool, discussed
earlier, the result is an index bracket on the OrderStatus index. It displays with a
green background, so everything should be OK.
The pertaining field OrderStatus however, can only have one of the following values:
Ordered, Partially or Shipped. The point here is, while we only want the records with
OrderStatus Ordered, we will have to access all records with OrderStatus Ordered,
because that’s what our index bracket dictates. So, although a bracketed index is used,
it might still not be the most efficient one.
In order to find a more efficient index, we will need to know the number of records
that are retrieved.
continued on next page
The Index Rulez 31 Copyright © 2012-2016 proWill B.V.
The USE-INDEX Ruleset, continued
Database Access Information Tool – Introduction.
Included in this package is a small tool, dbAccess.w, that
supplies information on the number of records that are read
in the various tables.
If you run the program, you will be prompted to make a
connection to a database. This should of course be the same
database as the one you use to test your queries.
Database Access Information Tool – Workings
If we execute the sample code below in a different session and choose the Refresh
button in the Database Access Information Tool, this will produce 3030 record reads.
It doesn’t matter what the OrderDate values are; the
OrderStatus index dictates that there are 3030 records with
that OrderStatus value and they all need to be read.
sample
for each Order
where OrderDate > 1/1/00 and
OrderDate < 1/1/05 and
OrderStatus = 'ordered'
If we add the use-index phrase in this example and put the
index on OrderDate, then the number of records is reduced
to 1. If a period before 2000 is choosen, more record reads
will be necessary.
sample
for each Order
where OrderDate > 1/1/00 and
OrderDate < 1/1/05 and
OrderStatus = 'ordered'
use-index OrderDate
Usually WHOLE-INDEX is selected if USE-INDEX is used. Note that if this query is
entered in the Compile-time Index Information Tool, it shows a bracketed use of the
OrderDate index. That’s because there are rangematch references to the OrderDate
field in the WHERE clause. If there is more than one reference, like in the example
displayed here, they should be combined with an AND.
continued on next page
32 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The USE-INDEX Ruleset, continued
Database Access Information Tool – Workings, continued
So, in the situation where indexbrackets are large because of a limited number of va-
lid values for the pertaining field, use-index may provide a better performance. How-
ever, it all depends on the number of index entries that will be bracketed; for example
when you use value partially or shipped, respectively 0 or 927 records are bracketed.
Here is another example of a sorting of customers by Name.
Progress chooses the bracket on the CustNum index.
sample
for each Customer
where CustNum > 1 by Name
In total 1116 records are selected by the bracket, that’s the entire Customer table
minus one. Next, this range has to be sorted by Name, so the records are read again,
generating a total of 2232 scanned records. So, although PROGRESS has been as
efficient as possible in its data selection, it needs to sort all data before passing it on
to the client, this then makes it inefficient.
If we specify the use-index on Name, the WHOLE-INDEX
on Name is used, meaning a full table scan; so all 1117 re-
cords in the Customer table are accessed.
Sample
for each Customer
where CustNum > 1 use-index Name by Name
This is still less then 2232 records.
Again, the efficiency of the use-index phrase fully depends
on the number of records that are bracketed using the
CustNum index.
sample
for each Customer
where CustNum = 1 by Name
In this example the index bracket spans only one record, which doesn’t have to be
sorted of course, use-index Name would still have to access 1117 records.
The Index Rulez 33 Copyright © 2012-2016 proWill B.V.
The RECID Ruleset
Introduction
This ruleset uses the recid/rowid to directly access the pertaining record in the
database. Since the recid/rowid is a unique identifier for a record, a query using it will
always return one record, if any. For that reason there is nothing gained using a for
each based query; instead use the static find…where rowid|recid statement or the
dynamic find-by-rowid method.
Condition 1
► The reference to the recid/rowid needs to be an equality match.
If the example below is entered in the Compile-time Index Information Tool, it will
show the reference to the RECID. The same will happen if you use the rowid in your
filter. The Database Access Information Tool confirms that only one record is
accessed.
sample
find Customer where recid(Customer) = 97.
Note that you don’t have to specify first, next, prev or last.
This is only necessary if more then one record can be
retrieved.
Suppose however, you enter a range match in your filter,
like shown here:
Sample
find Customer where recid(Customer) < 98.
Because the Customer with recid 97 is the first record in
the Customer table, this will return exactly the same result
as the previous example. However, this is of course not useful at all; with the recid
you can retrieve only one record, not a range of records. The result is that Progress
can not use the recid and is obliged to use the primary index CustNum and does a
complete table scan.
Note that you cannot use RECID in a USE-INDEX phrase, because it isn’t an index.
continued on next page
34 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The RECID Ruleset, continued
Condition 2
► The reference to the recid/rowid is the only reference in the where clause
Because referencing the recid/rowid already returns one single record, if any, it is not
meaningful to combine it with another field using an AND. Worse still, the RECID
ruleset will not and can not use any indexes in combination with the recid.
If you do use a reference to the recid/rowid combined with another field using an
AND, Progress will grab the recid and simply ignore the rest of the where clause.
Note that in the example below even a reference to the non indexed field Phone using
a range match, doesn’t have any influence on the index algoritm whatsoever.
sample
find Customer where recid(Customer) = 97 and Phone begins '0'.
If you combine the recid/rowid reference with another field using an OR, Progress
will completely ignore the where clause and use the entire primary index of the table.
Consider the example below where both conditions in the where clause will return the
first Customer record; Progress however, notices the OR and decides the entire
Customer table should be scanned.
sample
find Customer where recid(Customer) = 97 or CustNum = 1.
The Index Rulez 35 Copyright © 2012-2016 proWill B.V.
The AND Ruleset – the multiple indexes capacity
Introduction
The AND Ruleset is the first ruleset by which Progress can use multiple indexes if all
the components of each index are involved in equality matches and the indexes are
not unique. Let’s go into more detail.
First of all, the AND operator is used and the fields used in the where clause are
indexed. Next Progress determines what’s on both sides of the operator and applies
the following rules:
Condition 1
► Only equality matches are allowed.
In the example below the Name and SalesRep fields are both indexed and referenced
using an equality match; as a result brackets are put on both indexes. At runtime,
Progress will make 2 result lists of the matches, merge them into one on the basis of
the entries that appear in both lists and send them to the client.
sample
for each Customer where Name = 'Lift Tours' and SalesRep = 'BBB'
Here is another example. Note, that the begins operator is used which is a range
match. Progress can not apply the AND ruleset and chooses a bracket on the
SalesRep index, because this is the index with the most equality matches (see
SINGLE INDEX ruleset, condition 3)
Sample
for each Customer where Name begins 'Lift' and SalesRep = 'BBB'
continued on next page
36 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The AND Ruleset – the multiple indexes capacity, continued
Condition 2
► All components (of multi-component indexes) are present.
Progress needs references to all fields in a multi-component index to be able to
resolve it into an effective index bracket.
In this example a reference is made to the PostalCode and Country fields. These
fields constitute the multi-component index CountryPost. Since there are no more
fields in this index, it can be bracketed by Progress. Of course only equality matches
are allowed.
Sample
for each Customer where SalesRep = "BBB" and
PostalCode = '1234AB' and
Country = "Netherlands"
Note that the sequence in which the PostalCode and Country fields are referenced, is
in reverse, compared to their sequence in the CountryPost index. This has no
influence on the way Progress selects its indexes.
In the example below the Name and SalesRep fields are both indexed and referenced
using an equality match; as a result brackets are put on both indexes. There is also a
reference to the Country field, but not to the PostalCode field, so condition 2 does not
apply.
However, this only excludes the CountryPost index for being bracketed; The indexes
on Name and SalesRep are still being bracketed.
sample
for each Customer where Name = 'Lift Tours' and
SalesRep = 'BBB' and Country = "Netherlands"
continued on next page
The Index Rulez 37 Copyright © 2012-2016 proWill B.V.
The AND Ruleset – the multiple indexes capacity, continued
Condition 3
► Only non-unique indexes are allowed.
Ask yourself, What’s the point of using multiple indexes if a unique index is present;
This will resolve to an index bracket spanning one index entry.
The example below shows the effect of the presence of the
unique index CustNum; if any one of the expressions fails to
match any of the conditions, only one bracket will be used,
and the default ruleset for single indexes, see condition 2,
will apply.
Sample
for each Customer where SalesRep = "BBB" and
Name = "Lift Tours" and
CustNum = 5
Conclusion
In order to use the AND ruleset, all components of non unique indexes should be
combined using equality matches. Only then can Progress construct index brackets. If
not, Progress tries to apply the OR ruleset.
38 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The OR Ruleset – the multiple indexes capacity
Introduction
The OR Ruleset is the second ruleset by which Progress can use multiple indexes if
each expression has matches on the leading index component.
Of course, the OR operator is used and the fields used in the where clause should be
indexed. The OR ruleset is not as strict as the AND ruleset. For example, the OR
ruleset is not only restricted to equality matches; it also allows range matches.
Furthermore, references to unique fields are no problem.
In this example there are 3 range matches of which one is a reference to a unique
index. Still, Progress can put index brackets around each of them. At runtime,
Progress will make 3 result lists of the matches, merge them into one, removing the
duplicates and send them to the client.
sample
for each Customer where Name begins 'Lift' or
SalesRep begins 'BBB' or
CustNum < 5
In fact there is only one condition you have to be aware of when using the OR ruleset.
However, that condition is harsh; if not met, all field references on either side of the
OR operator and rendered inactive, meaning that they cannot be used for selection of
an index or index bracket. The result is a selection of a bracket on the entire primary
index.
The logic of this condition is that if one component in the where clause cannot be
used, Progress can never be certain if that condition would include records into the
record set or not, so it is obliged to do an entire table scan to be sure.
continued on next page
The Index Rulez 39 Copyright © 2012-2016 proWill B.V.
The OR Ruleset – the multiple indexes capacity, continued
Condition 1
► Only the leading component (of multi-component indexes) is present.
Here is an example of this condition. Country is the first (leading) component from
the multi-component index CountryPost. Therefor the OR ruleset can be applied
sample
for each Customer where Name begins 'Lift' or
Name begins 'Urpon' or
CustNum < 5 or
Country begins "Nether"
Note that the index on Name has 2 separate brackets.
Here is another example, but now, not with the leading, but the second component of
the CountryPost index. The result is that Progress can’t use any of the indexes and
needs to do a complete table scan based on the entire primary index CustNum. Why is
this ? Why does Progress throw away the rest of the indexes and start anew with the
primary index, which is unrelated to the query in question ?
sample
for each Customer where Name begins 'Lift' or
SalesRep begins 'BBB' or
CustNum < 5 or
PostalCode begins "1234"
Suppose Progress would retain the index brackets on Name, SalesRep and CustNum,
what good would that do ? Because no bracket can be put on the PostalCode field, it
acts the same as a non-indexed field. There is no other way to locate the records with
this PostalCode value, then by performing a full table scan.
Compare it to searching for all the words in a dictionary that contain “A” as their
second letter. You will have to go through the entire dictionary to locate these words,
and in fact it doesn’t matter how you do it; you can start at the beginning, in the
middle or work your way back from the end; whatever search method you choose,
they are equally inefficient.
continued on next page
40 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The OR Ruleset – the multiple indexes capacity, continued
Condition 1, continued
So, if any of the expressions fails to generate an index bracket, none of the indexed
fields from the where clause can be bracketed, even if the AND ruleset was
previously succesfully applied, and a bracket will be formed surrounding the entire
primary index.
In the example below both fields from the CountryPost index are used. If it were the
AND ruleset, this would be necessary to generate a bracket on CountryPost. In the
OR ruleset, it makes the bracket impossible to construct.
sample
for each Customer where Country begins "Nether" or
PostalCode begins "1234"
Because of the OR operator, Progress is required to make 2 separate index brackets
on 1 index table because the country bracket has different entries than the PostalCode
bracket, this is simply not possible.
In the AND ruleset, Progress needed only 1 bracket because the bracket that was
made by the second field in the composite index was, because of the AND,
necessarily a subset of the bracket made on the first field.
The Index Rulez 41 Copyright © 2012-2016 proWill B.V.
The Single Index Ruleset
Introduction
If everything else has failed so far, that is, if no index brackets were formed, the
Single Index Ruleset is applied. The first condition is already an odd one out.
Condition 1 – CONTAINS
► If there is a CONTAINS, use the word-index.
This means in fact: a word-indexed field is always selected, no matter what else is in
the where clause.
Here is an example of the resilience if this condition. Remember from the OR ruleset
that the reference to the second component, in the CountryPost index, PostalCode,
forced Progress to use the entire primary index. Note here, that even in that case, the
wordindexed field Comments still receives an index bracket.
sample
for each Customer where Name begins 'Lift' or
SalesRep begins 'BBB' or
CustNum < 5 or
PostalCode begins "1234" or
Comments contains "remark*"
Now let’s take a look at another example. The only change is that the last OR was
changed into an AND.
sample
for each Customer where Name begins 'Lift' or
SalesRep begins 'BBB' or
CustNum < 5 or
PostalCode begins "1234" and
Comments contains "remark*"
continued on next page
42 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The Single Index Ruleset, continued
Condition 1 – CONTAINS, continued
The OR ruleset is now applied successfully to the first 3 expressions. The inability to
put a bracket on CountryPost is not an issue anymore. This is because the
PostalCode begins "1234" now belongs to the AND not to the OR anymore.
When it belonged to the OR, Progress had no way of knowing which records were
concerned. Now that it belongs to the AND, Progress can fetch the records using the
index bracket on Comments and checks afterwards if the retrieved records also
satisfy the condition specified for PostalCode. The point here is that this check is not
done while retrieving records, but after the records are retrieved.
Notes:
● The CONTAINS can only be used with FOR EACH based queries, not with
FIND based searches.
● The CONTAINS can not be used in combination with USE-INDEX.
continued on next page
The Index Rulez 43 Copyright © 2012-2016 proWill B.V.
The Single Index Ruleset, continued
Condition 2 – Equality Match & Unique
► Use the unique index if all of its components are used in equality matches.
This condition can be considered a compensation or, if you will, extension of the
AND ruleset, which can be summarized as:
● Use the non-unique index if all of its components are used in equality
matches.
Here is the same example, as was used with condition 3 of the AND ruleset.
sample
for each Customer where SalesRep = "BBB" and
Name = "Lift Tours" and
CustNum = 5
CustNum is the only component of the unique and primary index CustNum and will
by definition return 0 or 1 records.
Condition 3 – Equality Match
This condition consists of several subconditions:
► use the index with the most active equality matches
► at least the leading component is referenced
► the AND is used to join the field references, OR is not allowed
Here is an example of this condition, focusing on the first subcondition. The AND
ruleset does not apply because of the range match in the SalesRep reference, so
Progress gives precedence to the indexed field that is most frequently referred to with
an equality match, in this case the Name index.
sample
for each Customer where SalesRep begins "B" and
Name = "Lift Tours"
continued on next page
44 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The Single Index Ruleset, continued
Condition 3 – Equality Match, continued
More frequently however, you will see this rule applied to multi-component indexes
containing multiple fields. That’s were subcondition 2 and 3 come in. An example is
displayed here.
sample
for each Customer where Discount = 10 and
Balance = 20 and
Name = "Lift Tours"
The index DiscBalCredL is such a multi-component index consisting of the fields
Discount, Balance and CreditLimit. Note that you need the xtraIdx.df file to load this
index into the sports2000 database.
Remember that the AND Ruleset required that all components of a multi-component
index should be present, while the OR Ruleset could be applied if only the first
component was referenced.
In this case the field references are joined with an AND and the first component of
the index is present. The second component is also referenced, the third component
isn’t. The rule is that the first component is required, while the other components can
be referenced, but don’t have to be. Furthermore, the order in which the fields are
referenced is of no importance; the first component can be the first one, but doesn’t
have to be.
The index DiscBalCredL is choosen because there are 2 references made to this index.
if Progress needs to make a choice between a a multicomponent index that has only
one component in the where clause and a non multicomponent index has its only
component in the clause, the latter wins.
continued on next page
The Index Rulez 45 Copyright © 2012-2016 proWill B.V.
The Single Index Ruleset, continued
Condition 4 – Range Match
This condition consists of several subconditions:
► use the index with the most range matches
► at least the leading component is referenced
► the AND is used to join the field references, OR is not allowed
In fact the only difference with the previous condition is the fact that this condition
handles range matches, instead of equality matches.
Here is a plain example. The Index DiscBalCredL has two active range matches,
which is more then Name has. Note again that at least the first component has to be
present, but doesn’t have to be the first one referenced.
sample
for each Customer where Balance > 20 and
Discount > 10 and
Name > "Lift Tours"
Here is another example. The index on SalesRep is used because it has the most
active rangematches. Remember that the BEGINS operator takes precedence over the
other range matches.
sample
for each Customer where SalesRep begins "B" and
Name > "Lift Tours"
This last example shows a combination of condition 3 and 4. Note that Name has an
equality match while references to the DiscBalCredL index have an equality – and a
range match. Because the number of equality matches for Name and DiscBalCredL
are equal, Progress cannot decide which one to use. However, applying rule 4 of the
Single Ruleset Progress discovers that the Balance from the DiscBalCredL index has
the most range matches and therefor it is choosen.
sample
for each Customer where Balance > 20 and
Discount = 10 and
Name = "Lift Tours"
continued on next page
46 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The Single Index Ruleset, continued
Condition 5 – Sort Match
A sort match accurs when a field is used for sorting purposes; that is, it is used with
the BY option.
► use the index with the most sort matches
► at least the leading component is referenced
Here is an example where the results are sorted by Country.
sample
for each Customer where Country > "USA" and
Name > "a"
by Country by PostalCode
Note that the AND ruleset doesn’t apply here, because they are not both equality
matches and PostalCode is not included as a reference in the where-clause. Progress
counts the number of range matches and can’t decide between Country and Name, but
because Country and Postalcode are 2 references to the countryPost index in the sort
clause, its index is choosen.
Beware of using the sortmatch in combination with the OR operator, like in this
example. Although the OR ruleset grants more indexbrackets, there may be a
dramatic loss in performance.
sample
for each Customer where Country > "USA" or
Name > "a"
by Country by PostalCode
In the first example, because of the AND operator, only the records that satisfy the
Country condition are read. Using this set, Progress determines if the Name condition
applies. The resulting subset is already sorted on Country but needs to be sorted on
PostalCode.
In the second example, because of the OR operator, a second bracket is needed for the
Name condition. These ‘Name’ records are not sorted yet, so they need to be accessed
twice.
This leads to a dramatic loss in performance. If the time needed by the second query
is set to 100%; the first query does the job in only 6% of that timeframe
continued on next page
The Index Rulez 47 Copyright © 2012-2016 proWill B.V.
The Single Index Ruleset, continued
Condition 6 - Alphabet
► use the index that alphabetically comes first
► at least the leading component is referenced
Progress applies this condition when the number of references to the pertaining
indexes are the same and no decision can be made based on the type of match.
In this example the AND ruleset cannot be applied because of the absence of the
equality matches. The number of range matches (Single index, condition 4) are also
the same. Since there is no way for Progress to make a logical index selection, it
makes the arbitrary choice and select the index that comes first alphabetically. Name
comes before SalesRep, so Name is selected.
sample
for each Customer where SalesRep begins "B" and
Name begins "Lift Tours"
In this example both referenced fields are the leading components of the indexes
CountryPost and DiscBalCredL. The former comes first alphabetically and is selected.
sample
for each Customer where Country > "US" and
Discount < 200
continued on next page
48 The Index Rulez
Copyright © 2012-2016 proWill B.V.
The Single Index Ruleset, continued
Condition 7 – Primary
► use the primary index
This is the last resort condition for Progress in making an index selection. This index
is choosen if none of the previous rulesets and pertaining conditions can be applied.
Usually, when this condition is applied, the entire primary index is bracketed and
your query will have a low performance.
Here is an example. In this case you try to filter on a field that is not indexed.
Progress always uses an index and, having no other options, will choose for the
primary index.
sample
for each Customer where Phone begins "+316"
This is another example which suffers in fact from the same problem. Although both
references are components of an index; they are not the first component. The effect is
that Progress has no starting point in the multi-component indexes.
sample
for each Customer where PostalCode = "1234AB" and
Balance = 2000
In this example, the AND ruleset is applied because of the equality matches on Name
and SalesRep. However, the OR ruleset can not be applied, so the selected index
brackets on Name and SalesRep are rendered useless; Progress has to scan the entire
table to search for the record where City, which is not an indexed field, is equal to
“Eindhoven”.
sample
for each Customer where Name = 'Lift Tours' and
SalesRep = 'BBB' or
City = "Waregem"
The Index Rulez 49 Copyright © 2012-2016 proWill B.V.
Best Practices
Introduction
Indexes can greatly improve the performance of your queries but can, at the same
time, be very expensive, because, for every record you create or update the pertaining
indexes have to be rewritten, causing index fragmetation, which in turn can lower the
performance of your queries.
So, for a table with intensive update activity and limited reads, the fewer indexes you
use; the faster your writes will be. The more this ratio reverses, the more indexes you
can use.
Next will be discussed some guidelines for:
● designing indexes
● using indexes
50 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Index design
Introduction
As a general rule: only put indexes on the fields you frequently use for filtering
Multi-component indexes
A multicomponent index is more performant then putting an index on each separate
component. Constructing brackets on one index is faster then using brackets on
separate indexes. Furthermore, when using the AND operator between the compo-
nents, Progress can keep on using the same index, making its bracket progressively
smaller.
From the index rules you may have gathered that the main requirement to be fullfilled
here is the presence of the first component of the multi-component index in your
where clause. This is necessary, because an index only works if it knows where to
start. If that requirement can’t be met, then there is no purpose in defining it.
Using the OR operator is not in favor of using multi-component indexes because
of the fact that Progress can not put 2 separate index brackets on these indexes.
Word indexes
A variation to this theme is the use of a word index, instead of using several separate
indexes on fields that contain only small numbers of certain distinct values.
Examples here could be fields whose values serve as status indicators or logical
fields. The point here is, that the less variation there is in the values of a fields, the
larger your index brackets will be and therefor less efficient. Consider, combining
there distinct values in a word indexed character field.
The Index Rulez 51 Copyright © 2012-2016 proWill B.V.
Index use
Introduction
When retrieving data and therefor using indexes there are some do’s and don’ts you
should consider.
The starting point
An index always needs a reference that tells it where to start.
Here are some examples that tell Progress which index entry NOT to use.
sample
for each Customer where Name NE "Lift Tours"
for each Customer where Name <> "Lift Tours"
for each Customer where not Name = "Lift Tours"
You could rephrase this query using:
sample
for each Customer where Name < "Lift Tours" or Name > "Lift Tours"
However, this won’t make much of a difference. It’s like saying: find all references in
a book to all topics, except for topic x. You could do this by reading the entire book,
which is equivalent to using the WHOLE-INDEX on CustNum or you could scan the
index of the book and read all referenced pages, except for the ones referenced by
topic x, which is equivalent to using the bracket on the Name index. Either way, there
is no efficiency to be gained.
So the only solution here is to rephrase your filter.
Note, that this is also the reason why the CONTAINS operator cannot be used
starting with a wildcard. The CONTAINS operator was introduced because the
MATCHES operator always use the entire primary index.
continued on next page
52 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Index use, continued
Using the OR operator ● Avoid using an expression that Progress can not fully satisfy using index brackets.
This has come up for discussion several times already. If any of the expressions in
your where clause on either side of the OR operator cannot be resolved by an index
bracket, none of the indexes pertaining to the referenced fields can be used. As a
result Progress will have to scan the entire table. This occurs for example when using
a non-indexed field or when you do not reference the first component of a
multicomponent index.
Run-time issues ● Don’t use expressions referencing a fieldname.
These expressions can only be evaluated at runtime, so the compiler doesn’t have a
clue which index to use and uses therefore the entire primary index (Single index
ruleset, condition 7). This also applies to dynamic queries because the QUERY-
PREPARE method is no more then a compilation of the dynamic where-clause.
Sample
for each Customer where substring(Name,1,1) = 'a'
for each Customer where (if true then Name else City) begins 'a'
continued on next page
The Index Rulez 53 Copyright © 2012-2016 proWill B.V.
Index use, continued
The order of execution
When the compiler examines the whereclause, the order in which the expressions
appear has no influence on the selection of indexes. That is, of course, if you don’t
replace operators.
You should know however, that:
● the AND operator takes precedence over the OR operator
● the BEGINS operator takes precedence over the other range operators.
Here is an example
sample
for each Customer where Name = "Lift Tours" and
Country = "USA" or
Country = "Nederland"
This is in fact the same as:
sample
for each Customer where (Name = "Lift Tours" and
Country = "USA") or
Country = "Nederland"
As you might have expected, the order of execution can be influenced by using
parentheses. Here is an example:
sample
for each Customer where Name = "Lift Tours" and
(Country = "USA" or
Country = "Nederland")
Normally the AND operator would take precedence over the OR, but the parentheses
override this. So, first the expression between the parentheses is evaluated and the OR
ruleset is applied. By result, two index brackets around the CountryPost index are
selected. Next, because of the AND, the AND ruleset is applied, but fails. This means
that the indexbrackets selected by the OR ruleset are also discarded and become
inactive. That means that they cannot be used anymore. Next, the conditions of the
Single ruleset are applied and condition 3 is applied to Name.
Although Country has more equality matches than Name, it also uses the OR,
meaning that it cannot be used. Therefor Name is selected as having the most equality
matches .
54 The Index Rulez
Copyright © 2012-2016 proWill B.V.
Summary Index Rulesets
The Rulesets compared
Ruleset Name
Conditions
Multiple indexes possible
USE-INDEX ruleset o use only the index specified
RECID ruleset o only an equality match
o only recid/rowid reference
AND-ruleset o only equality matches
o only non-unique fields
o all components √
OR-ruleset o only leading component √
SINGLE-INDEX ruleset
1. Contains o use the word index √
2. Equality match & Unique o only equality matches
o only unique fields
o all components
3. Equality match o count equality matches
o at least leading component
o only AND
4. Range match o count range matches
o at least leading component
o only AND
5. Sort match o count sort matches
o at least leading component
o preferably AND
6. Alphabet o use first (alphabetically) index
o at least leading component
7. Primary o use primary index