Upload
tranduong
View
218
Download
2
Embed Size (px)
Citation preview
FileMaker Konferenz 2010
ESS Hinter den Kulissen(ESS Under the Hood)
Galt JohnsonFileMaker, Inc.Galt Johnson
FileMaker, Santa Clara, California
FileMaker Konferenz 2010
Curriculum VitæRed Brick Systems (Ralph Kimball)
• Designed logical SQL compiler
• Materialized view support
IBM DB2 Alphablox
• Architect (Data Stack)
• Multi-dimensional database support (Essbase, MSAS)
• MDX compiler and execution engine
FileMaker
• External SQL Sources
• FQL Engine
FileMaker Konferenz 2010
Agenda
• What is ESS?
• SQL Database Intro
• Basic ESS Architecture
• ESS from the Inside
• Performance Considerations
• Best Practices
FileMaker Konferenz 2010
Agenda
• What is ESS?
• SQL Database Intro
• Basic ESS Architecture
• ESS from the Inside
• Performance Considerations
• Best Practices
FileMaker Konferenz 2010
What is ESS?
• External SQL Source (ODBC Data Source)
• Gives FileMaker applications access to SQL data sources
• Hides almost all of the SQL “ugliness” from FileMaker developers
• Provides FileMaker developers with a familiar framework in which to develop their applications
FileMaker Konferenz 2010
Design Goals• Goals
• Easy to use
• Provide FileMaker-like functionality on top of SQL data sources. ESS tables are just like FileMaker tables.
• Good performance and scalability
• Anti-Goals
• Not a SQL reporting tool
• Not for processing large amounts of data
FileMaker Konferenz 2010
Agenda
• What is ESS?
• SQL Database Intro
• Basic ESS Architecture
• ESS from the Inside
• Performance Considerations
• Best Practices
FileMaker Konferenz 2010
SQL Basic Terminology
FileMaker SQL
Table Table
Record Row
Field Column
Index Index
FileMaker Konferenz 2010
SQL Terminology
Key
• Column or set of columns whose values uniquely identify all rows in a table
Examples
• SKU in a product table
• Order Id and Line Item Number in an “Order Details” table
• Customer ID in a customer table
But not
• Last name in an “Employee” table
• Time of day in an “Orders” table
• ProductID in an “Inventory History” table
FileMaker Konferenz 2010
Primary Keys• Primary Key
• The key the SQL DB uses to identify a row uniquely
• Surrogate
• Not related to the data itself (e.g. auto-increment column, GUIDs)
• Often called ID (Customer ID, Shipment ID, etc.)
• Natural
• Part of the data
• State and City names
• SKU
• UPC
FileMaker Konferenz 2010
More Terminology• ODBC
• A technology developed by Microsoft.
• Programming API
• SQL standardization
• De facto industry standard
• FileMaker can access ODBC data sources:
• Microsoft SQL Server 2000, 2005 and 2008*
• Oracle 9i, 10g and 11g*
• MySQL Community Edition 5.0 and 5.1* Community Edition
* New to FileMaker 10
FileMaker Konferenz 2010
Agenda
• What is ESS?
• SQL Database Intro
• Basic ESS Architecture
• ESS from the Inside
• Performance Considerations
• Best Practices
FileMaker Konferenz 2010
Third Party
ESS Architecture - Client
FileMaker Pro 10.0
ESS ODBC SQLDatabase
FileMaker Konferenz 2010
HostClient
ESS Architecture - Server
FileMaker Pro 10
ESS ODBC Driver SQL
Database
FileMaker Server 10
ESS
FileMaker Konferenz 2010
FileMaker ESS Tables• Treated similarly to external table references
• Shadow fields are created for SQL columns
• Primary key and/or unique columns are automatically discovered
• Unsupported types are not available
• VARBINARY, BLOB, etc.
• INTERVAL types
• Some validations are automatically imported
• Auto-increment (SQL Server and MySQL)
• Text field lengths
• But, ESS tables appear in local table list as shadow tables.
FileMaker Konferenz 2010
ESS Terminology
Shadow TableReferences ODBC data
source
FileMaker Konferenz 2010
ESS Terminology
Shadow fields
Supplementary fields
Reload shadow fields
FileMaker Konferenz 2010
FileMaker ESS Tables – Additional Options
• Some aspects of shadow fields can be modified
• Auto-Enter values
• Some validations (range, value list, calc)
• Some cannot be modified
• Data type is strictly enforced (e.g. no strings in number columns)
• Some range validations and data integrity rules are enforced by the SQL DB.
• Additional fields can be added
• Summary fields
• Calculated fields
• Cannot be stored or indexed
FileMaker Konferenz 2010
FileMaker ESS Tables – Additional Options
• Shadow fields may be removed
• FileMaker will ignore those fields in the SQL table
• The underlying SQL table is not altered in any way
• Supplemental fields
• Calculated fields based on values in shadow fields or other calculated fields
• Never stored or indexed
• Summary fields
• Value Lists
• FileMaker Pro v10 and later clients only
• The ESS field may appear as either the primary or secondary field, or both.
FileMaker Konferenz 2010
SQL to FileMaker Data Type Mapping
FileMaker SQL
Text CHAR, VARCHAR, TEXT, CLOB and UNICODE equivalents
Number INTEGER, DECIMAL, NUMBER, REAL, FLOAT, DOUBLE PRECISION, etc.
Date DATE, TIMESTAMP, DATETIME
Time Time, TIMESTAMP, DATETIME
Timestamp TIMESTAMP, DATETIME, SMALLDATETIME
Container, Calculation, Summary N/A
FileMaker Konferenz 2010
MSSQL Server DATETIME Columns
MSSQL Server DATETIME column*
• You may override its type to DATE
• FileMaker assumes all data in the DATETIME column has a time of 12:00AM (midnight).
• This is SQL Server’s default behavior for DATETIME values with no time (e.g. ‘2009-08-16’ implies midnight on that date).
• You may override its type to TIME
• FileMaker assumes all data in the DATETIME column has a date of January 1, 1900
• This is SQL Server’s default behavior for DATETIME values with no date (e.g. ‘13:27’ implies a date of 1900-01-01).
• This is only an option. You can always leave DATETIME columns as Timestamp fields.
* New to FileMaker 10
FileMaker Konferenz 2010
Unsupported Data Types
SQL DBMS Data types
OracleAll BINARY types including BLOBs All INTERVAL types LONG
SQL ServerAll BINARY typesXMLLimited SMALLDATETIME support
MySQL All BINARY typesBLOBs
FileMaker Konferenz 2010
Agenda
• What is ESS?
• SQL Database Intro
• Basic ESS Architecture
• ESS from the Inside
• Performance Considerations
• Best Practices
FileMaker Konferenz 2010
First, a little background…
FileMaker Konferenz 2010
FileMaker Table Basics• Every record in a FileMaker table has a unique
Record ID (RID)
• Every table has a Master Record List
• List of RIDs in table
• Represented as a sparse bitmap
• Backed by client-side temporary file
• New records get the next available RID
• Deleting records
• Removes the data from the table
• Removes the RID from the Master Record List
FileMaker Konferenz 2010
Now back to our regularly
scheduled programming…
FileMaker Konferenz 2010
ESS mapping
How does it really work?
• Primary keys are…um…“key”.
• ESS maintains a mapping of PKRID.
SQLDB
PKsFileMakerDB EngineRIDs
SKU1 RID45
SKU2 RID12
SKU3 RID39PKs RIDs
FileMaker Konferenz 2010
Master RID list creation• Every table has a RID list representing every
row in the table
• Created when table is first opened
• RID list is allocated based on number of rows in table
select count(*) from ess_table
• RID list creation is very fast
• 1,000,000 row table opened in < 1 second
• No PK’s are retrieved.
FileMaker Konferenz 2010
Mapping Maintenance
• Mapping is maintained passively
• During Refresh Window command (menu or script step)
• When data for a RID cannot be found
• Data fetch (display, calc, sort, etc)
• Join
• Find
• During add and edit operations
FileMaker Konferenz 2010
Simple Mapping Scenario
Open layout on table
FileMaker Konferenz 2010
Simple Mapping Scenario (cont.)
• User opens a layout using an ESS table
• Connect to SQL database
• Query number of rows in table (20,000 rows)
select count(*) from rows20000
• Allocate RID list (20,000 RIDs)
• Connection remains open until file is closed
FileMaker Konferenz 2010
Simple Mapping Scenario (cont.)• FileMaker Pro requests 25 records for display
• RID request (1-25), mapping is empty
• Fetch primary keys for RIDs 1-25
select key1 from rows20000 order by 1
• Generate mappings for RIDs 1-25 to fetched PK’s
RID 1 100001
RID 2 100002
…
RID 25 100025
• Fetch data for RIDs 1-25 (uses key mapping to determine primary keys)
FileMaker Konferenz 2010
Simple Mapping Scenario (cont.)• User scrolls to 5,000th record
• Data requested (RIDs 5,000-5,024)
• Fetch primary keys for RIDs 26-5024select key1 from rows20000 where key1 > 100025 order by 1
• Map key1 values to RIDs
RID 26 100026
RID 27 100027
…
RID 5024 105024
• Fetch data for RIDs 5000-5024
select key1, d1, d2, d3 from rows20000where key1 in(105000,105001,…,105024)
FileMaker Konferenz 2010
Simple Mapping Scenario
RID PK
1 100001
2 100002
… …
5023 105023
5024 105024
5025
5026
… …
20000
RID - PK Mapping
FileMaker Konferenz 2010
Rows deleted from SQL DB (Scenario Continues…)
Note that we currently have mappings for RIDs 1-5024 and none for RIDs 5025-20000.• External to FileMaker Rows deleted: 100001, 100002 Rows deleted: 106000-106099 (total of 102 rows)
• FileMaker user performs Refresh Window • Refresh Window flushes cached data and
causes Master RID list to be updated.• The mapping remains untouched
FileMaker Konferenz 2010
Rows deleted from SQL DB (Scenario Continues…)
• How many rows are now in the table?select count(*) from rows20000
• There are now 19898 rows in the table.• There are many RIDs in the Master RID list with no
mappings, so they are the first to go.• RIDs 19898 through 20000 are removed from
RID listRID 1 100001 Still in mapping!RID 2 100002 Still in mapping!…RID 5024 $ 1005024RID 5025 through RID 19898 are still unmapped
FileMaker Konferenz 2010
Rows deleted from SQL DB(Scenario Continues…)
• Mapping
• RID 1 100001 Still in mapping!
• RID 2 100002 Still in mapping!
• …
• RID 5024 1005024
• RID 5025 through RID 19898 are still unmapped
• Master RID list:
• 1, 2, 3, …, 19898
FileMaker Konferenz 2010
Rows deleted from SQL DB (Continued…)
• User scrolls to first record, and refresh cleared the data cache, so… FileMaker now requests data for RIDs 1-25
select key1, d1, d2, d3 from rows20000 where key1 in(100001,100002,…,100025)
• No values returned for 100001 or 100002 • Consult mapping to find RIDs…• Update Master RID list
3, 4, 5, …, 19898
FileMaker Konferenz 2010
The Bottom Line
• Scrolling requires lots of maintenance
• There’s a great deal of work going on under the covers
• Aren’t you glad you don’t have to worry about this?
FileMaker Konferenz 2010
Performance Considerations
• Finds
• Joins
• Scrolling / Browsing
• Value Lists
• General
FileMaker Konferenz 2010
Finds
• Design goal: Mimic FileMaker
• “Find” operations converted to SQL to the extent possible.
• Push as much as possible into the SQL database.
NB: FileMaker post-processes results if necessary.
FileMaker Konferenz 2010
Finds: Numeric and Date• Numeric
• Supported operations
• Simple value matches: =, <, >, <=, >=, !
• Range (87..1006)
• Unsupported operations
• Wildcard (e.g. 3##3)
• Date/Time/Timestamp
• Supported operations
• Simple value matches: =, <, >, <=, >=, !
• Range (e.g. 1/1/2000...1/15/2000)
• Wildcard(e.g. */21/2007 10:*:* AM)
FileMaker Konferenz 2010
Finds: Text• ESS fields’ case sensitivity depends on database
collation for the column
• Usually a default for the database
• Can be overridden on a column-by-column basis in some SQL DBs.
• Most ESS searches are not word-based
• Find “<per” will match “pencils, wooden” But not “wooden pencils”
• “=” and default searches are word-based
• Find “pen” will match “pen, ink” and “ink pen”
• Exact matches are field based (as expected)
FileMaker Konferenz 2010
Finds: Text
Remember, FileMaker’s default behavior is word search
This is VERY expensive to do in SQL.
FileMaker Konferenz 2010
Some finds evaluated completely in SQL
• Numeric finds converted to simple predicatesFileMaker SQL
=32 where col=3243..97 where col>=43 and col<=97==Apples where col='Apples'
FileMaker SQL
=6/1/2007 10:00:00 tscol=datetime'2007-06-01 10:00:00'
1/*/2007 tscol>=datetime'2007-01-01 00:00:00' and tscol<datetime'2007-02-01 00:00:00'
• Date, Time and Timestamp converted using SQL literals or functions for date/time values
FileMaker Konferenz 2010
Some finds are post-processed
• Text word searches
• FileMaker’s wildcard search is richer than SQL’s
• SQL wildcards are generated, but will generally return more data than will really match
• FileMaker always post-processes these patterns
This can be VERY expensive.
FileMaker SQL
=pen where col like '%pen%'
=pen* where col like '%pen%'
(Any other wildcard search) SQL to get a constrained superset of rows matching expression
FileMaker Konferenz 2010
Find Performance✘ Avoid post-processed finds
✘ Pattern matches on text fields
✘ Word searches on text fields
✘ Non-leading date/time wildcard searches
Time values are always sorted YYYY/MM/DD HH:MM:SS
So…
“12/6/* 10:00:00” performs poorly because year is wild
“*:15:*” performs poorly because hour is wild
FileMaker Konferenz 2010
Join Performance• Single field equi-joins (equals predicate) are
preferredSQL is much more succinctselect pkkey from ess_tablewhere join_column in(v1, v2, …, vn)
instead ofselect pkkey from ess_table where (column1 = v1 and column2 = u1) or (column1 = v2 and column2 = u2) or (column1 = v2 and column2 = u3) …
FileMaker Konferenz 2010
Join Performance• Try to avoid large joins
Filter/find first and then join• Make sure ESS column(s) have appropriate
SQL indexes
NB: Index should cover at least all joined columns
FileMaker Konferenz 2010
Browsing Performance• Table-based browsing should be avoided
• Large table performance can be expensive• All intervening key values must be fetched• Mappings must be updated and maintained
• Order of rows is not necessarily predictable• Rows from FileMaker-based tables
generally appear in the order in which they were inserted
• ESS-based rows appear in the order in which their keys were fetched
• Browse on found sets instead• Hide the status area
FileMaker Konferenz 2010
Value Lists
• Version 10 and later supports value lists on ESS fields
• All functionality is available (no technical restrictions)
• For best performance avoid having ESS table as the secondary field unless it is the same table as the primary field.
• Performance of single field value lists generally not a problem.
FileMaker Konferenz 2010
General Performance• FileMaker does all sorting of data
• Fetches all records in found set
• Avoid sorting large quantities of data
• FileMaker does all summaries (count, sum, average, etc.)
• Fetches all records in found set
• Avoid summarizing large quantities of data
• Consider creating SQL views with aggregations
• Delete shadow fields that are never used from shadow table
• Caveat: Keep fields that tell you a record has been modified.
FileMaker Konferenz 2010
Best Practices• Don’t allow browsing of large tables.
• Find target set and then browse.
• Hide status area.
• Don’t sort or summarize large numbers of rows.
• Use SQL database’s native tools for loading and exporting large amounts of data.
• Use command-line tools from FM scripts where possible
• Or use Execute SQL script step
• Try to optimize “find” operations to avoid expensive post-processing.
FileMaker Konferenz 2010
Best Practices• Good relational database design is crucial to
good performance• Normalization• Surrogate keys• Good indexes• Discrete primary key types (not floating
point, or large text fields)
FileMaker Konferenz 2010
Questions?
FileMaker Konferenz 2010
Thank You!