Upload
shiva-kumara
View
273
Download
1
Tags:
Embed Size (px)
DESCRIPTION
ATG Data Anywhere Architecture WP
Citation preview
Ææ
Understanding ATG Data Anywhere Architecture™ Efficient, transactional data access without writing code using Dynamo Repositories
April 2002
ATG White Paper Pat Durante Senior Practice Manager, ATG Education Services
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
ii
Contents
1 Executive Summary 2
2 The Challenge of the Data Access Problem 4 Hasn’t the Data Access Problem Been Solved? 5 Why Should You Care about the ATG Data Anywhere Architecture™? 5
3 ATG Data Anywhere Architecture™ 8 Data Source Independence 8 Understanding the ATG Data Anywhere Architecture™ 9 Repository Basics 10 Using Repository Data 12
The Repository API 12 RepositoryFormHandler 13 Dynamo Servlet Beans and the Repository Query Language (RQL) 15
4 Less Java Code: Faster Time-to-Market, Less Maintenance 17 Using the Visitor Profile (Out-of-the-Box) 22 Extending the Definition of a Repository Item 23
Using a Simple Auxiliary Table (To model a one-to-one relationship) 23 Using a "Multi" Table (to model a one-to-many relationship) 25
Switching to an Alternative Relational Database Management System 26 Converting from one type of database to another 27
5 A Unified View of Customer Interactions 28
6 Maximum Performance Through Intelligent Caching 30 Case 1: Single Dynamo Server 30 Case 2: Read frequently, modify rarely or never 31 Case 3: Modifications made by one Dynamo server at a time 31 Case 4: Modification by multiple Dynamo servers 31 Case 5: Modification by a non-Dynamo Application 32
Disabling Caching 32 Invalidating the Cache 32
Controlling the Cache Sizes 33
7 Simplified Transactional Control 35 Overview of Transactional Integrity 35 The J2EE Approach to Transactional Integrity 35 ATG Data Anywhere Support for Transactional Integrity 35
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
iii
Advantages of the ATG Data Anywhere Approach 35 Example Page 36 Default Transactional Behavior 36 Recommendations 36
8 Strong Built in Search Capabilities 37
9 Fine-grained Access Control 38 Case 1: Controlling access to all items of the same type 38 Case 2: Controlling access to specific items 38 Case 3: Controlling access to specific properties 38 Case 4: Limiting Query Results 38 Creating a Secured Repository 39
10 Conclusions 41
Appendix: Other Sources of Information 42 Documentation 42 Education 42
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
2
1 Executive Summary
Providing good online service requires access to lots of data. At most companies, this data is spread among
different data stores and in different data formats across the enterprise. To provide a single face to their
customers, firms need to utilize the data in all those silos. Companies also benefit by putting together a complete
picture about each customer and driving their marketing and sales efforts more effectively.
Accessing data for online use is difficult. Data has to be cached efficiently to prevent bottlenecks. Software has to
provide transactional integrity, so that accounts will be accurate. It has to provide rich tools for important
functions like searching. And data needs to be secured to prevent unauthorized access. Most importantly,
accessing data has to be easy. Since there is so much work to create and maintain data access, some developers
end up spending the majority of their time simply trying to integrate data sources.
The ATG Data Anywhere Architecture™, featuring Dynamo Repositories, provides a world in which a simple XML
file is all you need to integrate a new data source for online use. This environment provides a wealth of caching
choices, insures transactional integrity, and offers the rich tools needed to rapidly manipulate, search and secure
data. It also provides a world where access to data stored in file systems, relational databases and LDAP directories
is all accomplished using the same set of interfaces. This world is accessible to all applications built using ATG
products.
What does this mean? Faster time to market, better maintainability, and more extensibility combine to decrease
total cost of ownership of web applications. With ATG Data Anywhere Architecture™, developers can focus on
implementing business logic rather than spending time writing "wrapper classes" for each persistent data type.
ATG Data Anywhere Architecture™ offers several advantages over the standard data access methods such as Java
Data Objects (JDO), Enterprise JavaBeans (EJB), and Java Database Connectivity (JDBC). Among the differences:
��ß Data source independence – ATG Data Anywhere Architecture™ provides access to relational
database management systems, LDAP directories, and file systems using the same interfaces. This
insulates application developers from schema changes and also storage mechanism. Data can
even move from a relational database to an LDAP directory without requiring re-coding. Java Data
Objects support data source independence, but it is up to vendors to provide an LDAP
implementation.
��ß Fewer lines of Java code – Less code leads to faster time-to-market and reduced maintenance cost.
Persistent data types created using ATG Data Anywhere are described in an XML file. Absolutely no
java code required.
��ß Unified view of all customer interactions – A unified view of customer data (gathered using web
applications, call center applications, and ERP systems) can be provided without copying data into
a central data source. This unified view of customer data leads to a coherent and consistent
customer experience.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
3
Figure 1 – The unified view of data access provided by the ATG Data Anywhere Architecture™
��ß Maximum performance - Our intelligent caching of data objects ensures excellent performance and timely,
accurate results. The JDO and EJB standards rely on a vendor implementation of caching which may or may not be
available.
��ß Simplified Transactional Control – The key to overall system performance is minimizing the impact of transactions
while maintaining the integrity of your data. In addition to full Java Transaction API (JTA) support, ATG Data
Anywhere allows both page developers and software engineers to control the scope of transactions using the
same transactional modes (required, supports, never, etc.) used by EJB deployment engineers.
��ß Powerful built-in search capabilities – Quality search tools lead to increased visitor satisfaction and efficiency
(which often lead to increased or sustained revenue!) Customers can’t buy what they can’t find.
��ß Fine-grained access control – Control who has access to which data at the data type, data object, even down to the
individual property using Access Control Lists (ACLs)
��ß Integration with ATG product suites - Our award winning personalization, scenarios, commerce, and portal
applications all make use of Repositories for data access. A development team is free to use EJBs along side of ATG
technology, but the easiest way to leverage investment in ATG technology is to follow the example set by our
solution sets. Our solution sets satisfy all of their data access needs using Repositories.
Technical leads and architects are faced with difficult choices to make when deciding upon the data access mechanism used for a new
application. Some think JDO or J2EE/EJB may be the right choice since both they offer portability across application server vendors.
However, in addition to all of the advantages above, the ATG Data Anywhere Architecture™ is also portable across application servers.
With support for Dynamo Application Server, BEA WebLogic and IBM WebSphere, applications built using ATG Data Anywhere
Architecture™ can be deployed on the majority of the application server market. The bottom line: ATG Data Anywhere Architecture is
the most powerful, most flexible, easiest to use data access method available. It saves developers time and frustration. It helps
customers have a better experience. It saves organizations money. Can there be any other choice for your next project?
������
����
���
��
���
�������
�����
�������
����
�������
����
�������
3URILOHV
������3K\VLFDO'DWD
6WRUDJH
&XVWRPHU�3URILOH�'DWD
(PSOR\HH�'LUHFWRU\
3URGXFW�&DWDORJ
&RQWHQW�0DQDJHPHQW�
6\VWHP
&DOO�&HQWHU�'DWDEDVH
6DOHV�)RUFH�'DWDEDVH
3URGXFW�&DWDORJ
&RQWHQW
$QDO\WLFV
(-%�&RQWDLQHUV
6HFXULW\
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
4
2 The Challenge of the Data Access Problem
In the first generation of sites for the World Wide Web, most companies developed simple sites with largely static content
describing their goods and services. Known as “Brochure Ware,” early sites proved to be a cost effective tool for displaying
information, and a popular way for clients to do basic research on companies.
As the Internet become more popular, firms recognized that the Web had the possibility of becoming a complete channel,
able to service a large section of their client base for many of their needs. While providing the access that clients increasingly
sought, Web sites also offered firms the ability to significantly decrease the cost of serving customers by making a self-
service option available around the clock. Firms like Amazon and Fidelity recognized that providing excellent service at low
cost could create a significant competitive advantage over their rivals.
However, firms quickly realized that to change customer behavior in large numbers, web sites had to offer a similar or better
level of service than traditional channels offered. Many firms with business strategies predicated on lower cost, without
providing a superior customer service experience, dropped from the market in record numbers.
In order to provide good service via the Internet, firms have had to offer intelligent web sites. These next generation web
sites are able to understand client’s account, just as a real customer service representative is able to do. To fulfill the vision of
excellent self-services, sites had to develop from simple brochures to rich, transactional environments able to satisfy
customer needs as completely as possible. The more data sources that are available, the better the chance that the
customer’s request can be answered.
Figure 2– EMC's Powerlink Required Approximately 20 Integrations
EMC’s Powerlink system (see figure 2 ) is a great example of the kinds of integrations necessary for a modern, world-class
web site (A detailed case study on the EMC Powerlink project is available on atg.com). A typical enterprise web site will
require integrating data from 15 to 50 systems. A more complex site can require far more integrations.
Adding to the challenge of numerous and varied data sources is the problem of transforming data gathered from these
external systems into an object-oriented framework. Information is organized differently in these external systems.
Relational databases store information in tables; file systems and LDAP directories store data hierarchically.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
5
Hasn’t the Data Access Problem Been Solved?
Some data access problems have, in fact, been solved. Java Database Connectivity (JDBC) enables our web applications to
interact with relational database management systems in a vendor independent way. Your applications are finally loosely
coupled with your database vendor. Unfortunately, JDBC does not insulate your application from the database schema, nor
does it map well into the object-oriented space. JDBC is a fairly low-level technology. Application developers execute SQL
statements to interact with the data source and if a result set is returned from a query, the developer must transform the
results into objects by writing code.
Enterprise JavaBean (EJB) technology enables web applications to interact with relational databases in a schema
independent way. The mapping between EJB properties and database columns is provided in an XML deployment descriptor
file. Also, the connection logic and the transactional control is handled outside of application code thus freeing up the
developers to focus on application logic. On the downside, the need for absolute portability across application servers has led
to complexity in both the coding and the configuration required to get an Enterprise JavaBean up and running. Developers
have to write at least 2 interfaces and one Java class for each EJB as well as provide a wide assortment of intricate
configuration details. Modern development tools such as JBuilder (by Borland) have made the process of creating and
configuring EJBs easier than ever, but still developers have faced steep learning curve and tedious development and
configuration tasks when working with EJB technology.
Java Data Objects (JDO) is the latest standard data access approach approved through the Java Community Process. JDO
offers a more transparent data access mechanism. JDO allows developers to persist any Java class without source code
modification. On the downside, developers still need to write a Java class for each persistent type (and modify that Java class
to add or remove properties). Also, JDO is going to rely on vendor implementations for caching of data objects and LDAP
access.
JDBC, Enterprise JavaBeans, and JDO focus on solving only some of the data access challenges described above. If your web
application needs to access data in a file system or in an LDAP directory, you'll be forced to use yet-another technology.
Why Should You Care about the ATG Data Anywhere Architecture™?
As the chart below shows, ATG Data Anywhere Architecture can do anything the others approaches can do, and also much
more. The ATG Data Anywhere Architecture™ was designed to meet the demanding requirements of web applications. Our
technology enables web applications to access data in a data source and schema independent way without writing code to
transform or store data in an object. Data Anywhere Architecture is a higher-level abstraction that leads to faster time-to-
market and higher reliability.
Think about the possibilities: if the integration with individual data sources is simpler, the number of integrations your team
can complete in the same amount of time will increase. The more successful integrations your team builds, the more
intelligent your customer interactions can be. In an economy where customer retention is key, the web experience can make
or break the success of your business.
As an instructor who has taught and used both J2EE/EJB technology and ATG Data Anywhere Architecture since they were
first implemented, I have seen the differences firsthand. To teach a developer to the J2EE/EJB approach (create a JSP that
accesses either a JavaBean or a Servlet that in turn accesses a container-managed entity bean) takes 4 days. In contrast, I
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
6
teach developers to use the ATG Data Anywhere Architecture™ in 2 days, including covering much of the additional
functionality provided. This is my personal measure of the elegance of the ATG Data Anywhere Architecture.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
7
Challenge Description ATG Data
Anywhere
Architecture
JDO Enterprise Java
Beans (EJB)
Java
Database
(JDBC)
Data Source
Independence
Application logic does not change based upon the
type of data source (e.g., relational database, XML file,
LDAP directory, etc.)
�� ���
With
Connectors or
BMP
Schema Independence Application logic is completely independent from the
schema (e.g., table names, column names, table
relationships, etc.) so that if the schema needs to
change (e.g., a new column is added/removed), the
application doesn't need to be changed.
�� �� ��
Object Relational
Mapping
Applications interact with objects not relationally or
hierarchically organized data.
�� �� ��
No Java Classes Developers do not need to write, compile and test
Java classes (or interfaces) for each persistent data
type that they want to use in their application,
reducing development time and errors.
��
Portability Across
Application Servers
Applications that make use of the data access
solution are portable to other application servers.
�� �� �� ��
Intelligent Caching The data access mechanism provides a caching
mechanism so that frequently accessed information is
available in memory, improving application reliability
and scalability.
�� Vendor
Specific
Vendor
Specific
Simplified
Transactional Control
The data access mechanism ensures the integrity of
the data and the transactional scope can controlled
programmatically (using modes) or via provided
dynamic page tags.
�� � ��
Searching The ability to search across data source/types.
��
Access Control The ability to control access to data objects and
properties within those data objects
�� ��
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
8
3 ATG Data Anywhere Architecture™
Data Source Independence
Figure 3 below provides a high-level overview of the ATG Data Anywhere Architecture™.
Figure 3 – The ATG Data Anywhere Architecture™
��ß With ATG Data Anywhere, the application logic created by developers uses the same approach to
interact with data regardless of the source of that data. One of the most powerful aspects of this
architecture is that the source of the data is hidden behind the Dynamo Repository abstraction. It
would be easy to change from a relational data source to an LDAP directory since none of the
application logic would need to change.
��ß Once data is retrieved from a data source it is transformed into an object-oriented representation.
Manipulation of the data can then be done using simple getPropertyValue and
setPropertyValue methods.
���������
���������-DYD
2EMHFW
SURSHUW\�
SURSHUW\�
�������������
64/�&RQQHFWRU
/'$3�&RQQHFWRU
)LOH�6\VWHP&RQQHFWRU
5'%06
/'$3�'LUHFWRU\
���������
SURSHUW\�
«
���������������
5HSRVLWRU\$3,�RU�'URSOHWV
���������
���������-DYD
2EMHFW
SURSHUW\�
SURSHUW\�
�������������
64/�&RQQHFWRU64/�&RQQHFWRU
/'$3�&RQQHFWRU/'$3�&RQQHFWRU
)LOH�6\VWHP&RQQHFWRU)LOH�6\VWHP&RQQHFWRU
5'%065'%06
/'$3�'LUHFWRU\/'$3�'LUHFWRU\
���������
SURSHUW\�
«
���������������
5HSRVLWRU\$3,�RU�'URSOHWV
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
9
Understanding the ATG Data Anywhere Architecture™ • A Repository is a data access layer that defines a generic representation of a data store. Application
developers access data using this generic representation by using only interfaces such as Repository and
RepositoryItem.
• Repositories accesses the underlying data storage device through a connector, which translates the request
into whatever calls are needed to access that particular data store. Connectors for relational databases, LDAP
directories, and file systems are provided out-of-the-box. Connectors use an open, published interface, so
additional custom connectors can be added if necessary.
• Developers use Repositories to create, query, modify, and remove Repository Items.
• A Repository Item is like a JavaBean, but its properties are determined dynamically at runtime. From the
developer’s perspective, the available properties in a particular repository item depend on the type of item
they are working with. One item might represent the user profile (name, address, phone number), while
another may represent the meta-data associated with a news article (author, keywords, synopsis).
• The purpose of the Repository interface system is to provide a unified perspective for data access. For
example, developers can use targeting rules with the same syntax to find people or content.
• Applications that use only the Repository interfaces to access data can interface to any number of back-end
data stores solely through configuration.
• Developers do not need to write a single interface or Java class to add a new persistent data type to an
application
ATG also provides a unified view of your applications data through the ATG Control Center which is a graphical user interface
that uses the Repository interfaces to allow users to create, query, update, and remove repository items. Figure 4 below
shows the interface to user repository items – this UI will look the same regardless of the data source used to store the user
data.ß
ß
Figure 4 – Using the ATG Control Center to Access the User Repository
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
10
Repository Basics
Figure 5 below shows an example of a repository that stores customer information.
Figure 5 – Sample Repository
Inside each repository, there can be several types of items (which are called "item-descriptors") and for each type there can be
several repository items. The definition of each type of item is described in a repository definition file using XML. In this
example, the Visitor Profile Repository defines two types of items (user and address).
������������ ���� ��������������� ���� �����������������������������
���������������� �������������������� ����
������������� ����
5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����
������������� �������
5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����
5HSRVLWRU\�,WHP�´-RHµ��LG�����
5HSRVLWRU\�,WHP�´6XHµ��LG�����
������������ ���� ��������������� ���� �����������������������������
���������������� �������������������� ����
������������� ����
5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����
������������� �������
5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����
������������� ����
5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����
������������� �������
5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����
5HSRVLWRU\�,WHP�´-RHµ��LG�����5HSRVLWRU\�,WHP�´-RHµ��LG�����
5HSRVLWRU\�,WHP�´6XHµ��LG�����5HSRVLWRU\�,WHP�´6XHµ��LG�����
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
11
Developers can model relationships between types of items as shown in figure 6.
Figure 6 – Relationships between Repository Items
Dynamo Repositories use the Java Collections Framework to model complex relationships between items using familiar
object-oriented concepts. You can store the "list" of addresses as a Set, List, Map, or array (whatever make sense for
your applications needs).
But the boundary does not fall at the Repository’s wall. Developers can create links between items in different repositories
(see figure 7 below). This allows you to create repository items that are composed of properties retrieved from more than
one data source. You’ll have to keep in mind though that the properties in the adjunct repositories will not be queryable.
Applications that need to query against properties from multiple data sources can still make use of Repositories, but the
developers will need to query each repository separately.
In the example shown below, the majority of the information about a particular visitor is stored in a relational database. In
many web applications, an LDAP directory is used to store information about the organizational structure of a company
and/or userid/password combinations for authentication. Dynamo Repositories allow you to create an item that has access
to both relational data and LDAP data from the same object.
��� ��� ���������� ������� ������������������������������� ���������
������� ��������������������������� �������� ���������������������������������������
���������������� �������������������� ����
������������� ����
5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����
������������� �������
5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����
5HSRVLWRU\�,WHP�´-RHµ��LG�����
5HSRVLWRU\�,WHP�´6XHµ��LG�����
��� ��� ���������� ������� ������������������������������� ���������
������� ��������������������������� �������� ���������������������������������������
���������������� �������������������� ����
������������� ����
5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����
������������� �������
5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����
������������� ����
5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����5HSRVLWRU\�,WHP�´���0DLQ�6Wµ��LG�����
������������� �������
5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����5HSRVLWRU\�,WHP�´��3DUN�6Wµ��LG�����
5HSRVLWRU\�,WHP�´-RHµ��LG�����5HSRVLWRU\�,WHP�´-RHµ��LG�����
5HSRVLWRU\�,WHP�´6XHµ��LG�����5HSRVLWRU\�,WHP�´6XHµ��LG�����
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
12
Figure 7 – Linking Between Repositories
Using Repository Data
Dynamo provides many powerful ways to make use of repository data in your application:
��ß Programmatically via the Repository API
��ß Through the use of RepositoryFormHandler
��ß On a dynamic page (through Dynamo Servlet Beans and the Repository Query Language (RQL))
The Repository API
The Repository API allows you to programmatically create, retrieve, update, or delete items. The power of the Repository API
is that developers use the same API regardless of data source. An item that contains data from an LDAP directory is
manipulated the same way that an item that contains relational data is manipulated.
Here’s a code example that shows how a developer can retrieve the age property of a user item (assuming that the id of the
user is known – in this case '9'):
import atg.repository.*;
Repository repository = getRepository();
RepositoryItem user = repository.getItem("9","user");
Integer age = (Integer) user.getPropertyValue("age");
���������������� �������
������������� ����
,WHP�´-RHµ��LG�����
,WHP�´6XHµ��LG�����
���������������� �������
������������� ����
,WHP�´$7*?(GXFDWLRQµ
������������� ������������
,WHP�´$7*?6HUYLFHVµ
5'%06 /'$3�'LUHFWRU\
�������������� ������� ��������������� ��� ��������������
���������������� �������
������������� ����
,WHP�´-RHµ��LG�����
,WHP�´6XHµ��LG�����
������������� ����
,WHP�´-RHµ��LG�����,WHP�´-RHµ��LG�����
,WHP�´6XHµ��LG�����,WHP�´6XHµ��LG�����
���������������� �������
������������� ����
,WHP�´$7*?(GXFDWLRQµ,WHP�´$7*?(GXFDWLRQµ
������������� ������������
,WHP�´$7*?6HUYLFHVµ,WHP�´$7*?6HUYLFHVµ
5'%065'%06 /'$3�'LUHFWRU\/'$3�'LUHFWRU\
�������������� ������� ��������������� ��� ��������������
�������������� ������� ��������������� ��� ��������������
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
13
The following code snippet shows how a developer can change the age property of a user item:
try {
MutableRepository mutableRepository =
(MutableRepository)getRepository();
MutableRepositoryItem mutableUser =
mutableRepository.getItemForUpdate("9", "user");
mutableUser.setPropertyValue("age",new Integer(43));
mutableRepository.updateItem(mutableUser);
}
catch (RepositoryException exc) ...
Notice that the code created by the application developer uses only the Repository API. The code has no knowledge of the
type of data source nor does the code have any knowledge of the schema. There is much more in the Repository API that you
will want to explore (such as the ability to query the repository, control transactional boundaries, and control the validity of
the repository items that are cached to improve performance), but this should give you a taste of what is involved.
RepositoryFormHandler
As you know, ATG provides a robust form handling framework that can be used by developers whenever a web form needs to
be created. ATG provides a specific form handler that can be used to manipulate repository data as well. The
RepositoryFormHandler can be used out-of-the-box to create, update, or delete repository items. And like any Java
class, it can be extended if you need specialized behavior.
Before you can use the RepositoryFormHandler, you'll need to configure a component based on this class. You will most
likely want to configure the repository it will be interacting with as well as the type of repository item. Here's a property file
(this is the configuration syntax used by Dynamo's Nucleus component framework) that shows an example configuration for
this type of form handler:
# /RepositoryFormHandler
#Thu Sep 06 08:41:24 EDT 2001
$class=atg.repository.servlet.RepositoryFormHandler
$scope=request
itemDescriptorName=topic
repository=/MyApplication/TopicRepository
requireIdOnCreate=false
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
14
Once you've configured a form handler as shown above, a page designer can make use of it. Here's an example
page that allows a visitor to add a new topic to the TopicRepository.
<H1>Add a New Topic</H1>
<dsp:form action="addTopic.jsp" method="POST">
<!-- Default form error handling support -->
<dsp:droplet name="/atg/dynamo/droplet/ErrorMessageForEach">
<dsp:oparam name="output">
<B><dsp:valueof param="message"/></B><BR>
</dsp:oparam>
<dsp:oparam name="outputStart">
<LI>
</dsp:oparam>
<dsp:oparam name="outputEnd">
</LI>
</dsp:oparam>
</dsp:droplet>
Enter the Topic Name:<BR>
<dsp:input bean="/RepositoryFormHandler.value.topicName"
name="topicName" size="24" type="TEXT"
required="<%=true%>"/><BR>
<dsp:input bean="/RepositoryFormHandler.value.topicBody"
name="topicBody" type="TEXT"/><BR>
<dsp:input bean="/RepositoryFormHandler.createSuccessURL"
type="HIDDEN" value="/Discussion/alltopics.jsp"/>
<dsp:input bean="/RepositoryFormHandler.create" type="Submit"
value="Add Forum"/>
</dsp:form>
A few notes about this example:
��ß This form handler makes it extremely easy to tie a form element to a specific item property.
Note the syntax used is /FormHandlerComponentName.value.propertyName.
��ß This form handler provides several "handlers" to enable a page developer to perform the
various operations (create,update,delete). This example uses the create handler.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
15
Dynamo Servlet Beans and the Repository Query Language (RQL)
ATG provides several Servlet Beans (also known as droplets) to allow page developers to retrieve and display repository data
on dynamic pages (JSP or DSP). In the simplest case (when the page developer knows the unique id of the repository item),
the ItemLookupDroplet can be used. The following code example shows this droplet in action (using JSP in this case):
<%@ taglib uri="/dspTaglib" prefix="dsp"%>
<dsp:page>
<dsp:importbean bean="/MyApplication/TopicRepository"/>
<dsp:importbean bean="/atg/dynamo/droplet/ItemLookupDroplet"/>
<dsp:setvalue bean="ItemLookupDroplet.useParams" value="true"/>
<dsp:droplet name="ItemLookupDroplet">
<dsp:param name="id" value="1"/>
<dsp:param name="repository" bean="TopicRepository"/>
<dsp:param name="itemDescriptor" value="topic"/>
<dsp:oparam name="output">
Name: <dsp:valueof param="element.topicName"/><br>
Body: <dsp:valueof param="element.topicBody"/><br>
</dsp:oparam>
</dsp:droplet>
</dsp:page>
A couple of notes about this example:
��ß We provided three inputs: The unique id of the item (1), the name of the repository
(TopicRepository) that contains the item, and the type of item (Topic).
��ß The output of the droplet is a RepositoryItem called element. We can retrieve the properties of that
item using the simple dot notation (element.topicName for example).
In many cases, page developers will not know the unique id of the item (or items) they want to display on the page. In fact,
what page developers often need to do is query the repository for a set of items that match some criteria. You might assume
that the page developers will use an industry standard such as SQL to perform this query. The problem with SQL is that it is
designed to work only with relational databases. Since a repository may have a relational database, an LDAP directory, or a
file system behind it SQL is not an appropriate query language. ATG provides a SQL-like query language for repositories called
the Repository Query Language (RQL). ATG also provides droplets that can be used by page developers to execute RQL queries
and loop over the results.
The code example below shows how a JSP developer can use the RQLQueryForEach droplet to display a list of all the topics
in the TopicRepository that have at least 1 reply associated with them.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
16
<%@ taglib uri="/dspTaglib" prefix="dsp"%>
<dsp:page>
<dsp:droplet name="/atg/dynamo/droplet/RQLQueryForEach">
<dsp:param name="queryRQL" value="numReplies >= 1"/>
<dsp:param name="repository"
value="/MyApplication/TopicRepository"/>
<dsp:param name="itemDescriptor" value="topic"/>
<dsp:oparam name="output">
Name: <dsp:valueof param="element.topicName"/><br>
Body: <dsp:valueof param="element.topicBody"/><br>
</dsp:oparam>
</dsp:droplet>
</dsp:page>
The primary difference between the ItemLookUpDroplet and RQLQueryForEach droplet is that
RQLQueryForEach requires an RQL statement as an input rather than an id. Also, the output oparam will be
rendered once for each item that the RQL query returns.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
17
�ß Less Java Code Leads to Faster Time-to-Market and Less Maintenance
Developers who use the ATG Data Anywhere Architecture do not need to write, compile or test Java classes or interfaces for
each persistent data type that they want to use in their application. A new persistent data type can be created by simply
creating an XML file which defines a mapping between a repository item and the underlying data structure as shown in
example 1 below.
ß
Example 1: XML Required to Define a Persistent Type using Dynamo Repositories
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE gsa-template PUBLIC "-//Art Technology Group, Inc.//DTD Dynamo Security//EN" "http://www.atg.com/dtds/gsa/gsa_1.0.dtd"> <gsa-template> <header> <name>Account Repository</name> <author>Pat Durante</author> </header> <item-descriptor name="account" default="true"> <table name="Account" type="primary" id-column-name="accountId"> <property name="accountId" column-name="account_id" data-type="string"/> <property name="type" data-type="string"/> <property name="balance" data-type="double"/> <property name="customerId" data-type="string"/> </table> </item-descriptor> </gsa-template>
In this example, we are creating a new type of repository item that represents a bank account. The account item contains
four properties (accountId, type, balance, and customerId) that are mapped into the columns of the Account database
table.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
18
With this item description in place, we can easily display all of the accounts on a dynamic web page as shown in example
below:
<%@ taglib uri="/dspTaglib" prefix="dsp"%>
<dsp:page>
<dsp:droplet name="/atg/dynamo/droplet/RQLQueryForEach">
<dsp:param name="queryRQL" value="ALL"/>
<dsp:param name="repository"
value="/MyApplication/AccountRepository"/>
<dsp:param name="itemDescriptor" value="account"/>
<dsp:oparam name="output">
Account Id: <dsp:valueof param="element.accountId"/><br>
Balance: <dsp:valueof param="element.balance"/><br>
</dsp:oparam>
</dsp:droplet>
</dsp:page>
The J2EE mechanism for representing a new persistent data type is to define a new Enterprise JavaBean (specifically an
EntityBean). Creating an EJB requires writing a fair amount of code (as shown in example 2 below). And deploying an EJB
requires a significant amount of configuration work (XML) as well.
ß
Example 2 – The Code Required for a Container Managed Entity Bean (EJB) (Account.java, AccountHome.java, and AccountBean.java)
Account.java: package atg.atm.account; import java.rmi.RemoteException; import javax.ejb.*; public interface Account extends EJBObject { public void setBalance(double pBalance) throws RemoteException; public double getBalance() throws RemoteException; public String getType() throws RemoteException; public String getCustomerId() throws RemoteException; } ß
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
19
ß
AccountHome.java: package atg.atm.account; import javax.ejb.*; import java.rmi.RemoteException; import java.util.*; public interface AccountHome extends EJBHome { public Account create(String accountId, String customerID, double initialBalance, String type) throws CreateException, RemoteException; public Account findByPrimaryKey(String primaryKey) throws FinderException, RemoteException; public Enumeration findAccountsForACustomer(String custId) throws FinderException, RemoteException; } AccountBean.Java: package atg.atm.account; import java.io.Serializable; import java.rmi.RemoteException; import java.rmi.Remote; import javax.ejb.*; import java.util.*; public class AccountBean implements EntityBean { private transient EntityContext ctx; public String accountId; public String customerId; public double balance; public String type; public void ejbActivate() throws RemoteException { ... } public void ejbPassivate() throws RemoteException { ... } public void setEntityContext(EntityContext ctx) throws RemoteException { this.ctx = ctx; } public void unsetEntityContext() throws RemoteException { this.ctx = null; } public void ejbLoad() throws RemoteException { } public void ejbStore() throws RemoteException { }
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
20
public void ejbRemove() throws RemoteException { } public String ejbCreate(String accountId, String customerId, double initialBalance, String type) { this.accountId = accountId; this.customerId = customerId; this.balance = initialBalance; this.type = type; return null; } public void ejbPostCreate(String accountId, String customerId, double initialBalance, String type) { } public void setBalance(double pBalance) { balance = pBalance; } public double getBalance() { return balance; } public String getType() { return type; } public String getCustomerId() { return customerId; } }
To be fair, there are tools in the marketplace that can generate most of this "boilerplate" code for a new EJB, but still this code
needs to be maintained and extended as the system grows. Also, even with the development and deployment of this new
EJB, the data is still not available to a dynamic page designer. According to the Sun Blueprint methodology, a JSP should not
access an EJB directly. This means that the developer has to write either a JavaBean or a Servlet that interacts with the EJB.
Only then can a JSP be created that includes dynamic information from a data source.
The JDO approach requires just a standard Java class. It provides transparent data access for all Java classes. Developers can
use existing classes or write new classes that new persistence. Example 3 below shows the code needed for our account
type.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
21
ß
Example 3: The Java Class Required By JDO
public class Account { private String accountId; private String customerId; private double balance; private String type; public Account(String accountId, String customerId, double initialBalance, String type) { this.accountId = accountId; this.customerId = customerId; this.balance = initialBalance; this.type = type; } public void setBalance(double pBalance) { balance = pBalance; } public double getBalance() { return balance; } public String getType() { return type; } public String getCustomerId() { return customerId; }
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
22
Using the Visitor Profile (Out-of-the-Box)
By default, the visitor profiling data that is used by the ATG e-Business Platform is stored in the Solid relational database
management system that ships with the product suite. The basic information about a visitor is stored in a table called
dps_user (although several auxiliary tables are used to store additional information about visitors).
Figure 8 below provides a conceptual view of the out-of-the-box architecture.
Figure 8 – Out of the Box Profile Architecture
Note that the data source configuration files contain the only dependency on the Solid RDBMS.
Usr_tbl
������������� ���� ����
���
����
firstNamefirstName
loginlogin
����������
first_name loginid
idid
���� dps_user
�������������
�������������� �����
������������� ������� ���� ���� ����������������� ����
������������ ��������� �����
�����������
��������
��� ������������� ������������ ������������������
�������������� ����������������������� ������ ����
Usr_tbl
������������� ���� ����
������
����
firstNamefirstName
loginlogin
��������������������
first_name loginid
idid
���� dps_user
�������������
�������������� �����
�������������� �����
������������� ������� ���� ���� ����������������� ����
������������ ��������� �����
�����������
��������
��� ������������� ������������ ������������������
�������������� ����������������������� ������ ����
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
23
Extending the Definition of a Repository Item
Using a Simple Auxiliary Table (To model a one-to-one relationship)
Lets say we want to extend the profile definition to include a subscription id for each visitor at the site. This too is extremely
easy to do. The steps are as follows:
��ß Create the additional table to store the new data (create a one-to-one relationship between the existing
dps_user table and your new table). For example:
CREATE TABLE elrn_user (
id VARCHAR(40) not null,
subscription_id VARCHAR(32) null,
constraint elrn_user_p primary key ( id ),
constraint elrn_user_f foreign Key ( id )
references dps_user(id)
)
��ß Create a new userprofile.xml file (in your CONFIGPATH somewhere) to extend the definition of the out-of-the-
box user item descriptor. For example:
<gsa-template>
<item-descriptor name="user">
<table name="elrn_user" type="auxiliary"
id-column-name="id">
<property name="subscription_id"
column-name="subscription_id"
data-type="string"
category="eLearning"
display-name="Subscription Id"/>
</table>
</item-descriptor>
</gsa-template>
��ß Restart the server. That's it! No code changes required. The new property will show up in the ATG Control
Center and you'll be able to retrieve and/or modify the value of this new property from your dynamic web
pages!
<dsp:page>
<dsp:importbean bean="/atg/userprofiling/Profile"/>
Welcome back, <dsp:valueof bean="Profile.firstName"/>!
Your subscription id is:
<dsp:valueof bean="Profile.subscription_id"/>.
</dsp:page>
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
24
�
Adding a Property to an EJB
Adding a single property to an EJB is considerably more involved than adding a property to a Repository Item type.
Lets say we would like to add a single boolean property to our Account EJB presented above (to keep track of
whether or not the account includes overdraft protection).
��ß Modify the schema of the account table to include a new column. (Alternatively, you can create a new
table and use a vendor specific mapping to build a relationship between tables.)
��ß Modify the Account interface code to allow other programmers to gain access to the new property:
public boolean getOverdraftProtection() throws RemoteException;
��ß Modify the AccountHome interface to allow the overdraft property to be initialized upon account
creation:
public Account create(String accountId, String customerID,
double initialBalance, String type, boolean overdraft)
throws CreateException, RemoteException;
��ß Modify the AccountBean class to accommodate the additional create parameter:
public String ejbCreate(String accountId, String customerId, double
initialBalance, String type, boolean overdraft)
{
this.accountId = accountId;
this.customerId = customerId;
this.balance = initialBalance;
this.type = type;
this.overdraft = overdraft;
return null;
}
public void ejbPostCreate(String accountId, String customerId,
double initialBalance, String type,
boolean overdraft)
{ }
��ß Add a get method for the new property to the AccountBean class:
public boolean getOverdraft() { return overdraft; }
��ß Re-deploy the J2EE application making sure to map the new bean property to the appropriate database
column.
��ß Modify the JavaBean or the Servlet code that interacts with your EJB (since JSPs should not access an
EJB directly). At a minimum, you will need to add a method that can check to see if the account has
overdraft protection.
��ß You are now ready to access your new property from a JSP.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
25
Using a "Multi" Table (to model a one-to-many relationship)
Lets say we want to extend the profile definition to include a list of each visitor's favorite subjects. Modeling a one-to-many
relationship is a little more involved, but still generally straight forward. The steps are as follows:
��ß Create the additional table to store the new data (create a one-to-many relationship between the existing
dps_user table and your new table). For example:
CREATE TABLE elrn_subjects (
id VARCHAR(32) not null,
subject VARCHAR(32) not null,
constraint elrn_subjects_p primary key ( id, subject ),
constraint elrn_subjects_f foreign Key ( id ) references dps_user(id)
)
��ß Create a new userprofile.xml file (in your CONFIGPATH somewhere) to extend the definition of the out-
of-the-box user item descriptor. Note that you can you a Set, List, array,or Map as the data-type of a
multi-value property. In this example, we will use a Set (since the order of the visitor's favorite subjects is
not important and we want each subject to be included only once). For example:
<gsa-template>
<item-descriptor name="user">
<table name="elrn_subjects" type="multi" id-column-name="id">
<property name="favoriteSubjects" column-name="subject"
data-type="set" component-data-type="string"/>
</table>
</item-descriptor>
</gsa-template>
��ß Restart the server. That's it! No code changes required. The new property will show up in the ATG Control
Center and you'll be able to retrieve and/or modify the values assigned to this new property from your
dynamic web pages!
<dsp:importbean bean="/atg/userprofiling/Profile"/>
<dsp:page>
<dsp:importbean bean="/atg/dynamo/droplet/ForEach"/>
Welcome back, <dsp:valueof bean="Profile.firstName"/>!
Your favorite subjects are:
<dsp:droplet name="/atg/dynamo/droplet/ForEach">
<dsp:param bean="Profile.favoriteSubjects" name="array"/>
<dsp:oparam name="output">
<li><dsp:valueof param="element"/>
</dsp:oparam>
</dsp:droplet>
</dsp:page>
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
26
Switching to an Alternative Relational Database Management System
At times, companies need to change from one database system to another. For example, during the prototyping phase of a
new project, many web architects choose to use the free Solid database included with ATG Dynamo. Eventually, application
will need to switch from the Solid RDBMS to a production grade RDBMS (such as Oracle). The ATG Data Anywhere
Architecture makes this switch incredibly easy. All you have to do is create the appropriate tables in the RDBMS of your choice
and change a few properties files to change the connection information.
We recently built an application that makes use of Microsoft SQL Server instead of Solid. Here's what we needed to do to get
the ATG Dynamo e-Business Platform running against SQL Server:
��ß Create the necessary tables and indices in a SQL Server database using the provided SQL (for each ATG
product there is a set of SQL files that contain DDL which can be used to create the appropriate tables and
indices). For example, under \ATG\Dynamo5.6\DAS\sql\install\mssql there is a file called das_dll.sql
which you can use for this purpose.
��ß Create a new data source (or replace the existing configuration). We chose to replace the existing
configuration as shown below:
# /atg/dynamo/service/jdbc/MyDataSource
#Wed Nov 14 15:07:38 EST 2001
$class=atg.service.jdbc.MonitoredDataSource
$description=JTA Participating eLearning Datasource
$scope=global
dataSource=/atg/dynamo/service/jdbc/MyXADataSource
logListeners=/atg/dynamo/service/logging/LogQueue,/atg/dynamo/service/logging/Scre
enLog
max=5
min=5
transactionManager=/atg/dynamo/transaction/TransactionManager
# /atg/dynamo/service/jdbc/MyXADataSource
#Wed Nov 14 15:05:06 EST 2001
$class=atg.service.jdbc.MyXADataSource
$scope=global
URL=jdbc\:inetdae7\:hostname.atg.com\:1433
dataSourceJNDIName=
database=eLearningBeta
driver=com.inet.tds.TdsDriver
logListeners=/atg/dynamo/service/logging/LogQueue,/atg/dynamo/service/logging/Scre
enLog
password=thepassword
server=hostname.atg.com\:1433
user=theuserid
IMPORTANT: Note that the change in data source required no changes to the application code, nor did it involve
changing the repository configuration. Changes were isolated to the data source configuration files.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
27
Converting from one type of database to another
If we want to switch over to using an LDAP directory (such as iPlanet Directory Server), you can do that easily as well. This
paper will not provide details on how to accomplish this. If you'd like to learn more about how to do this, please read the
following sections in the ATG Personalization Programming Guide:
��ß Setting Up an LDAP Profile Repository
��ß Linking SQL and LDAP Repositories
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
28
�ß A Unified View of Customer Interactions
The ATG Data Anywhere Architecture excels at providing a unified view of customer interactions. A unified view of customer
data leads to a coherent and consistent customer experience. For example, when a service call is logged using a called center
application, your web application is aware of the service call and its status.
One of the biggest challenges faced by a web application is getting access to information about a customer gathered outside
of a web context. As you know, many applications within the enterprise record information about customer interactions.
Call center applications and enterprise resource planning (ERP) systems are good examples of the kinds of systems used to
service customers in the enterprise.
The flexibility of ATG Data Anywhere allows you to "hook into" the important data gathered by call center, ERP, and other
enterprise applications without having to copy it all into a central repository.
Figure 9 below shows an example of an enterprise that gathers data about customer interactions using three disparate
systems (a web application, a call center application, and an ERP system).
Figure 9 – Enterprise data is often managed by disparate systems
Usr_tbl
����������
first_name loginid
��� ���������������
Service_request
�����������
description statusid
������������ ���������������
Order_history
�����������
order_num dateid
���������������
Usr_tbl
��������������������
first_name loginid
��� ���������������
��� ���������������
Service_request
����������������������
description statusid
������������ ���������������
������������ ���������������
Order_history
����������������������
order_num dateid
������������������������������
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
29
With ATG Data Anywhere, you can access all of this customer-focused data without relocating it. Figure 10 below shows one way this
can be accomplished.
Figure 10 – A Unified View of Customer Data using ATG Data Anywhere
Usr_tbl
������������� ���� �����
���
��������
firstNamefirstName
loginlogin
����������
first_name loginid
idid
�������������
����������� ������
������������ ��������� �����
�����������
ordersorders
callscalls
Service_request
�����������
description statusid
Order_history
�����������
order_num dateid
:HE�$SSOLFDWLRQ�'DWD
&DOO�&HQWHU�$SSOLFDWLRQ�'DWD
(53�6\VWHP�'DWD
Usr_tbl
������������� ���� �����
������
��������
firstNamefirstName
loginlogin
��������������������
first_name loginid
idid
�������������
����������� ����������������� ������
������������ ��������� �����
�����������
ordersorders
callscalls
Service_request
����������������������
description statusid
Order_history
����������������������
order_num dateid
:HE�$SSOLFDWLRQ�'DWD:HE�$SSOLFDWLRQ�'DWD
&DOO�&HQWHU�$SSOLFDWLRQ�'DWD
&DOO�&HQWHU�$SSOLFDWLRQ�'DWD
(53�6\VWHP�'DWD(53�6\VWHP�'DWD
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
30
�ßMaximum Performance Through Intelligent Caching
Thoughtful design of database access is a key to achieving acceptable performance in web applications. You want to
minimize how often your application needs to access the database, while still maintaining data integrity. An intelligent
caching strategy is central to achieving these goals. Dynamo SQL Repositories provide intelligent caching of data objects to
ensure excellent performance and timely/accurate results.
The ATG Data Anywhere Architecture supports an intelligent and flexible caching model that provides fine-grained control to
deployment experts. When an item is first retrieved from the database, it is stored in memory (in a cache). Subsequent
queries for the same item will not necessarily need to access the database (as long as the cache is still "valid" the data
currently stored in memory can be used).
ATG Data Anywhere was designed to work in the harsh web environment, while other systems make caching assumptions
that are more appropriate to a low-scale intranet. In most cases, a single server is accessing a data object (like the user
profile) at a time. Our locked-mode caching (see case #3 below) offers both high performance and data integrity. Locked
caching introduces a little bit of overhead to data access, since locks must be checked, set and removed during I/O. This
checking insures that data will never be stale, insuring data integrity. In the worst case, if one is reading and writing the very
same item all of the time, then the performance effect is similar to omitting caching all together. However, in the normal
case, when different data elements are being read and written by different systems, then locked caching offers performance
similar to simple caching. Bottom line: high performance caching while data integrity is assured: the best of both worlds.
See the white paper called " Caching Data for Scalability Without Losing Data Integrity" on atg.com for more information
about this topic.
Perhaps the most important thing to understand about Repository caching is that it is highly configurable to meet the needs
of your application. Developers can choose the appropriate caching strategy in the repository definition file (XML).
Dynamo SQL Repositories define four caching modes that can be used by developers as appropriate:
��ß Simple caching (which is the default)
��ß Locked caching
��ß Distributed caching
��ß Disabled (No caching)
Case 1: Single Dynamo Server
During development (and in some rare cases in production), an application may be deployed entirely on one Dynamo server.
The number of Dynamo servers that you'll be running on your production site will vary depending on the size/complexity of
your application and the number of simultaneous visitors the site needs to support.
Simple caching is recommended for single-Dynamo-server environments.
With simple cache mode, cached items are stored in each individual Dynamo server’s memory. No attempt is made to keep
cache items in sync between Dynamo servers. If one Dynamo server modifies an item, other servers will have inaccurate data
in cache until the cache is manually or automatically flushed (see "Invalidating the Cache" below).
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
31
Note that simple caching is the default. Sites running multiple Dynamo servers will want to change the cache mode (to
locked or distributed) on important item types (whose data changes periodically).
Case 2: Read frequently, modify rarely or never
Some types of repository items (such as product catalog items) are modified rarely or never on the production site. Simple
caching can be used in these situations too. This includes all items that you modify only on a staging server and that you do
not modify once they are published to your live site.
Developers will need to flush the cache on all Dynamo instances in the production environment whenever modifications are
being pushed from the staging environment to the live site (see "Invalidating the Cache" below).
Case 3: Modifications made by one Dynamo server at a time
Some types of repository items (such as the User Profile) are consistently used by one server at a time. The data may change
frequently during a visitor's session, but a session is handled by a single Dynamo server (unless a failover occurs).
Locked caching is recommended when typical usage will involve modification by one server at a time. If more than one server
tries to modify an item at the same time, the 2nd server will be locked out until the 1st server completes its modifications. If
you allow this to happen frequently, your site performance will suffer.
Locked Caching is based on write-locks and read-locks. If no servers have a write-lock for an item, any number of servers may
have a read-lock on that item. When a server requests a write-lock, all other servers are instructed to release their read-locks.
Once an item is write-locked, no other server may get a read-lock or write-lock until the first server releases its write-lock. In
other words, once a server has a write-lock on an item, all access to that item is blocked until the write is completed.
A server requests a read-lock the first time it tries to access an item. Once the server has a read-lock on the item, it holds
that read-lock until the lock manager notifies the server to release its read-lock. At that time, it drops the item from its cache.
Case 4: Modification by multiple Dynamo servers
In extreme cases, in a multiple Dynamo server environment, you need the ability to notify all other servers that an item has
been modified (even if those other servers are not going to modify the item themselves).
Use distributed caching for items that are modified infrequently during runtime. Distributed mode works best if there is
little chance that two Dynamo servers will attempt to access and change a repository item at the same time. For items that
change more frequently, use locked mode.
Distributed mode allows any Dynamo server to read or modify an item in cache. When one Dynamo server modifies an item,
it broadcasts a JMS cache invalidation event to all servers (see "Invalidating the Cache" below). Distributed caching uses
asynchronous message delivery. This means that there is a slight chance of a user getting stale data, until the invalidation
event message is received by all servers: if a user logged in on one server makes a change to an item, and another user logged
in on a different server requests that item after the change is made, but before the second server has received the cache
invalidation event, the second user would get stale data.
This mode is seldom used; in most cases, locked caching is preferable to distributed caching.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
32
Case 5: Modification by a non-Dynamo Application
On some sites, the data used by the web application can be modified by a third party system. In these cases, you will need to
either disable caching or find a way to notify the production Dynamo servers whenever a change is made by the third party
application (using messaging).
Disabling Caching
Disabled caching should be used with great caution, because it will result in database access for every page that
accesses an item of this type. This potentially has a severe impact on performance.
Caching should be disabled when there is a possibility that the underlying data will be changed by a non-Dynamo
Repository application. For instance, if you have an on-line banking application, and the same data is accessed by
other applications in addition to Dynamo, you may want to turn off caching for displaying the user’s account
balances.
The other caching modes can only be set on a per-item-type basis, but disabled caching mode may be set on a per-
property basis. If a request is made for a disabled cache property of a cached item, the database will be queried.
Example from userprofile.xml:
<property category="Login" name="password" data-type="string"
required="true"
column-name="password" cache-mode="disabled" >
Invalidating the Cache
Usually cache invalidation happens automatically when repository items are changed using the Repository API.
Sometimes it is necessary to force cache invalidation manually, such as when the contents of the database are
changed directly by a third-party application (without going through the Repository API).
One way of handling integration with a third-party application is to have the third-party application send a JMS
message whenever interesting data is modified. Your Dynamo application can then receive the message and
programmatically invalidate the appropriate cache.
To flush all items and all queries in all caches in a specific repository, use:
atg.repository.RepositoryImpl.invalidateCaches()
To flush the caches associated with a specific item type, use:
// The following method empties the item cache for the given item
// descriptor
atg.repository.ItemDescriptorImpl.invalidateItemCache()
// The following method empties the item cache and query caches // for this item
descriptor
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
33
atg.repository. ItemDescriptorImpl.invalidateCaches()
// The following method method removes a specific repository item // from the
cache
atg.repository.ItemDescriptorImpl.removeItemFromCache(String itemId)
Controlling the Cache Sizes
The size of each repository item cache is configurable as well. By default, the item cache size is 1000 items. After running
your site for some time, you can get a good idea of how well the repository item caches are working by going to the
repository's page in the Dynamo Administration interface. For example, the Administration interface page for the Commerce
Product Catalog repository is:
http://localhost:8830/nucleus/atg/commerce/ProductCatalog
Under the heading Cache usage statistics, this page lists, for each item descriptor, the number of items and queries in the
item and query caches, the cache size, the percent of the cache in use, the hit count, the miss count, and the hit ratio. If you
have a high quantity of misses and no hits, you are gaining no benefit from caching, and you can probably just turn it off, by
setting the cache size to 0. If you have a mix of hits and misses, you might want to increase the cache size. If you have all hits
and no misses, your cache size is big enough and perhaps too big. There is no harm in setting a cache to be too big unless it
will fill up eventually and consume more memory than is necessary.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
34
The cache size can be adjusted in a repository definition file as shown in the following example:
<gsa-template>
<item-descriptor name="topic" cache-mode="locked"
query-cache-size="100" item-cache-size="1500">
...
</item-descriptor>
</gsa-template>
There are actually two types of caches in the repository. The item-cache caches the item and the properties; the
query-cache caches the result set so that you don't need to hit the database to find out which items to return
when the same query is issued again and again. By default, query caching is turned off (the default query-
cache-size is set to zero).
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
35
�ß Simplified Transactional Control
Overview of Transactional Integrity
Web applications need to be built carefully to balance the integrity of the data it manages with its performance goals.
Consider a web application that allows visitors to transfer funds between bank accounts. A "transfer" operation really
involves two actions (a debit from the source account and a credit to the destination account). In order to maintain the
integrity of the data, both actions must complete successfully. If anything goes wrong during the "transfer" operation, the
account balances should be "rolled back" to their original amounts. A system with transactional integrity will allow a
developer to group multiple actions (e.g., a debit and a credit) into a single activity that either succeeds as a whole or fails as a
whole.
The J2EE Approach to Transactional Integrity
J2EE provides a vendor and data source independent mechanism for managing transactions called Java Transaction API (JTA).
JTA allows developers to control transactional boundaries (start, commit, rollback). J2EE also defines six transaction
demarcation modes (REQUIRED, REQUIRES_NEW, NOT_SUPPORTED, SUPPORTS, MANDATORY, NEVER) for specifying the
scope and impact of transactions on a particular Enterprise JavaBean method. All J2EE containers must provide a
UserTransaction component which exposes programmatic control of transactions to developers.
J2EE paved the way for what is called declarative transactional demarcation that allows the developer to establish the
transactional behavior outside of Java code. In J2EE, the transactional behavior for a particular method is specified at
deployment time in a deployment descriptor (an XML file).
ATG Data Anywhere Support for Transactional Integrity
The ATG Data Anywhere Architecture™ supports all of the requirements of the J2EE specification.
��ß The ATG Dynamo Application Server™ provides a fully J2EE-compliant TransactionManager,
but if you are building on a third-party application server (such as BEA WebLogic), you can use its
TransactionManager in place of ours.
��ß Transactional boundaries can be set declaratively (for EJBs) and programmatically (using JTA)
The down side of transactional integrity is that performance of the data access functions is slowed due to the overhead of
tracking data access operations occurring within transactions. The key to good overall system performance is to minimize
the impact of transactions. For this reason, the ATG Data Anywhere Architecture™ allows page developers and Java
developers to control the scope of transactions.
Advantages of the ATG Data Anywhere Approach
��ß Page developers can use simple droplets to control transactional boundaries (without writing Java code)
��ß Java developers can leverage the transactional demarcation modes using the provided
TransactionDemarcation interface (in J2EE only EJB methods can use the modal demarcation technique,
with ATG this technique can be used in simple JavaBeans and Servlets as well)
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
36
Example Page
<dsp:page>
<dsp:importbean bean="/atg/dynamo/transaction/droplet/Transaction"/>
<dsp:importbean bean="/atg/dynamo/transaction/droplet/EndTransaction"/>
<dsp:importbean bean="/atg/userprofiling/Profile"/>
<dsp:droplet name="Transaction">
<dsp:param name="transAttribute" value="required"/>
<dsp:oparam name="output">
One transaction instead of three:
<P> <dsp:valueof bean="Profile.firstName" />
<P> <dsp:valueof bean="Profile.lastName" />
<P> <dsp:valueof bean="Profile.city" />
<dsp:droplet name="EndTransaction">
<dsp:param name="op" value="commit"/>
<dsp:oparam name="successOutput">
The transaction ended successfully!
</dsp:oparam>
<dsp:oparam name="errorOutput">
Failure: <dsp:valueof / param="errorMessage">
</dsp:oparam>
</dsp:droplet>
</dsp:oparam>
</dsp:droplet>
</dsp:page>
Default Transactional Behavior
In order to protect the integrity of data, SQL Repositories wrap a "required" mode transaction around every property read and
write. This is good because by default transactional integrity will be enforced, however developers will need to consider the
performance implications of such granular transactional scope. Unless a developer creates a transaction of his/her own
(programmatically or using droplets), a new transaction will be conducted every time the getPropertyValue or
setPropertyValue methods are called on a repository item. In order to achieve good performance, developers need to be
aware of this default behavior and override it when appropriate, such as a dynamic page that displays multiple properties
from the user profile (rather than beginning and ending a transaction for each property, it would be more efficient to read all
of the properties in a single transaction).
Recommendations
��ß Use the Transaction droplets when displaying repository information.
��ß When processing a form, use programmatic demarcation (typically at the start and end of your handler
methods.)
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
37
�ß Strong Built in Search Capabilities
The ATG Data Anywhere Architecture™ provides a powerful set of Repository searching tools. We've already examined the
use of some of the searching mechanisms that are provided (searching by id (using the ItemLookupDroplet) and querying
against a single item type within a single repository (using the RQLForEachDroplet)). ATG also provides a
SearchFormHandler that can be configured for most of your "search page" needs.
The SearchFormHandler supports several types of searching:
��ß Keyword searches allow you to build a search page in which visitors enter a set of keywords and queries
all of the item properties that have been hold keywords. For example, "find all products in the catalog
with the keyword tools"
��ß Text searches allow your visitors to perform full-text searches. Dynamo can simulate full-text searches
or make use of your RDBMS-specific one (if it is available). For example, "find all products in the catalog
whose description contains quality"
��ß Hierarchical searches allow your visitors to limit a search to a particular subset of items. For example,
"find all products in the catalog with the keyword tools in the home-goods category"
��ß Advanced searches (also called Parametric searches) allow your visitors to limit the search based on a
range of property values ("find all recipes whose cook time is between 5 and 20 minutes") or based on a
specific enumerated value ("find all movies with the keyword action where the rating is PG-13")
��ß Combination searches – any of the above search types can be combined together.
Searching for content across repositories and item types is an extremely powerful feature. It allows visitors and developers
find the data they need more rapidly. Quality searching tools lead to higher satisfaction, greater efficiency, and potentially
more revenue.
Once again, developers do not need to write Java code to include a search function in their applications. The provided form
handler can be configured to perform a great variety of searches. If the provided form handler does not meet your developers
needs they can use inheritance to extend the provided class. For example, ATG developers specialized the search form
handler for searching the commerce catalog (it allows the results to be presented as matching categories followed by
matching products).
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
38
�ß Fine-grained Access Control
The ATG Data Anywhere Architecture™ provides a Secured Repository system that works with the Dynamo Security System
to provide fine-grained access control to repository item descriptors, individual repository items, and even individual
properties through the use of Access Control Lists (ACLs).
Any repository can have security by configuring an instance of a Secured Repository Adapter on top of the repository
instance. Depending on the security features you desire, some new properties (such as an owner property and an acl
property) may have to be added to the underlying repository in order to support access control information storage.
Case 1: Controlling access to all items of the same type
The most basic level of access control is at the item type level. This is similar to controlling access to a particular database
table. For example, you can specify that only members of the administrators group have access to user profile items.
Case 2: Controlling access to specific items
The next level of access control is on specific items. This is similar to access control of a single row in a database. For
example, you can specify that members of the education managers group have access to user profile items for people who
work in the education department.
Case 3: Controlling access to specific properties
You can even control who is allowed to read/write a particular property of an item. For example, you can specify that
members of the human resources group can retrieve the salary property within certain user profile items.
Case 4: Limiting Query Results
You can control who can receive certain repository items as results from a repository query. For example, you can specify that
only the owner can query new items until the owner previews and approves the item.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
39
Creating a Secured Repository 1. Modify the underlying repository. For those item descriptors you want to secure, you need to make some
minor modifications to the underlying data and item descriptors to add properties with which to store the
Access Control List (a String or an array of Strings) and owner (a user profile) information. For example:
<item-descriptor name="account">
<table name="Account" type="primary"
id-column-name="accountId">
<property name="accountId" data-type="string" />
<property name="type" data-type="string" />
<property name="ACL" data-type="string" />
<property name="accountOwner" component-type="user" />
</table>
</item-descriptor>
��ß Configure the Secured Repository Adapter component. You need to wrap a Secure Repository component
around the underlying repository. For example:
SecureAccountRepository.properties:
$class= atg.adapter.secure.GenericSecuredMutableRepository
$scope=global
# The name property is for the ACC.
name=Secure Account Repository
repositoryName=SecureAccountRepository
# The repository property refers to the underlying repository
repository=AccountRepository configurationFile=secureAccountRepository.xml
securityConfiguration=
/atg/dynamo/security/SecuredRepository/SecurityConfiguration
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
40
��ß Write a secure repository definition file to spell out the access control you desire. In the following example,
we first establish the name of the owner (accountOwner) property and the name of the property holding
the access control list (ACL). Then it establishes an ACL that grants read, write, and list (for queries) access to
account items to members of the ACC's administrators-group.
<!DOCTYPE gsa-template
PUBLIC "-//Art Technology Group, Inc.//DTD General SQL Adapter//EN"
"http://www.atg.com/dtds/security/secured_repository_template_1.1.dtd">
<secured-repository-template>
<item-descriptor name="account">
<owner-property name="accountOwner"/>
<acl-property name="ACL"/>
<descriptor-acl
value="Admin$role$administrators-group:list,read,write;"/>
</item-descriptor>
</secured-repository-template>
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
41
��ß Conclusions: ATG Data Anywhere Architecture™ Decreases Total Cost of Ownership
At the heart of all web applications is data access. Data access makes web sites more intelligent and thereby more useful for
companies and customers. Web applications require a data access mechanism to interact with user profiling information,
web site content and enterprise data. A customer facing web site needs to have a unified view of all customer interactions
(including sales force interactions, call center interactions, and web experiences). This unified view of customer data leads to
an integrated and coherent customer experience.
Data access for a web application is especially complex because the object-oriented world of a Java application is quite
different than the structure of data in a relational database, an LDAP directory, or a file system. The way you access each of
these data sources varies dramatically, so developers have to learn the tools and tips of each kind of system.
J2EE provides some support for data access in the form of JDBC and container managed entity beans (EJBs), but both most
implementations of these technologies focus on mapping relational data to objects. Of course, developers can create bean-
managed EJBs that let you interface EJBs to whatever data source you want (as long as you’re willing to write a lot of code or
use a tool to help you).
JDO is another data access standard for data access that supports data source independence, but still requires developers to
write a Java class for each persistent type and caching of data objects is left to the vendor implementations.
As you can see, the ATG Data Anywhere Architecture™ has several advantages over traditional data access mechanisms as
summarized below:
��ß Insulates application developers from schema changes and also storage mechanism (data can move
from a relational database to an LDAP directory without requiring any re-coding)
��ß Unifies your customer data without copying it all into a central data source
��ß Provides intelligent caching of data objects to ensure excellent performance and timely/accurate results
��ß Simplifies transactional control (programmatic demarcation using modes or droplets on a dynamic
page)
��ß Provides powerful searching tools out-of-the-box that can span data sources and data types
��ß Provides fine-grained access control to data – all the way down to the individual property level.
��ß Easier to use and more powerful than Java Data Objects (JDO) and Enterprise JavaBeans (shorter
learning curve, no code required to represent a persistent type, simpler configuration, more than just
relational database support)
The ATG Data Anywhere Architecture provides advantages that go well beyond the other options. Dynamo Repositories
allow developers to focus on implementing business logic rather than spending time writing "wrapper classes" for each
persistent data type used by the application. This focus directly improves time-to-market and significantly reduces the total
cost of ownership of web applications.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
42
��ßAppendix: Other Sources of Information
Documentation
ATG Dynamo Application Server Programming Guide
Part II: Repositories
ATG Dynamo Personalization Programming Guide
Setting Up a Profile Repository
Setting Up an LDAP Profile Repository
Linking SQL and LDAP Repositories
Working with the Dynamo User Directory
Setting Up an LDAP User Directory
ATG Dynamo Administration Guide
Using JDBC with Dynamo
Configuring Databases
Managing Database Servers
Repository and Database Performance
ATG Dynamo Page Developers Guide
Using Search Forms
Dynamo 5 ER Diagrams
Education
See atg.com for more information about these education offerings.
Instructor-led Training
ATG Dynamo Essentials for Java Developers (5 days)
Utilizing Dynamo Repositories (2 days)
Self-directed learning
Mastering Web Applications
Mastering Personalized Applications
ATG e-Learning Connection
Extending the User Profile (an e-Course)
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
��
This publication may not, in whole or in part, be copied, photocopied, translated, or reduced to any electronic medium or machine-readable form for commercial use without prior consent, in writing, from Art Technology Group (ATG), Inc. ATG does authorize you to copy documents published by ATG on the World Wide Web for non-commercial uses within your organization only. In consideration of this authorization, you agree that any copy of these documents which you make shall retain all copyright and other proprietary notices contained herein. This documentation is provided “as is” without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. The contents of this publication could include technical inaccuracies or typographical errors. Changes are periodically added to the information herein; these changes will be incorporated in the new editions of the publication. ATG may make improvements and/or changes in the publication and/or product(s) described in the publication at any time without notice. In no event will ATG be liable for direct, indirect, special, incidental, economic, cover, or consequential damages arising out of the use of or inability to use this documentation even if advised of the possibility of such damages. Some states do not allow the exclusion or limitation of implied warranties or limitation of liability for incidental or consequential damages, so the above limitation or exclusion may not apply to you. Acknowledgments I would like to thank all of the people who have contributed along the way to the creation of this paper. First and foremost, thanks to Bill Morrison, ATG Product Marketing Manager, who sponsored the creation of this white paper. Extra special thanks to the following trainers and courseware developers whose inspiration, ideas, diagrams, words, and experience have been used as source material for this white paper: Diana Carroll, Blake Crawford, Kevin Johnson, Pierre Billon, Karin Layher, and Paul Donovan. Thanks go to the following folks who reviewed the paper and provided helpful feedback: Blake Crawford, Karen Kilty, Joyce Wang, and Nathan Abramson. Final thanks go to my wife Bonnie Durante who put up with long hours spent designing, writing, and proofreading.
UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE™
www.atg.com/offices
America Headquarters Art Technology Group, Inc.
25 First Street Second Floor
Cambridge, MA 02141 USA
Tel: +1 617 386 1000
Fax: +1 617 386 1111
North American Offices
Atlanta / Chicago / Dallas / Los Angeles / New York / Palo Alto / San Francisco / Toronto / Washington DC
European Headquarters
Art Technology Group (Europe), Ltd
Apex Plaza Forbury Road
Reading RG1 1AX UK
Tel: +44 0 118 956 5000
Fax: +44 0 118 956 5001
European Offices
Amsterdam / Frankfurt / London / Milan / Paris / Stockholm
Asia/Pacific Headquarters
Art Technology Group, Inc.
Suite 46 Level 11 Tower B
Zenith Centre
821 Pacific Highway
Chatswood NSW 2067
Sydney Australia
+61 2 8448 2071
+61 2 8448 2010
Asia/Pacific Offices Hong Kong / Singapore
Japan Headquarters Art Technology Group, Inc.
Imperial Tower, 15th Floor
1-1-1 Uchisaiwaicho
Chiyoda-ku, Tokyo 100-0011, Japan
www.atg.com 6540001-01 April 2002
© 2002, Art Technology Group, Inc. ATG, Art Technology Group, the techmark, the ATG Logo, and Dynamo are registered trademarks, and Personalization Server and Scenario Server are trademarks of Art Technology Group. All other trademarks are the property of their respective holders. All specifications are subject to change without notice. Art Technology Group, Inc. cannot accept liability for any loss or damage arising from the use of information or particulars in the brochure.
NASDAQ:ARTG