Upload
pierce-skinner
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Case StudyDistributed Data Integration Framework
Roger Ruttimann
Lead Engineer Enterprise Systems, GroundWork Opensource Inc.
4th International Conference on Computer Science and its Applications (ICCSA-2006)
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Objective
Overview of integration of Open Source projects into the development process
Design, risk assessment, and implementation of a new product, leveraging OSS as much as possible
Discuss problems with this approach
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Agenda Details
Case study of the development process for a Data Integration Framework for Monitoring
Project requirements overview
Design
Risk assessment
Implementation
Encountered problems / issues
Project life cycle and project maintenance
Lessons learned
Q & A
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Project overviewThe company offers support and installation assistance for an Open Source Monitoring system. One component is the Open Source Project called Nagios.
Runs on Linux/Unix only
Data storage in text files
UI compiled (C++) classes parsing through text files
Hard to scale and limited possibilities to improve User Interface
The limitations to scale out and the User Interface are the two major issues hindering the adaption in larger installations
Overview
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Project Requirements
The goal was to come up with a framework that:
leverages the core features of Nagios such as the monitor-plugins, scheduler and the notification engine.
Extends the UI and the back end so that it can be deployed into larger data centers.
Has a generic data model so that other monitoring data can be integrated.
Uses an enterprise-type back end, including fail-over, load-balancing and high throughput.
Overview
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Mission: The CTO said...
Enable integration with multiple open source and commercial monitoring tools
Provide a platform for a unified enterprise-class solution
Provide real open source flexibility and extensibility
Publish the Monitor Data Integration Framework as an Open Source project so that outside developers can contribute
Overview
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Development Constraints
We are a startup company with limited development resources and an aggressive schedule - we had to use existing components.
As a new company we didn't have legacy libraries for re-use. The best alternative was to leverage Open Source components as much as possible.
Design Phase
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Final Feature set
Cross-platform application written in Java
Data exchange with XML feeder framework
Pluggable data normalization components
Java, Perl and PHP APIs for accessing data
Property-driven data structure for great flexibility
Design Phase
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Architectural overview
Extend Nagios 2.0
Integrateopen source applications
GroundWork-developed software
GROUNDWORK FOUNDATION
DATABASE
3rd PARTY SYSTEMSOPEN SOURCE TOOLS
AVAILABILITY MONITORING(Nagios 2.0)
Application Programming Interfaces (APIs)
Adapters
Common Data Model
NETWORKSNETWORKS SERVERSSERVERS APPLICATIONS APPLICATIONS OTHER DEVICES OTHER DEVICESDATABASESDATABASES
PERFORMANCE MONITORING
(RRDtool)
SNMP TRAPS(SNMP TT)
SYSTEM LOGS(Syslog NG)
ALARMPROCESSING(Nagios 2.0)
Network, Systems &Applications ManagementConfiguration Management
Service Desk
Design Phase
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
API Layer
Lightweight Object Container
Java API
Data Access Objects(DAO)
PHP API Perl API
Object Relational Bridge
Data Model
Design Phase
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Data Feeder / data normalization layer
LightweightObject Container
AdapterNormalizer
Data Access Objects(DAO)
Object Relational Bridge
Data Model
AdapterNormalizer
AdapterNormalizer
AdapterNormalizer
Listener / Message dispatcher
FeederPerl script
FeederPHPscript
FeederJMS
FeederC/C++
FeederVB
XML Message
Design Phase
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Common Data model
Common Data Model
Collector Normalizers
Application Programming Interfaces
Event DataLog Data
State Data Properties
PropertiesProperties
Design Phase
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
How to choose the components?
Choose point solutions with minimal dependencies– Business layer should be database agnostic– persistence layer should not depend on specific transaction
managers or connection pools
Multiple projects with same functionality available– Easier to replace component if problems occur– License compatibility
Evaluation / Risk assessment
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Choosing the Business Logic to database bridge
Requirements– Database agnostic. Not using stored procedures– Property based data model requires a lot of cross tables joins to
insert and retrieve data. Developers are used to manipulate objects rather than record sets.
– For performance reasons a cache is required.– Data consistency requires Transaction support
Hibernate -- www.hibernate.org– High performance object/relational persistence and query
service.– Most popular and stable O/R persistence tool– Online documentation and books available.– Active mailing lists and forums
Evaluation / Risk assessment
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Choosing the Database
Requirements– Easy to install– popular and accepted– multi platform support
MySQL -- dev.mysql.com– Most popular Open Source database– Easy to install and to maintain– Download, install and up-and-running in 15 Minutes– Online documentation and books available.– Active mailing lists and forums
Evaluation / Risk assessment
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Choosing the Lightweight object container
Requirements– Framework to manage Java Bean objects creation and
maintenance– minimal configuration at run time– Flexible to support aspect oriented programming (aop) and
transaction management
Spring -- www.springframework.org/– Lightweight container far smaller footprint than any available
J2EE container.– Configuration through XML format assemblies that can be
injected at any time.– Seamless integration of Hibernate for transaction management.– Online documentation and books available.– Active mailing lists and forums
Evaluation / Risk assessment
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Risk assessment
Choose popular and well documented projects
Monitor forums to observe common user issues– Large traffic alone doesn't indicate successful project
Consider only stable and documented features
Do extensive evaluation of core components but not tool/utilities components
Evaluation / Risk assessment
Even following these rules doesn't prevent you from surprises. Unstable fast changing projects can negatively affect your overall schedule
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Implementation
PHP API PERL API JAVA API
Listener
Nagios Feeder
Data Model
Adapter Frameworks
Spring
Hibernate
Apache Commons Perl DBI
JDBC
MySQL
More Feeders…Open source components
GroundWork open source
Availability of Open Source projects/components
GroundWork development
Implementation
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Encountered issues / problems
Java version. Clients were still running Java 1.3 or Java 1.4.x. Java 5 offers improvements that we couldn't leverage.
By design all components are loosely coupled and therefore replaceable. This requires more upfront work to design the communication interfaces.
Documentation needs to be written!
Training of staff installing and supporting the framework.
Overhead of following Open Source projects to be informed about updates/problems that might affect the project
Implementation
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Project Lifecycle
Feedback from the field needs to be integrated
Improvements / bugfixes from the various Open Source packages need to be evaluated and integrated.
Constant risk evaluation when integrating third party packages
Evaluate new Feature requests– How do they fit into the framework– Is there an Open Source package available– What's the license?– Can we integrate it easily? How much custom code?
Project Lifecycle
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Release
Data integration Framework was released to Open Source as GroundWork Foundation
http://gwfoundation.sf.net
Used as a part of GroundWork Monitor Professional
Customized by other users to store state and event information not directly related to infrastructure monitoring.
Development goes on Milestone-Releases available
Since the project is public, developers have a responsibility to support users and guarantee stability
Project Lifecycle
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Did the chosen approach work out?
Can we extend current design based on Open Source components?
Is the maintenance manageable since we integrated so many Open Source packages with their own lifecycle?
Is the built in flexibility really needed?
Project Lifecycle
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
First design challenge: Adding new features
Integration of new features – Remote API (WebService)– Higher throughput. Feed 500-1000 Message/sec– Integration of other Monitor systems such as JMX
Project Lifecycle
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Extensible architectureApplication ServerWeb Container
Project Lifecycle
NagiosAdapter
Data Access Objects(DAO)
Object Relational Bridge
Data Model
SNMPTrapAdapter
GenericAdapterListener / Message dispatcher
Nagios LogFeeder
Nagios StatusFeeder
SNMPTrapFeeder
SysLOGFeeder
XML Message
Java APIAdapter Manager
Web ServiceInterfaceAdmin UI
JMS Brokerwith persistence
Feeder Web Service
Post JMXAdapter
JMXFeeder
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Second design challange: Open Source package upgrades
Upgrade of core components – Hibernate update to version 3.1 (EJB 3.0 compliant)– Springframework update to 2.0 (JMX support/enhanced aop)– Upgrade to Java 5
Open Source packages have dependencies– Log4j, commons, XML parsers,..
Have unit tests in place to catch any differences and incompatibilities early
Even if the upgrade is a “drop-in” update you should leverage any new features and improvements
Once again check the forums and the mailing lists!
Project Lifecycle
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Conclusion
Without the usage of available Open Source components we wouldn't have been able to meet the aggressive release schedule.
Open Source Project evaluation and project monitoring needs to be built into development schedule
Mailing lists are a great help
Constant learning; projects change fast
Cleaner code since code is public – developer pride!
Lessons learned
Case StudyDistributed Data Integration Framework
ICCSA-2006 Roger RuttimannJune 27, 2006
Foundation Project:http://gwfoundation.sf.net
GroundWork Monitor Open Sourcehttp://www.groundworkopensource.com/downloads
Contact:Roger Ruttimann
GroundWork Open Source, Inc.139 Townsend Street, Suite 100San Francisco, CA [email protected]
More Info