27
Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference on Computer Science and its Applications (ICCSA-2006)

Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Embed Size (px)

Citation preview

Page 1: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

Roger Ruttimann

Lead Engineer Enterprise Systems, GroundWork Opensource Inc.

4th International Conference on Computer Science and its Applications (ICCSA-2006)

Page 2: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Objective

Overview of integration of Open Source projects into the development process

Design, risk assessment, and implementation of a new product, leveraging OSS as much as possible

Discuss problems with this approach

Page 3: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Agenda Details

Case study of the development process for a Data Integration Framework for Monitoring

Project requirements overview

Design

Risk assessment

Implementation

Encountered problems / issues

Project life cycle and project maintenance

Lessons learned

Q & A

Page 4: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Project overviewThe company offers support and installation assistance for an Open Source Monitoring system. One component is the Open Source Project called Nagios.

Runs on Linux/Unix only

Data storage in text files

UI compiled (C++) classes parsing through text files

Hard to scale and limited possibilities to improve User Interface

The limitations to scale out and the User Interface are the two major issues hindering the adaption in larger installations

Overview

Page 5: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Project Requirements

The goal was to come up with a framework that:

leverages the core features of Nagios such as the monitor-plugins, scheduler and the notification engine.

Extends the UI and the back end so that it can be deployed into larger data centers.

Has a generic data model so that other monitoring data can be integrated.

Uses an enterprise-type back end, including fail-over, load-balancing and high throughput.

Overview

Page 6: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Mission: The CTO said...

Enable integration with multiple open source and commercial monitoring tools

Provide a platform for a unified enterprise-class solution

Provide real open source flexibility and extensibility

Publish the Monitor Data Integration Framework as an Open Source project so that outside developers can contribute

Overview

Page 7: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Development Constraints

We are a startup company with limited development resources and an aggressive schedule - we had to use existing components.

As a new company we didn't have legacy libraries for re-use. The best alternative was to leverage Open Source components as much as possible.

Design Phase

Page 8: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Final Feature set

Cross-platform application written in Java

Data exchange with XML feeder framework

Pluggable data normalization components

Java, Perl and PHP APIs for accessing data

Property-driven data structure for great flexibility

Design Phase

Page 9: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Architectural overview

Extend Nagios 2.0

Integrateopen source applications

GroundWork-developed software

GROUNDWORK FOUNDATION

DATABASE

3rd PARTY SYSTEMSOPEN SOURCE TOOLS

AVAILABILITY MONITORING(Nagios 2.0)

Application Programming Interfaces (APIs)

Adapters

Common Data Model

NETWORKSNETWORKS SERVERSSERVERS APPLICATIONS APPLICATIONS OTHER DEVICES OTHER DEVICESDATABASESDATABASES

PERFORMANCE MONITORING

(RRDtool)

SNMP TRAPS(SNMP TT)

SYSTEM LOGS(Syslog NG)

ALARMPROCESSING(Nagios 2.0)

Network, Systems &Applications ManagementConfiguration Management

Service Desk

Design Phase

Page 10: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

API Layer

Lightweight Object Container

Java API

Data Access Objects(DAO)

PHP API Perl API

Object Relational Bridge

Data Model

Design Phase

Page 11: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Data Feeder / data normalization layer

LightweightObject Container

AdapterNormalizer

Data Access Objects(DAO)

Object Relational Bridge

Data Model

AdapterNormalizer

AdapterNormalizer

AdapterNormalizer

Listener / Message dispatcher

FeederPerl script

FeederPHPscript

FeederJMS

FeederC/C++

FeederVB

XML Message

Design Phase

Page 12: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Common Data model

Common Data Model

Collector Normalizers

Application Programming Interfaces

Event DataLog Data

State Data Properties

PropertiesProperties

Design Phase

Page 13: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

How to choose the components?

Choose point solutions with minimal dependencies– Business layer should be database agnostic– persistence layer should not depend on specific transaction

managers or connection pools

Multiple projects with same functionality available– Easier to replace component if problems occur– License compatibility

Evaluation / Risk assessment

Page 14: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Choosing the Business Logic to database bridge

Requirements– Database agnostic. Not using stored procedures– Property based data model requires a lot of cross tables joins to

insert and retrieve data. Developers are used to manipulate objects rather than record sets.

– For performance reasons a cache is required.– Data consistency requires Transaction support

Hibernate -- www.hibernate.org– High performance object/relational persistence and query

service.– Most popular and stable O/R persistence tool– Online documentation and books available.– Active mailing lists and forums

Evaluation / Risk assessment

Page 15: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Choosing the Database

Requirements– Easy to install– popular and accepted– multi platform support

MySQL -- dev.mysql.com– Most popular Open Source database– Easy to install and to maintain– Download, install and up-and-running in 15 Minutes– Online documentation and books available.– Active mailing lists and forums

Evaluation / Risk assessment

Page 16: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Choosing the Lightweight object container

Requirements– Framework to manage Java Bean objects creation and

maintenance– minimal configuration at run time– Flexible to support aspect oriented programming (aop) and

transaction management

Spring -- www.springframework.org/– Lightweight container far smaller footprint than any available

J2EE container.– Configuration through XML format assemblies that can be

injected at any time.– Seamless integration of Hibernate for transaction management.– Online documentation and books available.– Active mailing lists and forums

Evaluation / Risk assessment

Page 17: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Risk assessment

Choose popular and well documented projects

Monitor forums to observe common user issues– Large traffic alone doesn't indicate successful project

Consider only stable and documented features

Do extensive evaluation of core components but not tool/utilities components

Evaluation / Risk assessment

Even following these rules doesn't prevent you from surprises. Unstable fast changing projects can negatively affect your overall schedule

Page 18: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Implementation

PHP API PERL API JAVA API

Listener

Nagios Feeder

Data Model

Adapter Frameworks

Spring

Hibernate

Apache Commons Perl DBI

JDBC

MySQL

More Feeders…Open source components

GroundWork open source

Availability of Open Source projects/components

GroundWork development

Implementation

Page 19: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Encountered issues / problems

Java version. Clients were still running Java 1.3 or Java 1.4.x. Java 5 offers improvements that we couldn't leverage.

By design all components are loosely coupled and therefore replaceable. This requires more upfront work to design the communication interfaces.

Documentation needs to be written!

Training of staff installing and supporting the framework.

Overhead of following Open Source projects to be informed about updates/problems that might affect the project

Implementation

Page 20: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Project Lifecycle

Feedback from the field needs to be integrated

Improvements / bugfixes from the various Open Source packages need to be evaluated and integrated.

Constant risk evaluation when integrating third party packages

Evaluate new Feature requests– How do they fit into the framework– Is there an Open Source package available– What's the license?– Can we integrate it easily? How much custom code?

Project Lifecycle

Page 21: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Release

Data integration Framework was released to Open Source as GroundWork Foundation

http://gwfoundation.sf.net

Used as a part of GroundWork Monitor Professional

Customized by other users to store state and event information not directly related to infrastructure monitoring.

Development goes on Milestone-Releases available

Since the project is public, developers have a responsibility to support users and guarantee stability

Project Lifecycle

Page 22: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Did the chosen approach work out?

Can we extend current design based on Open Source components?

Is the maintenance manageable since we integrated so many Open Source packages with their own lifecycle?

Is the built in flexibility really needed?

Project Lifecycle

Page 23: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

First design challenge: Adding new features

Integration of new features – Remote API (WebService)– Higher throughput. Feed 500-1000 Message/sec– Integration of other Monitor systems such as JMX

Project Lifecycle

Page 24: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Extensible architectureApplication ServerWeb Container

Project Lifecycle

NagiosAdapter

Data Access Objects(DAO)

Object Relational Bridge

Data Model

SNMPTrapAdapter

GenericAdapterListener / Message dispatcher

Nagios LogFeeder

Nagios StatusFeeder

SNMPTrapFeeder

SysLOGFeeder

XML Message

Java APIAdapter Manager

Web ServiceInterfaceAdmin UI

JMS Brokerwith persistence

Feeder Web Service

Post JMXAdapter

JMXFeeder

Page 25: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Second design challange: Open Source package upgrades

Upgrade of core components – Hibernate update to version 3.1 (EJB 3.0 compliant)– Springframework update to 2.0 (JMX support/enhanced aop)– Upgrade to Java 5

Open Source packages have dependencies– Log4j, commons, XML parsers,..

Have unit tests in place to catch any differences and incompatibilities early

Even if the upgrade is a “drop-in” update you should leverage any new features and improvements

Once again check the forums and the mailing lists!

Project Lifecycle

Page 26: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Conclusion

Without the usage of available Open Source components we wouldn't have been able to meet the aggressive release schedule.

Open Source Project evaluation and project monitoring needs to be built into development schedule

Mailing lists are a great help

Constant learning; projects change fast

Cleaner code since code is public – developer pride!

Lessons learned

Page 27: Case Study Distributed Data Integration Framework Roger Ruttimann Lead Engineer Enterprise Systems, GroundWork Opensource Inc. 4th International Conference

Case StudyDistributed Data Integration Framework

ICCSA-2006 Roger RuttimannJune 27, 2006

Foundation Project:http://gwfoundation.sf.net

GroundWork Monitor Open Sourcehttp://www.groundworkopensource.com/downloads

Contact:Roger Ruttimann

GroundWork Open Source, Inc.139 Townsend Street, Suite 100San Francisco, CA [email protected]

More Info