SnapLogic User GuideSnapLogic® User Guide Document Release: October 2013 SnapLogic, Inc. 2 W 5th Ave, Fourth Floor San Mateo, California 94402 U.S.A. Table of Contents SnapLogic®

SnapLogic® User Guide

Document Release: October 2013

SnapLogic, Inc.2 W 5th Ave, Fourth FloorSan Mateo, California 94402U.S.A. www.snaplogic.com

http://www.snaplogic.com/

Copyright Information

© 2011-2013 SnapLogic, Inc. All Rights Reserved. Terms and conditions, pricing, and other information subject to change without notice. ”SnapLogic” and “SnapStore” are among the trademarks of SnapLogic, Inc. All other product and company names and marks mentioned are the property of their respective own-ers and are mentioned for identification purposes only. “SnapLogic” is registered in the U.S. and Trademark Office.

Table of Contents

SnapLogic® User Guide 1

Table of Contents 3

Preface 7

About SnapLogic 7

About This Guide 7

Concepts 9

SnapLogic Architecture 9

Key Concepts: Components, Pipelines, and Snaps 11

The SnapLogic Design Process 11

User Interfaces 13

Introducing SnAPI 13

Introducing the Designer 13

Sidebar: Foundry 17

Sidebar: Library 18

Canvas 20

Slider 21

Components 25

Creating and Configuring a Component 25

Suggestions for Component Properties, Inputs, and Outputs 28

Components for Connecting to Databases 29

Components with Pass-Through Fields 32

Populating Default Values in Output Fields 35

Component Parameters 36

Validating a Component 37

Pipelines 39

Creating a Pipeline in Designer 39

Configuring a Pipeline in Designer 39

Updating a Pipeline in Designer 42

Mapping Components with Field Linker 43

Creating Data Service Pipelines and Accessing Feeds 44

- 3 -


Error Handling to Address Connection Problems and Data Errors 47

Executing Pipelines 49

Scheduling Unattended Pipeline Execution 52

Enabling Email Notifications 53

Tracing Data to Debug Pipeline Execution 54

Snaps 57

Accessing Snaps 57

Installing Snaps 58

Configuring Snaps 59

Developing Snaps and Further Participating in the SnapLogic Community 59

Administration 63

Working with Users and Groups 63

Starting and Stopping Servers 65

Configuring SnapLogic Server 65

Signed SSL Certificate Installation 70

Clustering Servers 71

SnapLogic Server High Availability and Failover 76

Memory Configuration Guidelines 78

Authentication: Active Directory-Based or File-Based 80

SiteMinder Support 85

Security Overview 87

Enabling SSL 88

Management Console 90

Log Files 94

Sandboxing to Protect Your SnapLogic Environment 94

Importing and Exporting 94

Snapshots 97

Running SnapLogic Behind a Proxy 98

Data Types and Output Representation Formats 100

SnapAdmin Utility 102

Sidekick 108

Appendix: Completing Tasks in SnAPI 109

- 4 -

Table of Contents

Creating and Configuring a Component in SnAPI 109

Component Suggestions in SnAPI 110

Configuring Pass-Through Fields with SnAPI 110

Specifying Parameter Values at Runtime in SnAPI 111

Validating Components in SnAPI 112

Creating a Pipeline in SnAPI 112

Creating a Data Service Pipeline in SnAPI 113

Mapping Components in SnAPI 114

Executing Pipelines in SnAPI 114

Data Tracing in SnAPI 115

Appendix: SnAPI and PostgreSQL 117

General 117

Prerequisites 117

Triggers 117

Appendix: ACLs 120

Glossary 125

- 5 -


- 6 -

PrefaceThe SnapLogic User Guide contains instructions for performing data integration using several SnapLogic interfaces: its graphic user interface, Designer; and its application program inter-face, SnAPI.

About SnapLogicSnapLogic is a data integration platform with an innovative, open, and extensible data flow architecture and straightforward subscription model. SnapLogic connects to almost any SaaS, Cloud, Web, or enterprise application or data source through Components and Pipelines, pro-viding information as a utility to business users and applications. SnapLogic is an alternative to closed, proprietary, client-server-based integration solutions and the massive amount of hand-coding still being performed to accomplish data integration in the marketplace today.

About This GuideThe SnapLogic User Guide is an instructional guide for using the functionality available within SnapLogic. It contains descriptions, definitions, step-by-step instructions, and examples. The content is addressed to data integration designers using either the SnapLogic Designer or SnAPI.

- 7 -

1

- 8 -


ConceptsThis section of the document provides an overview to the architecture and concepts of the SnapLogic application.

SnapLogic ArchitectureSnapLogic consists of a number of architectural constituents, including the SnapLogic Server with its two interfaces and metadata repository, and the Component Servers with their APIs.

The SnapLogic Server and Metadata Repository

The SnapLogic Server is a lightweight server process that accesses data from one or more sources and executes SnapLogic integration Pipelines. A SnapLogic Server can run directly on a data source along with the database, application, or file server. It can also run as a stand-alone integration server, accessing SnapLogic Components across the network.

The SnapLogic Server manages the instantiation and execution of Components and Pipelines. The SnapLogic Server's open API enables anyone to create read, write, and operation Com-ponents and simply snap them into the Server, limitlessly extending the Server's connectivity and functionality. The SnapLogic Server features:

l Integration of data from any source (including Web, SaaS, and on-premise data)

l Universally extensible SnapLogic Component API

- 9 -

2


l SnAPI scripting and command-line API

l Deployment on-premise or in the cloud

l SnapLogic Designer, a browser-based drag-and-drop GUI

l Enterprise ETL functionality

l Searchable metadata repository of Pipelines and Components, including a robust set of Component connectors, functions, and extensions

l Access to downloadable and reusable integration extensions

The SnapLogic User Interfaces: Designer and SnAPI

SnapLogic has a graphic user interface in the form of the SnapLogic Designer, and an appli-cation program interface in the form of SnAPI:

l SnapLogic Designer: The SnapLogic Designer is a browser-based visual configuration tool for creating and executing data integration solutions. The Designer provides a sim-ple, drag-and-drop visual interface for combining Components and Pipelines and defin-ing their execution and results. The SnapLogic Designer application runs inside your Web browser, so no client software installation is required. This enables you to access your SnapLogic data flow server from anywhere. The Designer enables you to:

l Create Components from the SnapLogic Server Foundry

l Assemble data integration Pipelines

l Create data services from your data sources

l Preview and execute Pipelines

l Schedule unattended Pipeline runs

l SnAPI: The SnapLogic Application Program Interface, SnAPI, is the programmatic inter-face to the SnapLogic Server that enables you to create and use Components and Pipe-lines from your application or development environment. SnAPI is ideal for users who do not need the visual interface of the SnapLogic Designer, or for those who wish to create Components and Pipelines through code generation. SnAPI supports the following languages:

l Python

l Java

Most of the actions you can perform in SnAPI can also be performed in the Designer.

Java/Python Component Servers

The SnapLogic Component Servers are processes within which the Components run. The iso-lation of the Components from the main SnapLogic Server process improves reliability, and Component Containers can be distributed across a network of hosts to improve computation power. This architecture enables components to be implemented in a variety of programming languages.

- 10 -

Concepts

Key Concepts: Components, Pipelines, and SnapsWhen you work with SnapLogic, regardless of the interface you use, the main concepts to understand are Components, Pipelines, and Snaps:

l Components: A Component is an object used to perform a simple subtask, such as read, write, or act on data. Strung together, Components are the building blocks of Pipe-lines, or data flows. Components are generally classified as Connectors (Components that read or write data) and Operators (Components that perform an action, such as a join or filter, on data). Basic templates for Components are included in your SnapLogic installation (refer to the Component Reference for the list of Component templates that ship as part of SnapLogic), and reside in the Designer's Foundry panel. These generic templates, once configured, become configured Components that are stored in the Snap-Logic Server repository, and can be found in the Designer's Library panel.

l Pipelines: A Pipeline is a collection of Components linked together to orchestrate a flow of data between end points. For example, a simple Pipeline may read data from an RSS feed, reformat it, and write it to a database.

l Snaps: A Snap is the encapsulation of an integration task or subtask that performs a complete, and usually high-level, function. It is any “pluggable” piece of code that has been conveniently packaged to run seamlessly in the SnapLogic Server. Specifically, a Snap can take various shapes:

l A Snap can be a collection of Components that are functionally related, such as the Salesforce Snap, which contains Components for inserting contacts into and deleting contacts from Salesforce.

l A Snap can also consist of a single low-level building block, such as a filter.

l A Snap can comprise a complete Pipeline packaged as a simple Component to insert an item.

The definition of "Snap" is therefore a recursive one: A complex Snap can contain multiple Pipelines; a simple Snap can stand alone or participate in a Pipeline.

A Snap in SnapLogic is comparable to a smart phone app, a browser add-on, or an application plug-in. A Snap can perform as simple a task as to read data from a file, or as complex an operation as to connect to an instance of Microsoft® Dynamics CRM, analyze the source data, and provide full access (data and functionality) to all standard and custom objects within Microsoft Dynamics CRM. When changes have been made to standard or custom objects, the Snap adapts and provides you access that takes this change into account.

The SnapLogic Design ProcessThere are two main approaches you can take when you create data integration solutions in Designer:

l Sketch, then configure: Drag and drop Components onto the canvas and link them, leaving configuration for another time. (Refer to Mapping Components to Each Other

- 11 -


with the Field Linker for details.)

l Configure, then connect: Click on the Pipeline's Components and links displayed on the canvas, and use the slider to configure them. (Refer to the Component Configuration and Mapping Components to Each Other with the Field Linker sections for detailed con-figuration instructions.) Once all your Components are configured, you can link them to specify data flow.

- 12 -

User InterfacesSnapLogic has an intuitive graphic user interface, the SnapLogic Designer, where you can vis-ually create data integration scenarios. SnapLogic also has an Application Program Interface, SnAPI, through which you can access all SnapLogic functionality using Python or Java.

This guide provides instructions for completing tasks using both Designer and SnAPI where possible.

Introducing SnAPIThe SnapLogic Application Program Interface, SnAPI, enables you to create and use Com-ponents and Pipelines from your application or development environment. SnAPI is ideal for users who do not need the visual interface of the SnapLogic Designer, or for those who wish to create Components and Pipelines by way of code generation.

You can use SnAPI for tighter integration with various relational database management sys-tems. For example, refer to Using SnAPI within PostgreSQL for an introduction to SnAPI. Snap-Logic provides SnAPI support for the following languages:

l Python

l Java

Configuring Your SnAPI Environment

If you plan to use SnAPI, SnapLogic provides scripts to configure your environment appro-priately. The scripts set the environment variables required to locate the appropriate Snap-Logic libraries and executables. Scripts are provided for Linux environments:

l Linux: bin/snaplogic_env.sh and bin/snaplogic_env.cshSource the appropriate script for your Linux shell to configure the environment.

Introducing the DesignerDesigner is the graphical user interface for SnapLogic.

Launching Designer

Follow these instructions to launch Designer from your computer desktop:

1. Begin by starting the server. See "Starting and Stopping Servers" for information on how to do this.

2. Launch Designer.

l In a browser, go to http://hostname:443/designer, where hostname is the name of the machine where SnapLogic is installed.

- 13 -

3


Designer is a Flash/Flex application that runs in your web browser. You can also load it by pointing your web browser to your server host and port number; for example, http://h-ostname:443/designer.

A Tour of SnapLogic's Graphical User Interface

The Designer launches in a web browser window. The Designer is divided into a sidebar and a canvas, as illustrated in the following figure.

The sidebar hosts the Library and Foundry panels:

l Library: The Library stores the projects you are building: your Pipelines and configured Components. Refer to the Sidebar: Library Panel section for more details.

l Foundry: The Foundry stores the building blocks from which you can build projects. These building blocks are either generic templates provided with SnapLogic, or more specialized Snaps you have purchased from SnapStore. Refer to the Sidebar: Foundry Panel section for more details.

l Access these sidebar objects by dragging them onto the canvas to sketch, or design data flow.

l You can toggle the visibility of the sidebar by clicking the sideways arrow to the left of the canvas, or by selecting View > Sidebar.

l Canvas: The canvas is your work area for sketching and configuring. Drag select items from the sidebar onto the canvas to perform your data integration tasks. The canvas is equipped with a slider that displays in detail the properties and, if available, previews of any highlighted object. The canvas and slider are each discussed in detail in this section.

Designer Menu Bar

Menus occupy the top of the Designer screen and are focused on the following areas:

l Server: Use the Server menu to connect to, disconnect from, and manage SnapLogic servers.

- 14 -

User Interfaces

l Library: Use the Library menu to create new pipelines and to copy and paste high-lighted Library objects.

l View: Use the View menu to dictate what elements appear in your Designer screen, and to access scheduler, server, and log information. For information on configuring view settings, see "Settings".

l SnapStore: Click SnapStore to open SnapLogic's online marketplace in a new browser window. You can purchase Snaps from SnapStore and open them immediately in Designer.

l Help: Use the Help menu to access SnapLogic resources, check for software updates, and manage your SnapLogic licensing.

l Search Library & Foundry Bar: Use this search bar to search for Components, Snaps, and Pipelines if you know a portion of their name.

Settings

To invoke the Settings dialog box, select View > Settings from the Designer menu bar. Note that you can use the Reset to Defaults button if you do not want to keep your changes. This list describes the settings available and their default values:

l General: General settings include display preferences for the welcome page, animations, and feedback with sound.

l Show welcome page at startup: Select this option for SnapLogic to display the Start Page tab in the canvas when you launch Designer. Default setting: Yes

l Show client log at startup: Select this option to display the Designer log infor-mation. Default setting: No

l Show animations: Select this option to display animations in Designer. Default setting: Yes

l Provide feedback with sound: Select this option to enable short notes that alert you to possible errors or disabled tasks. Default setting: Yes

l Show last opened pipeline/resource at startup: Select this option to auto-matically see the last resource you were working on when you next return to Des-inger. Default setting: Yes

l Registered Servers: This setting displays a list of servers and their default connection settings.

l Remove: Highlight the server you want to remove and click Remove to delete the server from SnapLogic's registry. You must have at least one server registered in order to use SnapLogic. Default setting: None. When you first install SnapLogic, you are prompted to register a server. That server displays when you open this dialog box. However, if you click Reset to Defaults, you are asked to confirm whether by restoring all default settings, you also want all servers to be removed from this list. You can answer "No," and your servers will be left in the registry while all the remaining setting defaults are restored.

- 15 -


l Continually poll connected servers for status (restart required): High-light the server to which you want to apply this setting, and then either select the option to enable it, or deselect it to disable the polling. Repeat this choice for each server. Note that you will need to restart SnapLogic and the servers in question for your changes to take effect. Default setting: Yes

l Component Options: This setting enables you to control preview and pass-through options. Limit preview to (specify number) of records: This setting affects the preview feature available in the canvas slider for Components that support preview. The max-imum number of records SnapLogic can preview is 100. If desired, use this setting to lower the maximum. Default setting: 100

l Use UTC rather than local time for datetime values: Use this setting to specify whether the system should use UTC instead of the system's local time. This is often useful when integrating with other systems also on UTC. Default set-ting: Yes

l Pipeline Options: This setting enables you to control which prompts and aids are auto-matically enabled when you work with pipelines:

l Perform"Smart Link" automatically when the field linker is displayed: Select this option if you want SnapLogic to auto-link fields automatically as soon as the field linker displays. You can change the automatically generated links if they are not correct. Deselect this option if you want to link fields manually, or to launch auto-link yourself, at your convenience, with the Smart Link button. Refer to field linker section for more information on the field linker. Default setting: Yes

l User server-side linking algorithms: Default setting: Yes

l Prompt me to provide a node name when I drop a Component on the canvas: When you first drag a Component template from the Foundry to the can-vas, SnapLogic automatically assigns it a URI and a node name. The node name is strictly metadata that aids you to recognize the Component after you have con-figured it. SnapLogic then prompts you to override the suggested node name if desired. Select this option if you want to be prompted to provide your own meta-data node name when you first drop a Component or Snap on the canvas. Dese-lect it if you want SnapLogic to assign its own node names without prompting you for approval. You can always configure node names at your convenience, by selecting the Component on the canvas and configuring its properties in the slider. Default setting: No

l Automatically manage parameter mappings: When this option is selected and you drag a Component from the sidebar into a Pipeline on the canvas, Snap-Logic automatically places the Component-level parameters at the Pipeline level as well. Keep this option selected if you want SnapLogic to map Component-level parameters up to the Pipeline level. Deselect this option if you prefer to define Pipeline-level parameters manually. Default setting: Yes

l Automatically save before running/previewing.

- 16 -

User Interfaces

l Initial zoom level: Modify this option if you want to change the default zoom level for the objects in the canvas. Default setting: 100%

The Settings box provides a Reset to Defaults button should you choose to restore all the options to their default values. Click Close to exit the Settings box.

Sidebar: FoundryThe Foundry is the bottom panel in the Designer's sidebar and contains Component templates for your use.

Components are the building blocks of Snaps and Pipelines: they perform a simple subtask, such as connecting to a database, filtering rows of data, or performing a join. A Component has properties; each property has a type (for example, binary or string )and a value (its value can either be fixed, or made a parameter).

The Component templates in the Foundry are generic. They are designed to perform specific functions, but without configuration, they remain mere templates. To create Components, you must configure generic Component templates found in your Foundry to your specific needs and connect them to data sources, other Components, or targets.

The templates available in the Foundry include:

l Component templates provided with your purchase of SnapLogic, in their original ver-sion (prior to any configuration you have performed) are stored in the Foundry.

l Snaps you have purchased from SnapStore, prior to any configuration you have per-formed, consist of Component templates and appear in the Foundry.

To configure a Component, drag it from the Foundry to the canvas and edit its properties. At this point, any work you have done on the object is saved in the Library, and the object, whether complete or in progress, is no longer considered a Component template, but rather, a Component. The Foundry is akin to a hardware store where you procure your materials. The moment you begin to manipulate an object, its new home (with the work you have applied to

- 17 -


it) becomes the Library. Refer to the Library to locate any object which you have begun to con-figure.

Foundry Toolbar

The Foundry toolbar has buttons for the following commands:

l Install New Snap: Use this command to install a new Snap that you have purchased from SnapStore. Refer to "Installing Snaps" for details.

l Search for Snaps: Use this command to search for Components or Snaps by their name, or a portion of their name. If no match is found, a link enables you to access SnapStore, where additional Snaps are available for purchase.

l Refresh Component List: Use this command to refresh your view of the Foundry if you have purchased new Snaps that have yet to appear.

Foundry Categories

The Foundry panel separates its Component templates into categories, which you can filter using the drop-down list. This organization is helpful when you have many Component tem-plates and want to look at them by type.

l The ALL category displays all objects available to you in the Foundry in a tree structure.

l The SnapLogic category displays only Component templates that are shipped as part of SnapLogic.

l Additional categories appear when you purchase Snaps from SnapStore. These Snaps consist of Component templates and install into the Foundry under their own category. For example, if you purchase the SalesForce Snap, a SalesForce category is added to the drop-down list.

Foundry View Tabs

The Foundry panel has two views, in the form of tabs:

l The Foundry view tab ( ) displays all objects available to you in the Foundry, organ-ized by object type and name.

l The Recently Used view tab ( ), enables you to easily access the objects you last selected.

Sidebar: LibraryThe Library is the top panel in the Designer's sidebar and stores all of your customized solu-tions and projects in the form of Components and Pipelines.

- 18 -

User Interfaces

The Library contains all the objects that you are in the process of, or have completed, con-figuring. Configured Components and Pipelines reside in the Library. Components that you located in the Foundry and have begun to configure reside in the Library.

If the Foundry is akin to a hardware store where you procure tools and materials, the Library plays the role of your workshop, where your projects come to life. The moment you begin to configure a Component template, this configured Component appears in the Library. You can create any number of Components from a Component template.

Sidekick

As of 3.7, if you have Sidekick configured, it will display under the corresponding SnapLogic Server.

Data Folder

For each server in the Library pane, there is a “data” folder, which allows you to upload files to this folder, or its sub-folders, from within the Designer, by right-clicking on the destination folder and selecting Upload.

Library Folders

Notice that in the Library view, Library objects appear to be organized by folder. When you configure a Component or create a new Pipeline in SnapLogic, Designer automatically assigns it a URI and a location in this folder structure. This visual organization helps you conceptually sort the Component and Pipelines you build, and makes them easier to locate. While you can-not create folders or alter the folder structure outside of the URI definition, you can drag Com-ponents in and out of these folders as necessary.

However, despite the convenient appearance of this folder structure, Designer references all the Component and Pipelines you create by uniform resource identifiers (URIs), and not by a traditional folder structure. The resources in the URIs it assigns are fully qualified "long point-ers," which can point to objects located anywhere—in the cloud just as easily as on premise. Designer thus gives the appearance of a folder structure for your convenience, but uses the URI approach to referencing objects in order to build and execute bi-directional data flows between applications, hiding the complexity of your system architecture where needed.

Library View Tabs

The Library panel has two views, which you can choose between from tabs at the bottom of the panel:

- 19 -


l The Library ( ) view tab displays all objects available to you in the Library, organized in folders.

l The Recently Viewed ( ) view tab enables you to easily access the objects you last selected.

CanvasThe canvas is your main workspace in Designer. Create data integration solutions in the can-vas by sketching, connecting, and then configuring Components and Pipelines. Sketching refers to the process of dragging Component templates to the canvas and linking them to others. Configuring refers to the Component and Pipeline properties you can edit in the slider.

The following are high-level instructions for using the canvas, which are later discussed in detail:

l Begin by creating Components; do this by selecting one or more Component templates provided in the Foundry, dragging them to the canvas, and renaming them. See "Cre-ating a Component in Designer" for more information.

l Connect multiple Components to form Pipelines. See "Creating a Pipeline in Designer" for more information.

l Highlight a Component to configure its properties in the slider that occupies the lower portion of the canvas. For Components that support data preview, you can preview the results in the slider's Preview tab. See "Configuring a Component in Designer" for more information.

l Edit the field links, or connection properties (that is, configure how fields from each Component map to their downstream Component) by clicking on the link between them to display the Field Linker in the slider below. See "Mapping Components with Field Linker"for more information.

l Execute Pipelines directly from the canvas by clicking the Run Pipeline button in the canvas toolbar. See "Executing Pipelines" for more information.

The canvas occupies the entire work area until you select an object or a link in the canvas, or run a Pipeline. Any one of these three actions displays the slider just below the canvas. Use the canvas for sketching: bringing appropriate building blocks into your data flow and con-necting them. Use the slider for configuring the building blocks, their connections, and the resulting Pipelines. The slider is further discussed in the Slider section.

Canvas Tabs and Toolbar

The canvas initially displays a Start Page tab, which you can enable or disable with the Show at startup option on the right, or by editing the General Pipeline Settings.

Designer creates a tab for every Pipeline that is open, or any new Pipeline you create. Each tab displays the name of its Pipeline. You can have multiple tabs open, and drag building blocks into the tab of your choice.

- 20 -

User Interfaces

The upper right corner of each tab enables you to control the zoom level for the selected area in that tab. Zooming in and out is especially useful when you have a long or cluttered Pipeline. You can change the default zoom percentage in the Pipeline Settings' Pipeline Options screen.

A toolbar is located on the left side of the canvas. Note that, depending on where the canvas and slider are separated, you may not be able to see the entire canvas toolbar without first dragging down the divider to make the slider smaller and allow the entire toolbar to appear on the canvas.

The canvas toolbar contains the following commands:

l Run Pipeline: Use this command to execute the Pipeline in the active tab.

l Pipeline Properties: Use this command to display the properties of the entire Pipeline in the slider below.

l Rearrange Components: Use this command to arrange the Components in an orderly manner.

l Show Grid: Use this command to display a subtle grid to help you visually follow or align the layout of the objects in the tab.

l Snap to Grid: Use this command to have Designer automatically align the objects in the tab.

l Export Pipeline to PDF: Use this command to save the currently displayed Pipeline to a PDF file.

l Search Pipeline: Use this command to search for specific Components in a Pipeline. This is particularly useful in large, complex Pipelines.

l Drag and Drop New Annotation to Canvas: Use this command to add Pipeline anno-tations to the canvas, like sticky notes for your Pipeline.

SliderThe slider resides below the canvas. Its appearance is dynamic: it displays in a frame under the canvas when you select a Component or a link on the canvas, or when you run a Pipeline. When you first launch Designer and start to work on the Canvas, the slider is not visible. Once you select a Component or link, or run a Pipeline, the slider appears.

You can also maximize the slider to occupy the entire area of the canvas, or open it in its own tab alongside the other tabs topping the canvas. The content of the slider varies with the object selected:

l If a Component is highlighted on the canvas, the slider is in Component mode, and dis-plays Component properties you can configure. The properties presented vary with the type of Component and the functions it supports.

l If a connection, or link, is highlighted on the canvas, the slider displays the Field Linker.

- 21 -


l If you execute a Pipeline from the canvas, the slider goes into Pipeline mode and dis-plays Pipeline properties and execution data.

The slider's title bar displays the name and type of object highlighted in the canvas above.

Slider Commands

Regardless of which page the slider displays, the following commands are available:

l Save: Use this command to save the work you are doing in Designer to the SnapLogic Server repository.

l Suggest: Use this command to invoke automatic fill-in suggestions. This button is only enabled when you are viewing Components that are eligible for suggestions. Refer to the "Suggestions for Component Properties, Inputs, and Outputs" section for more details about this function.

l Validate: Use this command to validate Components and Pipelines. Refer to the "Val-idating a Component" section for more details about this function.

Library Toolbar

The Library toolbar has buttons for the following commands:

l New Pipeline: Use this command to create a new Pipeline. A URI dialog box asks you to name the new Pipeline, and then the canvas is primed for you to begin sketching.

l Add SnapLogic Server: Use this command to add other SnapLogic Servers to the Library.

l Delete Selected Item: Use this command to delete an object you have highlighted in the Library.

l Search for Snaps and Pipelines: Use this command to search for objects in the Library using their name of a portion thereof.

l Refresh Selected Server: Use this command to refresh your view of Library objects.

Slider in Component Mode

Selecting a Component on the canvas opens the slider in the lower frame. Here, you can con-figure the Component.

The slider's Component information is divided into a series of pages. Navigate between them by clicking the oval-shaped page names. Note that the pages available vary with the type of Component selected in the canvas. For example, if you have selected a Component that accepts inputs but produces no outputs, the slider menu displays an Input page, but neither an Output nor a Preview page.

For more information on using the slider in this mode, refer to the Components section.

Slider in Field Linker Mode

The Field Linker displays when you select the link between two Components.

- 22 -

User Interfaces

To connect two Components to each other, select the bottom (output) frame of the first Com-ponent and drag your mouse to join it to the top (input) panel of the downstream Component.

For example, the output frame of the Leads Component appears purple, and is joined to the green input frame of the Prospects Component. The line between them indicating their con-nection has a ring in its center; this ring represents the link between the Components. Select-ing the link displays the Field Linker in the slider. The Field Linker displays output fields in a column alongside input fields of the downstream Component.

The Field Linker automatically suggests field-to-field mappings when field names are the same (for example, the "Address" output field of the Leads Component is automatically linked to the "Address" input field of the Prospects Component). You do not have to accept these sug-gestions. Change individual links by clicking on the output field in question and selecting an alternative from the drop list.

Slider in Pipeline Mode

If you have Components linked to each other in the canvas, you have a Pipeline. When you are ready to execute the data flows within the Pipeline, click the Run Pipeline button in the can-vas toolbar. The slider now displays Pipeline information.

For detailed information on working with Pipelines, refer to the Pipelines section.

- 23 -

- 24 -


ComponentsA Component is an object used to perform a simple subtask, such as read, write, or act on data. Strung together, Components are the building blocks of Pipelines, or data flows. Com-ponents are generally classified as connectors (these are further divided into consumers and producers: Components that read or write data, respectively), and operators (Components that perform an action, such as a join or filter, on data).

Basic templates for Components are included in your SnapLogic installation (refer to the Com-ponent Reference for the list of Component templates that ship as part of SnapLogic), and reside in the Designer's Foundry panel. These generic templates, once configured, become configured Components that are stored in the SnapLogic Server repository, and can be found in the Designer's Library panel.

Creating and Configuring a ComponentIn this context, creating a Component refers to configuring an instance of a generic Com-ponent template to read, write, or operate upon data.

To create a Component, drag an unconfigured Component template from the Designer's Foundry panel onto the canvas.

If you are in configuration mode, continue with the instructions in this section. If you are in sketching mode and prefer to configure afterward, connect the Components to data sources, other Components, or data targets to form Pipelines, as described in the Pipelines section--and then return to this section for configuration instructions. Configured Components are stored in the SnapLogic Server repository and can be found in the Designer's Library panel.

Creating a Component in Designer

To create a Component in Designer:

1. Double-click the desired Component template in the Foundry.

2. Specify a URI in the New [Component Type] prompt that displays.

3. Click OK.The Component you just created now displays in the Library. The canvas displays Com-ponent properties for you to configure.

4. Configure the Component properties in the canvas, referring to the list of Component Configuration Properties if required.

5. If you wish to validate your work, click Validate, and refer to the "Validating a Com-ponent" section for details.

- 25 -

4


6. Click Save to save your Component (whether complete or not) in the SnapLogic Server repository. You can also return to the configuration step later. Save the Component as it is now, and at some later point, drag it from the Library to the canvas to edit its prop-erties in the slider.

Alternately, if you wish to accept the default URI, you can create a Component by dragging the Component Template onto the canvas.

Configuring a Component in Designer

The following tabs open to Component configuration screens for most Components in the slider:

l General: General information about the selected Component. Refer to the "Validating a Component" section for details.

l Properties: Properties vary for every Component. Some properties can be edited. Refer to the "Configuring Component Properties" section for details.

l Output: The Output tab only displays for Components that produce outputs, or data that can serve as inputs to downstream Components. Refer to the "Specifying Component Outputs" section for details.

l Input: The Input tab only displays for Components that consume input data for the pur-pose of performing functions on them or for passing them through to another down-stream Component. Refer to the "Specifying Component Inputs"section for details.

l Parameters: Parameters are variables that can be used for runtime substitution in the properties of Components. You can use parameters to avoid hard-coding property values that are likely to change. Using parameters enables you to use a single Com-ponent or a single Pipeline for multiple purposes. The Parameters tab displays Com-ponent parameters and their default values. You can edit this information, and add or remove parameters. Refer to the "Component Parameters" section for more infor-mation on parameters.

l Preview: The Preview tab only displays for Components that support preview, and ena-bles you to examine a subset of the Component output rows without actually executing the data integration scenario. Refer to the "Previewing Component Execution" section for details.

General Component Information

The General screen describes a Component's basic information:

l URI: The unique URI that SnapLogic assigns to the Component.

l Component: The Component used. Once you drag a Component to the canvas, save it with a name of your choice and configure it, the type of function it performs may not be immediately obvious. This field displays the original Component template used in your Component.

- 26 -

Components

l Version: The version of the Component used in the Pipeline. This may important when verifying the functionality available in that Component.

l Created by: The username that defined this Component. This is only visible on newly placed Components in release 3.7 or later, not on Components upgraded from a pre-vious release.

l Created on: When the Component was defined in the Library. This is only visible on newly placed Components in release 3.7 or later.

l Modified by: The username that last modified this Component. This is only visible on newly placed Components in release 3.7 or later.

l Modified on: When the Component was last modified. This is only visible on newly placed Components in release 3.7 or later.

l Author: Optionally enter the username of the Component creator.

l Description: Optionally enter a description for the Component.

Configuring Component Properties

Using the buttons on the Properties tab, you can display the properties in either a list view (

) or a treeview . Pausing your cursor over a property name displays an explanation of the property. Use the Value column to edit properties that allow editing. Properties vary for every Component. Refer to the Component Reference Guide for details about each Com-ponent's specific properties.

The following list is a small sampling of properties common to many Components:

l Credentials: Username

l Credentials: Password

l Delimiter

l File name

l Is input a list

l Input field

l Output field

l Error Handling (Refer to the "Error Handling to Address Connection Problems and Data Errors" section for details.)

Specifying Component Outputs

Outputs are pieces of data a Component produces that can serve as inputs to downstream Components or Pipelines. Not all Components produce outputs. The Output tab only displays for Components that produce outputs. In this screen, you can configure the Component's out-put fields. You can change the default field names, types, and descriptions. You can add (using the Add Output button) and subtract (using the Delete Output button) outputs and their fields.

- 27 -


Depending on the Component, a message may appear to the right of the properties list, explaining that the Component being edited does not support pass-through. You can enable or disable pass-through messages by editing the Resource Options in your Pipeline Settings.

You can also add error outputs to Components so that erroneous records are collected in a specified destination file. This practice enables you to address the failed records while the remaining records in the Pipeline execute. Refer to the "Error Handling to Address Connection Problems and Data Errors" section for details.

Specifying Component Inputs

Inputs are pieces of data that a Component consumes for the purpose of performing functions on them or (when supported) passing them through to another downstream Component or Pipeline. Not all Components accept input data. The Input tab only displays for Components that accept inputs. Here, you can configure the Component's input fields. You can change the default field names, types, and descriptions. You can add (using the Add Output button) and subtract (using the Delete Output button) inputs and their fields.

Previewing Component Execution

The Preview tab only displays for Components that support preview. This tab has the following subtabs:

l Run: Begin with this tab. If your Component has runtime parameters, enter their values here to run the preview. Click Run when you are ready.

l Preview Data: Here is where you can view the output rows generated as the preview. The default maximum number of rows generated is 100. If the data source has fewer records than the maximum setting, the preview shows all the records available. If you want faster previews or simply need smaller samples, you can change the maximum from 100 to any lower number in the Resource Options of your Pipeline Settings. When the data is too large to be displayed, the data is truncated by an ellipsis hyperlink. Click-ing on that link shows the complete data in another window. Use the Print and Copy buttons to print the rows generated in the preview, or to copy them to your Microsoft Office Clipboard in tab-delimited format that is ready to be pasted into a spreadsheet application.

l Runtime Information: Visit this tab to view runtime results. You can see the start and end times of your preview run, and its status.

Suggestions for Component Properties, Inputs, and OutputsSnapLogic can make "suggestions" about Component properties, inputs, and outputs, based on the values of their properties. These suggestions can help you complete your Components. For example, after specifying the filename property of a CSV Read Component, the Com-ponent can inspect the file and determine the field delimiter, the number and type of output fields, and even the field names (if a header row is present).

- 28 -

Components

Components are not required to suggest all fields, and in some cases, meaningful suggestions are impossible. You can choose to accept or reject any suggestions made.

Component Suggestions in Designer

If a Component can provide suggestions, the Suggest button at the bottom of the slider is ena-bled. The Designer displays a dialog containing the Component's suggestions.

For example, setting the filename property to the name of a .csv file, produces suggestions from the CSV Read Component regarding the output fields and how many header rows to skip.

Suggestions are followed by a Confirm Action prompt, which you can opt out of seeing.

Components for Connecting to DatabasesYou can use SnapLogic to read data from or write data to one or more databases. SnapLogic provides the following DB Components:

l DB Reader: Use this Component to select data from a database.

l DB Writer: Use this Component to insert, update or delete data in a database.

l DB Lookup: Use this Component to perform per record lookups, sometimes referred to as probes.

l DB Upsert: Use this Component for "upsert" (merge) functionality, updating existing or inserting new rows into a database.

Creating a Component from these templates requires that you also create a database-specific connection Component using the appropriate DB Connection Component template. SnapLogic provides DB Connection Components for a variety of databases, including:

l DB Connection - DB2

l DB Connection - JDBC

l DB Connection - MySQL

l DB Connection - Oracle

l DB Connection - PostgreSQL

l DB Connection - SQL Server

l DB Connection - SQLite

With a DB Connection Component, you specify the connection details of the target database, providing such information as database name, host name, and port number. Every DB Read-er/Writer/Lookup Component contains a DB Connection Component property field in which you specify the URI of the DB Connection Component containing the information about the data-base to which the Reader/Writer/Lookup Resource is to connect. This allows connection-spe-cific details to be centralized in a single Component that can be shared by multiple DB Reader/Writer/Lookup Components.

See the Component Reference for detailed information about Components.

- 29 -

http://www.snaplogic.com/docs/component-reference/component-reference.htm




Connecting to Databases in Designer

The following example illustrates the properties of a DB Reader Component that connects to a MySQL database, called "Sales."

The DB Reader Component specifies the URI of the DB Connection Component to use, in addi-tion to the actual SQL statement used to read from the database.

When building a Pipeline with DB Reader/Writer/Lookup Components, drag the DB Connection Component onto the canvas. Note that the DB Components have drop lists where you can select which DB Connection Component to use. This selection overrides the connection prop-erty you specified in the Component itself and allows parameter replacement to be per-formed.

The following illustration shows an example Pipeline that accesses three distinct databases with two DB Readers that read sales and account history data, and one DB Writer that updates the HR database with commission information.

- 30 -

Components

- 31 -


Components with Pass-Through FieldsThe pass-through capability allows Components to accept fields that are not specified as inputs, and pass these fields directly as their outputs. When you link two Components, only those fields specified as inputs of the downstream Component must be linked. All the remain-ing unlinked outputs of the upstream Component are passed through to the downstream Com-ponent's output.

The following diagram illustrates the concept of pass-through fields.

With pass-through, the inputs of Components in a Pipeline need not be explicitly designed to handle all the incoming fields from upstream Components. Component inputs only need to specify fields that the Component requires for its computations. This reduces the field linking to the absolute minimum.

Not all Components support pass-through. Each Component description in the Component Ref-erence declares whether it supports pass-through or not.

Examples

The following scenarios provide examples of how pass-through works.

- 32 -

http://www.snaplogic.com/docs/component-reference/component-reference





Components

Scenario 1:

[Component I] (Input: N/A; Output: A,B,C; Pass-through: No) -> [Component II] (Input: A; Output: A; Pass-through: Yes)

l Component I is linked to Component II and field A is mapped from I's output view to II's input view.

l With pass-through enabled, fields A,B,C are available in the output of Component II.

Scenario 2:

[Component I] (Input: N/A; Output: A,B,C; Pass-through: No) -> [Component II] (Input: A,B; Output: A; Pass-through: Yes)

l Component I is linked to Component II and field A is mapped between I's output view to II's input view.

l Field B from Component I is mapped to field B in Component II.

l With pass-through enabled, fields A,C are available in the output of Component II. Note that field B is not available in the output of Component II due to the fact that it is only defined in the input view of the Component.

Scenario 3:

[Component I] (Input: N/A; Output: A,B,C; Pass-through: No) -> [Component II] (Input: A,B; Output: A; Pass-through: No)

l Component I is linked to Component II and field A is mapped from I's output view to II's input view.

l Field B from Component I is mapped to field B in Component II.

l With pass-through disabled, field A is available in the output of Component II. Field C is not visible since pass-through is not enabled.

Scenario 4:

[Component I] (Input: N/A; Output: A,B,C ;Pass-through: No) -> [Component II] (Input: A,B; Output: A,B; Pass-through: Yes)

l Component I is linked to Component II and field A is mapped between I's output view to II's input view.

l Field B in Component II is mapped to NULL.

l With pass-through enabled, fields A,B,C are available in the output of Component II.

Configuring Pass-Through Fields with Designer

The SnapLogic Designer provides a simple interface for configuring pass-through. When con-figuring Component output properties for any Component that supports pass-through, toggle the checkbox of the input whose fields you want to make available for pass-through linking.

- 33 -


The following figure provides an example of the input fields accepted by the Filter Component ("FilterLeads").

"FilterLeads" is the second Component in the Filtered_Qual_CA_Prospects Pipeline. Its upstream Component is a CSV Read Component named "Leads." The Leads Component is read-ing data from a comma-separated file and passing it through a filter. The Filter Component, "FilterLeads" is selective on which rows it accepts as inputs, but does not operate on these rows, and outputs them directly to the next downstream Component.

The previous figure displays the Input side of the filter Component, "FilterLeads." These are the inputs it is accepting from the upstream "Leads" Component.

The next figure displays the Output side of the Filter Component. Notice that the pass-through option for fields coming from Input1 is enabled; that is, the Component is specified to accept pass-through fields.

- 34 -

Components

Populating Default Values in Output FieldsProviding a Default Field Value to Replace NULL

A common scenario addresses the need to set default values for fields in a record when the field value is NULL. To this end, create a Component using the String Operations Component.

The ifnull(X,Y) operator returns Y if X is NULL; otherwise, it returns X. You can specify con-stant values for the default value, or reference other fields as shown in the example below.

Providing a NULL Default Field Value

Another common scenario involves a downstream Component with an input field that is not present in the upstream Component. For example, application A may have a field for Pag-erNumber, whereas Application B has no such field. You can specify NULL for this value in the Field Linking dialog. For each field, one of the available values in the list is [NULL]. If you select this value, that field's value is set to NULL for each record.

- 35 -


Transforming Field Values

If you need to transform values you can use the CASE expression to achieve complex value replacement.

A simple example is changing True and False strings to Yes and No strings. The following is an example of the expression in this case:

CASE WHEN (${Field001} = 'True') THEN 'Yes' WHEN (${Field001} =

'False') THEN 'No' ELSE NULL END

You can perform complex expressions to calculate the replacement values or reference addi-tional fields.

Component ParametersParameters are variables you can set for run-time substitution in Component and Pipeline properties. Use parameters to avoid hard-coding property values that are likely to change; this enables you to use a single Component for multiple purposes.

For example, to create a Component that reads data files created on a daily basis, use a parameter for the filename property, as follows:

File name $?{INPUTFILE}

You can also use parameters to specify only part of property. In the following example, the user must only specify the date, rather than the entire path to the data file:

File name file://data/logs/revenue_$?{DATE}.csv

A property can contain any number of parameters. In the following example, the user can specify the report type and date:

File name file://data/logs/$?{REPORT}_$?{DATE}.csv

When defining parameters, the Component author has the option to specify a default value, or to require that the Component user provide a value at runtime. If the author specifies a default value, the parameter is called an Optional Parameter. If no default value is specified, the parameter is referred to as a Required Parameter. A Component with a required

- 36 -

Components

parameter cannot be executed without a value specified at runtime. Refer to the following table for a breakdown of this concept.

User-Supplied Value

Component Parameter Type

Result

Value Required/Optional Value

None Required Error: No value provided for required parameter

None Optional Component parameter's default value

Parameter Syntax in Designer and SnAPI

When authoring a Component using either the Designer or the SnAPI programmatic interface, a parameter is specified using the $?{PARAMETER} syntax.

Specifying Parameter Values at Runtime in Designer

When executing Components or Pipelines from within the SnapLogic Designer, the Run tab in the slider displays the parameter names and their default values (if any), and enables you to specify the values you want to set. In the following example, the Leads_to_Prospects Pipeline is open on the canvas, and the slider displays the Pipeline Run information. The parameter, "LEADS," contains a value pointing to the file required; the value of the parameter, "INPUT_DELIMITER," is set to a comma, and so forth.

Validating a ComponentWhen you create a Component, you can invoke the validation process at any point. Validation can perform basic checks, such as:

l Ensure that all the required properties have values specified

l Confirm that property values with constraints are set to valid values

- 37 -


l Check that all required input and output views are defined

l Verify that field and view name references are correct

In addition, each Component can perform more advanced validation tasks that are specific to the Component's function. For example, validation can ensure that a specified file exists or that a user name and password are valid.

Validating Components in Designer

The Designer displays a Validate button at the bottom of the slider when you view any Com-ponent that supports validation. Clicking Validate sends the Component to the SnapLogic Server for validation. If anything fails to validate, the SnapLogic Server replies with a mes-sage the items in the Component that require attention. Typically, most Components enu-merate all the validation fails at once, but there are situations where further validation failures are only detectable after previous ones are rectified. For example, until a missing database hostname property is rectified, it is not possible to determine whether the user cre-dentials are correct.

Validating Components in the Management Console

Outside of the design process, you can also validate Pipelines and their Components in an execution context. When the Management Console displays Pipelines or the Components within a Pipeline, you can run validation from the console. Refer to the "Management Console" section for details.

- 38 -

PipelinesA Pipeline is a collection of Components linked together to orchestrate a flow of data between end points. For example, a simple Pipeline may read data from an RSS feed, reformat it, and write it to a database.

Creating a Pipeline in DesignerCreating a Pipeline involves selecting the Components that represent data sources, targets, and any data manipulation operations, and linking them together in the logical order.

The main modes of action you can take on Pipelines in Designer are:

l Sketch: Drag and drop Components onto the canvas and link them, leaving con-figuration for another time. (Refer to the Field Linking section for details.)

l Configure: Click on the Pipeline's Components and links displayed on the canvas, and use the slider to configure them. (Refer to the Component Configuration and Field Link-ing sections for detailed configuration instructions.)

l Run: Execute Pipelines from the canvas and examine their runtime information in the slider. (Refer to the Executing Pipelines in Designer and Scheduling Pipeline Execution sections for detailed execution instructions.)

Follow these high-level steps to work with Pipelines in Designer:

1. Click New Pipeline and give the new pipeline a name. A tab for the Pipeline opens in the canvas.

2. Drag the desired Component templates from the Foundry, or Components from the Library, and drop them onto the canvas.

3. Configure the Components in the slider (refer to the section Configuring a Component in Designer for more details). You can also reverse the order of this step with the next; that is, you can link the Components and then configure them.

4. Link the Components together in the desired order by clicking on each Component and dragging the connecting arrow to the desired downstream Component.

5. Configure the field links by clicking on each link that connects two Components to dis-play the slider's field linking dialog.

6. Click Run to edit Pipeline properties and execute the Pipeline.

Configuring a Pipeline in DesignerConfiguring a Pipeline can refer to configuring the following types of properties:

- 39 -

5


l Component properties: this refers to configuring the properties of each Component par-ticipating in the Pipeline. This is discussed in detail in the "Configuring a Component in Designer" section.

l Connection properties (field links): this refers to configuring the properties of each link between Components that participate in the Pipeline. This is discussed in detail in the "Mapping Components with Field Linker" section.

l Pipeline properties: this refers to configuring the properties of the Pipeline as a whole. It is discussed in this section.

Pipeline Properties

To access Pipeline Properties, click Run on the canvas toolbar next to the open Pipeline you are viewing. The slider displays Pipeline Properties.

Pipeline Properties include a number of pages. Navigate between them by clicking the oval-shaped page names, as described:

l General: This tab contains general information about the Pipeline. Refer to the "Gen-eral Pipeline Information" section for details.

l Input: This tab displays inputs defined at the Pipeline level. Refer to the "Specifying Pipeline Inputs" section for details.

l Output: This tab displays outputs defined at the Pipeline level. Refer to the "Specifying Pipeline Outputs" section for details.

l Parameters: Use this page to define and set values for parameters at the Pipeline level and to ensure that all Components with required parameters are mapped to Pipe-line parameters. Refer to the "Pipeline Parameters" section for details.

l Run: Use this page for tasks associated with executing pipelines, such as inputting run-time parameter values, selecting a data tracing option, previewing data, executing the Pipeline, and monitoring its status. Refer to the "Executing Pipelines" section for detailed information.

General Pipeline Information

This screen contains general information about the Pipeline.

l URI: The unique URI that SnapLogic assigns to the Pipeline.

l Component: The Component used. The value here is "Pipeline."

l Author: Optionally enter the username of the Pipeline creator.

l Description: Optionally enter a description for the Pipeline.

l Related Pipelines: This button enables you to enter optional metadata information about other Pipelines that together provide useful data streams; this option facilitates "data serendipity." You can describe the correlation between a field from the Pipeline in question and a parameter in the target Pipeline. This information is metadata only; the

- 40 -

Pipelines

SnapLogic Server does not execute on this information. Click Related Pipelines and follow these steps to add a relation:

1. In the Related Pipelines screen, enter the URI of a Pipeline or click the browse button to locate a Pipeline on any server for which your Designer is configured. If you use the browse option, the Select Pipeline screen dis-plays.

2. Optionally enter a Display name for this entry and click Add.

3. You can then add correlation information, clicking Add Row or Remove when necessary, and specifying the source field and target parameter names for each row.

4. Click OK to finish.

l Scheduler: This button launches the Scheduler, which enables you to execute Pipelines automatically at designated times. Refer to the "Scheduling Unattended Pipeline Execution" section for details.

Specifying Pipeline Inputs

This page displays the inputs defined at the Pipeline level. Pipeline-level inputs are mapped to Component-level inputs in the Pipeline.

You can add and remove inputs by using the + and - buttons. Clicking the Add (+) button dis-plays the Add Input Mapping dialog. Enter a name for the input and select the Component to which your Pipeline-level input maps. Click Finish, and find your new input displayed in an additional tab on the Input page.

Specifying Pipeline Outputs

This displays any outputs defined at the Pipeline level. Pipeline-level outputs map to Com-ponent-level outputs within the Pipeline.

You can add and remove outputs by using the + and - buttons. Clicking the Add (+) button dis-plays the Add Output Mapping dialog. Enter a name for the Pipeline-level output and select the Component to it maps. Click Finish, and find your new output displayed in an additional tab on the Output page.

Pipeline Parameters

Use this page to define Pipeline-level parameters. Use it also to ensure that all Components with required parameters are mapped to Pipeline parameters, and that all required Pipeline parameters have values.

Parameters are variables you can set for runtime substitution in Component and Pipeline prop-erties. Use parameters to avoid hard-coding property values that are likely to change; this enables you to use a single Component for multiple purposes. Pipelines can define parameters which are then mapped to Component parameters. Like Component parameters, Pipeline parameters can be either Required Parameters or Optional Parameters. Refer to the following table for a list of outcomes for each combination of Pipeline and Component parameters.

- 41 -


User-Supplied Value

Pipeline Param-eter Type

Component Param-eter Type

Result

Value Required/Optional Required/Optional Value

None Required Required/Optional Error: No value provided for required parameter

None Optional Required/Optional Pipeline parameter's default value

A Pipeline must have the following requirements met in order to execute:

l All required Pipeline parameters must have values provided by runtime.

l All of the Pipeline's Components with required parameters must be mapped to Pipeline parameters.

Consult the following figure for an example of a parameter mapping.

The first Pipeline Parameter (that is, a parameter at the Pipeline level) is LEADS, which, in the Mapped To column, is mapped to the Leads.INPUTFILE Component parameter, with a Default Value that specifies the path of the .csv input file containing information on sales leads.

Updating a Pipeline in DesignerWhen you add a Component to a Pipeline, you are adding a copy of that object. The Pipeline recognizes the interfaces of the Component (that is, the Component's inputs, outputs, and parameters--not the Component's property values) as they are defined at that point in time. While the Pipeline always sees a Component's current property values, subsequent changes to the Component's interface are not reflected automatically in the Pipeline's copy of that object.

If you open a Pipeline in Designer, a background check verifies that its Components are up-to-date. A message displays to inform you that Components within the Pipeline have been auto-matically updated to reflect any changes if that is the case.

If you have completed a Pipeline that the Scheduler now executes, and then you modify the interface of one of its Components, future scheduled runs of that Pipeline may fail. To update

- 42 -

Pipelines

the Pipeline with the new version of the Component you have changed, you can simply open it in Designer, where it will automatically submit to a background check. You can also delete the object from the Pipeline and add it again.

How to Move a Pipeline

1. Select the Pipeline in the Library, then choose Library > Move.The Select Destination dialog opens.

2. Select the new location for the Pipeline and click OK.The Pipeline and its related resources are copied over.

Mapping Components with Field LinkerAfter you connect Components to each other to indicate the direction of data flow, you per-form the more specific process of field linking. Field linking is the detailed act of mapping Component outputs to inputs of downstream Components.

Field Linking in Designer

To connect two Components in Designer, select the bottom (output) frame of the first Com-ponent and drag your mouse to join it to the top (input) panel of the downstream Component. To link fields, select the connection between two Components. The slider displays the Field Linker.

The output frame of the "from" Component appears purple, and is joined to the input frame of the "to" Component, usually green. The line between them indicating their connection has a ring in its center; this ring represents the link between the Components. Selecting the link dis-plays the Field Linker in the slider. The Field Linker displays output fields in a column along-side input fields of the downstream Component.

The Field Linker automatically suggests field-to-field mappings when field names are the same (for example, the "Address" output field of the upstream Component is automatically linked to the "Address" input field of the downstream Component). You do not have to accept these suggestions. Change individual links by clicking on the output field in question and select-ing an alternative from the drop-down list.

Note: If you do not want automatic Smart Link suggestions, disable the option in the Pipeline Options of your Pipeline Settings.

When the Field Linker finds no obvious correspondence between fields, it displays only a drop-down list in the output column, from which you can select the appropriate field. For example, perhaps SnapLogic did not suggest an output field corresponding to the "Work_Phone" input. Clicking Select Field next to "Work_Phone," the other Component's "Phone_w" field is high-lighted as the appropriate output. This link is manual, as opposed to the automatic sug-gestions provided by SnapLogic.

On the right of the input column is a set of tools. The Field Linker's Tools include:

l Smart Link:Use this command to generate field linking suggestions. The Field Linker automatically suggests field-to-field mappings when field names are the same (for

- 43 -


example, the "Address" output field of a CSV Read Component is automatically linked to the "Address" input field of the CSV Write Component. By default, these suggestions already display when the Field Linker first opens. However, if you have changed this default behavior in your settings, or if you have cleared the auto-generated sug-gestions, click Smart Link to regenerate them. A confirmation prompt displays to warn you that any fields you have manually linked will be unlinked should you continue.

l Clear All: Use this command to clear all field linkings, manual and auto-generated alike. A confirmation prompt displays to warn you that any fields you have manually linked will be unlinked should you continue. The result is that all of the rows in the out-put Component column prompt you to select a field, as shown in the figure "Field Linker: Clear All Command".

l Null All: Use this command to set all output fields to NULL. This means that the fields in question provide no output values for the downstream Component. A confirmation prompt displays to warn you that any fields you have manually linked will be unlinked should you continue. The result is that all of the rows in the output Component column display "NULL."

l Null Remaining: Use this command to set any unlinked fields to NULL. This means that all fields that have been auto-linked or fields you have linked manually retain their links, while any fields that remain unconnected will provide no output values for the downstream Component. The result is that all unconnected fields in the output Com-ponent display "NULL."

Creating Data Service Pipelines and Accessing FeedsSnapLogic Pipelines fall into two broad categories:

l Pipelines that are self-contained, with no inputs or outputs.

l Pipelines that have inputs or outputs, or both.

A Data Service Pipeline is a special form of the second category: it is a Pipeline that has only one output view, and no input views. SnapLogic automatically makes the output of a Data Service Pipeline available as a data service endpoint. Data Service Pipelines provide data in response to a HTTP GET request. These data service endpoints provide a simplified interface to SnapLogic data streams and are easily consumed by any programming language that sup-ports a basic HTTP library. In particular, they are very Ajax-friendly, and can be readily con-sumed in Javascript using XMLHttpRequest().

Creating a Data Service Pipeline in Designer

A Data Service Pipeline is created by assigning the output of one the Pipeline's Components to an output of the Pipeline. The following diagram is an example of a Pipeline that can be used as a data service endpoint: Pipeline P contains Components C1 and C2. Component C1 is linked to Component C2 by connecting C1's output, C1_Output1, to C2's input, C2_Input1. To create a data service endpoint, assign an output to the Pipeline's output.

- 44 -

Pipelines

The following figure is an example of a Data Service Pipeline defined in Designer. In this example, C2's output C2_Output1 has been assigned to the Pipeline output P_Ouput1. This assignment is performed when adding an output to the Pipeline in the SnapLogic Designer.

Accessing a Data Service Feed in Designer

To access a data service feed directly from Designer, in the Pipeline Properties General tab (shown in the following figure), click on the URI link to access the menu. If the Component or Pipeline has a valid feed, you can select it. This opens a new browser window with the feed's URI.

- 45 -


Accessing a Data Service Feed in Your Browser

To access the Data Service Pipeline, point your browser or application at the data service URI, as in the following example:

http://host:port/feed/P/P_Output1

Note that the URI of the service endpoint is preceded by the /feed token and appended with the output name of the Pipeline. The /feed prefix is necessary to indicate to the server that a GET to this URI triggers Pipeline execution. The output name suffix is used to specify the out-put you are requesting.

You can pass parameters to the Pipeline in the HTTP URI, as follows:

http://host:port/feed/P/P_Output1?P_PARAM1=value&P_PARAM2=someothervalue

The Representation of data from a data server endpoint is negotiated between the application and the SnapLogic Server. You can explicitly request a certain representation by specifying the sn.content_type parameter appended to the URI, as in the following example:

http://host:port/feed/P/P_Output1?sn.content_type=text/html

Refer to the Output Representation Formats section for a list of the supported representations. Components that produce structured or user-defined output, such as the HTML Formatter and XML Write Components, output data in a representation that cannot be modified in a mean-ingful way by specifying the sn.content_type parameter. Refer to the following diagram.

- 46 -

Pipelines

Error Handling to Address Connection Problems and Data ErrorsThe most common causes of Pipeline run failures are connection problems and data errors. You can design Pipelines with tolerance for these obstacles to minimize run failures. You can configure connection, reader, and writer Components for a specified number of retries at spec-ified intervals when they encounter network connection problems. You can also prevent a sub-set of data errors from causing an entire Pipeline run to fail by setting a tolerance level for the number of errors and creating an error output that collects the erroneous records. For exam-ple, consider a Pipeline that reads data from a comma-separated file. Such a Pipeline begins with a CSV Read Component that outputs data to not one, but two CSV Write Components: a standard CSV Write Component that writes data to the target file, and a CSV Write Component that collects error rows and writes them to a bad data file.

- 47 -


Follow these instructions to configure error handling for the sample Pipeline shown in the dia-gram of a Pipeline with an Error Output:

1. Create a new Pipeline, starting with a CSV Read Component. Add two CSV Write Com-ponents.

2. In the CSV Read Component Properties, click the View/Edit link for the Error Handling property. If you are in tree view, you can expand the Error Handling folder and edit the properties there.

3. Specify your error handling preferences in the Error Handling Options dialog. Error Handling Options address both data errors and connectivity problems. The first option regards data errors; the rest address connection failures:

l Maximum number of data errors: This setting specifies the tolerance for errors before a Component fails with an error status. A Component executes suc-cessfully if its errors do not exceed the maximum specified here, and if the errors are sent to error outputs. If you do not specify an error output to collect the bad data, and the Component encounters any errors at all, it will fail even without exceeding this maximum. (Note also that if the number you specify here is greater than the number of records in the source file, the Pipeline executes suc-cessfully even if the entire output is written only to an error file.)

l Wait time before retrying: If the initial network connection attempt fails, the number of seconds SnapLogic must wait before retrying to connect. This time period is the basis for the next option.

l Retry strategy: There are two retry strategies:

l Wait time based: The wait-time-based strategy retries to connect peri-odically, using the same interval every time. The interval it uses is specified in Wait time before retrying option. For example, if the Wait time before retrying is set to 15 seconds, the wait-time-based strategy retries again at 30 seconds after the initial failure, and again at 45 seconds, and every 15 seconds until it exceeds either the Maximum number of retries or the Time-out in seconds.

l Exponential backoff: With the exponential backoff strategy, the previous wait time is doubled between retries. The initial interval it uses is specified in Wait time before retrying option. For example, if the Wait time before retrying is set to 15 seconds, this strategy waits 15 seconds before the first

- 48 -

Pipelines

retry, 30 seconds before the next retry, 60 seconds before the third retry, and continues to double the interval until it exceeds either the Maximum wait between retries, the Maximum number of retries, or the Timeout in sec-onds.

l Maximum wait between retries: This sets the upper limit (in seconds) on wait time. If you are using the interval doubling strategy, the wait time between retries increases cumulatively until it reaches the value in this field.

l Maximum number of retries: Maximum number of times the Component must retry in the case of a network connection failure. To set an unlimited number of retries, enter -1.

l Timeout in seconds: Time limit on how long the Component must retry con-necting in the case of a network connection failure. This value covers the period starting when SnapLogic initiates the first connection attempt; attempts continue until this timeout is reached.

4. In the CSV Read Component Output screen, click the create error outputs link. Snap-Logic presents field suggestions. You can agree and click Apply All Changes, or reject the suggestions by clicking Close. The Component Output screen displays the new Error Output you have added.

5. Link the CSV Read Component to the CSV Write Component you have designated to catch the bad data, as shown in the diagram at the beginning of this section. Because the CSV Read Component has two outputs, a Select Outputs to Link dialog prompts you to specify the output that provides data to the error file. Select your Error output.

When you execute the Pipeline, intermittent connection problems or data errors do not have to prevent its successful execution. If it encounters network connection problems, automatic retries often remove the obstacle without creating a fail. If the Pipeline con-tains a subset of bad records, they can be written to an error file without preventing the successful records from loading. You can inspect the error file and address the erro-neous records separately.

Executing PipelinesYou can execute Pipelines using several different interfaces, starting with the Designer. Using Designer is easy and intuitive, but can be impractical if you prefer to execute Pipelines on a periodic schedule, or trigger them from other applications. For this reason, SnapLogic pro-vides other interfaces. Any of the remaining methods can be invoked from cron or other sched-uling software.

l SnapLogic Designer

l Management Console

l SnapAdmin Utility

l SnAPI

- 49 -


l HTTP directly for Pipelines that have mapped output views; that is, for Data Service Pipelines.

If you have configured your servers to run in a cluster, the process of executing Pipelines in a cluster environment is identical to executing them on an independent server. If you submit a Pipeline execution request to a cluster, it is performed by the cluster; if you submit a request to an independent server, it is performed by an individual server.

Executing Pipelines in Designer

To begin the execution process in Designer:

1. Open the desired Pipeline on the canvas, and click Run. The slider displays Pipeline Prop-erties.

2. Go to the Run tab.

The Run tab has the following sub-tabs:

l Run: Use this tab to specify runtime parameters by inputting values directly into the Value column, select a data tracing option if desired (refer to the "Tracing Data to Debug Pipeline Execution" section for details), and execute your Pipeline by clicking the Run button.

l Preview Data: Use this tab to preview data for Pipelines with outputs defined at the Pipeline level. Pipelines that do not produce outputs at the Pipeline level do not support data preview. For Pipelines that do have defined outputs and support data preview, the preview feature is identical to the preview feature at the Component level.

l Runtime Information: Use this tab to view Pipeline status updates and execution logs. The figure, "Pipeline Runtime Information," displays a list of completed Pipeline runs. You can view the log and statistics for each run by clicking on the link provided in its status.

Clicking on the View log/statistics link provided with a Pipeline's status update displays the information for that run in a separate tab, Run1, next to the Runtime Information tab. In the Run1 tab, the Log file displays first. Use the Less and Full buttons to toggle between sum-mary and detailed views of the log. You can also use the Print and Copy icons to send the file content to a printer or to your clipboard, respectively.

Access the Pipeline statistics report by clicking Stats. A sample of the statistics screen is shown in the figure, "Pipeline Run Stats."

The Pipeline statistics report includes the following sections:

l Summary: This section reports the execution time, number of Components, total number of inputs, and total number of outputs in the Pipeline.

l Breakdown By Component: This section reports the execution time of each Com-ponent, its number of inputs, and its number of outputs.

l Component Run Times: This graph illustrates the time required to execute each Com-ponent pictorially. Components are listed along the Y axis, over time along the X axis.

- 50 -

Pipelines

l Record Breakdown: This graph illustrates the number of records extracted and loaded between Components, pictorially. Components are listed along the Y axis, over Record Count along the X axis.

Aborting a Pipeline

As of 3.7, you can abort a Pipeline from within Designer by clicking the abort link on the Run status of that Pipeline.

Executing Pipelines Using SnapAdmin

You can use the SnapAdmin utility to invoke the pipeline start command. From within a shell script, use the -c option to SnapAdmin to run a series of commands. Refer to this example of what a typical command file contains:

connect server http://localhost:443

pipeline start /Examples/Pipeline1

Executing Pipelines from the Management Console

You can execute a Pipeline from the following screens of the management console: History, Events, Pipelines, and Servers.

Select the Pipeline you want to execute by using the checkbox in the left column of any of these screens and click Run in the upper right corner. (The Pipelines screen is the only excep-tion to this: it displays only one Pipeline at a time, so there is no need to specify which Pipe-line you want to run. Simply click Run in that screen.)

For more information, refer to the Management Console section.

Executing Data Service Pipelines from HTTP

If the Pipeline you wish to invoke is a Data Service Pipeline; that is, if the Pipeline has mapped Outputs, you can start the Pipeline using the Pipeline's Data Service URI. For exam-ple, you can execute the Pipeline /SnapLogic/User/Exercise_3/CensusFeed by directly typ-ing into the address bar of your browser as follows:

http://servername:443/feed/SnapLogic/User/Exercise_

3/CensusFeed/Output001?sn.content_type=text/html

The /feed prefix tells SnapLogic that you wish to invoke the specified pipeline and read from the output view you specified, in this case Output001. The argument sn.content_type=text/html tells SnapLogic which output representation to use. You can specify Pipeline parameters in the request using standard HTTP syntax. Refer to this example of the request to specify a value for the CENSUS parameter for the CensusFeed Pipeline:

http://servername:443/feed/SnapLogic/User/Exercise_

3/CensusFeed/Output001?CENSUS=file://tutorial/data/alt_

census.csv&sn.content_type=text/html

You can invoke Pipelines by this method from cron or other scheduling software.

To view execution logs in this method, use the management console by entering its URI into your browser address bar: http://<hostname>:<port>/console.

- 51 -


Scheduling Unattended Pipeline ExecutionUse the Scheduler to schedule periodic, unattended executions of a pipeline. The SnapLogic Server runs the Pipeline unattended at the dates and times you specify, and using any param-eter values you specify.

Scheduling Pipeline Execution in Designer

Access the Scheduler in Designer by clicking the Scheduler button in the General page of Pipe-line Properties. The "Scheduled Events For Selected Pipeline" screen displays. With no sched-uled runs for a new Pipeline, the screen is initially blank. If you have already set an execution schedule for the Pipeline, its specifications display here.

In a clustered environment, the Scheduler ensures that only one scheduled Pipeline is executed at a time in a cluster. The head node will check whether the same Pipeline is waiting to run or is already running before scheduling a job. If is already running the scheduled job will be skipped.

Use the "Scheduled Events For Selected Pipeline" screen to add, modify, delete, and execute Pipeline schedules as follows:

l New: Click New to add a new Scheduler entry for the Pipeline. The Scheduler Prop-erties screens display. Complete the properties pages as described in the "Scheduler Properties" section.

l Edit: Select a Scheduler entry on the screen and click Edit to change the runtime spec-ifications for the entry. The Scheduler Properties screen displays. Modify the properties pages as described in the "Scheduler Properties" section.

l Delete: Select a Scheduler entry on the screen and click Delete to remove it.

l Run Now: Select a Scheduler entry on the screen and click Run Now to execute the Pipeline immediately.

l Close: Click Close to leave the Scheduler.

Scheduler Properties

Access the Scheduler Properties pages by clicking the Scheduler button in the General page of Pipeline Properties, and then clicking New to create an entry, or Edit to modify one.

Scheduler Properties consists of several pages, through which you can navigate using the side-bar menu on the left:

l General: The General properties page is shown in the figure, "Scheduler Properties: General Page." It includes the descriptive name of the Pipeline to run and the Pipeline's URI.

l Schedule: The Schedule page is where you specify the Pipeline's run dates and times. You can select the desired attributes of each scheduling column and view the resulting summary description in the Summary field.

You can specify the Month, Day, Date, Hour, and Minute on which the Pipeline should execute. For example, a Pipeline runs weekly at 3:00 am. on Tuesdays.

- 52 -

Pipelines

If necessary, select multiple non-consecutive entries in any column by holding the Ctrl key while highlighting multiple fields.

Select a range of consecutive entries by holding the Shift key.

To run a Pipeline once an hour, select the Minute at which you want the pipeline to execute; for example, selecting :00 runs the Pipeline at the top of the hour, whereas selecting :15 runs the Pipeline at quarter past the hour.

If you want to run the pipeline every 15 minutes, make multiple selections in the Minute column by holding down the Ctrl key and highlighting :00, :15, :30, and :45.

l Parameters: The Parameters page displays runtime parameters defined for this Pipe-line. The values you enter here are used at the scheduled runtime.

l Exclusions: The Exclusions page, enables you to specify exceptions to the execution schedule you defined. The Exclusions page works exactly like the Schedule page but the settings you select here specify times not to run the Pipeline. Toggle the value of the Exclusions enabled check box to suspend or enforce your exception.

l Notifications: The Notifications page enables you to receive notifications of execution successes and failures. You can receive notifications by way of additions to a specified text file, or through email. Support for email notifications is only enabled if your Snap-Logic Server has been configured with information about your outgoing SMTP server. The default snapserver.conf file is configured for logging only.

Enabling Email NotificationsTo enable email notification of execution successes and failures:

1. Edit the snapserver.conf file.

2. Uncomment and specify the appropriate values for the [[email]] section, as shown in the following example:

# Notifications

[notification]

# [[email]]

# smtp_server = smtp.gmail.com:587

# smtp_use_tls = yes

# smtp_login = [email protected]

# smtp_password = some_password

# to = some_target_email

# from = [email protected]

# subject_prefix = NOTIFICATION

# success_template = email_success_notification.tmpl

# failure_template = email_failure_notification.tmpl

[[file_write]]

filename = notification.txt

root_directory = $SNAP_HOME/../logs

- 53 -


# success_template = file_success_notification.tmpl

# failure_template = file_failure_notification.tmpl

3. Restart the SnapLogic Server and reconnect to the Server in the Designer.

Notification Templates

Templates have been provided that allow you to send custom notifications for successes and failures, as defined by the success_template and failure_template settings in the Noti-fication section of the snapserver.conf file. The parameters available for use in these tem-plates include:

l $name : Name of the Pipeline execution event

l $uri : Pipeline URI

l $status : Result of the Pipeline execution 'Completed' or "Failed'

l $hostname : Hostname of the server that ran the Pipeline

l $start_time : Time the Pipeline run began

l $end_time : Time the Pipeline execution ended

l $status_uri : URI to get status information on this run

l $log_uri : URI to get log information related to this run

l $err_msg : If status is "Failed", this contains the error message

Tracing Data to Debug Pipeline ExecutionThe SnapLogic Server has a data tracing capability that enables Components in an executing Pipeline to dump the data being sent through each Component's inputs and outputs. Use this data tracing for testing, debugging, and validating Pipelines. With data tracing, you can exam-ine the contents of every record as it enters or exits each Component in the Pipeline.

The data is written out to trace files using a comma-separated format, with each record ter-minated by a newline. The data traced can include:

l Component inputs: All Component inputs in the Pipeline dump their data into trace files.

l Component outputs: All Component outputs in the Pipeline dump their data into trace files.

l Component inputs and outputs: All Component inputs and outputs are traced.

You can turn data tracing on and off within the Designer, in SnAPI, or through the SnapLogic Server Configuration file.

Data Tracing in Designer

Follow these instructions to enable data tracing when you execute a Pipeline:

- 54 -

Pipelines

1. With your Pipeline open in the Canvas, click Run from the canvas toolbar.

2. Select the Run menu in the slider's Pipeline Properties. From this screen, select the Run tab.

3. Set the parameters and make other adjustments to your Pipeline as required.

4. Click the down arrow on the Run button that resides under the list of Parameters to dis-play a drop-down list of data tracing options

5. Select a tracing option from the Run drop list. The options are:

l Run, No Trace: This is the default setting. No trace files are created.

l Trace INPUT: Use this setting to force all Component inputs in the Pipeline to dump their data into trace files.

l Trace OUTPUT: Use this setting to force all Component outputs in the Pipeline to dump their data into trace files.

l Trace ALL: Use this setting to force all Component inputs and outputs to be traced.

6. Click the Run button.

7. Examine the trace files, as described in the Data Trace Files section.

Data Tracing through the SnapLogic Server Configuration File

You can enable data tracing by adding a trace_data line in the snapserver.conf file's [com-ponent_container] section, within the [[cc1]] or [[cc2]] subsections, as follows: trace_data=input,output.

Set the parameter to input, output, or input,output. Note that data tracing settings you specify when executing a Pipeline in Designer or SnAPI override any settings in the snapserver.conf file.

Refer to the Configuring SnapLogic Server section for more information on the configuration file.

Data Tracing Files

Trace files are written to the log/traces directory by the Component Container process that contains the Component. For each execution of the Pipeline, a subdirectory is created using the Runtime ID of the pipeline. The trace file name is constructed by concatenating the fol-lowing three values:

l the Component's name

l the input or output name

l a .in or a .out suffix denoting whether the trace contains input our output data, respec-tively

For example, a CSV Read Component called "FileReader," used in a Pipeline whose output is named "Output1," results in the trace file name: FileReader.output1.out.

- 55 -


If the Pipeline has a DB Write Component named "DataWriter," whose input is named "Input1," then the resulting file name is: DataWriter.input1.in.

If the Pipeline itself resides in another Pipeline named "Pipe1," and then executed, then the names of the files are: Pipe1.FileReader.output1.out and Pipe1.DataWriter.input1.in.

- 56 -

SnapsA Snap is a SnapLogic software package that adds to the functionality and connectivity pro-vided by the SnapLogic Server. For example, a Snap may add connectivity to SalesForce, or add functionality such as Data Cleansing. Snaps are typically purchased from the SnapStore and installed on a SnapLogic server via Designer. A Snap usually installs one or more of the following SnapLogic objects:

l A collection of Component templates that are functionally related, such as the Sales-force Snap.

l A Wizard that helps create Components from component templates.

l A collection of Components and Pipelines.

A Snap can perform as simple a task as to read data from a file, or as complex an operation as to connect to an instance of Microsoft Dynamics CRM, analyze the source data, and provide full access (data and functionality) to all standard and custom objects within Microsoft® Dynamics CRM. When changes have been made to standard or custom objects, the Snap adapts and provides you access that takes this change into account.

A Snap in SnapLogic is comparable to a smart phone app, a browser add-on, or an application plug-in. A Snap can perform a simple task, such as reading data from a file, or a more involved grouping of tasks, such as adding a comprehensive set of Insert, Update, Delete, Upsert, and Search capabilities to all Microsoft Dynamics CRMobjects. You can build your own Snap, or download Snaps built by the SnapStore community.

Accessing SnapsSnaps are add-on solutions that can be downloaded and installed to enhance the functionality of your SnapLogic Data Server. Snaps can be contributed by the SnapLogicCommunity to solve a specific integration problem. A Snap may include new Components and Pipelines.

The SnapLogic Server comes prepackaged with a library of commonly used Components. These prepackaged Components are at your disposal as soon as you install SnapLogic. They include:

l field-level operations: for example: arithmetic, string, dates, and type conversion

l complex operations: for example: join, lookup, filter, and sort

l advanced operations: for example: compute, regex, and DB analytics

Additionally, SnapStore, SnapLogic's online marketplace, serves many of your specific inte-gration needs by providing additional, specialized Snaps. Some of the Snaps in the SnapStore are free of charge; others must be purchased.

You can access SnapStore directly from the Designer in the following ways:

- 57 -

6


l In the Designer menu bar, click SnapStore.

l When searching for a Snap in the Foundry, if you cannot locate your desired Snap, you can click a link that takes you to the SnapStore.

To download a Snap from the SnapStore, browse the product listing using categories, tags, or searches until you find the Snaps you want. Shortly after you complete the checkout and pay-ment process, you receive two emails: one to confirm your purchase; the other containing any download links to your Snaps. SnapLogic must be installed before you can install your newly purchased Snap.

Installing SnapsYou can integrate with SnapStore--that is, you can install Snaps purchased from the Snap-Store--directly from the Designer. After having selected a Snap from SnapStore, you receive an email containing a URI to access the Snap. You can, but are not required to, download the Snap to a temporary directory on your system. When you install the Snap, you are prompted for either the URI you received (if you have not yet downloaded the Snap), or for the Snap's location (if you have already downloaded it).

Follow these instructions to install a Snap purchased from the SnapStore:

l Launch the installation in SnapLogic Designer either by selecting Server > Install New Snap, or by clicking the Install New Snap button in the Snap Foundry toolbar.

l At the installation screen, you can specify the URI you received in your SnapStore email when you selected the Snap, and then select the Component template from the drop-down list. If a subfolder specified in the URI name does not already exist, it will be created automatically. Or, if you have already downloaded the Snap, click Browse to locate it on your system.

l If you have enabled sandboxing, which creates a security sandbox for Java Snaps by lim-iting access to resources such as network destinations, file system locations, and execut-ing processes, then SnapLogic prompts you for permission to use each Component that the Snap requires. If you deny access to any Component, the Snap installation is can-celed. (For more information on this feature, refer to "Sandboxing to Protect Your Snap-Logic Environment.")

The installer decompresses the required files and installs them, prompting you when the instal-lation is complete. After the Snap installation is complete, the Foundry displays a new cat-egory tab with the Snap's name. This tab contains the available Component templates in the Snap.

Snap Installation in a Cluster Configuration

When you install a new Snap into a cluster configuration, install the Snap only on the head node. The head node then automatically distributes the Snap across the worker nodes. If any worker nodes are offline during Snap installation on the head node, they will be synchronized with the Snap content of the head node when they come online. If you wish to install a Snap on an individual worker node, the installation is not distributed to any other nodes.

- 58 -

Snaps

Configuring SnapsConfiguring a Snap is also a simple process from within the Designer. For your new Snap, the Foundry contains a folder with the Component templates available for the Snap. For more complex Snaps, a wizard is also provided. You can either use the wizard, or edit the Com-ponents directly in the Designer, referring to the "Components" chapter for instructions. Snap-Logic strongly recommends using the wizard when it is available. A wizard is intentionally provided for any Snap whose complexity makes editing the Components directly an involved process. The wizard guides you through a series of questions that vary with each Snap, and that dramatically simplify the configuration process.

Follow these instructions to configure a Snap using the SnapMaker wizard:

l In the Foundry, click on the Category tab displaying the Snap's name and expand it.

l Launch the Wizard within the Snap's folder. The SnapMaker screen appears.

l Enter the Snap-related information requested in the SnapMaker screens. The wizard begins by collecting source connectivity information.

l Select which records to generate from the Snap if you need only specific records; other-wise, select all records.

l Review the Summary screen and click Finish.

The SnapMaker wizard displays status messages as it completes the configuration. After the SnapMaker wizard is done, configured Components for your configured Snap appear in the Library.

Developing Snaps and Further Participating in the SnapLogic CommunitySnapLogic offers an opportunity for systems integrators, data integration developers, and ISVs to build custom applications on top of SnapLogic, as well as to create integrations between SnapLogic and other systems. This is an opportunity to grow revenue and dif-ferentiate your services. You may wish to develop and market custom Snaps if you have a pressing idea for a Snap, or if you wish to respond to Snap requests by developing needed solutions. If so, you can develop and monetize your custom solutions, setting the price for your Snap, and monthly receiving 70% of the sales revenue generated. To facilitate your inter-action with the SnapLogic community, you can participate in the support forums and devel-oper community groups where you can find answers to your questions. You can also choose to help SnapLogic QA new features and Snaps.

Developing Snaps is a straightforward process. Although Snaps vary in complexity and func-tionality, all Snaps follow the same pattern, use the same APIs, and take advantage of the common functions provided by the SnapLogic platform. However, developing Snaps is not the same as using SnapLogic. As a SnapLogic user, you create Components, and assemble Pipe-lines from existing Component templates and the Components you have already created. You focus on data flow and manipulation instead of finer details. By contrast, as a Snap developer,

- 59 -


you approach your task from the perspective of how to create Components that others can use, instead of focusing on how to use existing capabilities.

Concepts to Grasp. A good Snap developer must first be comfortable with using SnapLogic, and must understand these main elements of SnapLogic:

l Component templates and Components

l Pipelines

l Data Services

l Data types

Types of Snaps. There are two general categories of Snaps:

l Connectivity Snaps: Connectivity Snaps add connectivity to an application or data source. The SalesForce.com, NetSuite, and SAP Snaps are examples of connectivity Snaps. These Snaps normally include new Components that access the application API and translate it to the SnapLogic API.

l Solution Snaps: Solution Snaps are higher-level Snaps that implement the business logic for a specific integration scenario, such as "quote to bill" between a CRM system and a financial system. Solution Snaps normally include Pipeline and Component def-initions that implement business logic. They often also include connectivity Com-ponents, or depend on connectivity Components provided by another Snap.

The Twitter Snap in the Snap Developer Tutorial is a connectivity Snap; it adds Twitter read and write capabilities to SnapLogic, but does not provide any predefined Pipeline logic to process Tweets for a specific purpose.

Parts of a Snap. A Snap is composed of three parts:

l Component templates

l Pipelines and Components

l An installation program

All Snaps include an installation program. Connectivity Snaps include primarily Com-ponent templates and Components, while solution Snaps are oriented toward Pipelines and Components. Because the Snap Developer Tutorial profiles a connectivity Snap, it includes an installer, Component templates, and some Components to get its users started.

Considerations for Snap Design. The first step in designing a Snap is to shift perspective to that of a potential user of the Snap. How will someone use your Snap? Use your knowledge of the application in question and how it is commonly integrated to determine what your Snap exposes, which functions the existing platform capabilities address, and how the Snap fits into the data flow model. Ask yourself the following questions:

l Which application objects must the Snap expose as data?

l What data access does the Snap require?

l What data must the Snap needs read?

- 60 -

Snaps

l What data must the Snap write or update?

l Which application functions may the Snap need to call?

l Which transformations are likely to be used on the data?

l Which utility functions from the application should the Snap expose?

The answers to these questions guide your creation of a model of the Snap. Some of its capabilities may require new Components, while others can be addressed by creating Pipe-lines or using existing Components. In general, connectivity Snaps include a reader Com-ponent and a writer Component. They occasionally include one or two function Components. For example, in CRM integration, the "convert" functionality is neither a data source nor tar-get; rather, it is implemented as a transformation or utility function.

For more information about developing Snaps, visit SnapLogic's Snap Developer Doc-umentation page.

- 61 -

- 62 -


AdministrationThis section of the document covers administration functionality of your SnapLogic envi-ronment, such as:

l Starting and Stopping SnapLogic and Component Servers

l Configuring SnapLogic Server

l Configuring Authentication

l Enabling SSL

l Using the Management Console

l Sandboxing to Protect Your SnapLogic Environment

l Importing and Exporting Components

l Running SnapLogic Behind a Proxy

l Understanding SnapLogic Data Types and Output Representation Formats

l Using the SnapAdmin Utility

l Using the SnapLogic Sidekick

Working with Users and GroupsThis topic provides common procedures for working with users and groups in SnapLogic.

How to Create Users

Note: Authentication occurs at the user-level. If you have multiple users that require individual logins, each user will need a separate user account.

User names can contain only ASCII alphanumeric characters and must be lowercase.

Using SnapAdmin

1. Run SnapAdmin as described in "SnapAdmin Utility".

2. Connect to your server by entering: connect server <url>.

3. Set the credentials used for requests to the server to the default set for the admin, as: credential set default admin. Enter the password for the admin when prompted to do so.

- 63 -

7


4. At the command prompt, enter: users create <username> <password>, where

l <username> is the name for the user

l <password> is the password.

5. Restart the SnapLogic Server for the changes to take effect.

Within Designer

You can create users within Designer provided you are not using an external authentication method.

1. Select Server > Users.

2. Click Add User.

3. Type a name for the user and supply a password, then click OK.

How to Create Groups

Note: Access controls assigned to a group only apply to users that are members of that group.

Using SnapAdmin


2. At the command prompt, enter: group create <groupname>, where

l <groupname> is the name of the group you are creating.


Within Designer

You can create groups within Designer provided you are not using an external authentication method.

1. Select Server > Groups.

2. Click Add Group.

3. Type a name for the group and click OK.

How to Add Users to Groups

Add users to groups to simplify access control management for users with similar roles.

Using SnapAdmin


2. At the command prompt, enter: group adduser <groupname> <username>where,

l <groupname> is the name of the group you are creating

l <username> is the name of the user you are adding the to the group

- 64 -

Administration


Within Designer

You can add users to groups within Designer provided you are not using an external authen-tication method.

1. Select Server > Groups.

2. Select a group and click Assign User.

3. Select the user, then click OK.

How to Set Access Controls

For information on how to set access controls, see "Understanding ACLs"

How to Change a User Password

A user can change his password in Designer by going to Server > Change My Password.

Changing the Initial Admin Password

The default password assigned to an admin user is now a standard password, snaplogic_admin_pw. Admin users will be prompted to change their password each time that they log in until they choose a new password that’s different from the default password.

Starting and Stopping ServersThe SnapLogic Server installation provides scripts to start and stop the SnapLogic and Com-ponent Servers. The paths in this section apply if you have kept the default installation set-tings when installing SnapLogic.

Starting and Stopping SnapLogic and Component Servers in Linux

The Linux equivalent file names and locations are in /opt/snaplogic/[release number]/bin/.

You can invoke this script with start/stop/restart arguments.

l snapctl.sh start: Start the server processes.

l snapctl.sh stop: Stop the server processes.

l snapctl.sh restart: Stop and restart the server processes.

Note: Invoking this script with --admin_mode enforces single-user mode. It is rec-ommended to run in that mode for installation and upgrade of Snaps to ensure con-sistency.

Configuring SnapLogic ServerSnapLogic installs several configuration files containing default settings for running the Snap-Logic software. You can edit these files and modify or comment out settings. This section dis-cusses the configuration options available for SnapLogic in its primary configuration file,

- 65 -


snapserver.conf. Some options that must be enabled or disabled in the configuration file also require additional configuration steps; these are broken out into separate topics within this chapter.

l SnapLogic Server Configuration File: snapserver.conf

l General SnapLogic Server Settings

l Component Container Configuration Parameters

l Data Cache Configuration Parameters

l Notification Instructions

l Management Console Configuration

For information on clustering servers, see "Clustering Servers".

SnapLogic Server Configuration File: snapserver.conf

The bulk of your configuration settings reside in the SnapLogic Server Configuration file, snap-server.conf, which resides in the snaplogic\config folder of your installation directory. For example, if you accepted the default settings during installation, your snapserver.conf file location is:

l Mac: /Applications/snaplogic/config/snapserver.conf

l Linux: /opt/snaplogic/config/snapserver.conf

General SnapLogic Server Settings

The general settings in the [main] section of your snapserver.conf file include:

l log_dir: The location of logging directory. Example: log_dir = /opt/snaplogic/logs.

l log_level: Sets the SnapLogic Server log level (possible values are ERR, INFO, DEBUG). Note: the log level is set separately for the server and the Component Con-tainers.

l server_hostname: The hostname for this server. Example: server_hostname = hostname.

l server_proxy_uri: If the public hostname for this server is different than reported by `hostname` then set the server_proxy_uri to the external URI.Example: server_proxy_uri = http://HOSTNAME:DATAPORT

l server_address: The address to which this server binds. Example: server_address = 0.0.0.0.

l server_port, server_secure_port: The port numbers used by the server. Example: server_port = 80 and server_secure_port = 443.

l server_secure_cert: Location of the server certificate. Example: server_secure_cert = /opt/snaplogic/config/host.pem.

- 66 -

Administration

l server_secure_ignore = no: This setting tells SnapLogic to enable SSL. To disable SSL, change the setting to yes. Example: server_secure_ignore = yes.

l polling_interval: The interval of time, in seconds, that a Pipeline receives status updates regarding the Components inside it. Example: polling_interval = 60.

l static_dir: Specifies the directory from which static content is served. All requests for the /__snap__/__static__ URI space are served from within this directory. In the file, SnapLogic is calling this directory __static__, but you can assign any other name. If the directory name starts with a forward slash (/); for example, /tmp/static, then it is interpreted to be an absolute path name. Example: static_dir = /opt/snaplogic/static.

l state_dir: Directory to store state data needed across server restarts. Example: state_dir = /opt/snaplogic/repository.

l explorer_uri: Location of the explorer (as a fully qualified URI)Example: explorer_uri = http://Snap05:443 /__snap__/sta-tic/designer/index.html?mode=explorer

l pipe_to_http_uri_prefix: Specify the prefix that should be added to resource URI, in order to have it executed via pipe_to_http. Example: pipe_to_http_uri_prefix = /feed.

l auth_file_config, auth_file_passwords: If the auth_file option is present, then all authentication information is read from the specified file. In the absence of this option, SnapLogic Server does not start. Examples: auth_file_config = "/opt/snaplogic/config/snapaccess.conf" and auth_file_passwords = "/opt/snaplogic/config/passwords".

l license_file: Location of the license key file. Example: license_file = "/opt/snaplogic/config/license.txt".

l allow_proxying_to: To enable the management console, comment out this line. For tighter security, use a comma delimited list. For more information, refer to the "Con-figuring the Management Console" section. Example: allow_proxying_to = server1.somedomain.com:443, sever2.another-.domain:443.

l auth_plugin, auth_plugin_args, and proxy_auth_header: these properties are used to implement custom authentication.

l LDAP_address: To enable LDAP authentication, uncomment this line and update it to point it to your LDAP URL. This is no longer in the file by default, but is supported for leg-acy reasons. The recommendation is to use the auth_plugin and auth_plugin_args instead.

l log_backup_count: Set the maximum number of backups to create if log rotation has been enabled. Each backed up log gets suffixed with .1, .2, up to number of backups specified. The default value is 5.

- 67 -


l max_log_size: Set a maximum size which applies for each snaplogic log file. By default there is not limit on the size. When a file reaches the upper limit, the contents of the file are rotated to a backup log name (suffixed with .1, .2, etc.) as long as log_backup_count has a non zero value. The size can be specified in bytes, MB or GB.

l client_token_cache_limit, client_token_timeout: If token-based authentication is being used in the client, then client_token_cache_limit is the maximum limit on the number of client tokens cached by the server (default is 10000 entries). client_token_timeout is the time a token is valid for, specified in minutes (default is 1440 minutes, 24 hours). If the account password is changed or the account is deleted, the next request with that token will fail with a 401 error. To renew a token, a call can be made to the get_token api with the old token, which will give a new token valid for 24 more hours. A maximum of 10000 tokens will be cached; more than that and the expired tokens or the oldest token will be removed.

l max_job_limit: Adding this property sets a limit on the maximum number of jobs a data server will accept. Can be used on a worker node and also on a standalone server. Setting it on a head node does not do anything since job execution happens on workers. If one worker hits the throttle limit and fails, the job is tried on all other workers one by one. If it fails on all workers, the job is failed.

Component Container Configuration Parameters

The [component_container] section of the snapserver.conf file is subdivided into sections for each Component Container on your server (for example, [[cc1]] and [[cc2]]). The parameters in this section address SSL, sandboxing, and data tracing functionality:

l log_dir, component_dirs, component_conf_dir, cc_port, cc_secure_port, cc_secure_cert, cc_address, and cc_hostname: These parameters are used for your SSL configuration. Refer to the Enabling SSL section of this chapter for details.

l log_level: Sets the Python Component Container log level. Possible values are CRIT-ICAL, ERROR, WARNING, INFO, DEBUG.

l heartbeat_seconds: Server pings the CC every heartbeat_seconds (set to 0 to dis-able).

l filesystem_root: Changes the filesystem root. By default, the root is the SnapLogic install directory. For information on used of this feature in clustered deployments, see "Clustering Servers".

l disable_sandbox: This parameter controls sandboxing. Sandboxing enables you to run Components in a restricted environment provided by the JVM. By default, sandboxing is enabled. You can disable it by setting the disable_sandbox property to false. Refer to the section on Sandboxing to Protect Your SnapLogic Environment for details.

l trace_data: To enable data tracing, add this line to the [component_container] sec-tion of the snapserver.conf file, and set the parameter to either input, output, or input,output, as follows: trace_data=input,output. To turn data tracing off, comment out or remove the line from the file. The data tracing settings you specify when execut-ing a Pipeline in Designer or SnAPI override this parameter in the configuration file.

- 68 -

Administration

Refer to the "Data Tracing via the SnapLogic Server Configuration File" section for details.

l cc_resdef_cache_enabled and cc_resdef_cache_max_entries: control resdef caching.

Data Cache Configuration Parameters

The parameters in the [data_cache] section of the snapserver.conf file describe your cach-ing settings.

l cache_dir: The location of the cache directory.

l cache_timeout: The amount of time, in seconds, that data should be cached. Example: cache_timeout = 300.

l cache_size: The maximum allowed size of the cache. You can specify the size in bytes (for example, 10000), in kilobytes (in this case, use a KB/kb suffix; for example, 10KB or 10kb), in megabytes (using an MB or mb suffix), in gigabytes (using a GB or gb suf-fix), or in terrabytes (using a TB or tb suffix). Example: cache_size = 10MB.

While caching large data output (greater than 2GB) from a single resource, you need to deter-mine if the OS can support a large file. Also, the Python interpreter might need to be compiled with special flags to handle large files (see http://docs.python.org/lib/posix-large-files.html). Also, the OS should be capable of supporting large files.

l high_water_mark: An integer percentage value (0 - 100) specifying the percentage of the maximum size at which cache cleanup is initiated. The cache can temporarily exceed the maximum size, so SnapLogic recommends specifying a value less than 100% to allow for these temporary excessions. Example: high_water_mark = 90.

l low_water_mark: An integer percentage value (0 - 100), specifying the percentage of maximum size that must be reached once cache cleanup is initiated. Example: low_water_mark = 60.

Repository Configuration

The parameters in the [repository] section of the snapserver.conf file describe your meta-data repository.

l type: The type of database system hosting your metadata repository. Example: type = sqlite.

l path: The location of your metadata repository database file. Example: path = /opt/snaplogic/repository/repository.db.

Notification Instructions

Configure the parameters in the [notification] section of the snapserver.conf file to spec-ify how the Scheduler notifies you about Pipeline execution. For details on configuring these parameters, refer to the section on "Enabling Email Notifications" for more information.

- 69 -

http://docs.python.org/lib/posix-large-files.html




Management Console Configuration

SnapLogic includes a browser-based management console that provides details about the per-formance of executed Pipelines. To enable the management console, open your SnapLogic Server Configuration file, snapserver.conf and add a configuration directive in the form of a line specifying which servers can be proxied, as follows: allow_proxying_to = <domain>. You can use an asterisk ( *) wildcard to allow proxying to any domain, but SnapLogic does not recommend unlimited access. To limit access to specific domains, use a comma-delimited list, as follows:

allow_proxying_to = some_other_domain:443, some_other_domain2:443

Additional setup involves logging into the console on your primary server to add extra servers as required. Refer to the section on "Configuring the Management Console" for detailed infor-mation.

Clustering Configuration Parameters

Add a [cluster] section to your snapserver.conf file to run your SnapLogic Servers in a cluster. The following parameters are supported in this section:

l node_type: Designates whether a node is a head or worker node. See "Clustering Servers" for more information.

l head_node: Used on a worker node to designate the head node in the cluster.

l workers: Used on a head node of a cluster to list the worker nodes.

l jobs_per_worker: The maximum number of jobs a worker can run at a time. Set to -1 to disable it. Both jobs_per_worker and max_job_limit can be used together. In that case, jobs_per_worker is applied on the head node (with queuing of jobs) and max_job_limit is applied on the worker (with no queuing). See "Configuring Job Distribution Across Workers" for more information.

Signed SSL Certificate InstallationThe following procedure explains how to replace the SnapLogic instance's self-signed cer-tificates with digitally signed CA certificates.

Prerequisites:

l Private key used to create a certificate signing request (CSR)

l Signed SSL certificate

l CA bundle certificate

To replace the certificates:

1. (Optional) Backup existing keystore and certificate files: host.cert, host.jks, host.pem.

2. Change the current directory:

- 70 -

Administration

cd /opt/snaplogic/config

3. Create a PKCS12 file using the CA signed certificate:

openssl pkcs12 -export -chain -CAfile <BUNDLE_CERT> -in <SIGNED_CERT> -

inkey <PRIVATE_KEY> -out host.p12 -name tomcat -passout pass:changeit

where:

l <BUNDLE_CERT> refers to the CA bundle certificate

l <PRIVATE_KEY> refers to the private key used to create the signed SSL certificate CSR

l <SIGNED_CERT> refers to the signed SSL certificate

4. Convert the PKCS12 file created in Step 2 into a keystone:

keytool -importkeystore -srckeystore host.p12 -srcstoretype PKCS12 -

srcstorepass changeit -destkeystore host.jks -deststorepass changeit

5. Create the new server certificate:cp <SIGNED_CERT> host.cert

where:


6. Create a new PEM file:

cat <SIGNED_CERT> <PRIVATE_KEY> > host.pem

where:

l <PRIVATE_KEY> refers to the private key used to create the signed SSL certificate CSR


7. Restart SnapLogic services:

/opt/snaplogic/bin/snapctl.sh restart

Clustering ServersAdd a [cluster] section to your snapserver.conf file to run your SnapLogic Servers in a cluster. Clustering is a method of improving performance by allowing multiple Pipeline execution requests to proceed in parallel in a cluster of worker nodes. In clustering, data servers are assigned roles of either head node or worker nodes. The head node then main-tains a queue of job submissions. As jobs come in, they are scheduled to the worker nodes, with an emphasis on load balancing.

- 71 -


A SnapLogic clustering configuration consists of a data server designated as the head node and one or more data servers designated as worker nodes:

l Head node: Jobs get queued by the head node and then de-queued and distributed when a worker node is available to run them. The head node interacts directly with Snap-Logic Designer and SnAPI, and provides cluster execution and reporting data to the man-agement console and to Designer.

New Snaps are installed in the head node, which automatically repeats the Snap instal-lation on each of the worker nodes, provided that the worker nodes are online during installation.

l Worker nodes: The worker data servers accept pipeline execution requests and report completion back to the head node. The worker nodes do not access the repository directly; they point to the head node repository to retrieve Component and Pipeline parameter values. The status of Pipeline jobs on worker nodes can be polled using a status interface. Worker nodes report job completion to the head node.

If you submit a Pipeline execution request to a cluster, it is performed by the cluster; if you submit a request to an independent server, it is performed by the individual server. The proc-ess of executing Pipelines in a cluster environment is identical to executing them on an inde-pendent data server. Follow the process described in the "Executing Pipelines" section regardless of your server environment.

Configuring Clusters

To establish a SnapLogic cluster, designate a data server as a head node and one or more data servers as worker nodes by modifying the SnapLogic Server Configuration file (snap-server.conf) on each node. Follow these instructions to configure your cluster:

l Identify your head node and worker nodes, and note the URI or IP address of each node (for example, http://head:443, http://worker1:443, http://worker2:443).

l Install SnapLogic onto each node. The nodes in a cluster configuration do not have to run the same operating system.

l If not using LDAP authentication, ensure the SnapLogic passwords are synced up on all machines of the cluster. One way to do this is to copy the passwords, cc1_creds and cc2_creds files from the config directory of the head node to all the worker nodes. All accounts (including user accounts created after initial configuration/deployment) need to be synchronized between all cluster nodes. In addition, ACLs must be synchronized across all nodes if custom ACLs are implemented.

l Modify the SnapLogic Server Configuration file (snapserver.conf) on each node as fol-lows:

To designate a data server as the cluster head node, add the following [cluster] sec-tion to its snapserver.conf file:

[cluster]

node_type = head

workers = http://worker1:443, http://worker2:443

- 72 -

Administration

To designate a data server as a cluster worker node, add the following [cluster] section to its snapserver.conf file:

[cluster]

node_type = worker

head_node = http://head:443

l Restart the SnapLogic Servers on the head node and all the worker nodes.

Test the cluster by using the Designer to run a pipeline or using the test interface on the head node: http://head:443/__snap__/__static__/console/cluster-info.html. (You can also navigate to this page from the head node data server page: http://head:8088.) Go to server_cluster, and then to cluster-info.html. From this page, you can submit Pipeline jobs to the cluster.

Configuring Job Distribution Across Workers

Every incoming job is sent to the worker node with the least number of running jobs. Each worker node can run a maximum of one job at a time by default. To increase this, set the jobs_per_worker property in the [cluster] section of the head node. For example, if this is set to five, the head node will first distribute one job to every worker in round robin. If all workers are running one job, then the next job will be sent to the first worker. If all workers are running five jobs already, the incoming job is queued up and the next worker which has a completed job will be assigned the queued job.

Job execution requests which are directed to the head node are load balanced across the work-ers. If a job execution request is sent directly to a worker, the job runs on the same worker and does not apply towards the worker's job limit. If a Pipeline has a Execute Resource Com-ponent, the target Pipeline should be changed to point to the head node to ensure that the jobs spawned by the Execute Resource are load balanced across the cluster. The status of the job queue can be monitored by using the URI http://head:8088/__snap__/cluster/info.htm. This shows the queued jobs, the running jobs and the last few completed jobs.

To add a worker node to the cluster, the new worker node has to be configured appropriately with the right credentials and [cluster] section. The required Snaps have to be installed on the new worker by opening the Designer directly to the worker node. Then the new worker should be added in the head node's [cluster] section and the servers restarted. See the Worker section of "SnapAdmin Commands" for information on how to add or delete worker nodes.

Input/Output Files

Pipeline execution requests which are sent to the head node are forwarded to one of the avail-able worker nodes. The job execution happens on the worker node and any input files which are read or output files to be created are created on the worker node where the job runs. To ensure that the output file can be read without knowing which node the job ran on, a shared filesystem mount can be created and all filesystem operations can be done on the shared mount directory. One way to do this is to change the filesystem_root property to point to the shared mount directory. The filesystem browse option in the Designer shows the files on the head node by default. Using a shared mount point ensures that the files created as Pipe-line output can be used for Suggest and previewed using the filesystem browser.

- 73 -


Troubleshooting

l Since the Pipeline execution happens on one of the worker nodes, the execution logs are created on the worker node. If a Component has to be debugged, the debugger should be connected to the worker node.

l Previewing the output of a Pipeline or a Component from the Designer or reading the out-put of a SnAPI program requires the client to be able to talk to the worker node. If there are firewall rules which prevent the client from talking directly to the worker nodes, these operations will fail. The firewall rules need to be changed appropriately to fix this.

l The jobs_per_worker property can be used to increase the level of parallelism by increasing the number of jobs to run on a worker. Setting this to a very high value can cause issues with excessive memory usage on the worker node, possibly leading to out of memory errors.

l SnapLogic installation by default installs self signed certificates. If the Designer is using SSL connection to the head node, then for Pipeline or Component preview to work, the browser should trust the self signed certificates used by the worker nodes. To do this, open each of the workers SSL URIs https://head:8091, https://worker1:8091, https://worker2:8091 from the browser and accept the prompt which asks whether the certificate should be trusted.

Head Node Failover for Clustering

A SnapLogic cluster consists of one head node and multiple worker nodes. If one of the worker nodes goes down, the head node ensures that no more jobs are sent to the worker which is down and the other worker nodes handle the subsequent jobs. If the head node goes down for some reason, the cluster is no longer accessible since the clients cannot submit any jobs. Head node failover can be used to prevent such a scenario and provide protection against hav-ing a single point of failure.

Failover Configuration Options

To configure head node failover, a second machine is assigned to be the backup head node. The clients talk to the master head node using its hostname and in case of the master going down, the requests are sent to the backup head node instead of the master head node. The fail-over from master to backup can be done in multiple ways, using HTTP failover, DNS failover or IP address failover. The following documentation uses IP address failover using the kee-palived package. keepalived has to be installed on the master and backup head nodes and a virtual IP address is assigned to the master by keepalived. When the master goes down, the virtual IP switches to the backup node. All client requests are sent to the host name cor-responding to the virtual IP, so clients can continue to use the cluster after the master goes down.

Failover Configuration using keepalived

Note: Installing and configuring keepalived is operating system dependent. Depending on how the network is configured, assigning a virtual IP might require changes in the network configuration. If virtual machines are being used, then the VM settings might have to be changed to enable virtual IP addresses.

- 74 -

Administration

The following assumes that the master head node is named headmaster, the backup head is headbackup, headvirtual is the name for the virtual IP address and worker1 and worker2 are the worker nodes. The steps to configure the failover are:

1. Install keepalived on both the head nodes. keepalived has installables and keepalived docs has the user guide.

A sample keepalived.conf file for the headmaster is

global_defs {

router_id my_router

}

vrrp_instance app_master {

state MASTER

interface eth1

virtual_router_id 36

priority 150

advert_int 1

preempt

garp_master_delay 5

virtual_ipaddress {

60.57.189.127 dev eth0

}

}

A sample keepalived.conf file for the headbackup is

global_defs {

router_id my_router

}

vrrp_instance app_master {

state BACKUP

interface eth1

virtual_router_id 36

priority 100

advert_int 1

preempt

garp_master_delay 5

virtual_ipaddress {

60.57.189.127 dev eth0

}

}

- 75 -

http://www.keepalived.org/download.html

http://www.keepalived.org/pdf/UserGuide.pdf




2. Install SnapLogic as usual on the cluster machines. Configure the cluster as documented in the Clustering Servers section. On the master and backup head nodes, update all ref-erences to the local hostname to headvirtual. On the worker node also, update the head_node property to point to headvirtual. On the master head node, add a backup_head_node property pointing to the backup head node.

[cluster]

node_type = head

workers = http://worker1:443,http://worker2:443 backup_head_node

= http://headbackup:443

3. Restart keepalived and SnapLogic on all the machines. Change the Designer to point to http://headvirtual:443.

4. Test the failover by shutting down the headmaster machine and trying to run jobs on the cluster. The client requests should go to the headbackup and the jobs should run suc-cessfully.

If failover is configured using some other means like HTTP failover or DNS failover, then the SnapLogic configuration changes remain the same as mentioned. Mainly, the master head node has a new entry pointing to the backup_head_node and all the configuration entries point-ing to the head node use the virtual name. Since the head node maintains a job queue and keeps track of job distribution, multiple head nodes cannot be enabled at the same time. So if HTTP failover is used, only one of the head nodes should be enabled at a time.

Repository Synchronization

In a failover configuration, both the master and backup head node maintain a copy of the SnapLogic repository database. Any repository changes are replicated from the master to the backup. If the backup head node is offline for some time, the repository can go out of sync with the repository on the master. To resynchronize the repository, the repository database can be copied from the master head node onto the backup head node. If the master machine goes down and there are repository changes on the backup head node, the repository from the backup head node has to be synced up to the master repository after the master node comes back up again.

SnapLogic Server High Availability and FailoverIntroduction

Multiple SnapLogic Server instances can be installed behind a load balancer. This can be used as a means to achieve failover in case one instance goes down, particularly in a cluster envi-ronment. Either software or hardware load balancers can be used. If a request is sent to the load balancer, it will forward the request to one of the SnapLogic instances based on how the load balancer is configured. When a new resource or Pipeline is created, it will get saved on each SnapLogic instance, so each instance maintains a copy of the SnapLogic repository data-base. Pipelines will execute on the instance which gets the initial execution request. A leader is chosen among the SnapLogic instances. All scheduled jobs execute on the current leader. If the leader goes down, a new leader is chosen when the next scheduled job has to run.

- 76 -

Administration

During a Pipeline execution, any output files created by the Pipeline are created on the machine which ran the Pipeline. A shared filesystem can be used to ensure that input and out-put files are available on every machine. The filesystem_root setting controls the location of the input and output files used by the Components. The log file and trace data for each Pipe-line execution also are available only on the instance which ran the Pipeline.

Configuration

Each SnapLogic instance is installed using the installer. The user accounts on each instance need to be synced up. One way to do this is to:

1. Copy the passwords, cc1_creds and cc2_creds files from the /config directory of the first instance to the /config directory of all other instances.

2. Add the failover related entries need to the snapserver.conf config file.

Two entries need to be added to the [main] section of snapserver.conf. The first is backup_servers, which is a comma separated list of the backup instances. failover_proxy_uri is the URI of the load balancer. For example, if the two instances are run-ning at https://instance1.mydomain.com:443 and https:/-/instance2.mydomain.com:443 and the load balancer is running at https://loadbalancer.mydomain.com:443, then the entries on instance1 are:

backup_servers=https://instance2.mydomain.com:443

failover_proxy_uri=https://loadbalancer.mydomain.com:443

The entries on instance2 are:

backup_servers=https://instance1.mydomain.com:443

failover_proxy_uri=https://loadbalancer.mydomain.com:443

3. Install and configure the load balancer in front of the SnapLogic instances. The load bal-ancer configuration details depends on the type of load balancer being used.

Sidekick Configuration

Each of the instances can be configured with a Sidekick. In this configuration, there would be two SnapLogic servers and two Sidekicks, one for each of the SnapLogic Servers. There will be single load balancer in front of the two SnapLogic Servers. The steps to configure the server would be similar to above. Since the user accounts need to be synced up between the SnapLogic Servers, we need to ensure that the Sidekick is getting the updated server cre-dentials. The sequence of steps would be:

l Install SnapLogic on the two server instances. Sync the user credentials by copying the passwords, cc1_creds and cc2_creds files from the config directory of the first instance to the config directory of the second instance.

- 77 -


l Run /opt/snaplogic/bin/snaplogic_sidekick_generate_config.sh on both the instances to ensure that the updated credentials are available to the Sidekick instance for download.

l Install the sidekick machines, and run /opt/snaplogic/bin/snaplogic_sidekick_download_config.sh to download the sidekick configs.

l Start all the instances and enable sidekick for each instance individually. Each instance should be functional with its sidekick.

l Add the backup_servers and failover_proxy_uri entries to the server instances. Install and configure the load balancer and restart the servers.

Troubleshooting

l If using self signed certificates for the load balancer, then the load balancers SSL cer-tificate needs to be added to the trust keystore of the Java CC of all the SnapLogic instances. This can be done using the command

$SNAP_HOME/pkg/java/jre1.6.0_20/bin/keytool -importcert -alias

proxy -file /etc/ssl/certs/myssl.crt -keystore $SNAP_

HOME/../config/host.jks -storepass changeit

where $SNAP_HOME is the install location of the SnapLogic version and /etc/ssl/certs/myssl.crt is the certificate being used by the load balancer.

l Configuration changes are not currently synced between the instance. So changes like creating new user accounts etc need to be done on each individual instance separately.

l In a failover configuration, each instance maintains a copy of the SnapLogic repository database. Any repository changes are replicated to every instance. If one instance is down for some reason, the repository can go out of sync with the other instances. To resynchronize the repository, the repository database can be copied from the working instance onto the failed instance.

Memory Configuration GuidelinesThe SnapLogic Server does not provide a monitoring capability to determine if memory allo-cation is exceeded at runtime. If this happens, the Java process will exit with an "out of heap space" exception. Refer to the configuration scenarios below if you are receiving heap space errors during Pipeline/Component execution.

Memory Configuration

In some cases, Pipeline execution might require you to increase the memory allocation of the Java process. The Java Component Container memory allocation can be defined in the cc2_java.sh/bat files, located in the installdir/product/version/bin/init.ddirectory.

You can set the option using set SNAP_JAVA_MEMORY=-Xmx'MEM_VALUE' to allocate 'MEM_VALUE' memory for the Java CC process. The default value is 256MB. Other valid values are -Xmx512, -Xmx1048 or -Xmx2048. In the case that your physical memory of the machine is exceeded during startup, then the Java CC process will fail to come up and write its log into

- 78 -

Administration

installdir/logs/javacc_stderr. As an example: Some Linux kernels only provide 1.7GB available process memory. If you are setting the 'MEM_VALUE' to 2048 then the process will fail to start. The option should be set to the maximum available physical memory or below that limit.

The memory configuration allows you to increase performance for certain Components such as the Sort or the Aggregate Component. Both allow you to configure its memory usage during execution. As an example, the Sort Component allows you to configure how much memory can be allocated by the Component to sort the records in memory before they are written to disk. Setting this to, for example, 200 MB will allow you to sort 200 MB of records in memory. In the case that the whole input record set will fit into the allocated memory, then the sort will be much faster since it does not need to write records to disk during the sort operation.

Buffer Configuration

During Pipeline execution (in a Java-only execution environment), a buffer is kept between two Java Components that are linked to each other. The first one (defined as the upstream Component) links to the second one (defined as the downstream Component). In between resides a buffer with the default size of 1000, meaning up to 1000 records are kept in memory between the two Components. In the case the downstream Component is slower then the upstream Component in regards to throughput, then the buffer will fill up over time, having 1000 records remain in memory until the down stream Component consumes all remaining records. The assumption is that the upstream Component is either very fast or the process is long running to allow the buffer to fill up.

Lets say you have a larger Pipeline with ten Java Components (meaning nine buffers) and the last downstream Component is the slowest Component in the Pipeline. In this case all nine buffers will fill up over time, leading to 9000 records in memory. This might exceed the allo-cated memory depending on the record size. For this scenario it is advised to decrease the buffer size to a lower setting by defining the -Dsnap.pipe.size='BUFFER_SIZE' option as a Java argument cc2_java.sh/.bat files. The 'BUFFER_SIZE' setting should not be lower then the actual throughput of the slowest Component, meaning if the slowest Component produces 200 records/sec then a buffer size of 200 should be defined.

Concurrent Pipeline Execution

Running large records sets through multiple Pipelines that are being executed concurrently on the same Java CC process requires that Components that are being executed at the same time have sufficient memory available for execution. Important for this scenario are implications described above. If one or more Pipelines have slow running down stream components then the buffers of on or more pipelines for all up stream component will fill up over time during execution. Here you want to configure the buffer size as described above using the lowest through put value. If one or more Components of the Pipelines allocate memory such as the sort or aggregate Component then the combined memory allocation of all Components should not exceed the maximum allocated memory of the Java process (even though they might not be executed at the same time).

- 79 -


Authentication: Active Directory-Based or File-BasedThe SnapLogic Server supports an authentication and privilege model that allows the admin-istrator to grant, limit, or restrict access to Components and Pipelines. Users can access the server in "public" (unauthenticated) mode, or they can authenticate with the server during con-nection. The server can apply access rules to all requests, and grant or deny access depending on the type of operation attempted by the user. Users who share a particular responsibility can be assigned to groups.

By default, the SnapLogic Server is installed with a basic authentication configuration that allows "public" users to perform all operations, with the exception of modifying the Tutorial examples located in the /SnapLogic/Tutorial namespace. Depending on your requirements, you can further modify the authentication configuration to enhance the security configuration.

You can perform active directory-based authentication by configuring SnapLogic for your LDAP database, or file-based authentication by configuring the Snap Access Configuration (sna-paccess.conf) file.

Note: If a user account name is defined both locally within SnapLogic as well as through the authentication service, the credentials are checked against the built-in authentication first. If it is not a valid built-in credential, it is checked against the plug-in. If the same user name is defined in both places, it is a valid account as long as the password matches with either definition.

The SnapLogic Data Server logs all interactions in the access log files: main_process_access.log, cc1_access.log, and cc2_access.log. Each access request logs the time, orig-inating IP address, username, operation, and URI.

Note: If you are in a token-based authentication environment, see client_token_cache_limit and client_token_timeout in the "General SnapLogic Server Settings"

Permissions and User Credentials

The SnapLogic Server supports the following three types of permissions that are applied to the Components within a URI namespace:

l Read: Allows access to basic metadata including description, inputs and outputs, required arguments, and similar objects.

l Write: Allows users to save Components they create.

l Execute: Allows execution of the Component or Pipeline.

A SnapLogic user has an identity comprised of a username and set of groups to which the user belongs. There are two default groups: public and known. All users belong to the public group, but only users who have authenticated by providing a username and password belong to the known group as well.

- 80 -

Administration

Note: The administrator can create users and assign passwords with the Sna-pAdmin utility users command.

The SnapLogic Designer prompts for authentication when you initially connect to a Data Server, or when you add a new Data Server. SnAPI users can provide their credentials through the appropriate interface routines.

Authentication Using the snapaccess Configuration File

The snapaccess.conf configuration file is an XML document containing the rules that specify which users can access which URIs, and how. It contains the following sections:

l Users: Enumeration of individual users.

l Groups: Enumeration of logical groups. These groups are optional and in addition to the system public and known groups.

l UserGroups: Specifies to which groups a user belongs. This is optional, because a user is not required to belong to any administrator-defined groups.

l ACLs: Specifies, per URI, which user or group has which role or privileges.

Understanding ACLs

The SnapLogic Server reads the Access Control Lists (ACLs) from snapacess.conf file on server startup. These ACL rules apply to every REST request to the server. ACLs can be defined at a user level or at a group level. Every group should have an entry in the <Groups> section of snapacess.conf. Users are assigned to groups by adding entries in the <User-Groups> section.

Access rules are configured by adding entries in the <ACLs> section. Each entry specifies a rule for a particular URI and the default SnapLogic installation comes with a set of predefined rules.

Note: Rules for URIs beginning with '/__snap__/' are defined as required by the product and usually should not be changed by the user.

You can add entries for Pipeline URIs. For example, if a Pipeline is defined at URI '/Test/M-yPipeline', then an ACL can be added as follows.

<Location name="/Test/MyPipeline">

DENY USER joe

ALLOW GROUP dev_group PERMISSION READ WRITE EXECUTE NONRECURSIVE

</Location>

Some things to note:

l All rules are recursive by default, unless the NONRECURSIVE directive is specified.

l The directive names are case-insensitive, but user and group names are case-sensitive.

l Permissions can not be specified for DENY rules, it denies all access to the specified URI.

- 81 -


l One or more permissions must be specified for ALLOW rules. Permissions are separated by spaces and the order in which they are listed does not matter.

The syntax for a rule is:

DENY USER/GROUP <name> [NONRECURSIVE]

ALLOW USER/GROUP <name> PERMISSION READ [WRITE EXECUTE] [NONRECURSIVE]

The amount of information returned depends on what permissions the user has for this data:

l READ: Can access anything within the resdef.

l WRITE: Only has permission to see that the resource exists. No resdef data.

l EXECUTE: Can see the resource exists and any resdef data related to execution (known as DESCRIBE view).

In general:

l GET requests require READ access.

l POST/PUT requests require EXECUTE.

l DELETE requests require WRITE.

However, there are variations.

Matching locations are evaluated with the longest prefix first. So, you can specify:

<Location name="/foo">

deny user joe

</Location>

<Location name="/foo/bar">

allow user joe permission read

</Location>

This gives 'joe' read access to '/foo/bar' (and everything below), but denies all access to '/foo' (and everything below, except 'bar').

If a specific match for a user is found, the traversing of the path is stopped and the per-missions are taken from that match. Permissions through group matches (a user can be a member of multiple groups) however are accumulative. If a user is a member of multiple groups and any of them allow access to a path, then access is granted.

Note: Denying read access to the root namespace / may disable some features.

Read access to the / (root) namespace and to the /__snap__/meta/info is required by clients to whom you want to allow dynamic discovery of the SnapLogic Data Server capabil-ities. Without it, a client must know the exact URI, instead of being able to discover it.

Note: To access files outside of SNAP_HOME (which is the install directory; an example on Linux is /opt/snaplogic/[release number]), the file system root property needs to be configured. For example, filesystem_root = /tmp will

- 82 -

Administration

allow access to all files from the Components. This can be done for each Com-ponent Container individually.

Example

As an introduction to the flexibility of the SnapLogic authentication model, consider the fol-lowing scenario:

l Harry and Sally are both members of the Finance department. They have been given permission to execute the Pipelines in the /dept/finance/ space, but are not allowed to modify them.

l Jane is also a member of the Finance department, but she is the maintainer of the Pipe-lines in the /dept/finance/ space, so she has been given full permissions for that space.

l No one outside of the Finance department is allowed access to that space. The Access Configuration file should appear as follows:

<AccessConfig>

<Users>

#Username Description

harry Harry Smith - Senior Analyst

sally Sally Bell - Senior Analyst

jane Jane Burton - Finance Manager

</Users>

<Groups>

#Groupname Description

finance Finance Department

</Groups>

<UserGroups>

#Username Group1 Group2 ...

harry finance

sally finance

jane finance

</UserGroups>

<ACLs>

<Location name="/">

deny group public

allow group public permission read NONRECURSIVE

allow group finance permission read write execute

</Location>

<Location name="/dept/finance">

deny group public known

allow group finance permission read execute

allow user jane permission read write execute

</Location>

</ACLs>

</AccessConfig>

- 83 -


Note: Privileges are assigned to namespaces. They can be applied to a single resource, or to a group of resources with a common prefix.

The server checks the permissions based on the longest matching prefix first, as follows:

<Location name="/dept/finance/">

...

</Location>

<Location name="/dept/finance/sales_report">

...

</Location>

<Location name="/dept/finance/users/jane/">

...

</Location>

l If a user accesses the URI /dept/finance/budget or /dept/finance/audit/results, the longest matching prefix rule applies to the rules for location /dept/finance/.

l If a user accesses /dept/finance/sales_report, the rule for /dept/finance/sales_report is used.

l If a user accesses the URI /dept/finance/users/jane/Q1_budget, the rules from the location /dept/finance/users/jane/ are applied.

Configuring Authentication Based on Active Directory

You can integrate the SnapLogic Data Server with an existing external Active Directory server, which is then consulted for authentication. Authorization for access requests continues to be managed through the snapaccess.conf file.

By default, the user "admin" is the administrative user. If you define an "admin_group" group in the snapaccess.conf file, and add one or more users to it, then only this group's users will have admin privileges. The "admin" user has no special privileges.

Follow these instructions to configure and use Active Directory-based authentication:

1. Edit the snapserver.conf file's [main] section to add an LDAP_address entry. By default, this line is commented out in the snapserver.conf file. To enable Active Direc-tory authentication, uncomment this line and update it to point to the LDAP server URL. Example: LDAP_address=ldap://myserver.mydomain.com:389.

2. If the user name for the Active Directory instance is complex, a user name template can be configured to simplify the user name to be entered by the user. This is usually required when connecting to an OpenLDAP server. To configure the user name template, uncomment the LDAP_user_template entry. Example: LDAP_user_tem-plate="uid=%USER%,ou=Users,dc=mycompany,dc=com". %USER% is a keyword that gets replaced by the user name supplied by the user. If the user logs on as "abc", the user name sent to the LDAP server would be "uid=abc,ou=Users,dc=mycompany,dc=com".

3. Create a group named admin_group in the snapaccess.conf file. Add users that should have administrative privileges to this group in the UserGroups section. For example, if

- 84 -

Administration

admin_user@ad_domain.mycompany.com is the account in LDAP for the admin user, the following are your entries in the snapaccess.conf file:

<Groups>

test_group test group

admin_group Admin users group

</Groups>

<UserGroups>

admin_user@ad_domain.mycompany.com admin_group

test1@ad_domain.mycompany.com test_group

</UserGroups>

4. Restart the SnapLogic Server and Component Containers.

Custom Authentication Plug-ins

As of 3.7, more generic properties are available in the snapserver.conf file to allow for cus-tom authentication plug-ins.

l auth_plugin: the name of the plug-in to use.

l auth_plugin_args: a comma-separated list of arguments for the plug-in

l proxy_auth_header: for use with a proxy-generated header as authentication. If this is enabled, the server must be behind a proxy that protects against unauthenticated access.

Credentials and Requests

From SnapLogic 3.4 onwards, most incoming HTTP requests to the SnapLogic Server need to pass credentials. In earlier versions, the default configuration was to allow anonymous requests, now the default configuration is to disallow anonymous requests. Scripts or appli-cations using SnAPI or REST calls to the SnapLogic Server might have to be modified to pro-vide credentials. If sub-Pipelines are executed within a Pipeline using the Execute Resource Component where the target Pipeline is on another SnapLogic instance, credentials have to be configured for the Execute Resource Component.

SiteMinder SupportAs of 3.7, SnapLogic offers support for Single Sign-On with CA SiteMinder®.

How to Configure SiteMinder for Use with SnapLogic

The goal of this procedure is to get SnapLogic installed on a machine behind a SiteMinder proxy.

1. Once SnapLogic Server is installed, modify the snapserver.conf file as follows:

a. Change the default port from HTTPS port 443 to HTTP port 80.

Uncomment this line:

# The port number used by server.

- 85 -


# To enable HTTP, uncomment this line.

server_port = 80

And comment out:

# The secure port number used by the server.

#server_secure_port = 443

b. Set the proxy uri to be the SiteMinder proxy, with the appropriate DNS name and port.

# If the public hostname for this server is different than reported

# by `hostname` then set the server_proxy_uri to the external URI.

#server_proxy_uri = http://<whatever the siteminder proxy address is>

c. Set the proxy_auth_header to be SM_USER

# To use a proxy-generated header as authentication, uncomment and update property

# below.

# NOTE: If this is enabled, the server MUST be behind a proxy that

# protects against unauthenticated access

# proxy_auth_header = "SM_USER"

1. Start the server:

/opt/snaplogic/bin/snapctl.sh restart

2. Verify it is up and running. Ideally by connecting to the server machine on port 80 with a web browser, but if you only have shell access.

netstat -a | grep -i http

l Configure the SiteMinder proxy to forward all requests from a proxy URI to SnapLogic on port 443.

For example, configure an Apache proxy (using mod_proxy) and modify httpd.conf with:

ProxyPass / http://hostname.com/

ProxyPassReverse / http://hostname.com/

That forwards http port 80 from the proxy to the SnapLogic Server at hostname.com.

If you make the proxy_auth_header X_FORWARDED_FOR (which is added by the mod_proxy), you should be able to log in with Designer with no credentials as your IP address.

1. Configure SiteMinder Policy Server to allow access to the SnapLogic Server. You only need to allow access to the main server port (80 by default) and support GET/POST/PUT.

- 86 -

Administration

2. Once the proxy is set up, and SnapLogic is configured and running, try going to the Snap-Logic root URI on the proxy. It should re-direct you to the SiteMinder login screen. Once you enter your credentials, it should re-direct you to the SnapLogic landing page (the one with all the API links).

3. If that is successful, launch Designer. You should see Designer automatically log in as the SiteMinder user (identified by a numeric id). You should be able to build Pipelines and run them.

Note: You may not be able to import/export resources or install Snaps because you are logged in as an non-admin user.

1. To enable a SiteMinder user to have admin credentials, you need to edit the /opt/sna-plogic/config/snapaccess.conf file and add one group:

<Groups>

# groupname description

...

admin_group admins

</Groups>

and then add the numeric user id that you want to have admin privilege to the user groups sec-tion

<UserGroups>

# username groupname1 groupname2 ...

...

1234455678345 admin_group

</UserGroups>

Restart the server for it to take effect.

Security OverviewSnapLogic's security points consist of:

l The SnapLogic Designer and SnapLogic Server communicate with each other through a secure HTTPS connection.

l The SnapLogic Server and the Component Containers communicate with each other through a secure HTTPS connection.

l The method in which the Component Containers communicate through the various data sources is dependent on the following:

l Does the source support secure communication? For example, if the data source is an FTP server (not an SFTP server) then it would be impossible to com-municate with the data source through a secure connection. However, if the data source does support secure communication, then there is a dependency on the Snap.

- 87 -


l Assuming the data source supports secure communication generally, it is the Snap developer's discretion whether or not to support secure communications. Many existing Snaps support secure communication.

Enabling SSLThe SnapLogic Server, Python Component Container, and Java Component Container have SSL listeners on ports 8091, 8092, and 8093, respectively. To enable SSL, modify the snap-server.conf file's component_container section. A Component Container is a process that runs a Component. Include the name of each Component Container in brackets, followed by parameters describing the Component Container, as shown:

# Configuration of component containers (CC)

[component_container]

# Name of CC1

[[cc1]]

# the location of the log directory for this CC

log_dir = C:\Program Files\snaplogic/logs

# The location of component directory

component_dirs = "$SNAP_HOME","/opt/snaplogic/extensions/components"

component_conf_dir = "/opt/snaplogic/component_config"

cc_port = 8089

cc_secure_port = 8092

cc_secure_cert = /opt/snaplogic/config/host.pem

cc_hostname = YourHost-PC

# Name of CC2

SSL is enabled if server_secure port is defined in the main section and/or cc_secure_port is defined in the cc section. By default SSL is enabled. For example, to disable SSL, comment out server_secure_ignore and cc_secure_ignore properties in the snapserver.conf.

Note: The servers need to be restarted after any change in the snapserver.conf.

Parameters

The parameters in the component_container section of the snapserver.conf file include:

l CC Name: Enter the name of each Component Container before specifying its param-eters. Example: [[cc1]].

l log_dir: The location of the log directory for this Component Container. Example: log_dir = /opt/snaplogic/logs.

l component_dirs: The location of component directory. Example: component_dirs = "$SNAP_HOME","/opt/snaplogic/extensions/components".

- 88 -

Administration

l component_conf_dir: The location of the component configuration directory. Exam-ple: component_conf_dir = "/opt/snaplogic/component_config".

l cc_port: The port number used by the Component Container process. Example: cc_port = 8089.

l cc_secure_port: Number of the port on which secure communication is available via SSL. Comment out this parameter to disable https. Example: cc_secure_port = 8092 .

l cc_secure_cert: This parameter points to the location of the certificate file required for SSL communication. Example: cc_secure_cert = /opt/sna-plogic/config/host.pem.

l cc_hostname: Name of the host running the Component Container. Example: cc_host-name = YourHost-PC.

SSL Usage

To use SSL, enable SSL in the snapserver.conf and restart the server. The default SSL port for the server is 8091. To connect to the Designer over SSL, use the URL https:/-/machinename.domainname.com:8091/designer. By default, self signed certificates are installed on the server. The browser would prompt for whether the certificate should be trusted. Trusting the certificate would open the Designer with https. The machine name has to be specified when installing the product, since the SSL certificate will have the machine name and the browser can validate the machine name.

Note: Accessing SSL URLs through the IP address does not work since SSL cer-tificates do hostname validation.

If the client (Designer) connects to the SnapLogic Server through https, all subsequent oper-ations for that connection are done with SSL. If SSL is not desired for pipeline execution, SSL can be disabled for CC's, keeping SSL for the server only.

Disable non-SSL access

To disable non-SSL access (enable only SSL), comment out the server_port property in the main section and cc_port property in the cc section of snapserver.conf This will change SnapLogic to use SSL for all interactions. Pipeline execution involves the transfer of data between CC processes. Since the CC's will be running on the same machine, SSL is not required for inter-CC communication. To use SSL for the data server and non-SSL for CC's, comment out the server_port property in the main section and cc_secure_port property in the cc section of snapserver.conf.

Prior to 3.5, to disable non-SSL access (enable only SSL), comment out the server_port prop-erty in the main section and cc_port property in the cc section of snapserver.conf. There need to be a few updates to the startup scripts:

On Linux or OSX

Update the snaplogic_include.sh script in the INSTALL_VERSION_DIR/bin. Change DATA-PORT to the port use for server_secure_port in snapserver.conf. Change SERVER_URI to

- 89 -


start with https instead of http. Restart the SnapLogic servers.

Management ConsoleThe browser-based Management Console provides details about the performance of executed Pipelines, whether in a cluster or on individual SnapLogic Servers. The Management Console draws on comprehensive log message access, Pipeline- and Component-level statistics, and analysis of Pipeline run history to enable quick drill-down to the root causes of Pipeline execution failures. Use the Management Console to view data for multiple distributed Snap-Logic instances, including:

l A dashboard view of Pipeline executions

l Historic reports of all Pipeline executions

l A detailed drill-down on each Pipeline's execution history, results, and contents

l An overview of your SnapLogic Servers at both the cluster and individual server levels, cluster configuration and server designations, as well as jobs run by the cluster

Registering Servers in the Management Console

When you edit the SnapLogic Server Configuration file (snapserver.conf) to specify which servers can be proxied, you enabled the Management Console for your SnapLogic installation. To make full use of the management console, register all of your remaining SnapLogic Servers by following these instructions:

l Access the Management Console by entering its URI into your browser's address bar: http://<hostname>:<port>/console. A Login screen displays.

l Log in to your primary server; that is, the SnapLogic instance on the same domain as the console into which you are logging. The Wall page displays, with an overview of recent statistics for your SnapLogic instances. Because you are only logged in to your primary server, the information summarizes only that instance.

l To register additional servers, go to the Setup screen.

l Under the Server URI heading, add each server by entering its hostname, port, username, and password. Click Add Server.

l Enter the SnapLogic instance as host:port, without specifying the protocol (that is, without specifying http://), as follows: some_snaplogic_instance:443. The server you added now appears in the Setup screen.

l Repeat this step for each of your SnapLogic Servers. You can also remove servers from the list by clicking Remove Server to the right of the server's credentials. Clicking on the URI of a server in this page navigates you to the Servers page for detailed server information.

Using the Management Console

Access the Management Console by entering its URI into your browser's address bar, as fol-lows: http://<hostname>:<port>/console. Log in with your username and password.

- 90 -

Administration

Click the tab of the screen you wish to access. The screens of the management console are listed in a horizontal menu panel across a top of the console:

l The Wall: Use this screen as the dashboard view of your most recent Pipeline executions. From it, you can drill down to server, Pipeline, or execution details.

l Events: Use this screen to examine historical information for every Pipeline executed from an event-based approach.

l Pipelines: Use this screen to drill down to the details of a single Pipeline. Access the Pipelines screen from the Wall, History, Events, or Servers screen by clicking on the name of a Pipeline.

l Server Info: Use this screen to monitor server information, configuration, and activity for every server that you registered in the Setup screen. Access the Servers screen by clicking a server name in the Wall, History, Events, or Pipelines screens.

The Wall

The Wall is a dashboard view of your latest SnapLogic runs. It displays each Pipeline executed in the last 48 hours, the server on which it was run, and the result of its last run, which is easily discerned by its color. Amber-colored Pipelines indicate Pipelines that have failed in the past but are now running successfully.

l Time Range: Along the top of The Wall are links to control the time range in display. The Wall initially displays the Pipelines run in the last 48 hours. Click Last Week to dis-play all Pipelines run within the last week, or specify a custom range using the cal-endars that pop out of the Date Range from and to fields.

l Paging: Use the left and right arrows on the sides of the screen to page through the Pipelines displayed in your specified time range.

l Drilling Down: Each Pipeline is presented in overview format on The Wall. Click the Pipeline's Server name to display the Servers screen where you can examine server details. Click the Pipeline's Last Run time to display the Pipelines screen, where you can examine the log of that run in the bottom panel.

Events

The Events screen also contains historical information for every Pipeline executed, but takes an event-based focus. The top panel of the page displays Pipeline names, the servers on which they executed, their status, their start and end times, the number of records written, and the number of errors encountered. The bottom panel of the Events page initially displays the main server log, but is used to display Pipeline Components when you select a Pipeline to examine.

You can manipulate the event-driven display as follows:

l Time Range: Along the top of the screen are links to control the time range in display. The screen initially displays the Pipelines run in the last 48 hours. Click Last Week to dis-play all Pipelines run within the last week, or specify a custom range using the cal-endars that pop out of the Date Range from and to fields.

- 91 -


l Run or Validate Pipelines: Select one or more Pipelines using the check boxes in the left column. Click Run or Validate in the upper right corner of the screen to run or val-idate your selected Pipelines.

l Sorting: Click on any of the column headers (Pipeline, Server, Stats, Started, Ended, Records, and Errors) to sort the data by that criterion. Click again to alternate between ascending and descending order.

l Filtering by Event Type: Initially, the Events screen displays All executions, regard-less of result. Use the Type column on the left of the screen to filter the display by type of event. By clicking on one of the event types, you can limit the display to Pipelines that ran with Success, Pipeline runs that Failed, Pipelines currently Running, or Pipelines whose runs were Stopped.

l Pipeline Components: Drill down to a single Pipeline's contents by selecting a Pipeline using the checkbox in the left column. The bottom panel of the Events screen displays a list of the Pipeline's Components and their execution status.

l Pipeline Details: Drill down to Pipeline-level historical and content details by clicking on a Pipeline's name to navigate to the Pipelines screen.

l Server Details: Drill down to server details by clicking the Pipeline's Server name to display the Servers screen.

Pipelines

The Pipelines screen displays when you want to drill down to the details of a single Pipeline. This screen accesses all log information related to Pipelines. Access the Pipelines screen from the Wall, History, Events, or Servers screen by clicking on the name of a Pipeline.

The top of the screen identifies the Pipeline-server combination you are examining. The tab-ular display reports the Pipeline's runs on that server: each run's Status, the time each run Started and Ended, the number of Records processed and the number of Errors encountered. The graphic display includes a chart of data records processed by date. The bottom panel of the screen breaks down the Pipeline's Components and reports on their individual per-formance for the execution selected in the top panel.

You can manipulate the detailed Pipeline display as follows:

l Time Range: Along the top of the screen are links to control the time range in display. The screen initially displays executions that occurred within the last 48 hours. Click Last Week to display all executions occurring within the last week, or specify a custom range using the calendars that pop out of the Date Range from and to fields.

l Run or Validate the Pipeline: Click Run or Validate in the upper right corner of the screen to run or validate the Pipeline on display.

l Sorting: You can sort the tabular data by clicking on any of the column headers (Status, Started, Ended, Records, and Errors) to specify the sort criterion. Click again to toggle between ascending and descending order.

- 92 -

Administration

l Pipeline Components: The bottom panel of the Pipelines screen displays a list of the Pipeline's Components and their execution status.

l Pipeline Log: Click the Logs view in the bottom panel to view the execution log for the server on display.

ServerInfo

The Server Info screen enables you to monitor server information, configuration, and activity for every server that you registered in the Setup screen, as instructed in the "Registering Servers in the Management Console" section. Access the Servers screen by clicking a server name in the Wall, History, Events, Pipelines, or Setup screens. If you are in a cluster envi-ronment, a Cluster pane displays providing a real-time health status of the cluster.

The left panel of the screen displays static Summary information about the server you are monitoring: the edition, version, installation date, license type, and expiration date of Snap-Logic it is running, as well as the operating system and architecture of the machine. It also dis-plays information about the Cluster to which this server belongs, if you are using that configuration. The main portion of the screen is devoted to the activity of the server you are examining: which Pipelines it has executed, their Status, time Started and Ended, and the number of Records processed and Errors encountered. The bottom of the panel displays the server Log file content.

You can manipulate the server information display as follows:

l Time Range: Along the top of the screen are links to control the time range in display. The screen initially displays the Pipelines run on the server in the last 48 hours. Click Last Week to display all Pipelines run within the last week, or specify a custom range using the calendars that pop out of the Date Range from and to fields.

l Run or Validate Pipelines: Select one or more Pipelines using the check boxes in the left column. Click Run or Validate in the upper right corner of the screen to run or val-idate your selected Pipelines.

l Sorting: Click on any of the column headers pertaining to Pipeline executions (Pipeline, Status, Started, Ended, Records, and Errors) to sort the data by that criterion. Click again to toggle between ascending and descending order.

l Pipeline Components: Drill down to a single Pipeline's contents by selecting a Pipeline using the checkbox in the left column. The bottom panel of the Servers screen displays, in its Details view, a list of the Pipeline's Components and their execution status.

l Pipeline Details: Drill down to Pipeline-level historical and content details by clicking on a Pipeline's name to navigate to the Pipelines screen.

l Examine a Different Server: If you have configured your servers to run in a cluster, then the left panel of the Servers page displays a Cluster section. This section identifies the head and worker nodes of the cluster to which this server belongs. Click on any of the other nodes in this section to view details about that server. The Servers screen dis-plays information for the head node of a cluster.

- 93 -


Log FilesLog files provide a behind-the-scenes look into how your SnapLogic system is running.

You can view log files in Designer from the View > Server Logs and View > Client Log commands.

Sandboxing to Protect Your SnapLogic EnvironmentSandboxing only applies to Java Components, and enables you to run Components in a restricted environment provided by the JVM. Java Snaps run inside a security sandbox, which limits access to resources, such as network destinations, file system locations, and executing processes. Each Snap declares which resources it intends to use for it to run successfully. This declaration occurs at Snap installation time, enabling you to grant or deny permissions at install time.

Every Snap includes a permissions file that dictates the permissions required by each Com-ponent in the Snap. With sandboxing, when you install a Snap, you are prompted to grant or deny each of its requests to use JVM Components. Upon approval of the Snap's declared per-missions, the server puts the approved permissions into the repository for use during execution.

By default, sandboxing is already enabled in the Java Component Container. You can disable it by setting the disable_sandbox property to true in the snapserver.conf file. Refer to the "Configuring SnapLogic Server" section for more information on the snapserver.conf file, and to its "Component Container Configuration" topic for details.

Importing and ExportingThe SnapLogic Server supports exporting Component definitions in a portable format that can be subsequently imported by the same or other servers. The export format is portable regard-less of operating system, and starting with SnapLogic release 2.0, import compatibility is sup-ported across at least one major release number. For example, release 3.1 supports the import of release 2.0 exports. (Release 2.0 does not support the import of 1.0 exports. For assistance migrating 1.0 definitions, please contact SnapLogic Professional Services.)

Note: Authentication to the SnapLogic Server being connected to is required before running the commands documented.

Importing within Designer

As of release 3.5, importing resources can be done by selecting Import from the Server menu.

Importing using SnapAdmin

Perform imports using the SnapAdmin resource import command, as follows:

snapadmin> resource import /Samples/Demo1/Resources/Emp

Imported /Samples/Demo1/Resources/Emp as /Samples/Demo1/Resources/Emp

Imported 1 resource from file 'snaplogic.dmp'

- 94 -

Administration

The following is a list of additional import options:

l Specifying the import file: Import reads from the default file, snaplogic.dmp, unless you specify an alternate filename using the -i option, as follows:

snapadmin> resource import -i /home/fred/Leads.dmp

/SnapLogic/Tutorial/Exercise_1/Resources/Leads

Imported /SnapLogic/Tutorial/Exercise_1/Resources/Leads

Imported 1 resource from file '/home/fred/Leads.dmp'.

l Importing multiple Components: You can import numerous Components with one command by listing the URI of each Component, by using a wildcard in your command, or by performing a recursive import.

l Importing multiple Components by listing URIs: You can specify a list of Com-ponents to import by listing multiple Component URIs, separated by spaces, as follows:

snapadmin> resource import /SnapLogic/Tutorial/Exercise_

1/Resources/Leads /SnapLogic/Tutorial/Exercise_1/Resources/Prospects

Imported /SnapLogic/Tutorial/Exercise_1/Resources/Leads

Imported /SnapLogic/Tutorial/Exercise_1/Resources/Prospects

Imported 2 resources from file 'snaplogic.dmp'.

l Importing multiple Components using a wildcard: You can use the asterisk (*) wildcard character to specify all of the items in a particular folder, as follows:

snapadmin> resource import /Samples/Demo1/Resources/*

Imported /Samples/Demo1/Resources/Emp

Imported /Samples/Demo1/Resources/Dept

Imported 2 resources from file 'snaplogic.dmp'.

l Recursive import: To recursively import everything in a location, use the -r option, as follows:

snapadmin> resource import -r /Samples/Demo1

Imported /Samples/Demo1/Pipelines/Emp_Dept_Pipeline

Imported /Samples/Demo1/Resources/Emp

Imported /Samples/Demo1/Resources/Dept

Imported 3 resources to file 'snaplogic.dmp'.

l Overwriting existing Components: By default, import preserves the URI name of the exported Components. If the specified Component already exists in your SnapLogic Server, you are prompted for permission to overwrite the existing Component. You may also specify the -f option to force the overwrite without prompting you first. You may also choose to rename the Component's URI, either by re-rooting the URI relative to a provided path, or by text substitution within the URI.

l Converting absolute URI paths to relative: When importing Pipelines containing absolute URIs that point to a different server, you can direct SnapLogic to automatically convert these URIs to relative paths with the -R option. For import, the -R option con-verts all contained Component URIs, regardless of the server.

- 95 -


l Renaming Components on import: You can rename Components when you import them. For example, if you do not wish to overwrite an existing Component, you can import it to a new location, using the -s option, as follows:

snapadmin> resource import -s Emp=EmpCopy /Samples/Demo1/Resources/Emp

Imported /Samples/Demo1/Resources/Emp as

/Samples/Demo1/Resources/EmpCopy


snapadmin> resource import -s /Demo1/=/MyDemo1/

/Samples/Demo1/Resources/Emp

Imported /Samples/Demo1/Resources/Emp as /Samples/MyDemo1/Resources/Emp


Exporting within Designer

As of release 3.5, exporting resources can be done by:

l right-clicking on a resource in the Library and selecting Export Resource.

l selecting Export All from the Server menu.

Exporting using SnapAdmin

Perform exports using the SnapAdmin resource export command, as follows:

snapadmin> connect server http://snaplogic1:443

Success: Connected to server.

Server URL: http://snaplogic1:443

Server version: 3.1.0

Server copyright: Copyright (c) 2007 - 2010, SnapLogic Inc. All rights

reserved.

snapadmin > credential set default admin

Password: ******

Success: credential is set

snapadmin> resource export /SnapLogic/Tutorial/Exercise_

1/Resources/Leads

Exported /SnapLogic/Tutorial/Exercise_1/Resources/Leads

Exported 1 resource to file 'snaplogic.dmp'.

The following is a list of additional export options:

l Specifying the export file: Export writes to the default file, snaplogic.dmp, unless you specify an alternate filename using the -o option, as follows:

snapadmin> resource export -o /home/fred/Leads.dmp

/SnapLogic/Tutorial/Exercise_1/Resources/Leads


Exported 1 resource to file '/home/fred/Leads.dmp'.

l Exporting multiple Components: You can export numerous Components with one command by listing the URI of each Component, by using a wildcard in your command, or by performing a recursive export.

- 96 -

Administration

l Exporting multiple Components by listing URIs: You can specify a list of Com-ponents to export by listing multiple Component URIs, separated by spaces, as follows:

snapadmin> resource export /SnapLogic/Tutorial/Exercise_

1/Resources/Leads /SnapLogic/Tutorial/Exercise_1/Resources/Prospects


Exported /SnapLogic/Tutorial/Exercise_1/Resources/Prospects

Exported 2 resources to file 'snaplogic.dmp'.

l Exporting multiple Components using a wildcard: You can use the asterisk (*) wildcard character to specify all of the items in a particular folder, as follows:

snapadmin> resource export /SnapLogic/Tutorial/Exercise_1/Resources/*




l Recursive export: To recursively export everything in a location, use the -r option, as follows:

snapadmin> resource export -r /SnapLogic/Tutorial/Exercise_1


Exported /SnapLogic/Tutorial/Exercise_1/Pipelines/Leads_to_Prospects



l Converting absolute URI paths to relative: By default, export preserves the URIs of Components contained in a Pipeline. These URIs may be absolute URIs that point explicitly to the originating server. The originating server to which they point may be the same server containing the Pipeline, or a different SnapLogic server. Use the -R flag to make these URIs relative when they are written to the dump file.

l Exporting Pipeline dependencies: When exporting Pipelines, you normally need to export the objects included in the Pipeline. SnapLogic export can do this automatically for any Components that are co-located on the same server as the Pipeline you are exporting. To export dependencies, use the -d flag as follows:

snapadmin> resource export -d /SnapLogic/Tutorial/Exercise_

1/Pipelines/Leads_to_Prospects


Exported /SnapLogic/Tutorial/Exercise_1/Pipelines/Leads_to_Prospects


Exported 1 resources and 2 dependencies to file 'snaplogic.dmp'.

SnapshotsUnder the Server menu there is a new option by the name Snapshots. This allows users to create a dump file of all the Pipelines on a specific SnapLogic Server. Importing the dump file back on the Server will restore all the Pipelines to the state they were in when the dump was created.

- 97 -


The difference between the Snapshots and the Export All functionalities is that while the con-tents of the dump file generated by either actions is the same, Snapshots creates the dump file on the server while Export All provides the dump file as a local download. This means Snapshots is saves time and is less of a hassle when it comes to recovery/rollback.

Running SnapLogic Behind a ProxySnapLogic supports running behind a proxy server. This section provides an example of con-figuring the nginx web server to be used as a front end for the SnapLogic processes.

Configuring the Proxy

A running SnapLogic installation has at least three server processes: The SnapLogic Server (the main server) and one or more Component Container processes. Consequently, you must configure the proxy for all three. If the SnapLogic Server is configured with both SSL and non-SSL, then both the SSL and non-SSL ports have to be proxied for each of the three servers.

In the case of nginx, configuring proxy forwarding is straightforward. Add a server con-figuration for each process to your nginx.conf file, as follows:

...

http {

...

underscores_in_headers on;

server {

# For the SnapLogic main server process

listen 80; # The port on which nginx should list

# for requests for this server (can be

# anything you want, of course)

location / {

# Replace the example URI with your actual Server URI.

proxy_pass http://myhost.mycompany.com:443;

proxy_redirect off;

proxy_buffering off;

}

}

server {

# One of these entries for each CC (component container) process

# If multiple CCs are running on the same host, please configure

# a different port for each of these entries in the nginx

# config file.

listen 81;

location / {

# Replace the example URI with your actual CC URI.

proxy_pass http://myhost.mycompany.com:8089;

proxy_redirect off;

proxy_buffering off;

}

- 98 -

Administration

}

...

}

Please note that it is currently NOT possible to separate requests to the various SnapLogic servers by path (simply using separate 'location' directives within a single 'server' definition). Instead, a different front-end port must be chosen in the proxy for each SnapLogic server and the location must be '/', as shown in the preceding config file example.

The underscores_in_headers property is required to ensure that all headers used by the SnapLogic are forwarded by nginx. 'proxy_buffering' has to be disabled to ensure that out-put view reads do not fail with network read errors.

Configuring SnapLogic

In order to ensure correct operation of the SnapLogic Server processes, you must edit the SnapLogic configuration file. Add proxy-URI definitions for each server process, as follows:

server_hostname = myhost.mycompany.com

server_port = 443

server_proxy_uri = http://myproxy.mycompany.com:80

Likewise, in the CC section of the config file, add proxy-URI definitions for each CC:

cc_hostname = myhost.mycompany.com

cc_port = 8089

cc_proxy_uri = http://myproxy.mycompany.com:81

The server_proxy_uri and cc_proxy_uri parameters define the complete URI of the proxy front end for that process. Your proxy may run on a different host than your SnapLogic servers, or it may run on the same host, in which case the hostnames in the examples above would be the same.

SSL Proxy Configuration

In order to secure the network connection between clients and the server of a SnapLogic instal-lation, the main server and CC processes are run behind a proxy front end. The front end then takes care of all SSL encryption.

Extending the example described in the "Running SnapLogic Behind a Proxy" section, the fol-lowing example shows how this can be done using nginx as a proxy front end:

...

http {

...

server {

# SSL proxy for the SnapLogic Server

listen 443;

ssl on;

ssl_certificate /etc/ssl/certs/myssl.crt;

ssl_certificate_key /etc/ssl/private/myssl.key;

location / {

...

- 99 -


}

}

server {

# SSL proxy for the SnapLogic CC

listen 444;

ssl on;

ssl_certificate /etc/ssl/certs/myssl.crt;

ssl_certificate_key /etc/ssl/private/myssl.key;

location / {

...

}

}

...

}

...

Note: Replace the example paths to the CRT and KEY file with the correct paths to you CRT and KEY files.

When using the proxy front end to take care of SSL, SSL support can be disabled in the Snap-Logic configuration. Comment out the server_secure_port and cc_secure_port properties in the snapserver.conf. The proxy URI has to be modified to express the new scheme and, possibly, port, as follows:

server_proxy_uri = https://myproxy.mycompany.com:443

...

cc_proxy_uri = https://myproxy.mycompany.com:444

Self-signed Certificates

If self-signed certificates are being used, the certificate used by the proxy should be added to the trust store used by the Java CC. The keytool command which is part of JRE can be used for this. For example,

keytool -import -alias proxy -file /etc/ssl/certs/myssl.crt -keystore

/opt/snaplogic/3.4.1.19583PE/pkg/java/jre1.6.0_20/lib/security/cacerts

If the SnapLogic Server is being accessed through a Java SnAPI program, the proxies cer-tificate would have to be added to the trust store of the JRE running the Java client process.

Data Types and Output Representation FormatsPipeline Data Types

The following data types are the field-level data types supported by SnapLogic, which are used by input and output records within Pipelines:

l String: This is a Unicode string.

l Number: This is a decimal number, with a precision of 28 digits.

- 100 -

Administration

l Datetime: This is a combined date and time data type, which stores time with micro-second resolution.

General Data Types

The following definitions are the definitions of data types used in Component properties:

l SnapLogic Identifier: This is an ASCII string which follows the same rules as Python identifiers. A SnapLogic identifier is a string beginning with a letter or underscore, fol-lowed by a sequence of letters, digits, or underscores.

l Filename: Most file read components support a URL specification for their input using the format 'scheme://input_path'. The valid schemes and their meanings are:

l file: Specifies a file local to the SnapLogic Server. input_path is the local filename. If input path begins with a /, then the filename is treated as absolute, otherwise the path is relative to the server root.

l http: Specifies data is read via HTTP. input_path is the HTTP URL.

l https: Specifies data is read via secure HTTP. input_path is the HTTP URL.

l ftp: Specifies data is read from an ftp data source.

Output Data Representation Formats

When SnapLogic sends data over the network, between Components and Pipelines, or to the applications requesting output data, it converts the data stream into one of several formats. This data representation enables requestors to specify which format their preferred format. SnapLogic currently supports the following data representations:

l ASN.1: information encoded in Abstract Syntax Notation One format. Mime type: 'application/x-snap-asn1'

l JSON: information encoded in JavaScript Object Notation format (SnapLogic Version 1.0 had a JSON Component for explicitly returning output records from Pipelines to the appli-cation. This component is obsolete as of Version 2.0.)Mime type: 'application/json'

l HTML: text in the form of Hypertext Markup LanguageMime type: 'text/html'

l CSV: text with comma-separated values; introduced in version 2.0.3Mime type: 'text/csv'

l TSV: text with tab-separated-values; introduced in version 2.0.3Mime type: 'text/tab-separated-values'

Applications can request a specific format by specifying the sn.content_type parameter in the HTTP request. For example, a web browser can read the output from Output1 of a Com-ponent whose URI is http://server:443/SnapLogic/Tutorial/Exercise_1/Resources/Leads using the following /feed URI:

http://server:443/feed/SnapLogic/Tutorial/Exercise_

1/Resources/Leads/Output1?sn.content_type=text/html

- 101 -


SnapAdmin UtilitySnapAdmin is a simple command line interface that provides basic SnapLogic Server admin-istration functions. By default, SnapAdmin starts an interactive command session. Alter-natively, by using the -c filename option, you can give SnapAdmin a file containing commands to execute.

Starting SnapAdmin

The SnapAdmin utility is located in the /bin directory of the SnapLogic Server installation. To start SnapAdmin:

l Linux: Execute snapadmin.sh at the command prompt.

SnapAdmin Commands

The SnapAdmin command library consists of the following commands.

acl

l acl addrule: Add a new acl rule. Admin credentials must be used.

l acl create: Create a new acl. Admin credentials must be used.

l acl delete: Delete an existing acl from the username/password file. Admin cre-dentials must be used. Admin credentials must be used.

l acl delrule: Delete an existing acl rule. Using acl get <aclname>, note the <index> of the specific rule in order to specify it here. Admin credentials must be used.

l acl get: Print info for a particular acl in the username/password file. SnapAdmin must be connected to a server. Admin credentials must be used.

l acl list: Print list of acl in the username/password file. SnapAdmin must be con-nected to a server. Admin credentials must be used.

bye

l bye: Exit the SnapAdmin utility.

cluster

l cluster set: Sets the specified parameter to the new value. The only parameter allowed currently is jobs_per_worker. Example: cluster set jobs_per_worker 5

l cluster syncsnaps: Reads all the snap. zip files on the head node and installs each one on the given worker node URI.

l cluster workers: Prints all the worker nodes configured on the head node. Returns an error if not connected to a cluster configured server.

- 102 -

Administration

component

l component list: List the Components available to this server.

l component print <name>: Print the information about a specific Component. The <name> parameter must be a full Component identifier name, such as snap-logic.components.CsvRead. Refer to the Component Reference Guide for a com-prehensive list of Component names.

connect

l connect server <url>: Connect to a server by specifying its URL.

l connect switch <connection-index>: Set the active server connection. Use this com-mand when you are connected to multiple SnapLogic Servers. Use the connect list com-mand to view each connection and its index.

l connect list: List the current server connections and their index. You can use the index to switch the active connection with the connect switch command.

credential

l credential set {default | current | <connection-index>} <user>: Set the cre-dentials used for requests to the server. Use default to specify the default credentials used for any connection that does not have its own credentials set. Specifying current sets credentials only for the current connection. Lastly, a connection index is used to set the credentials for a specific connection. Use the connection list command to view each connection and its index.

The command prompts for a password, and the resulting user/password pair is used for all server requests for the relevant connections. Note that if you are using a command script (for example, snapadmin -c command_file) and you don't want the prompt, you can place the password after the user credential, as follows:

connect server http://localhost:443

credential set default admin yourpassword

resource import -r -R -f -i ./demo.dmp /path/to/...more commands

that follow...

Note: You may need to edit your ACL configurations for this command to work. For example, you may need to disable ALLOW GROUP known PER-MISSION read write execute and enable ALLOW GROUP public PER-MISSION read write execute.

l credential reset {default | current | <connection-index> }: Reset the cre-dentials for a server connection. Use default to specify the default credentials used for any connection that does not have its own credentials set. Specifying current resets cre-dentials only for the current connection. Lastly, a connection index is used to reset the credentials for a specific connection. Use the connection list command to view each con-nection and its index.

- 103 -


disconnect

l disconnect: Disconnect the current connection.

exit

l exit: Exit the SnapAdmin utility.

group

l group adduser <groupname> <username>: Add a user to a group. Admin credentials must be used.

l group create <groupname>: Create a new group. Admin credentials must be used.

l group delete <groupname>: Delete an existing group from the username/password file. Admin credentials must be used.

l group deluser <groupname> <username>: Delete a user from a group. Admin cre-dentials must be used.

l group get <groupname>: Get group. Admin credentials must be used.

l group list: Print list of group in the username/password file. SnapAdmin must be con-nected to a server. Admin credentials must be used.

help

l help: Print-related help.

log

l log search [-l limit] [-o offset] pipeline_rid [regex_search_string]: Dis-plays the Pipeline logs for the given Pipeline runtime ID. You can find the Pipeline ID by using the pipeline list command.

If the optional regex search string is specified, only the log lines matching the search string are displayed. If the search string consists of several words, enclose it in quotes. Lines are shown in reverse chronological order; that is, the most recent records are returned first.

Options

l -l LIMIT, --limit=LIMIT: how many log lines to show. Default is 0 (show all).

l -o OFFSET, --offset=OFFSET: Log lines at the beginning to skip

pipeline

l pipeline list: List the Pipeline execution history.

l pipeline print <pipeline-uri>: Print the history of a specific Pipeline identified by URI.

- 104 -

Administration

l pipeline start <pipeline-uri>: Start the execution of a Pipeline identified by URI. The command returns immediately, but the Pipeline continues to run asynchronously until completed or forcibly stopped.

l pipeline stop <pipeline-id>: Stop the execution of a Pipeline identified by its run-time ID. You can get the runtime ID through the pipeline list command.

repository

l repository create -t sqlite <filename>: Create a new SQLite repository database at the given path. The path must either point to a nonexistent file to create, or to a SQLite database that does not already contain a SnapLogic repository.

l repository create -t mysql [-H <host>] [-P <port>] -u <username> [-p <password>] <database>: Create a new MySQL repository. The optional host and port specify the hostname and port of the MySQL server. The username is the MySQL user. If the password option is not supplied, the command prompts the user for the password. The last argument specifies the name of the database in the MySQL server in which to create the repository.

l repository destroy -t sqlite <filename>: Destroys the SQLite repository at the given path. The database file is left at the location, and the user can recreate it with the repository create command.

l repository destroy -t mysql [-H <host>] [-P <port>] -u <username> [-p <password>] <database>: Destroys the MySQL repository. The optional host and port specify the hostname and port of the MySQL server. The username is the MySQL user. If the password option is not supplied, the command prompts the user for the password. The last argument specifies the name of the database in the MySQL server containing the repository. The database is left in the MySQL server and can be recreated with the repository create command.

l repository destroy -c <filename>: Destroys the repository configured in the config file. It is the same as running one of the above repository destroy commands, but the necessary options are parsed directly from the config file.

l repository encrypt [options]: Encrypts the repository database used by SnapLogic. Supported only if the repository type is set to SQLite. The user needs to be connected to the server as an admin user to be able to run this command.

l Options:

-n, --no_save Do not save the new repository encryption password on the server. If this option is used, the repository password has to be entered every time the server is restarted. If this option is used, the user needs to ensure that the repository password is saved in a secure manner. If the repository password is lost, the repository contents CANNOT be recovered. The repository contents can be backed up using export before using this command.

-p PASSWORD, --password=PASSWORD The password to use to encrypt the repos-itory database. Use empty password to disable encryption.

- 105 -


-c CIPHER, --cipher=CIPHER The cipher to use to encrypt the repository data-base. The supported options are aes-128-ecb, aes-128-cbc,aes-128-cfb, aes-256-ecb, aes-256-cbc and aes-256-cfb.The default is aes-256-cbc.

l repository password [options]: Changes the encryption password for the repository database used by SnapLogic. Supported only if the repository type is set to SQLite. The user needs to be connected to the server as an admin user to be able to run this com-mand.

l Options:

-n, --no_save Do not save the new repository encryption password on the server. If this option is used, the repository password has to be entered every time the server is restarted. If this option is used, the user needs to ensure that the repository password is saved in a secure manner. If the repository password is lost, the repository contents CANNOT be recovered. The repository contents can be backed up using export before using this command.

-p PASSWORD, --password=PASSWORD The password to use to encrypt the repos-itory database. Use empty password to disable encryption. Using this option takes you out of the --no_save mode.

-c CIPHER, --cipher=CIPHER The cipher to use to encrypt the repository data-base. The supported options are aes-128-ecb, aes-128-cbc, aes-128-cfb, aes-256-ecb, aes-256-cbc and aes-256-cfb. The default is aes-256-cbc.

-o CURRENT_PASSWORD, --current_password=CURRENT_PASSWORD The current password to use to decrypt the repository database.

l repository upgrade {-t <db-type> | -c <config-file>} [options] ...: Upgrades the structure of the repository database. This command is used by the Snap-Logic installer. In general it should not be used manually unless you are sure of what you are doing.

l repository wait_on_upgrade: Wait for the startup repository upgrade to complete.

resource

l resource delete [-f] {<uri-list> | *}: Delete Components from the repository. If supplied, uri-list is a list of URIs of Components to delete, separated by whitespace. If * is supplied instead of uri-list, all Components are deleted. The -f flag forces the delete without requiring user interaction. Otherwise, the command confirms whether to proceed for each Component.

l resource export Refer to the "Import and Export" section for details.

l resource import: Refer to the "Import and Export" section for details.

l resource list: List the Components in the repository.

l resource print <uri>: Print Component details, including its properties and input/out-put definitions. In the case of Pipelines, the command also displays the list of contained Components.

- 106 -

Administration

l resource upgrade [options] {<uripatterns> | *}: Upgrade Components from prior Component versions to the current versions.

l Options: -r, --recursive: descend recursively into matched subfolders

server

l server shutdown: Shutdown the SnapLogic server in a graceful manner. Waits for all currently running jobs to finish. No new job executions are permitted when the server is shutting down.

shell

l shell: Execute an operating system shell command.

source

l source: Execute SnapAdmin commands contained in a file.

users

l users create <username> <password>: Add a user.

l users delete [-f] <username>: Delete an existing user from the password file. The -f flat forces deletion. This means that the user is deleted, without prompt.

l users list: Print a list of users in the username/password file.

l users makepasswordfile [-f] <password_file> [<admin_password>]: Creates a new, empty password file, only containing the admin user. Specifying the name of an already existing password file overwrites the existing file. All user information is lost. An admin user will always be created. If the admin password is not specified on the com-mand line, then it will be prompted for.

l users setpassword <username> <password>: Modify the password of an existing user.

verbose

l verbose lasterr: Print additional information about the last error, if available.

l verbose off: Turn off verbose mode. Succinct error reporting.

l verbose on: Verbose mode. Print additional error information.

worker

l worker add: Add a worker node to the cluster configuration. Example: worker add http://worker.mydomain.com:443

l worker delete: Delete a worker node from the cluster configuration.Example: worker delete http://worker.mydomain.com:443

l worker list: List all the worker nodes configured on the head node. Returns an error if not connected to a cluster-configured server.

- 107 -


SidekickThe SnapLogic Sidekick is a service installed locally that lets you access data both on site behind a firewall and in the cloud. The Sidekick can be installed anywhere on your network that has access to the data that you want to use. The advantage of using Sidekick instead of an on-premises installation of the Server is a lighter install and footprint on the ground. Side-kick's lighter footprint means that all Pipelines and associated metadata are stored in the cloud instance of SnapLogic Server.

Sidekick is essentially a Java Component Container (Java CC). A Server running in the cloud stores the metadata and controls the Sidekick. When a Pipeline needs to run, the server tells the Sidekick to run it; actual Java code runs in the Sidekick on the ground.

Snap output streams are made available to the Server in such a way that there is no dif-ference whether the Java CC is running on the ground (as Sidekick) or in the cloud. Snap logs are stored with the Sidekick, but logs can be shown in the Designer. If communication between the Server and the Sidekick is interrupted, the server will no longer be able to talk to the Sidekick, so Pipelines will not be executed. Scheduled Pipelines have an option to be executed if they miss their window, so those Pipelines can still be run.

For more information, see the SnapLogic Sidekick Guide.

- 108 -

Appendix: Completing Tasks in SnAPIThis section provides instructions for completing tasks using the SnapLogic Application Pro-gram Interface.

Creating and Configuring a Component in SnAPICreating a Component with SnAPI consists of these basic steps:

l Create an instance of a Component.

l Set the Component properties.

l Define the inputs and/or outputs.

l Validate and save the Component to the server.

Refer to the following example for Component creation in SnAPI.

from snaplogic import snapi

# Create an instance of a Component

server_uri = 'http://localhost:443'

# URL of the SnapLogic data server to which we connect

emp_reader_resource = snapi.create_resource_object(server_uri,

'snaplogic.components.FixedWidthRead')

# Set the properties

emp_reader_resource.set_property_value('filename',

'file://demo/date/emp_fixed.txt')

emp_reader_resource.set_property_value('Field specs', input_def)

emp_reader_resource.set_general_info('description', 'Read employee

records from emp_fixed.txt')

# etc. etc.

# Define the output view

output_fields = (

( 'empno', 'string, 'Employee Number'),

( 'ename', 'string', 'Employee Name'),

( 'job', 'string', 'Job'),

( 'mgr', 'string', 'Manager'),

( 'hiredate', 'string', 'Hiredate'),

( 'sal', 'string', 'Salary'),

( 'comm', 'string', 'Commission'),

( 'deptno', 'string', 'Department')

)

- 109 -

A


emp_reader_resource.add_record_output_view("output1", output_fields,

"output view")

# Validate and save it to the server

validate_error = emp_reader_resource.validate()

emp_reader_resource.save(server_uri + '/SnapLogic/Demo/Resources/Emp')

Component Suggestions in SnAPIWhen using SnAPI, the Suggest method returns a complete Component definition containing the suggested changes. You can inspect and compare the original and suggested resdefs to determine whether to use the suggested resdef.

Example: Component Suggestions in SnAPIfrom snaplogic import snapi

# Create an instance of a component




'snaplogic.components.CsvRead')

emp_reader_resource.set_property_value('filename',

'file://demo/data/emp_csv.txt')

# Run suggest/auto-fill and get a new resource image back with

suggested changes.

emp_suggest_resource = emp_reader_resource.suggest()

print "Original Resource"

print emp_reader_resource

print "Suggested Resource"

print emp_suggest_resource

#...

Configuring Pass-Through Fields with SnAPIYou can specify pass-through in SnAPI using the set_output_view_pass_through() call.

- 110 -

Appendix: Completing Tasks in SnAPI

Example: Configuring Pass-Through Fields with SnAPI# This resource needs only empno, deptno and sal. The rest are

passthrough.

raise.add_record_input_view("Input1", (('empno', 'number', 'Employee

number'),

('deptno', 'number', 'Department'),

('sal', 'number', 'Salary')), "Input View")

raise.add_record_output_view("Output1",(('empno', 'number', 'Employee

number'),

('deptno', 'number', 'Department'),

('sal', 'number', 'Salary')), "Output View")

raise.set_output_view_pass_through("Output1",["Input1"])

Specifying Parameter Values at Runtime in SnAPI When using the SnAPI interface, you can specify parameter values and pass them into the snapi.exec_resource call.

import time

from

snaplogic import snapi

SERVER_URI = 'http://myhost.example.com:443'

PIPELINE_NAME = '/SnapLogic/Tutorial/Exercise_1/Pipelines/Leads_to_

Prospects'

# You can specify values for any of the parameters declared by the

Pipeline

parameters = {'LEADS' : 'file://tutorial/data/leads.csv',

'INPUT_DELIMITER' : ',' }

handle = snapi.exec_resource(SERVER_URI + PIPELINE_NAME,

params=parameters)

print "Waiting for completion of pipeline execution..."

while not handle.get_current_status(False).is_finished():

time.sleep(1)

print "Polling for completion of pipeline..."

print "All done!"

- 111 -


Validating Components in SnAPIRefer to the following example to validate a Component in SnAPI. For more information, refer to the online SnapLogic SnAPI information on creating a Component in Python and creating a Component in Java.

Example: Validating a Component in SnAPIfrom snaplogic import snapi

# Create an instance of a component




'snaplogic.components.FixedWidthRead')

# Define the output view

output_fields = (

('empno', 'string, 'Employee Number'),

('ename', 'string', 'Employee Name'),

('job', 'string', 'Job'),

('mgr', 'string', 'Manager'),

('hiredate', 'string', 'Hiredate'),

S('sal', 'string', 'Salary'),

('comm', 'string', 'Commission'),

('deptno', 'string', 'Department')

)

emp_reader_resource.add_record_output_view("output1", output_fields,

"output view")

# Validate and save it to the server

validate_error = emp_reader_resource.validate()

# We forgot to set the filename property so validate

# should return an error so indicating.

if validate_error:

print validate_error

else:

emp_reader_resource.save(server_uri +

'/SnapLogic/Demo/Resources/Emp')

Creating a Pipeline in SnAPIUsing SnAPI, you can create Pipelines by adding Components to your Pipeline object, and then linking the Components together using simple SnAPI calls. The following example is an excerpt from the SnAPI files used to load the examples featured in the tutorials:

- 112 -


.

.

.

p = snapi.create_resource_object(server, snapi.PIPELINE)

# Add the resources to the pipeline.

p.add(leads_res_def, "Leads")

p.add(prospects_res_def, "Prospects")

# Specify the field linkage

field_links = (('First', 'First_Name'),

('Last', 'Last_Name'),

('Address', 'Address'),

('City', 'City'),

('State', 'State'),

('Zip', 'Zip_Code'),

('Phone_w', 'Work_Phone'))

p.link_views('Leads','output1', 'Prospects','input1', field_links)

# Save the pipeline

p.save(p_uri)

.

.

.

When using the SnAPI programmatic interface, you can specify parameter values and pass them into the snapi.exec_resource call.

import time



PIPELINE_NAME = '/SnapLogic/Tutorial/Exercise_1/Pipelines/Leads_to_

Prospects'


Pipeline




params=parameters)


while not handle.get_current_status(False).is_finished():

time.sleep(1)


print "All done!"

Creating a Data Service Pipeline in SnAPIFollow this example to assign outputs in Data Service Pipelines in SnAPI:

- 113 -


# p_resdef is my pipeline object to which resource C2 has been added.

p_resdef.assign_output_view("C2", "C2_Output1", "P_Output1")

Mapping Components in SnAPITo link fields with SnAPI you must define a list of field pairings and add the list to the Pipeline. The following is an example of a field linking script in SnAPI:

# Define the field name mapping

emp_record_link = (

('empno', 'empno'),

('ename', 'ename'),

('job', 'job'),

('mgr', 'mgr'),

('hiredate','hiredate')

('sal', 'sal'),

('comm', 'comm'),

('deptno', 'deptno')

)

# Define the resource linkage and the mapping used for each link

p.linkViews('Emp Read', 'output1', 'Emp Write', 'input1', emp_record_

link)

Executing Pipelines in SnAPIYou can use the SnAPI interface to launch Pipelines. This gives you more flexibility to start and monitor Pipelines from Python and Java scripts. First, configure your environment using the appropriate snaplogic_env script. You can then run Pipelines using snapi.exec_resource(), passing arguments at runtime, and monitoring them. Refer to the following example of SnAPI Pipeline execution:

#!/usr/bin/env python

import time



PIPELINE_NAME = '/SnapLogic/Tutorial/Exercise_1/Leads_to_Prospects'


Pipeline




params=parameters)


while not handle.get_current_status(False).is_finished():t

ime.sleep(1)


- 114 -


print "All done!"

To view execution logs in this method, use the Management Console by entering its URI into your browser address bar: http://<hostname>:<port>/console.

Data Tracing in SnAPIEnable data tracing during Pipeline execution in SnAPI by specifying the parameter sn.trace_data. The parameter value can be set to:

l input: Use this setting to force all Component inputs in the Pipeline to dump their data into trace files.

l output: Use this setting to force all Component outputs in the Pipeline to dump their data into trace files.

l input,output: Use this setting to force all Component inputs and outputs to be traced.

The following is an example of the data tracing command to trace both inputs and outputs:


SERVER_URI = 'http://snaplogic1.snaplogic.org:443'

TUTORIAL_1_PIPE = '/SnapLogic/Tutorial/Exercise_1/Pipelines/Leads_to_

Prospects'

pipe_resdef = snapi.get_resource_object(SERVER_URI + TUTORIAL_1_PIPE)

handle = pipe_resdef.execute(None,

None,

{"sn.trace_data": "input,output"},

"Pipeline with tracing enabled")

# ...

- 115 -

- 116 -


Appendix: SnAPI and PostgreSQLGeneral This page describes the usage of SnAPI from within PostgreSQL. This has been tested with the following:

l Ubuntu 10.04

l PostgreSQL 8.4 (installed with apt-get install postgresql-plpython-8.4)

Prerequisites PL/Python

This assumes PL/Python has been installed and CREATE LANGUAGE plpythonu; has been run on the appropriate databases. For more details on that, see http://www.-postgresonline.com/journal/index.php?/archives/99-Quick-Intro-to-PLPython.html.

First, check the version of Python PL/Python will be using. It has to be 2.5 or 2.6. This can be done by executing the following at the psql prompt:

CREATE FUNCTION get_plpython_ver() RETURNS text

AS $$

import sys

return str(sys.version)

$$ LANGUAGE plpythonu;

SELECT get_plpython_ver();

If the version returned from the query above is not 2.5 or 2.6 you may have to rebuild lib/post-gresql/plpython.so.

Log Level

Sample snippets below assume SET client_min_messages='LOG'; has been run at the psql prompt. For more information on that, see 41.3. Database Access chapter of the PostgreSQL manual.

Triggers This example presents integration with triggers, as a useful case for change data capture. Here we will write new data to a CSV file as it gets inserted into a table. Triggers are chosen as an example for a nice use case, but, as triggers use functions, this means that functions can be used as well to integrate with SnapLogic. We will assume, for the purposes of this example, that every time a row is inserted into a table, we'd want to capture this changed data and write it out into the CSV file. A more advanced scenario could, of course, set up the

- 117 -

B

http://www.postgresql.org/docs/8.4/interactive/plpython-database.html






trigger on both insert and update, and use DBUpsert component to propagate the change to another database. We will, then, go through the following steps:

Table to Capture Changes on

For the purposes of this example, we will create a table the changes for which we will prop-agate to the CSV file, as follows:

CREATE TABLE snap_cdc_example (

city VARCHAR(100),

state VARCHAR(2));

Create CSVWrite Resource

We will create a CSVWrite resource at /postgres_cdc URI that has a single input view, Input1, with two String fields, city and state (corresponding to the above table).

Create Function

Create a function that will be used as a trigger:

CREATE OR REPLACE FUNCTION write_to_csv() RETURNS trigger

AS $$

import sys

# Path to the SnapLogic Python code base

sys.path.append('/opt/snaplogic/2.2.2PE')

from snaplogic.snapi_base import resdef


from snaplogic.rp import SNAP_ASN1_CONTENT_TYPE

# URL for the CsvWrite resource created above

write = snapi.get_resource_object('http://localhost:443/postgres_cdc')

inputs = {'Input1' : SNAP_ASN1_CONTENT_TYPE}

h = write.execute(inputs, None)

h.dictionary_format_record = True

inp = h.inputs['Input1']

# In this example, we are focusing on responding to just INSERTs,

# Thus, we are only interested in the 'new' value of the TD object.

# however, a trigger may act not just on INSERT but also on

# UPDATE or DELETE. In such cases we would be interested in other

# parts of TD object. For more information, see

# http://www.postgresql.org/docs/8.4/interactive/plpython-trigger.html

rec = TD['new']

plpy.log("Writing to SnapLogic: " + str(rec))

inp.write_record(TD['new'])

inp.close()

h.wait()

$$ LANGUAGE plpythonu;

- 118 -

Appendix: SnAPI and PostgreSQL

Create the Trigger

Create a trigger to run the above function, write_to_csv(), on every insert: DROP TRIGGER cdc_trig ON snap_cdc_example; CREATE TRIGGER snap_cdc_example AFTER INSERT ON snap_cdc_example FOR EACH ROW EXECUTE PROCEDURE write_to_csv();

Try it

Now, every insert into the snap_cdc_example table will result in writing the inserted data to the SnapLogic CSV Write resource created above. Try it for yourself. Execute

INSERT INTO snap_cdc_example VALUES ('New York', 'NY');

And see that New York,NY line is written to the postgres.csv file.

- 119 -


Appendix: ACLsThe access control lists, stored in the snapaccess.conf file, let you configure which user or groups has which role or privileges. Information on configuring ACLs can be found at "Under-standing ACLs".

The following is the list of predefined ACLs in the snapaccess.conf file.

l Location name: /

l Description: Root directory. Default deny, except for root, which is needed for the land-ing page. Required by the landing page (http://<hostname>:<port>).

l Default Permissions:

l DENY GROUP public

l ALLOW GROUP public PERMISSION read NONRECURSIVE

l ALLOW GROUP known PERMISSION read write execute

l Location Name: /__snap__

l Description: Allow all Snap handlers, most of which protect themselves further with admin restrictions or secret keys. Required by the product and usually should not be changed by the user.

l Default Permissions: ALLOW GROUP known PERMISSION read write execute

l Location Name: /__snap__/__static__

l Description: Allow static handlers, needed by anonymous users to login. Required by the product and usually should not be changed by the user.

l Default Permissions: ALLOW GROUP public PERMISSION read

l Location Name: /__snap__/__static__/protected

l Description: Only admin can access it. Required by the product and usually should not be changed by the user.

l Default Permissions:

l DENY GROUP public

l DENY GROUP known

l Location Name: /__snap__/auth

l Description: Allow auth entry point. Required by the product and usually should not be changed by the user.

l Default Permissions: ALLOW GROUP public PERMISSION read NONRECURSIVE

- 120 -

Appendix: ACLs

l Location Name: /__snap__/uri_check

l Description: Allow uri checks. Required by the product and usually should not be changed by the user.

l Default Permissions: ALLOW GROUP public PERMISSION read write execute NON-RECURSIV

l Location Name: /__snap__/auth/acl/list

l Description: Allow auth entry point. Required by the product and usually should not be changed by the user.

l Default Permissions: ALLOW GROUP known PERMISSION read NONRECURSIVE

l Location Name: /__snap__/cluster/status

l Description: Allow cluster status check. Required by the product and usually should not be changed by the user.


l Location Name: /__snap__/runtime/status

l Description: Allow runtime status check. Required by the product and usually should not be changed by the user.

l Default Permissions: ALLOW GROUP public PERMISSION read write execute

l Location Name: /__snap__/auth/check

l Description: Allow auth login point. Required by the product and usually should not be changed by the user.


l Location Name: /__snap__/cc/register

l Description: Allow cc register uri, which is further protected by tokens. Required by the product and usually should not be changed by the user.

l Default Permissions: ALLOW GROUP public PERMISSION write

l Location Name: /__snap__/meta/info

l Description: Allow info page, which is needed by snapadmin. Required by the product and usually should not be changed by the user.


- 121 -


l Location Name: /__snap__/resources/upgrade/status

l Description: Allow polling of repository upgrade status. Required by the product and should not usually be changed by the user.


l Location Name: /__snap__/self_check

l Description: Used for determining whether resource references are remote or local. Required by the product and usually should not be changed by the user.


l Location Name: /console

l Description: Runs the SnapLogic Management Console.


l Location Name: /crossdomain.xml

l Description: Serve /crossdomain.xml


l Location Name: /designer

l Description: Runs the SnapLogic Designer.


l Location Name: /extensions

l Description: By default, Snaps will instantiate resources here.

l Default Permissions: allow group public permission read write execute.

l Location Name: /favicon.ico

l Description: Access to the SnapLogic icon (from the root directory) for the browser.

l Default Permissions:ALLOW GROUP public PERMISSION read

l Location Name: /__snap__/cc_proxy/__snap__/runtime/

l Description: Allow proxying for querying runtime info


l Location Name: /robots.txt

l Description: Serve /robots.txt

- 122 -

Appendix: ACLs


l Location Name: /public

l Description: Public folder


l Location Name: /SnapLogic

l Description: Allows access to the SnapLogic directory.


l Location Name: /SnapLogic/Tutorial

l Description: Allows access to the SnapLogic Tutorial directory.

l Default Permissions: ALLOW GROUP public PERMISSION read execute

- 123 -

- 124 -


Glossary

C

canvasThe canvas is your main workspace in Designer. Create data inte-gration solutions in the canvas by sketching, connecting, and then configuring Components and Pipelines. Drag generic Component templates from the Foundry or configured Components from the Library in the sidebar to the canvas. Connect these objects to each other, configure, and execute them all from the canvas. Refer to the section about the canvas for greater detail.

ComponentA Component is an object used to perform a simple subtask, such as read, write, or act on data. Strung together, Components are the building blocks of Pipelines, or data flows. Components are generally classified as Connectors (Components that read or write data) and Operators (Components that perform an action, such as a join or filter, on data). Basic templates for Components are included in your SnapLogic installation (refer to the Component Reference for the list of Component templates that ship as part of SnapLogic), and reside in the Designer's Foundry panel. These generic templates, once configured, become configured Com-ponents that are stored in the SnapLogic Server repository, and can be found in the Designer's Library panel.

Component ContainerA process that runs a Component.

Component templateAn unconfigured Component in the Foundry. Component tem-plates are generic objects used to perform a simple subtask, such as read, write, or act on data. They are included in the SnapLogic installation, and reside in the SnapLogic Designer Foundry panel. To create a Component, you must configure a Component tem-plate to your specific needs. To create a Pipeline, you must con-figure one or more Component templates, and connect them to data sources, other Components, or data targets.

- 125 -


Connectivity SnapsA Snap that adds connectivity to an application or data source.

D

DesignerThe SnapLogic graphic user interface where you can visually create data integration scenarios.

F

FoundryThe bottom panel in the Designer's sidebar that stores the building blocks from which you can build projects.

I

inputInputs are pieces of data that a Component consumes for the pur-pose of performing functions on them, or (when supported) pass-ing them through to another downstream Component. Not all Components accept input data.

L

LibraryThe top panel in the Designer's sidebar that stores the projects you are building: your Pipelines and configured Components.

linkA link defines a mapping between the fields of any two Com-ponents.

M

Management ConsoleThe SnapLogic browser-based management console provides details about the performance of executed Pipelines, whether in a cluster or on individual SnapLogic servers. The management con-sole draws on comprehensive log message access, Pipeline- and Component-level statistics, and analysis of Pipeline run history to

- 126 -

Glossary

enable quick drill-down to the root causes of any Pipeline execution failures.

mappingThe identification of data relationships between different entities. In SnapLogic, field linking is used to specify how fields or columns from one Component map to those of a downstream Component.

O

outputOutputs are pieces of data a Component produces that can serve as inputs to downstream Components. Not all Components produce outputs.

P

parameterParameters are variables that can be used for run-time sub-stitution in the properties of Components or Pipelines. You can use parameters to avoid hard-coding property values that are likely to change. Using parameters enables you to use a single Component or a single Pipeline for multiple purposes. Parameters defined at the Pipeline level must be mapped to corresponding parameters at the Component level within the Pipeline. Refer to the Com-ponent Parameters section for more information on parameters.

pass-throughThe pass-through capability allows Components to accept fields that are not specified as inputs, and pass these fields directly as their outputs. When you link two Components, only those fields specified as inputs of the downstream Component must be linked. All the remaining unlinked outputs of the upstream Component are passed through to the downstream Component's output. With pass-through, the inputs of Components in a Pipeline need not be explicitly designed to handle all of the incoming fields from upstream Components. Component inputs only need to specify fields that the Component requires for its computations. This reduces the field linking to the absolute minimum.

- 127 -


PipelineA collection of one or more Components linked together to orches-trate a flow of data between end points.

S

SchedulerThe SnapLogic utility for scheduling automatic, periodic executions of a Pipeline. The SnapLogic Server runs the Pipeline unattended at the dates and times you specify, and using any parameter values you specify.

sliderA feature in the SnapLogic Designer that enables you to navigate through Component, connection, and Pipeline properties below the canvas, while still viewing the corresponding object in the can-vas above.

SnapAn object that performs a complete, and usually high-level, func-tion. A Snap can be a collection of Components that are func-tionally related, such as the Salesforce Snap, which contains Components for inserting contacts into and deleting contacts from Salesforce. A Snap can also consist of a single low-level building block, such as a filter. A Snap can comprise a complete Pipeline packaged as a simple Component to insert an item. The definition of "Snap" is therefore a recursive one: A complex Snap can con-tain multiple Pipelines; a simple Snap can stand alone or par-ticipate in a Pipeline. You can purchase a Snap in SnapStore and install it into the Foundry. After you configure its Component tem-plates to your specific sources and targets, the Snap resides in the Library.

SnapAdminA simple command line interface that provides basic SnapLogic Server administration functions. Refer to the "SnapAdmin Utility" section for more information.

SnAPISnapLogic Application Program Interface is the programmatic interface to the SnapLogic Server that enables you to create and

- 128 -

Glossary

use Components and Pipelines from your application or devel-opment environment. SnAPI is ideal for users who do not need the visual interface of the SnapLogic Designer or for those who wish to create Components and Pipelines by way of code generation.

SnapStoreSnapLogic's online marketplace, SnapStore, enables developers, SIs, and ISVs to develop and monetize custom Snaps, extending the SnapLogic Server's connectivity and functionality.

- 129 -


- 130 -

A

ACLs

in snapaccess.conf 81

syntax 82

understanding 81

admin

password 65

administration

buffer configuration 79

memory configuration 78

overview 63

architecture 9

authentication

Active Directory 84

overview 80

permissions 80

snapaccess.conf 81

user credentials 80

B

buffer

configuration 79

C

canvas

overview 20

tabs 20

toolbar 20

clustering

head node 72

worker node 73

Component

configuring 26

creating in Designer 25

creating in SnAPI 109

creation overview 25

database related 29

definition 11

overview 25

parameters 36

optional 36

required 36

syntax 37

pass-through fields

configuring in Designer 33

configuring in SnAPI 110

overview 32

properties 27

suggestions

Designer 29

overview 28

SnAPI 110

Components

exporting using SnapAdmin 96

exporting within Designer 96

general information 26

importing using SnapAdmin 94

importing within Designer 94

inputs 28

- 131 -

Index


liking fields in Designer 43

liking fields in SnAPI 114

outputs 27

previewing 28

validating

Designer 38

management console 38

overview 37

SnAPI 112

concepts

overveiw 9

credentials

http requests 85

D

data folder

Library 19

data representations 101

data service

access feed from browser 46

access feed from Designer;feeds

accessing 45

creating Pipeline in Designer 44

data tracing

configuration file 55

designer 54

trace files 55

data types;output representation for-mats 100

Designer

launching 13

menu bar 14

overview 13

E

error handling 47

F

failover

server configuration 77

Sidekick configuration 77

feed 46

Field Linker 43

files

uploading from Library 19

Foundry

categories 18

Foundry view 18

overview 17

Recently Used view 18

toolbar 18

view tabs 18

G

groups

adding users 64

creating 64


known 80

public 80

H

high availability 77

http requests

credentials 85

- 132 -

L

Library

data folder 19

folders 19

overview 18

toolbar 22

view tabs 19

M


Events 91

Pipelines 92

registering servers 90

Servers 93

using 90

Wall 91

memory

configuration 78

O

output fields

default values 35

providing NULL default value 35

replacing NULL with default value 35

transforming field values 36

P

parameters

specifying at runtime in Designer 37

specifying at runtime in SnAPI 111

pass-through fields

configuring in Designer 33

configuring in SnAPI 110

overview 32

passwords

admin 65

changing users 65

permissions

execute 80

read 80

write 80

Pipeline

concurrent execution 79

configuring 39

dataservice


definition 11

properties 40

Pipelines

aborting 51

creating in Designer 39


data types 100

definition 39

executing 49

executing from HTTP 51

executing from management console 51

executing from SnapAdmin 51

executing in Designer 50

executing in SnAPI 114

execution

scheduling 52

general information 40

inputs 41

- 133 -


outputs 41

parameters 41

running 49

updating 42

proxy server

configuring for SnapLogic 98

self-signed certificates 100

R

repository

encrypt 105

password 106

requests

anonymous 85

http and credentials 85

S

sandboxing 94

Scheduler

email notifications 53

properties 52

security

overview 87

server configuration 77

servers

clustering 71

head node 72

head node failover 74

job distribution 73

worker nodes 72

settings 15

Sidekick 108

in Library 19

Sidekick configuration 77

slider

commands 22

Component mode 22

Field Linker mode 22

overview 21

Pipeline mode 23

Smart Link 43

Snap

definition 11

SnapAdmin

commands 102

overview 102

SnAPI

configure 13

configuring pass-through fields 110

overview 13

procedures 109

SnapLogic

about 7

Component servers 10

design process 11

user interfaces 10

SnapLogic Server

architecture 9

Component Container parameters 68

configuration file;snapserv.conf 66

configuring 65

configuring for proxy server 99

data cache parameters 69

general settings 66

- 134 -


notification 69

repository configuration 69

starting 65

stopping 65

SnapLogic Sidekick 108

Snaps

accessing 57

configuring 59

defined 57

developing; 59

installing 58

installing in a cluster 58

overview 57

SnapStore

defined 57

SSL

disabling non-SSL access 89

enabling 88

proxy configuration 99

usage 89

T

token-based authentication

server setting 68

U

user interfaces

overview 13

usergroups


users

changing password 65

creating 63


- 135 -


- 136 -

Documents

SnapLogic User GuideSnapLogic® User Guide Document Release: October 2013 SnapLogic, Inc. 2 W 5th Ave, Fourth Floor San Mateo, California 94402 U.S.A. Table of Contents SnapLogic®