72
EMC ® Atmos Version 2.3.0 CAS Programmer’s Guide P/N 302-002-037 REV 01

EMC Atmos CAS Programmer’s Guide · EMC Atmos Version 2.3.0 CAS Programmer’s Guide 5 PREFACE As part of an effort to improve its product lines, EMC periodically releases revisions

Embed Size (px)

Citation preview

EMC®

Atmos™

Version 2.3.0

CAS Programmer’s GuideP/N 302-002-037REV 01

EMC Atmos Version 2.3.0 CAS Programmer’s Guide2

Copyright © 2008 - 2015 EMC Corporation. All rights reserved. Published in the USA.

Published May, 2015

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

EMC2, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United States and other countries. All other trademarks used herein are the property of their respective owners.For the most up-to-date regulatory document for your product line, go to the technical documentation and advisories section on the EMC online support website.

CONTENTS

Preface

Chapter 1 Introduction

Introducing EMC Atmos............................................................................... 10 Introducing Atmos CAS ............................................................................... 11 Key concepts............................................................................................... 12

Fixed content ........................................................................................ 12CAS....................................................................................................... 12Content Address ................................................................................... 12Pool ...................................................................................................... 13Access credentials ................................................................................ 13C-Clip .................................................................................................... 13Tags and tag attributes.......................................................................... 14Blob...................................................................................................... 14CDF ....................................................................................................... 14Stream .................................................................................................. 14

Basic calling sequences .............................................................................. 14Code examples ..................................................................................... 16

Chapter 2 Streaming Data

Understanding streams............................................................................... 18Stream types......................................................................................... 18Buffer types .......................................................................................... 19

File streams ................................................................................................ 19 Buffer streams ............................................................................................ 21 Generic streams.......................................................................................... 22

Using the generic stream API................................................................. 22Blob write flow via generic stream ......................................................... 24Blob read flow via generic stream.......................................................... 27

Factors that affect streaming data ............................................................... 30

Chapter 3 Application Authentication and Authorization

Access security model................................................................................. 34Capabilities........................................................................................... 34

Application authentication and authorization.............................................. 34Creating PEA files .................................................................................. 35

Chapter 4 Best Practices

Programmer’s notes.................................................................................... 38Time formats ......................................................................................... 40

Pool functions............................................................................................. 40Load balancing ..................................................................................... 41Connection pooling ............................................................................... 42Set pool options.................................................................................... 42Profile clips .......................................................................................... 43Blob-slicing........................................................................................... 43

EMC Atmos Version 2.3.0 CAS Programmer’s Guide 3

Contents

Clip functions.............................................................................................. 44Traverse C-Clips .................................................................................... 44Check modifications before FPClip_RawRead( ) ..................................... 44C-Clip IDs in canonical format................................................................ 45

Tag functions .............................................................................................. 45Writing blobs......................................................................................... 45Reading blobs....................................................................................... 45

Stream functions......................................................................................... 46Streaming recommendations ................................................................ 46

Query functions .......................................................................................... 46Performing a query ................................................................................ 47Resuming interrupted queries (incremental queries) ............................. 48Other query information ........................................................................ 48

Buffers........................................................................................................ 48Memory stream ..................................................................................... 48

Error handling ............................................................................................. 49Parameter errors ................................................................................... 50

Logging....................................................................................................... 50Use case description............................................................................. 51

Unicode and wide character support ........................................................... 52 Synchronization in multithreaded programs ................................................ 53

Increasing performance......................................................................... 53 Retry function ............................................................................................. 53

Changing the retry settings.................................................................... 53 Content Address calculation........................................................................ 54

Content Address collision avoidance..................................................... 54 Performance ............................................................................................... 55

Fixed overhead and object size ............................................................. 55Policy definition .................................................................................... 55Embedding data in the CDF ................................................................... 55Aggregating data in a single blob .......................................................... 56Threads................................................................................................. 56Application server ................................................................................. 57Pluggable SDK MD5 module.................................................................. 57SHA256 algorithm................................................................................. 60

IBM z/OS notes .......................................................................................... 60Porting overview.................................................................................... 61Branch interfaces not supported ........................................................... 61No reserved ports for IBM z/OS client.................................................... 61Memory optimization runtime recommendations for XPLINK.................. 61Multiprogramming support ................................................................... 61LRECL, fixed block datasets, and trailing NULLs .................................... 62IBM C/C++............................................................................................. 62Running an Atmos CAS IBM z/OS application under TSO ....................... 63Dataset name specification................................................................... 63Atmos CAS IBM z/OS API memory utilization notes................................ 64APF-authorization support..................................................................... 66Invoking the Atmos CAS API from assembler ......................................... 66

Index

4 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

PREFACE

As part of an effort to improve its product lines, EMC periodically releases revisions of its software and hardware. Therefore, some functions described in this document might not be supported by all versions of the software or hardware currently in use. The product release notes provide the most up-to-date information on product features.

Contact your EMC representative if a product does not function properly or does not function as described in this document.

Note: This document was accurate at publication time. New versions of this document might be released on the EMC online support website. Check the EMC online support website to ensure that you are using the latest version of this document.

AudienceThis guide is for experienced programmers who are developing applications that interface with an Atmos CAS-enabled node using the Atmos CAS API. This guide is a companion supplement to the EMC Atmos CAS API Reference Guide. It provides feature descriptions and best practice recommendations.

Readers of this document are expected to be familiar with the following topics:

• Content Addressed Storage (CAS)

• EMC Atmos architecture and features

Related documentationThe EMC Atmos documentation set includes the following titles:

• EMC Atmos Release Notes

• EMC Atmos Administrator’s Guide

• EMC Atmos Programmer’s Guide

• EMC Atmos System Management API Guide

• EMC Atmos Security Configuration Guide

• EMC Atmos CAS Programmer’s Guide

• EMC Atmos CAS API Reference Guide

• EMC Atmos Installable File System (IFS) Installation and Upgrade Guide

• EMC Atmos online help

• EMC Atmos Series Open Source License and Copyright Information

• EMC Atmos Series Open Source License and Copyright Information for GPLv3

EMC Atmos Version 2.3.0 CAS Programmer’s Guide 5

Preface

Conventions used in this documentEMC uses the following conventions for special notices:

Note: A note presents information that is important, but not hazard-related.

IMPORTANT

An important notice contains information essential to software or hardware operation.

Typographical conventions

EMC uses the following type style conventions in this document:

Normal Used in running (nonprocedural) text for:• Names of interface elements, such as names of windows, dialog boxes,

buttons, fields, and menus• Names of resources, attributes, pools, Boolean expressions, buttons,

DQL statements, keywords, clauses, environment variables, functions, and utilities

• URLs, pathnames, filenames, directory names, computer names, links, groups, service keys, file systems, and notifications

Bold Used in running (nonprocedural) text for names of commands, daemons, options, programs, processes, services, applications, utilities, kernels, notifications, system calls, and man pages

Used in procedures for:• Names of interface elements, such as names of windows, dialog boxes,

buttons, fields, and menus• What the user specifically selects, clicks, presses, or types

Italic Used in all text (including procedures) for:• Full titles of publications referenced in text• Emphasis, for example, a new term• Variables

Courier Used for:• System output, such as an error message or script• URLs, complete paths, filenames, prompts, and syntax when shown

outside of running text

Courier bold Used for specific user input, such as commands

Courier italic Used in procedures for:• Variables on the command line• User input variables

< > Angle brackets enclose parameter or variable values supplied by the user

[ ] Square brackets enclose optional values

| Vertical bar indicates alternate selections — the bar means “or”

{ } Braces enclose content that the user must specify, such as x or y or z

... Ellipses indicate nonessential information omitted from the example

6 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Preface

Where to get helpEMC support, product, and licensing information can be obtained as follows:

Product information - For documentation, release notes, software updates, or information about EMC products, go to EMC Online Support at:

https://support.emc.com

Technical support - Go to EMC Online Support and click Service Center. You will see several options for contacting EMC Technical Support. Note that to open a service request, you must have a valid support agreement. Contact your EMC sales representative for details about obtaining a valid support agreement or with questions about your account.

Your commentsYour suggestions will help us continue to improve the accuracy, organization, and overall quality of the user publications. Send your opinions of this document to:

[email protected]

EMC Atmos Version 2.3.0 CAS Programmer’s Guide 7

Preface

8 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

CHAPTER 1Introduction

Invisible Body Tag

This chapter provides a brief overview of EMC Atmos CAS and its key concepts.

• Introducing EMC Atmos........................................................................................... 10• Introducing Atmos CAS ........................................................................................... 11• Key concepts........................................................................................................... 12• Basic calling sequences .......................................................................................... 14

Introduction 9

Introduction

Introducing EMC AtmosAtmos is a multi-petabyte platform for information storage and distribution. It combines massive scalability with automated data placement to efficiently deliver content worldwide. The Atmos platform provides the following features for cloud storage:

• Policy-based information management• Multi-tenancy• REST and SOAP web service APIs• CAS API• A browser-based administration tool• Universal namespace• Advanced auto-managing and auto-healing capabilities

Figure 1 on page 10 shows the front view of an EMC Atmos unit.

Figure 1 An EMC Atmos unit

Figure 2 on page 11 shows the front view of an EMC Atmos unit with the panels removed and the tray pulled out.

10 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Introduction

Figure 2 An EMC Atmos unit with panels removed

Introducing Atmos CASAtmos CAS is an access method for Atmos specifically designed to store and provide fast and easy access to fixed content (information in its final form). Fixed content is the fastest growing category of information that is traditionally stored offline after a certain period of time. Atmos CAS enables a cost-effective means of delivering this fixed information online, while improving information availability.

Atmos CAS as the access method guarantees longevity and authenticity of content stored. In addition, Atmos CAS leverages policy-based data placement to ensure the availability of data.

Applications can use the CAS API (aka the EMC Centera SDK) to interact with Atmos CAS. Using Atmos CAS requires CAS API version 3.1 or higher.

In this guide, the following EMC Centera SDK terms in Table 1 are mapped to their equivalents in Atmos. The terms can be used interchangeably as references to describe the same concepts:

Table 1 Mapping of terms between Centera and Atmos

Centera term Atmos equivalent

Pool A combination of subtenant ID and user ID (UID)

Profile Subtenant UID

Introducing Atmos CAS 11

Introduction

Key conceptsAs an Atmos CAS application developer, you must be familiar with the following concepts that comprise the core Atmos CAS components:

• “Fixed content”• “CAS”• “Content Address”• “Pool”• “Access credentials”• “C-Clip”• “Tags and tag attributes”• “Blob”• “CDF” • “Stream”

Fixed content

A fixed content object is a piece of unstructured application data that exists in its final form. Fixed content is unchanged data with long-term value. A few examples of fixed content data include medical images like CT scans and digital X-rays, audio/video files, email archiving, and business documents of all kinds.

CAS

Content Addressed Storage (CAS) is an object-oriented, location-independent approach for archiving large quantities of fixed content data. Atmos CAS provides valet-style archiving of data objects, and guarantees the integrity of all archived objects regardless if they are stored for 1 year or 100+ years.

Content Address

A Content Address is a data object’s unique identifier. A Content Address is the claim ticket that is returned to the client application when an object is stored to the archive. The Content Address must then be presented back to the archive when a request to retrieve the content is made.

Cluster CAS-enabled access nodes

PEA file PEA file or the shared secret of the UID.

Note: In Atmos, each PEA file can only contain a single set of access credentials.

C-Clip ID Atmos object ID

Table 1 Mapping of terms between Centera and Atmos

Centera term Atmos equivalent

12 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Introduction

Each Content Address is generated by applying a naming-scheme algorithm to the binary representation of a fixed content object. Because the Content Address is derived from the object, it also can be used to guarantee the integrity of the object. Atmos CAS continuously monitors the integrity of all stored objects, and the SDK automatically checks the integrity of each object retrieved from the archive.

Pool

A pool is a logical entity representing an archive. In Atmos, the pool is represented by a combination of subtenant ID and UID.

A pool is opened for I/O operations by invoking the FPPool_Open API and providing a series of node IP addresses and access credentials for the subtenant and UID combination, which the pool represents. The SDK creates a connection to the first available access node running the CAS service on that list. The pool object also auto-discovers all available access nodes that are running the CAS service. Write and read operations can occur on any CAS-enabled node.

Access credentials

Applications need to provide access credentials when opening a pool for I/O operations. An access credential consists of a name and a shared secret. In Atmos, the name is a combination of subtenant ID and UID joined together by a colon. The shared secret is a base64-encoded shared secret for the UID. You can retrieve the subtenant ID, UID, and shared secret by logging into the Atmos management GUI as a Tenant Admin.

Alternatively, you can use the Atmos-CAS PEA File Generator to generate a PEA file that contains the access credentials (subtenant ID, UID, and shared secret of the UID).

You can download the Atmos-CAS PEA File Generator from Powerlink following this path for registered users: Support > Software Downloads and Licensing > Downloads A-B > Atmos Software and Virtual Edition (VE).

The following connection string is used to access Atmos through Atmos CAS:

<Atmos CAS IP>?<peafile>

or

<Atmos CAS IP>?name=<subtenant uid>:<uid>,secret=<shared secret>

Note: The EMC Atmos Administrator’s Guide provides the procedure on using the Atmos-CAS PEA File Generator.

C-Clip

A C-Clip™ is an application object that represents a bundle of fixed content data and metadata. A C-Clip is created by calling FPClip_Create. This operation creates a new C-Clip in memory so that it may be populated by the application. The application can associate any number of fixed content objects with a C-Clip, and can utilize metadata to describe the associated content. Each C-Clip also has a set of standard metadata values such as the application-specified name and the system-specified creation date and modification date.

Key concepts 13

Introduction

Tags and tag attributes

Tags and tag attributes allow client applications to create self-describing objects in nodes. A C-Clip can have any number of tags associated with it. Each tag is created by calling FPTag_Create. A tag can have a name, a series of attributes, and a single data object associated with it. An attribute is a name-value pair that is associated with the tag.

Blob

A blob is a series of bytes that represents a fixed content object stored in Atmos. The format and structure of the blob is wholly owned by the client application, and neither the SDK nor the CAS-enabled node attempts to interpret the binary object.

Each tag in a C-Clip may have a single logical blob associated with it. Blobs are streamed to the access node via the FPTag_BlobWrite operation. After a blob has been successfully streamed to the access node, the associated Content Address is stored to the tag in the C-Clip. This Content Address represents a link between the two objects—the C-Clip tag and the blob.

CDF

The C-Clip Descriptor File (CDF) is the physical object stored to Atmos nodes that represents an application-defined C-Clip. The CDF contains all of the metadata specified by the client application, and the links to all associated blobs. The CDF-to-blob relationship can have a cardinality of one-to-one, one-to-many, or many-to-many.

A C-Clip is serialized to the CAS access node, in the form of a CDF, when the application calls FPClip_Write. FPClip_Write will return a Content Address to the client application, and this value must be used to retrieve the object. An existing C-Clip may be opened by calling FPClip_Open and providing the Content Address for the desired CDF.

Stream

A stream is an SDK mechanism used to exchange content with an Atmos CAS-enabled access node. The SDK provides easy-to-use file- and buffer-based streams for the efficient transfer of content. The SDK also supports a generic stream mechanism that can be used if the application content cannot be provided to the SDK as a file- or buffer-based stream. Examples of instances when generic streams are required include streaming live video and recording call center audio.

Note: For more information about streaming data, refer to Chapter 2, “Streaming Data.”

Basic calling sequences This section shows the basic structure for writing code with the EMC Centera API.

C FPPool_Open()Opens a connection to the cluster.

FPClip_Create()Creates a new empty C-Clip structure in memory.

14 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Introduction

FPClip_GetTopTag()Opens the top-level (root) tag as a starting point for C-Clip tag navigation.

FPTag_Create()Creates a new tag in a tree that has already been opened.

FPTag_BlobWrite()Writes the data.

FPTag_Close()Closes the connection to an opened tag.

FPClip_Write()Returns the actual C-Clip reference.

FPClip_Close()Closes an opened C-Clip.

FPPool_Close()Closes an opened Pool.

Note: When you open a C-Clip (FPClip_Open()) and a tag (for example using FPTag_Create()) in the C-Clip, ensure that you always close the tag (FPTag_Close()) before you close the C-Clip (FPClip_Close()).

Java FPPool vPool = new FPPool ("pooladdress") Opens a connection with the cluster.

FPClip vClip = new FPClip (vPool, "clipname") Creates a new empty C-Clip structure in memory.

FPTag vTopTag = vClip.getTopTag() ; Opens a top-level (root) tag as a starting point for C-Clip navigation.

FPTag vTag = new FPTag (vTopTag, "tagname") ; Creates a new tag in a tree that has already been opened. vTag.BlobWrite (new FileInputStream (…)); Writes the data.

vTag.Close(); Closes the connection to the opened tag.

vTopTag.Close() ; Closes the root tag.

vClip.Close() ; Closes the opened C-Clip.

vPool.Close() ; Closes the connection to the cluster.

Basic calling sequences 15

Introduction

Code examples

EMC provides complete code examples, both C and Java, that model how you should structure any code that you develop.

The examples are available in the SDK distribution in the samples directory. Each example has a subdirectory under samples containing C and Java implementations of the example, and a document that describes the example.

16 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

CHAPTER 2Streaming Data

Invisible Body Tag

This chapter describes the various types of streaming data and streaming API operations in EMC Atmos CAS, from the use of SDK-provided streams to generic streams.

The main sections are:

• Understanding streams........................................................................................... 18• File streams ............................................................................................................ 19• Buffer streams ........................................................................................................ 21• Generic streams...................................................................................................... 22• Factors that affect streaming data ........................................................................... 30

Streaming Data 17

Streaming Data

Understanding streamsA stream is a transfer mechanism that manages the exchange of blob data to and from Atmos CAS. A stream is like a logical pipe that facilitates the flow of blob data between a source (data source) and a sink (target). A source can be a file, a pointer to in-memory data on a server, or a device.

A stream is either an input stream or an output stream. The use of the term input or output relates to the flow direction of content as it affects Atmos CAS. An input stream ingests (writes) data to Atmos CAS, which means the SDK must read the data from the input stream. An output stream reads data from Atmos CAS, which means the SDK must write the data to the output stream.

Figure 3 shows an example of an input stream that transfers content from a data source to Atmos CAS. Calling FPTag_BlobWrite() or FPTag_BlobWritePartial() causes the SDK to migrate the data contained in the input stream to the Atmos CAS-enabled node.

Figure 3 Blob data input to Atmos CAS

Similarly, calling FPTag_BlobRead() or FPTag_BlobReadPartial() causes the SDK to migrate the data contained in the output stream from the Atmos CAS-enabled node.

In addition, FPClip_RawRead() and FPClip_RawOpen() both use streams to write and restore CDFs, respectively.

Stream types

Depending on the nature of the source data, you use one of the following types of streams for content transfer. Each stream type, except the generic stream, has a separate API call for each input and output streaming function. The generic stream uses one API call, which you can configure for input or output:

• File stream — Transfers data via file.

• Buffer stream — Transfers data via a single memory buffer.

• Generic stream — Allows for the aggregation of multiple memory buffers.

18 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Streaming Data

Note: The file and buffer streams are SDK-provided streams. They are wrappers around the standard mechanism of a generic stream. The SDK generic stream determines the overall streaming behavior of a file or buffer stream.

Buffer types

The SDK uses the following types of buffers:

Protocol buffer For read and write operations, the SDK and Atmos CAS-enabled node exchange data using the proprietary HPP protocol. This protocol uses buffers with a fixed size of 16 KB (16*1024). For every write and read operation, data is partitioned and transmitted in packets of 16 KB. This size cannot be changed.

Internal buffer When reading from the CAS-enabled nodes using a stream, the SDK provides a buffer the size of which depends on the amount of data provided by the server. The SDK and the stream can both determine the size of the buffer.

Application buffer When writing to the CAS-enabled nodes using a stream, the application provides the buffer and the buffer size. Although the SDK requests a size of 16 KB, it does not reject larger buffers that can cause performance degradation.

File streamsFile streams transfer content that is accessible via a single file. The client application points the SDK to the file, and the SDK completely manages the data transfer.

Input (Write) To transfer data to Atmos CAS from a file, use the API call FPStream_CreateFileForInput() to create the file stream.

FPStream_CreateFileForInput() uses the following parameters:

• const char *pFilePath: The file path for the content to be written to Atmos CAS.

• const char *pPerm: The buffer that contains the open permission for the file. The string can be rb (read binary) only.

• const long pBuffSize: The size of the memory buffer used to transfer the file through the SDK.

Example // Open a new streamFPStreamRef vStream= FPStream_CreateFileForInput( pPath, "rb", 16*1024

);

if (ENOERR == FPPool_GetLastError()){

// write it to the poolFPTag_BlobWrite( vFileTag, vStream, 0 );

// check if it succeededif (ENOERR != FPPool_GetLastError()){

// error handling... }// and don't forget to close it...FPStream_Close( vStream);

File streams 19

Streaming Data

// and check the close succeeded (as above)}

To transfer partial data to Atmos CAS from a file, use the API call FPStream_CreatePartialFileForInput() to create the file stream.

FPStream_CreatePartialFileForInput() uses the following parameters:

• const char *pFilePath: The file path for the content to be written to Atmos CAS.

• const char *pPerm: The buffer that contains the open permission for the file. The string can be rb (read binary) only.

• const long pBuffSize: The size of the memory buffer used to transfer the file through the SDK.

• const long pOffset: The starting point in the file (in bytes) from which Centera starts the read.

• const long pLength: The amount of data (in bytes) from the offset to be transferred to Atmos CAS. You can use FP_STREAM_EOF to request all remaining data in the file from the given offset.

Example FPStreamRef vStream= FPStream_CreatePartialFileForInput( pPath, "rb", 16*1024, 0, 100);

// Write 100 bytes from offset 0// Open a new stream

if (vStream != 0) { FPTag_BlobWritePartial (vFileTag, vStream,

FP_OPTION_CLIENT_CALCID, seqID); // write it to the pool FPStream_Close (vStream); // and don't forget to close it... } vStatus = FPPool_GetLastError(); }

Output (Read) To transfer data from Atmos CAS to a file, use the API call FPStream_CreateFileForOutput().

FPStream_CreateFileForOutput() uses the following parameters:

• const char *pFilePath: The file path to which the Atmos CAS content is written.

• const char *pPerm: The buffer that contains the open permission for the file. For a write, the string is typically wb (write binary), however, other choices may be ab, rb+, wb+, or ab+.

Example // Create a new binary file for writeFPStreamRef vStream= FPStream_CreateFileForOutput( pPath, "wb");

if (ENOERR == FPPool_GetLastError()){

// read streamFPTag_BlobRead( pTag, vStream, 0 );

if (ENOERR != FPPool_GetLastError()){

// error handling... }

20 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Streaming Data

else{// close file streamFPStream_Close( vStream);

// Check for error as before}

Note: The EMC Atmos CAS API Reference Guide provides more information on these functions and parameters.

To transfer partial data from Atmos CAS to a file, use the same API call FPStream_CreateFileForOutput().

Example { FPStreamRef vStream = FPStream_CreateFileForOutput (pPath, "ab+"); // Read 100 bytes to offset 0// Open a new stream// Use FPTag_BlobReadPartial instead of FPTag_BlobRead if (vStream != 0) { FPTag_BlobReadPartial (vFileTag, vStream, 0, 100,

FP_OPTION_DEFAULT_OPTIONS); // write it to the pool FPStream_Close (vStream); // and don't forget to close it... } vStatus = FPPool_GetLastError(); }

Buffer streamsBuffer streams transfer content that is available via a single in-memory buffer. The client application points the SDK to the file, and the SDK completely manages the data transfer.

Input (Write) To transfer data to Atmos CAS from a memory buffer, use the API call FPStream_CreateBufferForInput().

FPStream_CreateBufferForInput() uses the following parameters:

• const char *inBuffer: The pointer to the memory buffer containing the data to be written to Atmos CAS.

• const unsigned long inBuffLen: The length of the memory buffer in bytes.

Note: The entire blob must be in memory prior to the start of the transfer.

Example // open a new streamFPStreamRef vStream= FPStream_CreateBufferForInput( myDataSource,

A_BIG_SIZE);

Output (Read) To transfer data from Atmos CAS to a memory buffer, use the API call FPStream_CreateBufferForOutput().

FPStream_CreateBufferForOutput() uses the following parameters:

• const char *inBuffer: The pointer to the memory buffer that receives the data from Atmos CAS.

Buffer streams 21

Streaming Data

• const unsigned long inBuffLen: The length of the memory buffer in bytes.

For more information on these functions and their parameters, refer to the EMC Atmos CAS API Reference Guide.

Example // open a new streamFPStreamRef vStream= FPStream_CreateBufferForOutput( myDataSource,

A_BIG_SIZE);

Generic streamsGeneric streams form the underlying basis of file and buffer streams. An application can define a generic stream if the content is not available via a single file or memory buffer. EMC recommends that you use a generic stream only if the following conditions exist, otherwise, use the appropriate file or buffer stream for a simpler implementation:

• The data is not completely held in memory.

• The data is not an application file.

• Your application needs granular control over the behavior of the data transfer (for example, defining callback functions to read streaming video off a device).

Using the generic stream API

The function FPStream_CreateGenericStream creates either an input or output stream with various callback functions that behave differently depending on the purpose of the stream. During stream creation, the SDK uses the following callback functions in FPStream_CreateGenericStream:

• PrepareBufferProc • CompleteProc • SetMarkerProc • ResetMarkerProc • CloseProc

Each function takes a pointer to an FPStreamInfo structure as a parameter.

FPStreamInfo structureThe generic stream allocates and maintains an FPStreamInfo structure. This structure contains fields that define the required input parameter for each callback function in the generic stream. These values convey stream handling information to and from the SDK. An application can read and write fields from this structure. If a callback returns a non-zero value, the blob read or write operation fails with an FP_STREAM_ERR (code -10209).

Each of the following fields in FPStreamInfo passes the applicable value to the methods. Many values are context-driven, depending on the callback function used and whether for an input or output stream.

Field validation and error handling

In v3.2, a validation of several FPStreamInfo fields occurs automatically, if the FP_OPTION_STREAM_STRICT_MODE option is enabled. (The default of this option enforces the strict mode.) If validation fails (with or without logging), the SDK generates the appropriate error message—either FP_STREAM_VALIDATION_ERR or FP_STREAM_BYTECOUNT_MISMATCH_ERR, as indicated. If only logging is turned on, the SDK issues the appropriate warning message, which does not result in an error.

22 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Streaming Data

Note: All error codes are defined in that chapter in the EMC Atmos CAS API Reference Guide.

The SDK validates the following FPStreamInfo fields:

• mVersion (short): The current version of the FPStreamInfo structure specified by the SDK. The callback functions should not modify this field. A failed validation generates an FP_STREAM_VALIDATION_ERR.

• mUserData (void*): The pointer to client-specific data passed by the application during the creation of the generic stream. The user data is passed in when the stream object is created and made available to the callback functions. The SDK does not access this field.

• mStreamPos (FPLong): The current position of the data in the stream. The application must update this field. A failed validation generates an FP_STREAM_VALIDATION_ERR.

• mMarkerPos (FPLong): The end position of the last successfully transferred block of data recorded by the stream. A failed validation generates an FP_STREAM_VALIDATION_ERR.

• mStreamLen (FPLong): The total number of bytes to be transferred. If unknown, this field defaults to -1. The application should provide this value when transferring data to Atmos CAS. A failed validation can generate either an FP_STREAM_VALIDATION_ERR or an FP_STREAM_BYTECOUNT_MISMATCH_ERR.

Note: In order for the SDK to use this field correctly, mStreamLen must be set before FPTag_BlobWrite() is called.

• mAtEOF (Boolean): The true/false indicator that shows the last transfer buffer is completed. The application must update this field when transferring data to Atmos CAS.

• mReadFlag (Boolean): The true/false indicator that shows the flow direction of the data from an Atmos CAS perspective. This field is true if the data transfer is to Atmos CAS; false, if the data transfer is to the application. A failed validation generates an FP_STREAM_VALIDATION_ERR.

• mBuffer (void*): The data buffer being transferred. The application populates this field if transferring data to Atmos CAS.

• mBufferLength (unsigned long): The number of bytes provided in the buffer. The application populates this field when transferring data to Atmos CAS.

Note: For complete details on the FPStreamInfo fields, refer to FPStream_CreateGenericStream in the EMC Atmos CAS API Reference Guide.

Callback functionsThe following callback functions appear in the order called by the SDK in FPStream_CreateGenericStream.

Generic streams 23

Streaming Data

PrepareBufferProc PrepareBufferProc prepares a buffer only for an input stream to use, otherwise, it is not required for an output stream. PrepareBufferProc allows the application to allocate and manage the data buffer used to facilitate the data transfer. The SDK calls PrepareBufferProc for each data segment that transfers to Atmos CAS. The SDK assumes ownership of the buffer until the buffer is no longer needed.

During data transfers to Atmos CAS, the application is responsible for the following actions:

• Populate the data buffer with the content to write to a blob in Atmos CAS.

• Indicate the data size in the mTransferLen field.

In the case of data transfers out of Atmos CAS, the application can allocate its own buffer for the data transfer by assigning a pointer to it in the mBuffer field and setting its size in the mTransferLen field.

CompleteProc CompleteProc confirms that a data segment has transferred successfully into or out of Atmos CAS. If an input stream, the application resumes ownership of the buffer. If an output stream, the buffer contains the next data segment retrieved from the blob for transfer. The SDK records the data size in the mTransferLen field and refers to the pointer in the mBuffer field for the client-specific data target.

Note: Processing pPrepareBufferProc and pCompleteProc must complete within 60 seconds, otherwise, an error results.

SetMarkerProc and ResetMarkerProc

SetMarkerProc and ResetMarkerProc are both optional callback functions that are used to indicate current stream positions for write operations to Atmos CAS. The SDK calls these functions as necessary, and the application must mark them when told to do so. These functions enable the SDK to record and reset position in the stream, if necessary. SetMarkerProc records the position after the last successful transfer. ResetMarkerProc resets the current position to the end of the last successful transfer. These callback functions are not mutually exclusive; each gives context to the other.

If marking support is provided, streams must provide adequate staging for up to 100 MB of data. This allows the SDK to rewind the stream on 100 MB boundaries, if necessary.

CloseProc CloseProc confirms that the data transfer is completed and the stream is closed. The SDK does not provide or accept any more buffers. The application can close the source or sink and deallocate any memory, if necessary.

Note: For complete details about each callback function, refer to FPStream_CreateGenericStream in the EMC Atmos CAS API Reference Guide.

Blob write flow via generic stream

Figure 4 shows an example of the process flow of writing blob data from an input generic stream to Atmos CAS. A high-level explanation of each numbered step follows the diagram.

24 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Streaming Data

Figure 4 Example of blob write

1. The application creates a generic stream by calling FPStream_CreateGenericStream.

2. The application sets the the total length of data in mStreamLen. If unknown, the default sets a -1 value.

3. The application calls an FPTag_BlobWrite.

4. The SDK first calls back into the application via PrepareBufferProc and requests the first 16 KB of data.

5. The application’s callback function PrepareBufferProc is invoked. PrepareBufferProc allocates a memory buffer (set in mBuffer), populates it with the blob data, and sets the buffer size in mTransferLen.

6. The application sets the Boolean value in mAtEOF to indicate whether the final transfer buffer has occurred.

7. The SDK receives the callback information implemented in steps 5 and 6 and processes the input by first reading the block of data and then writing it to Atmos CAS.

8. The SDK calls CompleteProc, which signals the end of a successful data transfer for your specified buffer. Remember that PrepareBufferProc and CompleteProc must complete within 60 seconds or an error occurs.

Generic streams 25

Streaming Data

9. The ownership of the buffer transfers back to the application, which frees up the memory buffer for the next block of data to be sent via PrepareBufferProc. Each data chunk requires a buffer allocation and a repeat of steps 4 - 9 until the end of file (EOF) is reached.

10. The application closes the stream.

11. When the SDK and the application finish their operations on a stream, the SDK calls CloseProc to allow the application to clean up any opened resources and deallocate memory, if necessary, as a consequence of the application closing the stream.

12. The application deallocates its memory buffer.

Figure 5 shows how the SDK and application use the callback functions to implement the transfer of data packets to Atmos CAS.

Figure 5 Data transfer of FPTag_BlobWrite

If the application provides data to the SDK in memory buffers that are larger than the requested size, the SDK sends the data to Atmos CAS in packets of 16 KB and smaller as shown in Figure 6.

Figure 6 Memory buffers larger than SDK request

26 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Streaming Data

If the application provides data to the SDK in memory buffers that are smaller than the requested size, the SDK sends the data to Atmos CAS in the supplied size.

Blob read flow via generic stream

Figure 7 shows an example of the process flow of reading blob data from an output generic stream received from Atmos CAS. A high-level explanation of each numbered step follows the diagram.

Figure 7 Example of blob read

1. The application creates a generic stream by calling FPStream_CreateGenericStream.

2. The application calls an FPTag_BlobRead.

3. The SDK first calls back into the application via PrepareBufferProc.

4. The application implements the callback function PrepareBufferProc by either allocating a memory buffer or allowing the SDK to supply one. If the application allocates its own buffer, it sets the pointer in mBuffer and sets the buffer size in mTransferLen. If the application wants the SDK to allocate the buffer, then there is no need to implement this callback.

5. The SDK receives the callback information implemented in step 4 and processes the input by writing the block of data received from Atmos CAS to the transfer buffer.

Generic streams 27

Streaming Data

6. The SDK calls CompleteProc, which signals the end of a successful data transfer for the next buffer. Remember that PrepareBufferProc and CompleteProc must complete within 60 seconds or an error occurs.

Note: To terminate an output read stream with an error prior to reading all the data requested in a FPTag_BlobRead or FPTag_BlobReadPartial call, set the mStreamPos variable to 0 and return a non-zero code from the CompleteProc callback. For example:pStreamInfo->mStreamPos = 0;Return(n); //where n is the non-zero return code

7. The application receives the memory buffer containing the blob data, checks the number of bytes in mTransferLen, and accepts the content into the client application.

8. The application checks whether it is the end of the file. If so, mTransferLen has a 0 value and mAtEOF is True.

9. The application processes the blob data from the buffer.

10. The application closes the stream.

11. The SDK calls CloseProc to clean up any opened resources and deallocate memory, if necessary.

12. If the SDK did not allocate the buffer, the ownership of the buffer transfers back to the application, which frees up the memory buffer for the next data block to be sent via PrepareBufferProc. Each data block requires a buffer allocation and a repeat of steps 4–9.

Figure 8 shows the transfer of data packets from the Atmos CAS server to the application, which reads the blob data from mBuffer and mTransferLen.

Figure 8 Data transfer of FPTag_BlobRead by internal buffer

An FPTag_BlobRead that is handled by an SDK internal buffer is the fastest means of data transfer because the data does not need to be rearranged.

28 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Streaming Data

If the application wants to manage its own buffers, the SDK can copy the blob data into these buffers. The prepareBuffer callback then needs to specify that:

• mBuffer points to an application buffer.

• mTransferLen is the actual length of that buffer.

The SDK then copies the data from the network packet into the application buffer and calls the complete callback to notify the application that the data is available for application processing.

Figure 9 on page 29 shows the use of the application buffer to transfer the blob data from Atmos CAS.

Figure 9 Data transfer of FPTag_BlobRead by application buffer

If the application buffer is larger than the data received from the network packet, the SDK fills up the remaining space with data from subsequent network packets, as shown in Figure 10 on page 29.

Figure 10 Application buffer larger than network packet

If the application buffer is smaller than the data received from the network packet, the SDK calls PrepareBufferProc and CompleteProc to get additional buffer space, as shown in Figure 11 on page 30.

Generic streams 29

Streaming Data

Because of all the data rearrangements, a data transfer using application buffers is slower than using internal buffers.

Figure 11 Application buffer smaller than network packet

Factors that affect streaming dataThe following factors affect the handling and performance of streaming data, depending on your system environment and Atmos CAS configurations. For definitions of these options, refer to FPPool_SetGlobalOption, FPPool_SetIntOption, FPTag_BlobWrite, and FPTag_BlobWritePartial in the EMC Atmos CAS API Reference Guide:

• ID calculation — Enabling FP_OPTION_CLIENT_CALCID requires marking support in a generic stream. When the SDK initially reads content, it goes through the blob data twice if you use FP_OPTION_CLIENT_CALCID. The client calculates the address before sending the data to the CAS-enabled nodes. It proactively checks to see if the specified blob content already exists on the CAS-enabled nodes. If it does, the client does not send the data. This option also assumes that the collision avoidance feature is disabled.

As FP_OPTION_CLIENT_CALCID_STREAMING (default) calculates the Content Address of the blob data while it is being streamed to the CAS-enabled nodes, streams are not required to support marking.

Note: You can easily switch CLIENT_CALCID_STREAMING to operate in the SERVER_CALCID_STREAMING mode by setting the option FP_OPTION_DISABLE_CLIENT_STREAMING in FPPool_SetGlobalOption or by setting it to True as an environment variable. If the latter, the change does not require a recompilation of application code. The SDK then handles and processes all references to CLIENT_CALCID_STREAMING as SERVER_CALCID_STREAMING.

FP_OPTION_SERVER_CALCID_STREAMING enables the Atmos CAS server to calculate the Content Address of the blob data as it is being streamed to the server side. There is no client check of the data before it is streamed to the cluster.

30 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Streaming Data

• FP_OPTION_PREFETCH_SIZE — If the total size of a stream is unknown (mStreamLen is set to -1 in FPStreamInfo), the SDK reads the blob bytes into memory up to the prefetch size to determine what the stream length might be. The SDK uses this option value to know how best to proceed with other decision processes:

• For consideration of embedded blob threshold — If blob data is larger than what the prefetch buffer size specifies, the SDK may not embed the blob in the CDF even if the blob is smaller than the embedded threshold. (See FP_OPTION_EMBEDDED_DATA_THRESHOLD below.)

• FP_OPTION_EMBEDDED_DATA_THRESHOLD — The SDK ignores this option’s threshold if the data size is unknown, if FP_OPTION_CLIENT_CALCID_STREAMING is enabled, and if the size of the data exceeds the prefetch size. However, the embed option set in a blob write overrides any global threshold setting.

Factors that affect streaming data 31

Streaming Data

32 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

CHAPTER 3Application Authentication and Authorization

This chapter describes the Atmos CAS application authentication and authorization process. Customized Pool Access Information (PAI) modules and server capabilities are also covered in this chapter.

The main sections are:

• Access security model............................................................................................. 34• Application authentication and authorization.......................................................... 34

Application Authentication and Authorization 33

Application Authentication and Authorization

Access security modelThe process of application authentication and authorization starts with the access security model, which is enforced in Atmos CAS. The Atmos CAS access security model prevents unauthorized applications from storing data on or retrieving data from Atmos. This security model operates at the application level, not from the level of individual end-users. Atmos CAS authenticates applications before giving them access to the content.

The Atmos CAS security model is based on the concept of pools and profiles.

For more information on the access security model, refer to the EMC Atmos Security Configuration Guide.

Capabilities

Table 2 on page 34 lists the capabilities associated with profiles. Each capability is set to a value of either Enabled or Disabled.

2 Profile capabilitiesTable

Capability (setting) Definition

Write (w) Writes to a C-Clip.

Read (r) Reads a C-Clip.

Delete (d) Deletes a C-Clip.

Exist (e) Checks for the existence of a specified C-Clip.

Privileged delete (D) Disabled in this release.

Profile-driven metadata (P) Not supported in this release.

Query (q) Queries the contents of CAS-enabled nodes. If set to Enabled, C-Clips can be searched for in the CAS-enabled nodes using a time-based query.

Clip-copy (c) Not supported in this release.

Litigation hold (h) Not supported in this release.

Application authentication and authorizationApplication authentication is the process whereby the application provides authentication information to Atmos CAS before access is granted. This information is available in the PEA file that the application integrator or system administrator has generated using the PEA file generator tool.

The PEA file contains the following information:

• User name: A combination of subtenant ID and UID joined together by a colon (SubtenantId:UID).

• Secret: A password that is used to authenticate the application.

Note: In Atmos, each PEA file can only contain a single set of access credentials.

34 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Application Authentication and Authorization

To access Atmos CAS, the application uses the API function FPPool_Open() with parameters that specify a comma-separated list of node IP addresses and the path to the PEA file. If the application does not provide the path to the PEA file with the FPPool_Open() call, an environment variable will be searched to check the location of the PEA file. If that variable does not exist, the application authentication fails and access to Atmos CAS is denied.

In addition to the application authentication, Atmos CAS transfers a capability string to the SDK. This capability string contains Atmos CAS capabilities and authorizes the application to perform certain operations on the Atmos CAS system.

Refer to Figure 12 on page 35 for an illustration of the process.

Figure 12 Application authentication and authorization

Creating PEA files

The system administrator can create a PEA file for each application using the PEA file generator tool. The tool will then generate a PEA file containing a section such as:

<key type=”cluster” id=”12345-12345-12345-12345” name=”MyApp”> <credential id=”csp1.secret” enc=”base64”>

MySpecialApplicationSecretForThisCluster</credential>

</key>

Application authentication and authorization 35

Application Authentication and Authorization

36 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

CHAPTER 4Best Practices

This chapter provides information to facilitate your development work. EMC recommends that you review this information, as it may help prevent problems at a later stage.

The main sections are:

• Programmer’s notes................................................................................................ 38• Pool functions......................................................................................................... 40• Clip functions.......................................................................................................... 44• Tag functions .......................................................................................................... 45• Stream functions..................................................................................................... 46• Query functions ...................................................................................................... 46• Buffers.................................................................................................................... 48• Error handling ......................................................................................................... 49• Logging................................................................................................................... 50• Unicode and wide character support ....................................................................... 52• Synchronization in multithreaded programs ............................................................ 53• Retry function ......................................................................................................... 53• Content Address calculation.................................................................................... 54• Performance ........................................................................................................... 55• IBM z/OS notes....................................................................................................... 60

Best Practices 37

Best Practices

Programmer’s notesThe following technical notes provide environment-specific information for the application developer:

• All functions are implemented in ANSI C.

• All function calls (except FPEventCallback_RegisterForAllEvents()) must be considered as synchronous function calls. The application developer needs to implement threading for asynchronous behavior.

• The application developer needs to handle synchronization in a multithreaded application. Refer to “Synchronization in multithreaded programs” on page 53 for more information.

• The application must handle application level signals such as SIGPIPE.

• A software upgrade on the Atmos system will affect the performance of the client application because all access nodes will sequentially restart. Performance returns to normal when the upgrade is complete.

• If the network connection to a CAS-enabled node is lost during a read or write operation, the connection times out. The timeout may take a few seconds and can cause a temporary delay of the operation.

• Trying to close an object (such as a pool, C-Clip, or tag) twice or accessing a closed object may cause a system crash. Your code should handle this scenario.

• When multiplying a 64-bit integer with another integer n, typecast n also as a 64-bit integer if you want the result to be a 64-bit integer.

Application server The following notes apply to the application server:

• The Atmost CAS API and associated set of protocols are designed, implemented, and tested for an environment where an application server is acting as a gateway between the clients and Atmos CAS. It is important to keep an application server in your design, not only for scalability but also for security and administration.

• If you want to use large files (larger than 2 GB) on your application server, ensure that large files support is enabled on the file system of the server.

Naming conventions The following notes apply to C-Clips.

• Do not use xml and eclip at the beginning of a tag name.

• All tag names must be XML compliant.

Recommended usage For recommended and maximum application settings, the CAS API specifications are available in Chapter 1, “Introduction,” of the EMC Atmos CAS API Reference Guide:

• For best data availability, place data replicas and metadata close to your access locations.

• Configure MDS remote replication for disaster recovery. In addition, for objects, configure at least 2 local synchronous replicas and 1 remote asynchronous replica.

38 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

• Query is intended for disaster recovery and not for regular application operation. For security reasons, EMC recommends the disabling of the query capability (clip-enumeration) during normal operation. The system administrator should enable the query capability only in the event of disaster recovery. Refer to “Query functions” on page 46 for more information.

• Avoid the use of auto-deletion policy or retention policy for CAS objects.

• The supported maximum number of threads per access node is 30. A four-access node system thus supports 120 threads. In this scenario, a single application can use up to 120 threads, or two or more applications can use 120 threads combined. Refer to “Performance” on page 55 for performance details related to the number of threads and object size.

• EMC recommends adding application-specific metadata to the C-Clips that you store on an Atmos CAS system. This practice enables the recovery of the C-Clips in the event that the C-Clip ID database is inaccessible. You can add metadata to C-Clips with FPClip_SetDescriptionAttribute() and retrieve metadata with FPClip_GetDescriptionAttribute(). You can also set metadata by defining the CENTERA_CUSTOM_METADATA environment variable. Refer to the EMC Atmos CAS API Reference Guide for more information on adding metadata.

• Use only the Atmos CAS API to handle CAS objects. Do not manipulate CAS objects using other access methods such as REST, SOAP, NFS, and CIFS interfaces.

• Where possible, use dedicated access nodes and dedicated hardware for better performance.

Capabilities Capabilities are granted to an application and determine what operations the application is allowed to perform on the Atmos CAS system. After the application has been granted access to Atmos CAS, Atmos CAS sends a string with the capabilities to the SDK.

The following API functions are cancelled if the related capability is not enabled: FPClip_Open(), FPClip_Write(), FPClip_Delete(), FPClip_AuditedDelete(), FPClip_Exists(), FPTag_BlobRead(), FPTag_BlobReadPartial(), FPTag_BlobWrite(), FPTag_BlobWritePartial(), FPPoolQuery_Open(), and FPMonitor_XXX(). Before performing one of these functions, check if the related capability is enabled by using FPPool_GetCapability(). Refer to the EMC Atmos CAS API Reference Guide for more information.

Your code should handle the scenario that one or more capabilities are not enabled.

Compliance mode The FP_COMPLIANCE capability allows an application to determine the compliance mode of the Atmos CAS system. As Atmos CAS currently supports only one compliance mode, the returned value is basic.

For a complete list of capabilities, refer to the EMC Atmos CAS API Reference Guide.

Limitations The following limitations apply to this release of Atmos CAS:

• The FPTag_Copy API is not supported. Using it results in an application error.

• The MoPI API is not supported. Using it returns an error message to applications that the operation is unsupported.

Programmer’s notes 39

Best Practices

• Pools map to Atmos subtenants; profiles map to Atmos user IDs (UIDs). A virtual pool in Centera is defined in Atmos CAS as a combination of unique subtenant ID and UID. A profile in Centera is defined as a subtenant UID.

• In Atmos, blob deduplication (single instancing) occurs at the access node level per storage server. A single write by an application can result in multiple copies of a blob.

• The Raw write API is not supported.

• Replication read and write operations are not supported.

• Query ordering: While the time ordering of query results is not guaranteed, the completeness of a result set is guaranteed for a given time range.

• No audit pool and audit logging: Because you cannot edit CAS configurations, the audit log does not display CAS-related entries.

• Profile metadata is not supported.

Timestamps The Atmos CAS API uses the following types of C-Clip timestamps:

• Creation date — The point in time that FPClip_Create() was called to create the C-Clip.

• Modification date — The point in time that an existing C-Clip was changed. For example, a C-Clip is reopened via FPClip_Open() to extend its retention period.

• Query date — The point in time that a C-Clip was physically written to the Atmos CAS system.

Time formats

You can use the following functions to convert Atmos CAS time strings to integral time values marking the time since the epoch 1 January 1970 00:00:00.000. These API calls represent integral units in either seconds or milliseconds:

• FPTime_MillisecondsToString()• FPTime_SecondsToString()• FPTime_StringToMilliseconds()• FPTime_StringToSeconds()

The SDK supports two string formats for either milliseconds or seconds when converting the integral values to time strings. These string formats are defined as options FP_OPTION_MILLISECONDS_STRING and FP_OPTION_SECONDS_STRING with a flag argument. The inOptions argument produces a time string with or without a milliseconds field.

For more information about these time-specific function calls, refer to the EMC Atmos CAS API Reference Guide.

Pool functionsThe pool functions operate at the pool level. A pool consists of one or more connections to Atmos CAS-enabled nodes, each with its own IP address or DNS name and port number(s). The default port number is 3218.

All pool functions are thread safe.

40 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

Your application must establish a connection to a pool by calling FPPool_Open() before performing any other pool operation, with the exception of FPPool_SetGlobalOption() and FPPool_GetComponentVersion(). You pass a list of nodes with the access role to FPPool_Open(). Refer to “Connection string” on page 41 for more information.

The first CAS-enabled access node to which the SDK connects determines the list of access nodes for the pool. All SDK operations access the CAS-enabled nodes, except in the case of failover.

Once the primary CAS-enabled node is established, the SDK by default continues to make connections to all nodes in the connection string and any replicas. Alternatively, you can configure the SDK to make connections only as needed.

Once a pool is open, load balancing and connection pooling take place automatically. Refer to “Load balancing” and “Connection pooling” for more information.

Always close the connection to a pool that is no longer needed by calling FPPool_Close() to free all associated resources.

Refer to the EMC Atmos CAS API Reference Guide for descriptions of all pool functions.

Connection string A connection string is the list of Atmos CAS-enabled nodes that you pass to FPPool_Open(). The connection string defines the nodes that will comprise the pool. You can specify:

• 1 IP address: If the SDK can establish a connection to the IP address, it automatically detects the IP addresses of all available access nodes. The disadvantage of this scenario is that if the SDK cannot connect to the IP address provided, the SDK cannot detect other access nodes and no pool connection is established.

• 2 or more IP addresses of the same Atmos CAS system (recommended scenario): The SDK tries to connect to all IP addresses. Failing to connect to one address does not prevent establishing the pool connection.

Multiple profiles You can specify multiple profiles on a connection string that allows access to pools associated with those profiles. Atmos CAS supports a single profile per pool in this release. Specifying multiple profiles on a connection string allows application to access multiple pools. Each profile has full access rights to its associated pool.

Load balancing

When the application makes a connection to a pool using FPPool_Open(), the SDK balances the load on all available CAS-enabled nodes in that pool. The system will list the two least loaded access nodes. The nodes on this list are used for pool operations until the pool is closed. When a new Atmos CAS network transaction begins, the same nodes are used if the access node list is not older than 10 seconds; otherwise, the Atmos system creates a new list. For more details on FPPool_Open(), refer to the EMC Atmos CAS API Reference Guide.

Pool functions 41

Best Practices

Connection pooling

The Atmos CAS pool interface is similar to a database connection pool. Both interfaces require one object with open connections to manage all transactions rather than establishing a new connection for every transaction. Each application server only requires one FPPool object to manage all connections and transactions to that pool.

The FPPool object maintains a list of open pool connections (TCP/IP sockets). Every pool transaction uses a separate connection to the pool. The connection is not closed immediately after the transaction has finished but remains available for other transactions. If a connection is not used for 60 seconds, then the connection is closed automatically.

You can globally specify how many connections can be made to a pool using the function FPPool_SetGlobalOption(). The default setting is 100 and the maximum value is 999. Refer to the EMC Atmos CAS API Reference Guide for more information on FPPool_SetGlobalOption().

Set pool options

You can set options that are specific for a given pool or that are global to every pool. If you want to set options for a given pool, you must open that pool first, then call FPPool_SetIntOption(). With FPPool_SetIntOption(), you can set the following options for a given pool:

• The internal buffer size

• The TCP/IP connection timeout

• Default collision avoidance

• Prefetch buffer size

With FPPool_SetGlobalOption(), you can set the following global pool options:

• The maximum number of open pool connections

• The number of retries

• The time to wait before a retry

• The time threshold for attempting communication with a node with the access role

• The size threshold for embedding content (blobs) in the CDF

Use the static public initializer loadFPLibrary to load the underlying library (dll or shared library), which contains methods defined by this class, if your environment requires this. The library is called FPLibrary.dll on Windows and libFPLibrary.so on a Unix system. Save the library to a location where the virtual machine can find it to load.

Setting options as environment variablesYou can effect application behavior without additional code recompilation. You can set global pool options as environment variables. The option settings made by an application take precedence and override those that were previously set as environment variables.

42 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

Profile clips

A profile clip is a C-Clip that associates a Content Address with an access credential. An access credential consists of a name and a shared secret. The name is a combination of subtenant ID and UID. Thus, an application can use a profile clip to access the information in the access credential without having to store that information in an external source such as a database.

Note the following about profile clips:

• The storage and retrieval of profile clips are identical to other cluster parameters.

• Profile clips are not protected from deletion.

• Profile clips persist permission changes.

You can use the Atmos CAS Access API to set and get profile clips by calling the FPPool_SetClipID and the FPPool_GetClipID routines.

Note, however, that before you can set a profile clip with FPPool_SetClipID , you must first do the following:

1. Open a pool with, for example, a call to FPPool_Create or FPPool_Open.

2. Create a C-Clip with a call to FPClip_Create.

3. Write the C-Clip to disk with a call to FPClip_Write.

For more information, see the EMC Atmos CAS API Reference Guide.

Blob-slicing

Blob-slicing provides applications with the ability to use multiple threads to write discrete segments of a single blob to Atmos CAS. Blob-slicing enables increased performance of blob write time by allowing the data segments to be written simultaneously on multiple threads. Sliced blobs retain visibility as a single tag in the C-Clip Descriptor File (CDF).

Applications must partition the data into logically or physically separate segments and provide streams for each segment to be written. Each data segment belonging to the single blob receives a sequence ID. The sequence IDs determine the order in which the data is to be read back from Atmos CAS, from the lowest ID to the highest ID. A single sequence ID cannot be reused within a tag during the same FPClip_Open/FPClip_Close operation. It can, however, be reused in that tag if reopened in another API call.

An inSequenceID must be greater than or equal to zero. Threads that are using the same inTag cannot have duplicate sequence IDs. Each ID must be unique, otherwise, an FPParameterException is thrown. For more information on blob-slicing, refer to FPTag_BlobWritePartial() in the EMC Atmos CAS API Reference Guide.

Pool functions 43

Best Practices

Blob-slicing, as shown in Figure 13 on page 44, also allows an application to append data to the end of existing data contained within an existing tag. Blob-slicing does not support embedded options.

Figure 13 Blob-slicing

Clip functionsYou must open a C-Clip before you can perform a clip function. When you open a C-Clip, the CDF is read into the memory of the application server.

Note: Always close a C-Clip when it is not needed to avoid unnecessary memory consumption.

Traverse C-Clips

Use FPClip_Open() to open a C-Clip. You can open the C-Clip in two ways: as a tree (read/write) or flat (read-only). You can traverse a C-Clip hierarchically when it has been opened as a tree. C-Clips that have been opened flat only allow sequential access.

To start navigating through a C-Clip in tree mode, you will first have to get the root tag using FPClip_GetTopTag(). You can use FPTag_GetFirstChild() and FPTag_Get(Prev)Sibling to retrieve related tags, or FPClip_FetchNext() to retrieve tags in sequential order.

To start navigating through a C-Clip in flat mode, you get the first tag using FPClip_GetTopTag(). You can then use FPClip_FetchNext() to retrieve other tags.

Check modifications before FPClip_RawRead( )

Consider the following code flow:

FPClip_Open()FPTag_RemoveAttribute()FPClip_RawRead()FPClip_RawOpen()

44 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

FPClip_Open() opens the C-Clip from the pool and temporarily saves its CDF in memory on the application server. Subsequently the CDF in memory changes because of the tag operation (removing an attribute). The FPClip_RawRead() call reads the changed CDF into a stream. The FPClip_RawOpen() call reads the stream to write it to the pool. This call will return the error FP_BLOBIDMISMATCH_ERR because the ID of the CDF in the stream no longer corresponds to the original C-Clip ID of the C-Clip on the pool. It is best to first check if FPClip_IsModified() returns false before calling FPClip_RawRead().

C-Clip IDs in canonical format

Use the canonical format to store a Content Address contained in a C-Clip ID in a database as a blob and not as a string. This canonical format allows a Content Address to be the same on all platforms, which means the Content Address is independent of character encoding and can be shared in a single database by all platforms. The canonical format also reduces the size of a Content Address from 65 bytes (string format) to 41 bytes and is not NULL-terminated.

For more information on the related function calls, refer to FPClipID_GetCanonicalFormat() and FPClipID_GetStringFormat() in the EMC Atmos CAS API Reference Guide.

Tag functionsTo create a new tag, you need a reference to a parent tag. If the tag structure is empty, use FPClip_GetTopTag() to get the top tag. This tag functions as the parent tag to create the first user tag in the tree.

Writing blobs

Use FPTag_BlobWrite() to write blob data to the pool from a stream that the application provides. If multithreaded stream objects are available from the application, you can use FPTag_BlobWritePartial() to write blob data to the pool from multiple threads. For more information on blob-slicing, refer to “Blob-slicing” on page 43.

Either function opens a new blob, reads bytes from the stream object, writes the bytes to the pool, closes the blob, and associates the calculated Content Address with the tag. Whether the function calculates the Content Address before or while sending the data to the pool depends on the given options associated with the function. Note that FPTag_BlobWritePartial() does not operate in conjunction with any of the embedded options. For more information, refer to the EMC Atmos CAS API Reference Guide.

If FPTag_BlobWrite() or FPTag_BlobWritePartial() is used to restore data from a stream to the pool, the data will only be exported to the pool if the tag already has data associated with it and the Content Address is the same as the Content Address of the stream data.

Reading blobs

Use FPTag_BlobRead() or FPTag_BlobReadPartial() to read blob data from the pool to a stream object.

Either function gets the Content Address from the tag, opens the blob, reads the data in chunks of 16 KB, writes the bytes to the stream, and closes the blob.

Tag functions 45

Best Practices

Stream functionsThe API stream functions implement streams generically. For more information on the creation and use of streams, refer to Chapter 2, “Streaming Data,” in this guide and to the EMC Atmos CAS API Reference Guide.

Streaming recommendations

The following suggestions apply to streaming activities:

• Because the server closes an unused connection after 120 seconds, EMC recommends that the prepareBufferProc callback not take longer than 1 minute to execute. Since most of the delay in a data flow occurs during the setup of the data stream’s source, EMC recommends establishing or setting up the data before you call any I/O function within the API, instead of during the prepareBufferProc callback function.

• If FP_OPTION_CLIENT_CALCID is used on a generic input stream, the stream must implement marking. If marking is implemented, the stream will be marked (set) to the first or current position and will then be used to read the data that will be sent to the server. When the data has been read, the Content Address is calculated. The stream is then reset to the marked position and the data is read and streamed to the server.

• Output streams can specify a buffer in which the API can put data. Therefore, the stream is not dependent on the buffer size and does not need to copy data that the API provides. Use this when you have a buffer to the output streaming process and/or when you have overlapping I/O buffers of a fixed size. The generic stream has to specify the buffer and its size during prepareBufferProc. During inCompleteProc, the generic output stream uses the data. This process matches the generic input streaming capability. If the generic stream does not specify a buffer during prepareBufferProc, the API provides a buffer of data to inCompleteProc. In this instance, the API controls the buffer.

• For best storing practices, custom-made streams must pass the size of the stream to the server after the FP_Stream_CreateGenericStream call is made but before making an FPTag_BlobWrite call.

Query functionsThe CAS API provides functions that query C-Clips stored on an Atmos CAS system. Every C-Clip has a timestamp that indicates when the C-Clip was written to the Atmos CAS-enabled node. This timestamp provides fast lookup of C-Clips and allows for queries of C-Clips that were created within a given time frame.

You can also query reflections — C-Clips that have been deleted from the Atmos CAS system. The query timestamp in the case of reflections corresponds to when the C-Clip was deleted. Note that reflections are only exposed in the CAS API through the query functions.

The query feature is intended for backup applications and not as a general-purpose application feature. Applications should expect that the clip-enumeration capability, which is required for query operations, has been disabled by the system administrator for security reasons.

46 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

Note: Query operations examine all C-Clips on CAS-enabled nodes, including C-Clips written by other applications.

Executing queries involves three object types:

• FPQueryExpression — Defines the query criteria.

• FPPoolQuery — Represents an in-progress query.

• FPQueryResult — Holds a query result representing a single C-Clip or reflection.

Refer to the EMC Atmos CAS API Reference Guide for more information on the query functions.

Performing a query

The process for querying the Atmos CAS-enabled nodes is as follows:

1. Create a query expression by calling FPQueryExpression_Open(). The query expression is the criteria by which C-Clips and reflections are returned by the query.

2. Define the query expression. You can set the following values:

• Start time — Call FPQueryExpression_SetStartTime() to define the earliest timestamp for which C-Clips are returned. For existing C-Clips, the timestamp is when the C-Clip was written to the CAS-enabled nodes. For reflections, the timestamp is the time when the C-Clip was deleted from the CAS-enabled nodes.

• End time — Call FPQueryExpression_SetEndTime() to define the latest timestamp for which C-Clips and reflections are returned.

• Query type — Call FPQueryExpression_SetType() to specify whether existing C-Clips, reflections, or both are returned by the query.

3. Specify which description attributes you want returned in the query results. Call FPQueryExpression_SelectField() and FPQueryExpression_DeselectField() to add or remove description attributes from the query expression.

Note: Selecting description attributes does not affect the list of C-Clips or reflections returned by the query, only what information is included for each returned C-Clip or reflection.

4. Initiate the query on the CAS-enabled nodes by calling FPPoolQuery_Open().

5. Get the first C-Clip or reflection that matches your query expression by calling FPPoolQuery_FetchResult().

6. Process the first query result by calling the FPQueryResult_GetXXX functions.

7. Close the first query result to free resources by calling FPQueryResult_Close().

8. Call FPPoolQuery_FetchResult() repeatedly, until the query has returned all C-Clips that match your query expression (FPQueryResult_GetResultCode() returns FP_QUERY_RESULT_CODE_END) or you do not need to continue the query.

9. Close the pool query to free resources by calling FPPoolQuery_Close().

Query functions 47

Best Practices

10. Close the query expression to free resources by calling FPQueryExpression_Close().

Resuming interrupted queries (incremental queries)

Queries can take significant time to complete. For example, CAS-enabled nodes with three million C-Clips with a typical throughput of 100 results/s will take three hours to complete. It is therefore not uncommon for a query to be interrupted before it completes. For example, the client itself might interrupt the query. Or, the network connection between client and the access node might be lost, in which case the CAS-enabled nodes recognize the problem and abort any ongoing queries.

An application can resume an interrupted query. An application launches a new query specifying the start time, using FPQueryExpression_SetStartTime(), as the value of the timestamp returned by the last valid query result (FPPoolQuery_FetchResult()).

In the case of the client interrupting the query, there might be overlap between the result set of the resumed query and the interrupted query, meaning the same query results are in the two result sets. The handling of result overlap is the application’s responsibility.

Other query information

The following notes provide general information about queries:

• Queries with the same criteria (same QueryExpression) always return the same result set, even if content was added to the CAS-enabled nodes between successive queries. If some content is not returned due to unavailable nodes, the query result is reported as incomplete (FPPoolQuery_FetchResult() returns FP_QUERY_RESULT_CODE_INCOMPLETE). The one exception is if the end time is specified with the "now" value (-1). In this case, content added to the CAS-enabled nodes is included in the query results.

• The default number of simultaneous queries that can run on the CAS-enabled nodes is 10. However, your Atmos CAS system may be configured differently. The number of simultaneous queries is configurable but is not a customer setting.

BuffersThe SDK uses internal buffers in different ways. This section describes the main uses of buffers by the SDK and provides some programming tips.

Memory stream

When opening a C-Clip (using FPClip_Open() or FPClip_RawOpen()) or writing a C-Clip (using FPClip_Write()), a temporary stream is created. The FPClip object stores the CDF in this stream. As shown in Figure 14 on page 49, the stream keeps only the first n bytes in memory and stores the overflow to a file on disk. The default size (n) of the memory buffer is 16 KB. This means that CDFs smaller than 16 KB remain completely in memory.

48 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

Figure 14 Temporary CDF stream

The application can change the size of the default memory buffer using FPPool_SetIntOption(FP_OPTION_BUFFERSIZE, n). If, for example, the application mostly uses CDFs of 17 KB, setting the buffer size higher increases performance significantly. If the application mostly uses small CDFs, then the application can save memory by specifying a smaller buffer. (The size of the memory buffer must be greater than 0.)

Error handlingAll API calls reset the value of the last error to ENOERR (0) at the beginning of their execution. If an error occurs during the execution, this value is set to an error code. Calling FPPool_GetLastError(), therefore, always returns the last error condition. It is recommended that your application calls FPPool_GetLastError() after every function call. You can then call FPPool_GetLastErrorInfo() to get details about the error.

Note: If FPPool_GetLastError() is unable to return the error status of the last FPLibrary function call, the return value is FP_UNABLE_TO_GET_LAST_ERROR. This value indicates that an error was generated by the FPPool_GetLastError() call itself and not the previous function call. The error status of the previous function call is unknown; the call may have succeeded.

The chapter on error codes in the EMC Atmos CAS API Reference Guide provides definitions. There are four types of error codes:

1. Program logic errors — Verify your code.

2. Internal errors — Contact the EMC Customer Support Center.

3. Network errors — Retry the function.

4. Server errors — Check your server log and retry the function if necessary.

If an error occurs during the execution of a function, the value of any output parameter is undefined.

All Java methods return an error condition by throwing an FPLibraryException. The translation from error number to error message happens when an FPLibraryException is thrown.

Although each class contains a finalize() method, you must call the Close() method for each instantiated variable.

Error handling 49

Best Practices

The static members of the Java interface FPLibraryErrors have same name equivalents as the errors defined in the C header file FPErrors.h. For example, you can refer to the C error code FP_INVALID_NAME in Java as FPLibraryErrors.FP_INVALID_NAME.

You can also call three FPLibrary error methods on any FPLibraryException that is thrown in Java.

Parameter errors

All API functions check their parameters. The following errors can be returned:

• FP_INVALID_NAME• FP_UNKNOWN_OPTION• FP_WRONG_REFERENCE_ERR• FP_OUT_OF_BOUNDS_ERR• FP_CLIPCLOSED_ERR• FP_POOLCLOSED_ERR• FP_QUERYCLOSED_ERR• FP_TAG_CLOSED_ERR• FP_PARAM_ERR• FP_OBJECTINUSE_ERR

Refer to the Error Codes chapter in the EMC Atmos CAS API Reference Guide for descriptions of these errors.

LoggingYou can enable and define SDK logging via the use of API calls, a configuration file that can be optionally polled, or via logging environment variables.

Note: For application flexibility, EMC recommends using API calls for your logging environment. Chapter 11, Logging Functions, in the EMC Atmos CAS API Reference Guide provides details on logging behavior and settings.

50 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

Use cases Table 3 on page 51 shows examples of SDK logging configurations and expected logging behavior. The use cases assume that environment variables are set prior to application startup time.

3 Logging scenarios Table

Logging configuration Logging behavior

Example 1: The FP_LOGPATH environment variable is set to C:\MyLog.txt.

FPLogging, the logging mechanism, reads this environment variable and enables logging to the C:\MyLog.txt file.

Example 2:FP_LOGPATH is set to C:\MyLog.txt. FP_LOG_STATE_PATH is set to C:\MyLogState.txt, which defines FP_LOGPATH = C:\log.txt.

Based on the order of precedence, FPLogging gives priority to FP_LOG_STATE_PATH. The log output is directed to C:\log.txt.

Example 3:Although FP_LOGPATH is set to C:\MyLog.txt, the application creates an FPLOGSTATE object that points to C:\log.txt and calls FPLogging_Start().

FPLogging detects the FP_LOGPATH on startup and starts logging to C:\MyLog.txt. The FPLogging_Start call closes the C:\MyLog.txt log file and starts logging to C:\log.txt.

Example 4:Although FP_LOGPATH is set to C:\MyLog.txt, the FPLogState.cfg is in the path and points to C:\log.txt.

FPLogging directs logging to the C:\log.txt file without checking the environment variables.

Example 5:FPLogState.cfg is in the path and points to C:\MyLog.txt.The application uses FPLogging_Start with an FPLogState configuration to C:\log.txt.

FPLogging automatically starts logging to C:\MyLog.txt.The FPLogging_Start call closes the C:\MyLog.txt file and directs logging to C:\log.txt file.

Example 6:No environment variables are set. Logging is disabled.The application calls FPLogging_Start(), using an FPLogState configuration to C:\log.txt.

The FPLogging_Start call directs logging output to C:\log.txt.

Use case description

An application administrator wants to be able to toggle logging on and off without having to restart the application. Prior to starting the application, the administrator sets the environment variable FP_LOG_STATE_PATH = C:\CenteraSDKLogSetup.cfg and creates the following log configuration file, as shown:

################################### Centera SDK Log Setup##################################

FP_LOGPATH=<NULL>FP_LOGLEVEL=4FP_LOGFORMAT=TABFP_LOGFILTER=ALLFP_LOGKEEP=OVERWRITE

Logging 51

Best Practices

FP_LOG_DISABLE_CALLBACK=FALSEFP_LOG_STATE_POLL_INTERVAL=0FP_LOG_MAX_SIZE=FP_LOG_SIZE_UNBOUNDEDFP_LOG_MAX_OVERFLOWS=1

The administrator then enables logging to a file by modifying the FP_LOGPATH field:

FP_LOGPATH=C:\CenteraSDKLog.txtThe next occurring log event results in the creation of the given log file (as polling is configured for all log events). The reverse steps may be taken to disable logging.

Unicode and wide character supportAtmos CAS uses the Unicode Worldwide Character Standard (ISO-10646) to encode and retrieve metadata. Unicode support makes it possible to add metadata in any language. You do not need to use code pages for each different language or set of languages.

Functions that accept string arguments have a wide character and three Unicode variants. Each variant has a suffix indicating its type of encoding support.

For example, the function to open a pool (FPPool_Open (const char *inPoolAddr)) accepts a string argument (the connection string). Therefore, the FPPool_Open function has the following wide character and Unicode variants:

FPPool_OpenW (const wchar_t *inPoolAddr)FPPool_Open8 (const FPChar8 *inPoolAddr)FPPool_Open16 (const FPChar16 *inPoolAddr)FPPool_Open32 (const FPChar32 *inPoolAddr)

Wide character calls support the UCS-2 standard for architectures supporting 2 byte wide characters.

Whereas 7-bit ASCII characters are fully portable to the Unicode standard, ASCII characters between 80-7F may have different hex codes between the PC-850/PC-437 codepage and the ISO-8859-1 (Latin-1) codepage, particularly with respect to accented characters.

Thus, EMC does not recommend using locales to specify metadata for Latin-1 character API calls.

For example, relying on locales, if you build an application on a Windows system using the Latin-1 character Atmos CAS API routines and specify data (like an accented French character), a UNIX application will not be able to read the data correctly using the Latin-1 character API calls.

Therefore, EMC recommends that you build non-English applications with Unicode API calls to avoid any possible confusion.

Be aware that some C/C++ compilers do not properly translate Unicode characters. Compiling source code containing hard-coded wide-character strings, for example FPClip_CreateW(vPool, L”àáâãäåçèéêë”), may result in the error FP_PARAM_ERR. EMC strongly recommends not to hard code wide-character strings in the source files. For maximum compatibility, the application should query them through a GUI or read them from an external file (preferably encoded in UTF-8). If a hard-coded string is absolutely required, all compilers recognize the following syntax:

const wchar_t wide_string_with_accents[] = {0xE0,..,0x00}

where you use Unicode numbers for each character and add a terminating null character at the end of the string.

52 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

Note: A UTF-8 encoded character may take up 6 bytes, while a UTF-16 encoded character may take up 4 bytes. Therefore, care should be taken when allocating UTF-8 and UTF-16 related buffers.

Synchronization in multithreaded programsThe Atmos CAS API is thread safe with the exception of stream operations. Refer to the concurrency requirements specified within each function description in the EMC Atmos CAS API Reference Guide.

Increasing performance

To increase performance in a multithreaded environment, the API should share one FPPoolRef by all threads instead of one FPPoolRef per thread.

Retry functionAPI calls can potentially fail due to permanent or circumstantial errors. A permanent error is, for example, a wrong parameter in your application or a file that could not be located. A circumstantial error is, for example, a timeout caused by traffic overload or the server being too busy. The system can resolve circumstantial errors by retrying the operation.

API calls that access the Atmos CAS system can encounter client-server network errors. In this instance, the API retries the failed call. The following API calls automatically retry:

• FPPool_Open()/FPPool constructor• FPPool_GetPoolInfo/FPPool.getPoolInfo• FPClip_Open/FPClip constructor• FPClip_Exists/FPClip.Exists• FPClip_Delete/FPClip.Delete• FPClip_AuditedDelete/FPClip.AuditedDelete• FPClip_Write/FPClip.Write• FPTag_BlobWrite/FPTag.BlobWrite• FPTag_BlobWritePartial/FPTag.BlobWritePartial• FPTag_BlobRead/FPTag.BlobRead• FPTag_BlobReadPartial/FPTag.BlobReadPartial• FPPoolQuery_Open()/FPPoolQuery constructor

Changing the retry settings

Use the C function FPPool_SetGlobalOption() or the Java method setGlobalOption() to change the default retry settings or to switch the retry mechanism off. Refer to the EMC Atmos CAS API Reference Guide for more information about these functions. To change the retry settings, use either of the following options:

• FP_OPTION_RETRYCOUNT

This option specifies how many times a retry should be executed. The default value is 6.

• FP_OPTION_RETRYSLEEP

Synchronization in multithreaded programs 53

Best Practices

The time to sleep before the failed API function call should be executed again, in milliseconds. The maximum value is 100000 ms. If no retrysleep has been defined, the SDK uses an exponential backoff scheme. The sleep time increases after each retry, starting at 1 second, and doubles after each retry.

Note: Using the retry option might increase the execution time of a function call. This will affect the same thread if it subsequently processes user events or requires a fast response time.

The retry functionality requires marking support in the input stream for FPTag_BlobWrite(). The operation will fail if marking is not supported.

Content Address calculationWhen a user presents data to the CAS-enabled node, the system calculates a unique Content Address for that data and then stores the data on the CAS-enabled node using FPTag_BlobWrite(). For every write transaction you can specify when and how the Content Address will be calculated. Refer to the EMC Atmos CAS API Reference Guide for more details.

Note: When writing large files (10 MB or larger) or when using many threads, EMC recommends not calculating Content Addresses before writing to the CAS-enabled nodes for performance reasons.

Content Address collision avoidance

For some applications, it is not acceptable that there is an infinitesimal chance that the same Content Address may be created for different data. The API provides a capability to ensure that a unique Content Address is created for each piece of stored content. Refer to the description of the Content Address Collision Avoidance option with FPPool_SetIntOption() in the EMC Atmos CAS API Reference Guide for more details. When using this option, each piece of content receives a unique identifier that changes whenever that content is stored on the server. The consequence of Content Address collision avoidance is that single-instance storage — sharing blobs by different C-Clips or tags — is not possible anymore.

When collision avoidance is enabled at pool level, using FPPool_SetIntOption(), FPClip_Write() will return: <C-CLIPID><REFID>. For example: 42L0M726P04T2e7QU2445E81QBK7QU2445E81QBK42L0M726P04T2.

Collision avoidance is off by default. FPPool_SetIntOption() lets you turn collision avoidance on and off at pool level. Collision avoidance can also be turned on and off at blob level with FPTag_BlobWrite(). Refer to the EMC Atmos CAS API Reference Guide for more information.

54 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

Performance

Fixed overhead and object size

An important characteristic of any architecture involving a network-based storage device is the consideration of a fixed overhead for any given transaction regardless of the size of the data object that is being transferred. Due to this overhead, throughput is highly dependent on the size of the data written to Atmos CAS. For small sizes, you can improve performance by embedding data in the C-Clip Descriptor File (CDF) or aggregating data into a single blob.

Because CAS CDF objects are typically small in size (about 1 KB), avoid using GeoParity (erasure code) for CDF objects.

Policy definition

In Atmos, the storage of CAS objects is policy-driven. Policies determine how and where data is stored. A policy defines the number of local (synchronous) and remote (asynchronous) copies (replicas) an Atmos object can have. In addition, a policy refers to specific storage service attributes that determine data placement and data transformation methods. The combination of these storage service attributes affects a system’s overall performance.

For example, data placement can write objects to disks in a round robin order, keep disks active only when written to, use non-parity-based striping by putting blocks of data of the same replica on different disks, or write data to the disk with the most available space.

Similarly, data transformation factors like compression, single instancing (dedup), checksum, or a combination of compression and dedup specify how data is stored on replicas.

The EMC Atmos Administrator’s Guide provides detailed information on creating and managing policies. The “EMC Atmos Planning Considerations” document describes the impact of storage service attributes on system performance.

Embedding data in the CDF

Having Atmos CAS-enabled nodes manage two objects to store your data—a CDF and a blob—has twice the overhead of managing a single object. If your data size is small (generally less than 100 KB), EMC recommends that you embed the data in the CDF.

Note: The maximum data size that you can embed is 100KB. After encoding the data, the SDK checks to see if the total size of the data is > 100 KB. If it is, the SDK returns an error.

By default, data is stored as a blob and the CDF stores only the blob’s Content Address. When you embed data, your data is stored directly in the CDF as a tag attribute. The storage method is transparent to the client application—the blob functions (FPTag_BlobRead(), FPTag_BlobWrite(), and so on) behave identically for embedded and linked data.

Your application can define a content size threshold under which data will be embedded in the CDF. Refer to the FPPool_SetGlobalOption() function. You can also explicitly control whether or not to embed data when you call FPTag_BlobWrite().

Performance 55

Best Practices

Note: Embedded data does not benefit from single-instance storage. For linked data, a single copy of identical content is stored on the system. For embedded data, identical content is stored multiple times. Therefore, you should consider the tradeoff between improved performance and the possibility of reduced storage capacity.

Aggregating data in a single blob

Another approach for managing small objects is to aggregate them into a single blob. This technique is called containerization. For example, you can take many Call Detail Records (CDRs) from a telecom switch and store them as a single Atmos CAS object. You can also store the offset and length for each component object as an attribute associated with the blob tag. You can then use FPTag_BlobReadPartial() to retrieve the individual component objects from the container.

There are several drawbacks to aggregating data:

• Aggregated data does not benefit from single-instance storage.

• The retention period applies to the aggregated data; you cannot assign different retention periods to the individual data objects.

Note: In Atmos 2.0, Atmos CAS does not enforce retentions set on CAS objects.

• You cannot delete individual data objects without rewriting the container object.

Threads

To optimize Atmos CAS performance, EMC recommends the use of multithreaded applications to increase the maximum transfer rate. The Atmos CAS architecture is highly parallel and supports multiple parallel activities. A single thread cannot take advantage of multiple nodes. However, it is important to optimize concurrent activity on the system because too much activity can lead to tasks competing for resources. For example, tasks requiring disk resources on storage nodes can create bottlenecks that cause performance degradation.

Because threads are distributed evenly over available nodes, the number of nodes influences the number of threads that can be supported. Typically, a 32-node system performs better than a 16-node system. And how many threads to use depends on how many access nodes are in operation. There is a supported capacity of 30 threads per access node.

Generally, if you are running your application against like Atmos CAS configurations, you can increase the throughput of small files by increasing the number of threads to optimize read and write performance. For example, a 50KB file can increase the number of files(objects)/second if written with 30 concurrent threads than with a single thread. Likewise, if you have 1000 objects that need to be transferred to Atmos CAS, it is more efficient to create several threads and perform write transactions simultaneously than it is to perform all 1000 writes one at a time in the same thread.

56 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

Note: EMC recommends that you keep the total number of threads <30 times the number of nodes with the access role when writing small files(< 500 KB), or < 10 times the number of nodes with the access role when writing large files ( > 500 KB).

Application server

The Atmos CAS architecture is not the only factor when evaluating performance as this also depends on the application and the application server configuration. The memory on an application server is very important, especially if a large number of C-Clips, or even a few, very large C-Clips (containing tens of thousands of tags) are in use. A good network stack is also important. Dedicated application servers that operate with 2, 4, or 8 processors also deliver a better performance.

The number of application servers writing to a single Atmos CAS system influences the overall number of files stored and bandwidth utilization of the CAS-enabled nodes. In a test using 1, 2, 4, and 6 application servers, with each application using 10 threads to write, the impact of multiple application servers was measured on an Atmos CAS configured with 4 access nodes and 28 storage nodes. Performance improves significantly as additional pairs of application servers are added.

Pluggable SDK MD5 module

MD5 Library APIThe pluggable SDK MD5 module provides client applications with an alternative means to implement the MD5 algorithm used by the SDK. This module enables the SDK to communicate with another MD5 library or algorithm to calculate the Content Address of blobs being written or read to and from Atmos CAS.

Note: For IBM z/OS, EMC provides a DLL (FPMDZDLL) that utilizes assembler routines to minimize the CPU time for MD5 calculation. This module is unrelated to any IBM Cryptographic hardware or software. As this module is preconfigured for use by the SDK, the application can utilize this DLL without taking additional action.

If an application wants to develop its own MD5 implementation for the SDK to use, the application must implement the function definitions listed in the SDK’s publicly provided FPMD5_Interface.h file. The application library must be placed in the search path. The shared library must be called libFPMD5_Lib32.so (or libFPMD5_Lib64.so).

Note: You can use the pluggable MD5 module only with Atmos CAS API v3.1 or higher.

MD5Lib_CreateContext

Syntax MD5Lib_CreateContext (void);

Return value void*

Description This function creates, initializes, and returns a newly allocated context that is used as a parameter for all other MD5 library APIs.

Performance 57

Best Practices

Error handling MD5Lib_CreateContext() returns the newly created context or NULL on error.

MD5Lib_Update

Syntax MD5Lib_Update(void* outContext, char* inBuf, unsigned long inBufLen);

Return value int

Input parameters void* outContext, char* inBuf, unsigned long inBufLen

Description This function performs the MD5 block update operation. It continues an MD5 message-digest operation, processes another message block, and updates the passed-in context.

Parameters • void* outContextThe returned value (if not NULL) from the MD5Lib_CreateContext() function.

• char* inBufThe buffer that holds the data to be hashed.

• unsigned long inBufLenThe length of the inBuf.

Error handling MD5Lib_Update() returns 0 if successful or a negative value on error.

MD5Lib_Final

Syntax MD5Lib_Final(unsigned char outDigest[16], void* inContext);

Return value int

Input parameters unsigned char outDigest[16], void* inContext

Description This function finalizes an MD5 operation. It ends an MD5 message-digest operation, writes the message digest, and zeroes the context.

Parameters • unsigned char outDigest[16]The buffer that is populated with 128 bits of MD5 hash.

• void* inContextThe returned value (if not NULL) from the MD5Lib_CreateContext() function.

Error handling MD5Lib_Final() returns 0 if successful or a negative value on error.

MD5Lib_DeleteContext

Syntax int MD5Lib_DeleteContext(void* inContext);

Return value int

Input parameters void* inContext

Description This function releases the resources associated with the passed-in context.

Parameters inContextThe returned value (if not NULL) from the MD5Lib_CreateContext() function.

58 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

Error handling MD5Lib_DeleteContext() returns 0 if successful or a negative value on error.

MD5 code examplesThe following code examples are based on OpenSSL and Solaris platforms for implementing the SDK MD5 module. The Linux version utilizes the MD5 routines from OpenSSL, while the Solaris version uses the MD5 library included with the Solaris operating system (/usr/lib/libmd5.so or "man libmd5" for more information).

OpenSSL/* * To compile: * gcc -o libFPMD5_Lib32.so -Wall -pedantic -Wno-long-long \ * -D_GNU_SOURCE -g -DPOSIX -I$SDK/include \ * -I/usr/local/ssl/include -L/usr/local/ssl/lib/ \ * -shared fpmd5_openssl.o -ldl -lcrypto \ * -Wl,--rpath,/usr/local/ssl/lib * * (where $SDK refers to top level of SDK directory) * * Then put this shared library in the linker's search path (either * by the existing linker config or via LD_LIBRARY_PATH). It must be * named libFPMD5_Lib32.so. * */

#include <stdlib.h>#include <stdio.h>

#include "FPMD5_Interface.h"#include "openssl/md5.h"

EXPORT void* DECL MD5Lib_CreateContext(void) { MD5_CTX *ctx; if ((ctx = malloc(sizeof(MD5_CTX))) == NULL) { perror("malloc:"); return NULL; } MD5_Init(ctx); return ctx;}

EXPORT int DECL MD5Lib_Update(void* ioContext, char* inBuf, unsigned long inBufLen) {

MD5_Update((MD5_CTX *) ioContext, (void *) inBuf, (unsigned) inBufLen);

return 0;}

EXPORT int DECL MD5Lib_Final(unsigned char outDigest[16], void* inContext) {

MD5_Final(outDigest, (MD5_CTX *)inContext); return 0;}

EXPORT int DECL MD5Lib_DeleteContext(void * inContext){ if (inContext != NULL) { free(inContext); } return 0;}

Performance 59

Best Practices

Solaris/* * To compile 32 bit version: * cc -DPOSIX -I$SDK/include -o libFPMD5_Lib32.so -G -K pic \ * fpmd5_solaris.c -lmd5 * * To compile 64 bit: * cc -xarch=v9 -DPOSIX -I$SDK/include -o libFPMD5_Lib64.so \ * -G -K pic fpmd5_solaris.c -lmd5 * * (where $SDK refers to top level of SDK directory) * * Then put this shared library in the linker's search path (either * by the existing linker config or via LD_LIBRARY_PATH). It must be * named as shown above. * */

#include <stdlib.h>#include <stdio.h>

#include "FPMD5_Interface.h"#include <md5.h>

EXPORT void* DECL MD5Lib_CreateContext(void) { MD5_CTX *ctx; if ((ctx = malloc(sizeof(MD5_CTX))) == NULL) { perror("malloc:"); return NULL; } MD5Init(ctx); return ctx;}

EXPORT int DECL MD5Lib_Update(void* ioContext, char* inBuf, unsigned long inBufLen) {

MD5Update((MD5_CTX *) ioContext, (void *) inBuf, (unsigned) inBufLen);

return 0;}

EXPORT int DECL MD5Lib_Final(unsigned char outDigest[MD5_DIGEST_LENGTH], void* inContext) {

MD5Final(outDigest, (MD5_CTX *)inContext); return 0;}

EXPORT int DECL MD5Lib_DeleteContext(void * inContext){ if (inContext != NULL) { free(inContext); }}

SHA256 algorithm

In v3.2, instead of implementing the MD5 algorithm, client applications can specify the SHA256 hash algorithm for calculation of Content Addresses (CA) by the SDK.

IBM z/OS notes This section applies specifically to the IBM z/OS SDK.

60 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

Note: Mainframe CAS client applications can only be written in IBM C/C++. SAS/C is not supported.

Porting overview

The porting team selected the z/OS IBM C/C++ compiler for the port of the Atmos CAS API to the IBM z/OS because it alone provided the level of open systems compatibility needed to implement the API without substantially rewriting it.

A bridge or glue layer was developed to allow SAS/C applications to execute Atmos CAS API services within the IBM Language Environment (LE). The glue layer creates a persistent LE enclave by attaching a subtask. For simplicity, we chose to attach this subtask when encountering the first FPPool_Open() or FPPool_OpenW() call, and detach it with the final FPPool_Close().

In addition to the LE enclave, the API needs a C runtime environment. This is established by a top-level IBM C/C++ program with a main() function (sometimes called a fetchable main). Such a program may be started with a JCL EXEC statement or via the LINK or ATTACH macros.

For these reasons, top-level assembler programs (even LE assembler) can only use the Atmos CAS API through LINK or ATTACH to a C/C++ main() routine, as does the glue layer.

Branch interfaces not supported

Assembler programs cannot use CALL or any other branch interface to access the API.

No reserved ports for IBM z/OS client

There are no reserved ports for the IBM z/OS client. Atmos CAS-enabled nodes use port 3218.

Memory optimization runtime recommendations for XPLINK

If you are using the XPLINK (XPT) version of the DLLs, IBM recommends that you should use the HEAPPOOLS(ON) runtime option. This reduces latch contention in the malloc() and free() functions, which have also been recompiled entirely in XPLINK. If HEAPPOOLS(OFF) is in effect, calls to malloc() and free() require a call through glue code from XPLINK applications. HEAPPOOLS(ON) may increase memory utilization but should result in a net performance benefit.

The XPLINK (XPT library suffix) version of the subtask driver program that supports SAS/C (FPZMAIN) has, as of this release, now been compiled with the following line:

#pragma runopts(heappools(on))

Multiprogramming support

The Atmos CAS IBM z/OS SDK provides multiprogramming support using either subtasks or pthreads. Here are some restrictions on this support:

• Pthreads can only be used when POSIX=ON is set.

IBM z/OS notes 61

Best Practices

• The Atmos CAS IBM z/OS SDK can only execute in multiple subtasks if POSIX=OFF is set.

• When using pthreads, it is believed that only one (1) FPPool_Open() should be performed and its FPPoolRef should be used by all threads. This is, however, not required.

• When using subtasks, each subtask must perform its own FPPool_Open(). This is required because the Atmos CAS IBM z/OS SDK DLLs' Write Static Areas, WSAs, are not shared across subtasks. This differs from pthreads where the WSA is shared across threads.

• FPStream_CreateFileForOutput() and FPStream_CreateBufferForOutput() are not thread safe.

LRECL, fixed block datasets, and trailing NULLs

When reading BLOBs created and written from datasets using access method parameter RECFM=FB (Fixed Block) format, it is important to use the same logical record length (LRECL) for both datasets.

If, for example, the BLOB is obtained from a dataset defined with an LRECL=40 and it is retrieved to a dataset using LRECL=80, then 40 bytes of every record will be filled with trailing NULLs.

On the other hand, if the original LREC=80 and the restored dataset uses an LRECL=60, then two records are written, one 60 bytes and the other 20 bytes of data together with 40 bytes of trailing NULLs.

IBM C/C++

POSIX ON and threadsThe POSIX runtime option must be set to on to use threads. For more information, see “Multiprogramming support” on page 61. Several options are available to set this option.

• #pragma statement

• PARM statement

• CEEUOPTS

Note: A CSECT assembled with user options using the CEEXOPT macro. For more information, refer to the IBM publication z/OS V1R6.0 C/C++ Programming Guide, SC09-4765-05.

The following examples illustrate how to set POSIX(ON):

• C Source

#pragma runopts(posix(on)

• JCL Source

PARM=’POSIX(ON)/Application_Parameters’

62 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

Environment variablesFor instructions on how to set Atmos CAS environment variables, see the IBM publication z/OS V1R6.0 C/C++ Programming Guide, SC09-4765-05. With z/OS v1.7, the CEEOPTS DD statement can be allocated to specify LE options including ENVAR("_CEE_ENVAR=YOUR.DATASET") so that the #pragma runopts statement may not be needed.

In an LE multitasking environment, the CEEOPTS specification affects all main tasks, whereas the #pragma statement may be specific to a single main task. The following examples show how to set environment variables with a DD called FPCONFIG:

• C Source:

#pragma runopts(ENVAR("_CEE_ENVFILE=DD:FPCONFIG")

• JCL Source:

//FPCONFIG DD *FP_OPTION_OPENSTRATEGY=FP_LAZY_OPENCENTERA_CUSTOM_METADATA=ENV1,ENV2ENV1=ENV1DATAENV2=ENV2DATA2/*

Running an Atmos CAS IBM z/OS application under TSO

When running an Atmos CAS application under TSO, the TSO PROCLIB must contain a STEPLIB DD statement that contains the API DLL or the TSO tsolib command executed to include the Atmos CAS API DLL.

Dataset name specification

Depending on the compiler used, dataset names are specified in application programs in several ways. Table 4 on page 63 below show how dataset names and compiler style prefixes that are acceptable in Atmos CAS API. Wherever the Atmos CAS API needs a dataset name, or a DDName, these specifications apply:

Table 4 Supported dataset name specifications (page 1 of 2)

Dataset name Description

HLQ.SOMEFILE Name used without // prefix is taken as a fully qualified dataset name

'HLQ.SOMEFILE’ Name used with quotes but without // is taken as a fully qualified dataset name

//'HLQ.SOMEFILE' IBM/C: used with // prefix and quotes is a fully qualified dataset name

//SOMEFILE IBM/C: name used with // prefix and without quotes is a partial name and will prefix the user id as HLQ (High Level Qualifier)

IBM z/OS notes 63

Best Practices

Atmos CAS IBM z/OS API memory utilization notes

Table 6 on page 65 and Table 7 on page 65 describe the memory use by a subpool for an application written in IBM C/C++.

In each table, the memory consumed is compared to the memory consumed in the previous release, V2R0M0.

Note that there has been a increase in the use of lower memory of 172K or approximately 55%.

DD:DDNAME IBM/C style prefix for DDName.

*DD:DDNAME IBM/C style prefix for DDName.

//dsn:HLQ.SOMEFILE SAS/C style prefix. Name must be fully qualified.

dsn.HLQ.SOMEFILE SAS/C style prefix for dataset name (optional format).

//ddn:DDNAME SAS/C style prefix for DDName.

ddn:DDNAME SAS/C style prefix for DDName (optional format).

//tso:SOMEFILE SAS/C style prefix for TSO file (partial name allowed).

tso:SOMEFILE SAS/C style prefix for TSO file (optional format; partial name allowed.

//tso:’SOMEFILE.HLQ’ SAS/C style prefix for TSO file (fully-qualified name).

tso:’SOMEFILE.HLQ’ SAS/C style prefix for TSO file (optional format; fully-qualified name).

5 Unsupported file name specificationsTable

Dataset name Description

///u/cen/file HFS file name

/u/cen/file HFS file name

//hfs:/u/cen/file SAS/C style prefix for hfs file name

hfs:/u/cen/file SAS/C style prefix for hfs file name (optional format)

Table 4 Supported dataset name specifications (page 2 of 2)

Dataset name Description

64 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

There has also been a large increase in the use of upper memory, particularly in subpools 229 and 251.

6 KB of memory used below 16 MB by subpoolTable

Subpool usage Subpool V2.01 V2.3

Private 0 12 12

Private 1 4 4

Private 2 0 0

Private 131 0 0

LSQA 205 0 0

LSQA 215 0 0

LSQA 225 0 0

Private 229 0 0

Private 230 84 92

Private 236 52 52

Private 237 72 72

Private 249 0 60

Private 251 60 60

Private 252 0 0

LSQA 255 28 132

Total: 312 484

Table 7 KB of memory used above 16 MB by subpool (page 1 of 2)

Subpool usage Subpool V2.01 V2.3

Private 0 16 16

Private 1 552 804

Private 2 32 1784

Private 131 4 4

LSQA 205 488 1420

LSQA 215 24 32

LSQA 225 52 52

Private 229 528 38692

Private 230 616 2300

Private 236 96 96

Private 237 200 200

IBM z/OS notes 65

Best Practices

Note: These tables show the subpool environment as it exists within the EMC IBM z/OS systems and subpool assignments and utilization may vary accordingly at vendor and customer sites. Private subpool usage, other than subpool 0, is determined by IBM/C, LE, and z/OS.

APF-authorization support

The Atmos CAS API supports but does not require APF-Authorization. If you are creating APF-Authorized applications make sure that the Atmos CAS API DLLs and the IBM LE runtime libraries, CEE.SCEERUN and CEE.SCEERUN2, are loaded from APF-Authorized libraries.

Invoking the Atmos CAS API from assembler

In the SAMPLIB PDS dataset, there is an example program, CB1ASM, that illustrates how to implement an application from a driver written in assembler.

When implementing the Atmos CAS API from a program written in assembler, note the following:

• The Atmos CAS SDK is written in IBM C/C++ and supports a C language interface. As a result, you must establish an LE enclave. In the CB1ASM example program, it is assumed that a C executable has been implemented which performs the Atmos CAS API calls.

• The value of the POSIX flag must be set to off when executing in a multiple subtask-shared address space environment.

• Limiting the number of inter-language calls, (that is, assembler to C), can result in more efficient usage of CPU.

The CB1ASM example contains the following logic points:

• Accepts an input parameter.

• Attaches a C executable, passing it the input parameter it received.

• Waits for the C executable to terminate.

• Issues a WTO message with the return code received from the C executable.

• Issues a Detach.

• Terminates.

Private 249 8 0

Private 251 6284 17732

Private 252 0 532

LSQA 255 8,576 8972

Total: 17,476 72,636

Table 7 KB of memory used above 16 MB by subpool (page 2 of 2)

Subpool usage Subpool V2.01 V2.3

66 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

Best Practices

• You may want to consider the following when implementing an application based on the CB1ASM example program:

• Passing a couple of ECBs to the C executable, and waiting on multiple ECBs for the subsequent return, (when the attach ECB is posted), or the completion of some work unit for another ECB. Using this setup, the C executable can be in a wait state as it waits for work units to process. Also, the LE enclave for the C executable is setup once and does not require re-initialization for each work unit.

• Passing an address as a parameter to the C executable.

In this case, an address must be formatted into a Zoned/displayable hex value when passed as a parameter and then rebuilt by the C executable into a binary address. When this technique is used, multiple data elements can be passed that are mapped to a C structure and assembler DSECT.

IBM z/OS notes 67

Best Practices

68 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

INDEX

Aaccess security model 34ANSI C 38APF-authorization 66application authentication and authorization 34application server 38application server and performance 57asynchronous behavior 38

Bblob read flow via generic stream 27blob write flow via generic stream 24blobs

embedded 55reading 45writing 45

blob-slicing 43branch interfaces 61buffer

FPTag_BlobRead 49buffer size 42, 48buffer stream 21buffer types

protocol buffer, internal buffer, application buffer 19

Ccalculate Content Address 45, 54canonical format 45C-Clip traversal 44Centera error codes 49CENTERA_CUSTOM_METADATA 39changing retry settings 53clip functions 44clip-enumeration 46collision avoidance 42, 54concurrency requirements 53connection string 41Content Address

calculate 45, 54Content Address Collision Avoidance 54conventions

naming 38create new tag 45

Ddata

aggregating into one blob 56embedding in the CDF 55

datasetname specification 63

Eembedded data threshold 42EMC online support website 5encode metadata 52error handling 49errors

internal 49network 49parameter 50program logic 49server 49

Examples - C 16

Ffile stream 19finalize() 49flat mode 44FP_OPTION_RETRYCOUNT 53FP_OPTION_RETRYSLEEP 53FP_OPTION_STREAM_STRICT_MODE 22FP_STREAM_BYTECOUNT_MISMATCH_ERR 22FP_STREAM_VALIDATION_ERR 22, 23FPClip_FetchNext() 44FPClip_GetTopTag() 44, 45FPClip_IsModified() 45FPClip_RawRead() 45FPErrors.h 50FPLibraryException 49FPPool_SetGlobalOption() 42, 53FPPool_SetIntOption() 42FPPoolQuery functions 47FPQueryExpression functions 47FPQueryResult functions 47FPStream_CreateGenericStream() 22FPTag_BlobRead to internal buffer 49FPTag_BlobRead() 45FPTag_BlobWrite() 45FPTag_BlobWritePartial() 45FPTag_GetFirstChild() 44FPTag_GetSibling() 44

Ggeneric stream 22

Hhard-coded string 52

IIBM C/C++ 62IBM z/OS

APF-authorization 66assembler 66

EMC Atmos Version 2.3.0 CAS Programmer’s Guide 69

Index

branch interfaces 61dataset name specification 63IBM C/C++ 62LRECL 62memory utilization 64multi-programming support 61notes 60porting overview 61reserved ports 61TSO 63XPLINK and memory optimization 61

increasing performance 53input stream 18interfaces

branch 61internal buffer size 42internal errors 49ISO-10646 52

Lload balancing 41loadFPLibrary 42LRECL 62

MMD5 code examples 59MD5 Library API 57MD5Lib_CreateContext 57MD5Lib_DeleteContext 58MD5Lib_Final 58MD5Lib_Update 58memory utilization

IBM z/OS 64metadata

encode 52multi-programming support

IBM z/OS 61

Nnaming conventions 38network errors 49

Oopen C-Clip

as a tree 44flat 44

output parameters 49output stream 18

PPAI 33parameter errors 50parameters

output 49parent tag 45performance

aggregating data 56application server 57

embedding data 55transfer rate 56

pluggable SDK MD5 module 57pool

connections 42functions 40options 42

pool access information (PAI) 33pool interface 42porting overview, IBM z/OS 61Profile Clips 43program logic errors 49programmer’s notes 38

Qquery

functions 46performing 47resuming 48

Rreading blobs 45recommended usage 38reflections 46retry

functionality 53number 42wait time 42

retry settingschanging 53

Sserver errors 49setGlobalOption() 53size of buffers 48stream functions 46, 49stream types

file stream, buffer stream, generic stream 18streaming callback functions 23streaming recommendations 46synchronization 53

Ttag

create new 45functions 45names 38parent 45

thresholdembedding data 42for embedding data 55

timeout TCP/IP connection 42top tag 44transfer rate and performance 56traverse C-Clips 44tree mode 44TSO

IBM z/OS 63

70 EMC Atmos Version 2.3.0 CAS Programmer’s Guide

IndexIndex

UUnicode support 52UTF-8 52

Wwide characters 52writing blobs 45

XXML 38XPLINK 61

EMC Atmos Version 2.3.0 CAS Programmer’s Guide 71

Index

72 EMC Atmos Version 2.3.0 CAS Programmer’s Guide