84
5/28/2018 ZenandtheArtofPowerPlayCubeBuilding-slidepdf.com http://slidepdf.com/reader/full/zen-and-the-art-of-power-play-cube-building-562451be08bd4 1/84 Performance and Scalability in PowerPlay Applications This document outlines several known facts and theoretical aspects of PowerCube creation in Transformer. It also discusses scalability in PowerPlay Enterprise Server.

Zen and the Art of Power Play Cube Building

Embed Size (px)

DESCRIPTION

CRM tools; IBM cognos; powercube; multidimensional cube; datawarehouse

Citation preview

  • Performance and Scalability in PowerPlay Applications

    This document outlines several known facts and theoretical aspects of PowerCube creation in Transformer. It also discusses scalability in PowerPlay Enterprise Server.

  • Table of Contents

    Part I ZEN and the Art of PowerPlay Cube Building 7

    CHAPTER 1 Choosing Your Hardware 9How Transformer Uses Memory 10

    Setting the PowerPlay DataServer Cache Size 11Setting the Memory Available for Data Sort 12Operating System Memory Settings 13Using Multiple CPUs 14

    How Transformer Uses Disk Space 16Configuring Work Locations for Transformer 16Handling Very Large Work Files 17Disk Configurations 18

    CHAPTER 2 PowerPlay Partitioning 19When To Focus On Partitioning 20Environmental Constraints 21

    Network Bandwidth 21Cube Production Window 22Acceptable Query Response Time 22Server And End User Hardware 22After Environmental Constraints 22

    Whats New In PowerPlay 6.0 Partitioning 24Terminology Refresh 24The New Partitioning Scheme 25Other Tips 28

  • CHAPTER 3 Using Server Transformer 31The Basics 32

    Installation Checklist 32Step-wise Testing 35Common Pitfalls 37

    Creating an Effective Working Environment 43Directory Structures 43Configuration Files 43Automation Scripts 47

    CHAPTER 4 Cube Build Samples 53Cube Build Results 54

    Environmental Settings and Methodology 54Row Dominant Model 55Category Dominant Model 56Row and Category Dominant Model 57

    Part II Zen and the Art of PowerPlay Enterprise Server Configuration 59

    CHAPTER 5 PowerPlay Web Stress Testing 61Test Tools and Environment 62Test Results (Windows NT Server) 63

    Enterprise Server Concurrency Test Results 66Test Results (Solaris) 67PowerPlay Web Sizing Guidelines (NT) 68

    Basic Server Specifications 69Memory Requirements 71Deciding on the Number of Processes 72

  • CHAPTER 6 Enterprise Server Tips and Techniques 75Restricting Query Processing Time 76Setting the Query Processor Threshold 77Registering Multiple Cubes by Directory Name 78Reducing the amount of HTML Passed to the Web Browser 80Passing Security Information to PowerPlay Web from a Web Front End 81File Descriptors for PowerPlay Enterprise Server 83

  • 7

    Part I ZEN and the Art of PowerPlay Cube Building

    Nothing can improve either build performance or runtime performance in a Power-Play application better than taking the time to design your OLAP model and cubes well. However, there are some things that can impact performance for nearly any PowerPlay application. This chapter describes the things you can do to optimize your PowerPlay cube builds. In addition, this section describes in detail how to use Transformer successfully in a UNIX environment.

  • 9

    CHAPTER 1 Choosing Your Hardware

    Choosing the right hardware to build PowerPlay cubes is dependent on the follow-ing things:

    1. Memory. This is probably the single most important aspect of a cube building machine.

    2. CPUs. Transformer can take advantage of two CPUs during the data read phase, thereby achieving faster builds.

    3. Disk Space and Configuration. Transformer is I/O intensive as it reads source data, builds temporary work files, and creates PowerCubes. Tuning your envi-ronment to provide Transformer the best possible I/O performance will speed up your cube builds.

    4. Networking Considerations. Ensuring fast access between Transformer and its data sources will provide optimal cube build times.

    5. Partitioning. PowerPlay versions 6.0 and 6.5 include a new autopartitioning algorithm that typically builds cubes faster than earlier versions.

    This chapter discusses these issues.

  • Choosing Your Hardware

    Performance and Scalability in PowerPlay Applications 10

    How Transformer Uses MemoryOne of the keys to building big cubes is to understand how Transformer uses mem-ory during the various phases of cube building. There are three major phases of cube building:

    Category/Work File Generation. Includes reading the data source and pro-cessing the rows to create categories, and compressed working files.

    Meta-Data Update. Deals with specifying the structure of the cube. Cube structure is some subset of the dimensions, categories and measures available in the model, and it reflects what end-users see in PowerPlay.

    Data Update. Represents (normally) the dominant stage of cube processing in terms of time. This stage includes consolidation, partitioning, and updating the cube with the set of records that apply to the cube.

    The last two phases (Metadata Update and Data Update) most often represent the bulk of cube build time.

    The main variables for controlling how Transformer uses memory are as follows:

    UNIX Windows NT PPDS_WRITE_MEMORY environment variable. Default: 16MB

    WriteCacheSize entry in [PowerPlay DataServer] section of cognos.ini.Default: 4MB

    SORTMEMORY environment variable SortMemory entry in [Services] section of cognos.ini.file.

  • 11 Performance and Scalability in PowerPlay Applications

    How Transformer Uses Memory

    Setting the PowerPlay DataServer Cache SizeThe value set for PowerPlay DataServer Write Cache can affect cube build time in a negative or positive way, depending on how much memory is available. In order to better understand this, you have to be aware of what processes are using memory during a cube build. Assuming that the server on which cubes are being built is ded-icated to Transformer, the following processes compete for memory:

    the Operating System the Disk Cache, which is maintained by the Operating System Transformer, and the Transformer Object Repository (Transformer RP) Cognos Data Management Services (DMS) the PowerPlay Data Server (PPDS)

    The MetaData Update and Cube Update phases most often represent the bulk of processing time for Transformer. During these phases, PPDS, Disk Cache, and to some extent Transformer RP compete for memory. The amount of physical mem-ory available during these times is critical to better build times.

    As a rule of thumb, the best performance is achieved when enough physical mem-ory is available so that the Disk Cache can grow to be as large as the final size of the PowerCube (MDC). The reason is that building an MDC results in many inserts, updates and deletes by PPDS. When the PowerCube is entirely in memory (Disk Cache), cube build performance will be optimal since there is no need by PPDS to go to access information on the hard disk.

    Assigning more memory to PPDS (via the PPDS_WRITE_MEMORY or Write-CacheSize entries) can benefit build time since this will reduce the number of disk I/Os required by PPDS. However, this is beneficial only if it does not reduce the memory available to the Disk Cache. On systems with lots of memory, there is no reason why the user should not consider increasing the PPDS Write Cache from the default setting. However, increasing it indiscriminantly can actually cause prob-lems. Increasing the cache size to 32MB or 64MB on large systems can provide performance improvements. Increasing it to a very large setting (hundreds of mega-bytes) is not generally advised.

  • Choosing Your Hardware

    Performance and Scalability in PowerPlay Applications 12

    A recent cube build for a 65MB cube in Windows NT, on a PC with 512 MB of RAM yielded a build time improvement of approximately 15%:

    Setting the Memory Available for Data SortThe SORTMEMORY variable sets the amount of physical memory thats available to DMS when its sorting data. Transformer sorts data for consolidation, and sorting is required for auto-partitioning. The number you specify represents the number of 2K blocks used when sorting data. For example, setting SORTMEMORY=5120 provides 5120 x 2K = 10MB of memory for data sorting. The default setting for SORTMEMORY is 512, or 1MB.

    As with the PPDS Write Cache, you should only increase the setting for SORT-MEMORY if your systems overall physical memory allows it, and if your source data is large.

    When sorting data, creates the data services component of Transformer creates a temporary sort file. You will obtain the best performance (and possibly avoid run-ning out of disk space during cube builds) if you control the location of this tempo-rary file. See the section entitled Configuring Work Locations for Transformer on page 16.

    WriteCacheSize Build Time4 MB 162 Minutes32MB 137 Minutes

  • 13 Performance and Scalability in PowerPlay Applications

    How Transformer Uses Memory

    Operating System Memory SettingsMost UNIX systems are configured to provide the best possible sharing of system resources between competing processes. This is not an optimal setting for Trans-former, which requires as much physical memory as possible.

    For example, an HP/UX server might have 2GB of physical memory, but be config-ured so that no single process can ever obtain more than 67MB. In such cases, Transformer will never obtain as much memory as it needs to perform large cube builds in an efficient way.

    To ensure that Transformer can obtain the memory it requires, make sure that your UNIX server is configured to grant unlimited resources to the rsserver process.

    For HP/UX: You can increase the MAXDSIZ kernel setting.

    For Solaris and IBM/AIX, contact the system administrator or your operating sys-tem documentation to determine how to tune your kernel settings for Transformer.

  • Choosing Your Hardware

    Performance and Scalability in PowerPlay Applications 14

    Using Multiple CPUsTransformer versions 6.0 and 6.5 have an option to use multiple CPUs during the data read phase of a cube build. For large data sources, this can significantly improve the overall performance of your cube builds. This feature is called Enable Multi Processing, and it is set on a query-by-query basis in the Query Properties dialog box:

    Note: The Enable multi-processing feature is supported only for the following query types:

    Impromptu Query Definition (IQD) Delimited Field Text Delimited Field Text with Column Titles

    The Enable multi-processing feature allows Transformer to use multiple processors (if theyre available) during Transformers data read phase.

  • 15 Performance and Scalability in PowerPlay Applications

    How Transformer Uses Memory

    Running Parallel Instances of Transformer

    A second case in which a multi CPU system can be used to advantage with Trans-former is in running multiple instances in parallel. This can be required when a large number of cubes must be built in parallel, in a specific time period.

    When running multiple Transformer instances, keep the following in mind:

    Each distinct Transformer process should have its own dedicated CPU. If youre using the Enable Multi-Processing feature, then each instance or trans-former should have 2 dedicated CPUs.

    Each Transformer instance will use system resources independent of all other instances. Ensure that you have sufficient memory, disk space, and i/o band-width to support all instances.

    Each Transformer instance will require its own set of configuration files. Do not share directories for temporary files between Transformer instances.

  • Choosing Your Hardware

    Performance and Scalability in PowerPlay Applications 16

    How Transformer Uses Disk SpaceTransformer creates several temporary files as it builds cubes. Most of these are documented in the Transformer documentation, with the exception of the DMS temporary sort file. You can find detailed information about temporary file size estimates in the PowerPlay Administrators Guide.

    One file that is not documented in the PowerPlay Administrators Guide is the DMS temporary sort file. This file is created whenever Transformer has to perform a sort operation, which happens by default when you build cubes. The entry that controls the location of the DMS temporary sort file is:

    If you do not set this entry, then Transformer uses whatever location is set up as the Temporary file location in either UNIX or NT. Most often this is /usr/tmp (for UNIX) or c:\temp for NT. If you have not explicitly set up the location for the tem-porary sort file, its possible to run out of disk space during a cube build. Set the TEMPFILEDIRS (UNIX) or Temporary (NT) entry so that it points to a disk with space equal to the size of the cube or greater. You can specify multiple directories separated by semicolons.

    Note: You cannot set this entry for Windows NT using the Transformer UI; you can only set it by editing the cognos.ini file manually.

    Configuring Work Locations for TransformerA common mistake is to point all of your file locations to the same physical folder. For example, you might set your DataWorkDirectory, ModelWorkDirectory, Cube-SaveDirectory, and so on all to /usr/cognos/work, or something similar. The prob-lem with this approach is that it introduces contention for the same physical drive among multiple processes. All phases of the cube build process will read data from and write data to the same physical device. This does not provide an optimal use of system resources.

    A better approach is to specify different locations for each of the major areas in which Transformer creates files, preferably in physical locations associated with

    UNIX Windows NTTEMPFILEDIRS environment variable Temporary entry in [Services] section

    of cognos.ini.file.

  • 17 Performance and Scalability in PowerPlay Applications

    How Transformer Uses Disk Space

    different disk controllers. For example, assume that on a Solaris system, there are four physical drives, each with several GB of space, and each controlled by a sepa-rate disk controller (Controller1 through Controller4). A good approach might be to define the following locations:

    ModelWorkDirectory: /cognos/model/work, on Controller1 DataWorkDirectory: /cognos/data/work, on Controller2 TEMPFILEDIRS: /usr/cognos/temp, on Controller3 All other file locations (CubeSave, ModelSave) in various folders on

    Controller4.

    Handling Very Large Work FilesIn the case of very large cubes (Millions of rows of input data and hundreds of thousands of categories), its possible for Transformer to exceed the maximum allowable size for a file when building its temporary work file. Its also possible to completely fill a volume or disk.

    In such cases, you can specify multiple locations for your temporary work files. For NT, you specify these in the Data Temporary Files box in the Directories tab of the Preferences dialog box. For UNIX, you specify multiple paths for the DataWorkDi-rectory setting, as inDataWorkDirectory=

    For example, you might specify the following:DataWorkDirectory=/opt/cognos/work1;/opt/cognos/work2

    Being able to use multiple drives eliminates size limitations set by the operating system. As Transformer creates cubes, it writes temporary files to the specified drives or directories, in the order that you listed them. It moves to the next location when the current one is full. These files are concatenated into one logical file, regardless of which drive they are in.

  • Choosing Your Hardware

    Performance and Scalability in PowerPlay Applications 18

    Disk ConfigurationsTransformer is i/o intensive. For a large cube build, Transformer can perform thou-sands of reads and writes to and from its temporary work file, the sort file, and to the PowerCube (MDC file). To ensure that your cubes are built as optimally as pos-sible with respect to disk I/O, its important that your environment be optimized for fast reads and writes. Consider the following when configuring your environment for optimal Transformer cube builds:

    The speed of the physical devices available. When deciding on hardware for use in building large cubes, you will realize the best performance if you have very fast physical disk access times.

    The RAID level of your physical storage devices can have an effect on overall cube build time. Which RAID level you choose will have a big effect on the throughput for Transformer. The following table describes the effects of some (but not all) widely-used RAID levels:

    Because PowerPlay cubes should not be used as a primary storage place for data, RAID Level 0 (striping) is often most appropriate for large cube builds. This will provide optimal cube builds. In the event of a disk failure during a cube build, the cube can be rebuilt from the source data. If data redundancy is a requirement once the cube is built, the cube can be copied to another location if required.

    RAID Level Description0 (striping) Optimizes performance at the expense of data redundancy.

    Data is distributed among disks for performance, with no pro-vision for redundancy. As a result, a disk crash can cause data to be lost across several disks.

    1 (mirroring) Emphasizes data redundancy at the expense of perfor-mance. Mirroring maintains multiple copies of data to ensure that, in the event of a disk failure, no data is lost.

    5 (striping with parity) Attempts to balance performance and data redundancy. Data is distributed across disks along with parity infor-mation about the data. In the event of a disk crash, the parity information allows for the reconstruction of data. As a result, system-wide risk is lowered, as no individ-ual disk can cause widespread loss of the data. Although RAID Level 5 is generally faster than mirroring, it can significantly slow down data reads, as it must deal with (ignore) parity information when performing reads.

  • 19

    CHAPTER 2 PowerPlay Partitioning

    The goal of partitioning has always been to achieve a satisfactory end user query response time without exceeding the cube production window. Partitioning is a strategy that improves query response time at the expense of cube build time. In PowerPlay 6.x, partitioning occurs automatically using a best guess approach that can be fine tuned by the user. Choosing the right partitioning strategy is, how-ever, the subject of environmental constraints that include hardware power and the nature of end user queries performed. This means the default auto-partitioning cannot always pick the right partitioning strategy for every situation.

  • PowerPlay Partitioning

    Performance and Scalability in PowerPlay Applications 20

    When To Focus On PartitioningPowerPlay 6.x partitioning is effective only if the proper OLAP architecure and supporting hardware is in place. No amount of partitioning can correct deficiencies in these areas. The final tuning of query response time then comes down to what you can do with partitioning strategies. The Auto-partitioning capability in Power-Play 6.x does most of the work for you, provided that you supply some simple information concerning partition size. Partitioning is a big cube philosophy which is only applicable if you are dealing with millions of rows of source data; partitioning is of little value for small cubes.

  • 21 Performance and Scalability in PowerPlay Applications

    Environmental Constraints

    Environmental ConstraintsEnvironmental Constraints are limitations in the BI architecture that presently can-not be changed.

    These constraints impact the range of Partition Sizes that can be employed and, in general, apply to all cubes. Environmental constraints include:

    1. Network bandwidth2. Cube production window3. Acceptable query response time4. Server and end user hardware

    Network Bandwidth

    As a rule of thumb, avoid using large partitions when the PowerCube does not reside on the same machine as PowerPlay Server.

    The impact of network bandwidth constraints comes from the new architecture of PowerPlays Enterprise Server. Communication of information between the Power-Cube and the end user can involve multiple networks and hosts. In PowerPlay 6.5, it is also possible to network PowerPlay servers to distribute requests to the right OLAP data source. The effect of partitioning on network traffic is most notable at the PowerPlay Server closest to the cube. Between this server and the cube, the vol-ume of network traffic is proportional to the partition size and number of requests. Using a smaller partition size will support a larger number of queries. Another example of this is end users who employ PowerPlay Personal Server to access a cube shared on the LAN. Query processing is performed by each individual users Personal Server. Here again the volume of network traffic is proportional to the par-tition size and number of requests on the LAN.

    Where cube and PowerPlay Server are resident on the same host, network con-straints are exchanged for the higher volume limits associated with hardware buses and back-planes which permit greater partition sizes. These higher limits may shift the constraint to the capability of the CPU to perform query calculations within the required response time.

  • PowerPlay Partitioning

    Performance and Scalability in PowerPlay Applications 22

    Cube Production WindowWhen considering cube production windows, decreasing partition size generally improves query response time by reducing the volume of data required for query calculation. This improvement is at the expense of increasing cube build time. Average query response time reflects the response time needed to keep users happy and the throughput required to service the user community.

    Acceptable Query Response TimeThis is normally expressed in terms of an average since it varies both with distance from the cube and with the complexity of the reports being executed against the cube. Other factors that can effect query response time include the number of simultaneous requests being processed and network capacity. It is important to understand the fluctuations in workload, like variances in user behavior during the course of the day, in order to verify the right OLAP architecture and hardware have been employed.

    For information about how to guage the average response time at your site, see Determining Average Query Time on page 70.

    Server And End User HardwareIf an OLAP architecture is to support a given number of users it must be able to execute as many requests per second, as it receives. Estimating volumes of incom-ing requests is where capacity planning should start so that systems can be designed having a combination of hardware capacity and cube structure to support the request load. It follows that large partitions are only practical on servers that are capable of performing a large number of calculations per second.

    For information about sizing your server, see Basic Server Specifications on page 69.

    After Environmental ConstraintsHaving defined the environmental constraints and determining that the average required response time is still outside acceptable criteria, several steps can be taken

  • 23 Performance and Scalability in PowerPlay Applications

    Environmental Constraints

    to refine cube partitioning strategies. There are two critical strategies that guide successful cube partitioning:

    1. Answer the majority of queries from the first or upper partitions (called Sum-mary Partitions)

    2. Try to ensure that information sought by a query can be found within a single partition.

  • PowerPlay Partitioning

    Performance and Scalability in PowerPlay Applications 24

    Whats New In PowerPlay 6.0 PartitioningConsider the following when you partition models in PowerPlay 6.x:

    Terminology RefreshIt is important to remember that a partition level is not the same as a level in a dimension. A partition level is a set of categories that receive the same partition number. These categories can and probably will come from more than one level in a dimension when using the 6.x auto-partitioning algorithm.

    PowerPlay 5.x PowerPlay 6.xAutopartitioning not optimized to pro-duce balance between access time and cube build time.

    Autopartitioning is quicker and will usually produce a smaller faster cube than 5.x Man-ual partitioning

    No default partitioning. Partitioning is ON by default. The new auto-partitioning feature determines the way to partition any given cube.

    Manual partitioning only. Supports both manual and automatic parti-tioning.

    The only partitioning controls were partition numbers.

    Partition numbers are assigned by a user when manually partitioning only. This facil-ity will be less commonly used. Instead an extra tab on the Cube Properties dialog box provides controls that guide the auto-parti-tioning algorithm as shown below.

    Consolidation of input records was off by default.

    Consolidation with Sort is ON by default.

    Only 2 partition levels were recom-mended.

    Up to 5 levels are commonly used.

  • 25 Performance and Scalability in PowerPlay Applications

    Whats New In PowerPlay 6.0 Partitioning

    The New Partitioning SchemePowerPlay version 6.x partitioning (both Manual and Automatic) requires the new cube build algorithm. If you use a feature that forces Transformer to use the default 6.0 cube optimization then you will not be able to get the advantages of 6.x parti-tioning. The features that do not utilize the new cube build algorithm are:

    Cube Optimization is not set to Auto-Partition Consolidation set to: either of NO or YES with Presort Using externally rolled up measures Using before rollup calculated measures (use calculated columns instead)

    Auto-Partitioning

    There are two controls used by the auto-partitioning algorithm. In order of impor-tance they are:

    The desired partition size (based on estimated number of consolidated records) The number of partitioning passes

    The desired partition size is set by a slide control that is designed to help users select sensible values. It makes the trade-off between cube build time and query response time very clear. However, it does not show the effect that a change in par-

  • PowerPlay Partitioning

    Performance and Scalability in PowerPlay Applications 26

    tition size has on the partitioning strategy employed. The Desired Partition Size is the main control in determining the partitioning strategy.

    Transformer selects groups of categories that satisfy the partition size criteria you specify. It does this in a way that optimizes ad-hoc query performance for that par-tition size. This means that users should try several different partition sizes that sat-isfy their environmental constraints. Users will arrive at a suitable partition strategy quickest if they start with larger partition sizes and work toward smaller ones (rather than the other way around).

    The number of passes can be adjusted at the end to find the best build time/query performance trade-off. It becomes a fine tuning tool that safe-guards against over-partitioning. Transformer doesnt necessarily use all the passes that you allow it but it will not use more. Transformer requires one pass for every partition level created. The smaller the partition size, the greater, the number of partition levels created (and the longer Transformer will generally take to build the Cube). This is why it is recommended that users start with larger partition sizes and work toward smaller ones. During this phase it is important to allow Transformer to use a large number of passes.

    Dragging the slider to the right (towards Faster Cube Access) decreases the Desired Partition Size. Dragging the slider to the left increases the Desired Partition Size.Similarly, editing the Estimated Number of Consolidated Records changes the Desired Partition Size accordingly.

  • 27 Performance and Scalability in PowerPlay Applications

    Whats New In PowerPlay 6.0 Partitioning

    Cutting back the number of passes will not prevent Transformer from producing a partitioned cube. It simply controls the amount of effort Transformer will expend on a given partition size. Transformer always stops partitioning when the number of records in the summary partition drops below the specified partition size. The key point is that later passes may not produce significant improvements in query response time. This is where the number of passes may be refined to get the right query performance/cube build time trade-off.

    A useful tip to know is that the last pass that Transformer performs is summary par-tition consolidation. If an additional pass does not yield a reduction in summary partition size then this pass can be skipped. (If both the summary partition and the level one partition finish with the same number of records then it is likely that Transformer would use more passes if you allow it. It shows that the summary par-tition has not been consolidated.)

    PowerPlay 6.x Manual Partitioning

    There are reasons why you might wish to use manual partitioning in 6.x. Some of these are:

    Deciding whether to adopt 6.x auto-partitioning over your 5.x partitioning scheme

    Trying to improve on the auto-partitioning strategy Dealing with large cubes or an unusually structured cubes Tuning specifically for the top 10 reports

    Lets consider reasons for using auto-partitioning instead of a 5.x partitioning scheme. If the 6.x Auto-Partitioned cubes build faster and answer queries at least as quickly as 5.x cubes then there is no need to compare the actual partitioning strate-gies. Some time should be spent with Auto-partitioning under 6.x since it generally performs very well. If necessary your 5.x partitioning scheme can be loaded into 6.x but you must open the model in 6.x and change the default optimization setting in cube properties to auto-partitioning. In addition you must remove all the manual partition information from your old model otherwise the Auto-partitioning algo-rithm will not be used. Also, if you want optimal performance, ensure that you dont use any features that may prevent Transformer from using the new Cube build engine. See The New Partitioning Scheme on page 25.

    Experience has shown that it is possible to improve on Auto-partitioning in some situations. This might be achieved by starting with the partition points already

  • PowerPlay Partitioning

    Performance and Scalability in PowerPlay Applications 28

    selected by the Auto-partitioning feature. Note that the moment partitioning level numbers are assigned manually, Auto-Partitioning is disabled.

    Most models have dimensions whose lowest level details are frequently accessed (i.e product codes). In these dimensions, it is important to manage categories with a high parent to child ratio and partition them accordingly. Normally Auto-partition-ing will do a good job provided you pick a partition size that is big enough. Too many partition levels will adversely effect lowest level detail reports. This is another reason why you should first auto-partition with a large partition size and work down in size to get acceptable query response.

    After dealing with large detail dimensions, consider dimensions that are frequently summarized. These dimensions may be less obvious than the detail dimensions. If 80 percent of your user community creates summary reports with categories from a given dimension, consider partitioning at a high level in that dimension. Auto-parti-tioning will do this automatically so long as the partition size is small enough for Transformer to consider the dimension as a candidate for partitioning. The partition status information will show the partitions chosen by Transformer. Using a small partition size may go against advice offered earlier. This is true if you have both frequently summarized dimensions and dimensions containing large category counts used primarily to report lowest level details. Unfortunately it is not possible to use different partition sizes on different parts of the cube. The best one can do is to favor either summary reports or lowest level detail reports.

    Excluding Dimensions

    The auto-partitioning algorithm does permit the exclusion of dimensions. Dimen-sions might be considered for exclusion if they are used only for lowest level detail reports. By excluding detail dimensions (which are often large) it will be possible to set the partition size smaller. Cube build time will be faster and summary queries will be serviced quickly. If the same model included partitioning on detail dimen-sions, lowest level queries would be slower since they are reporting across multiple partitions.

    Other Tips The moment any partitioning level numbers are assigned to either categories

    or levels auto-partitioning is turned off and only the categories and levels spec-ified manually are partitioned.

    When manually defining partitions in 6.x, the number of Auto-partition passes

  • 29 Performance and Scalability in PowerPlay Applications

    Whats New In PowerPlay 6.0 Partitioning

    must be set to the number of partition levels assigned manually or greater. Whether in Automatic or Manual mode, Transformer 6.x requires one pass for each partition level defined.

    In 5.x, we made the recommendation that dimensions with alternate drill-downs were poor choices for partitioning. The categories in alternate paths will likely come from multiple partitions, causing slow drill-down response times. In 6.x, Auto-partitioning is more equitable in choosing candidate categories for partitioning. (As a result it will frequently partition categories from dimensions that have alternate drill paths, without detrimentally effecting performance). Also only the primary drill path is used for partitioning.

    Manually defined 6.x partition level numbers dont necessarily define the order in which partitions are placed in the cube. Partition level numbers may be re-ordered by Transformer.

    When partitioning manually, rules similar to 5.x partitioning apply. If perfor-mance is not adequate after partitioning repeat the process using a level (in some other dimension) that adequately divides the number of records in the summary partition. For each new level of partitioning, increase the partition number by 1. Try to select categories across the entire dimension and assign these categories the same partition number. Once complete, you should not be able to traverse a path from the root category to any leaf without encountering a category with an assigned partition number.

  • 31

    CHAPTER 3 Using Server Transformer

    In production environments, there is often a desire to leverage the CPU power and resources of a UNIX server to update PowerPlay PowerCubes. Processing produc-tion PowerCubes in this fashion enables you to centralize administration and cube storage and allows PowerCube updates to become a part of the existing data ware-house update process. After the warehouse updates, the PowerCubes are updated. Also, the UNIX server can often access the database faster because of location or network priority.

    PowerPlay Server enables you to use Transformers graphical user interface on the PC to model your two-dimensional data into multidimensional PowerCubes. How-ever, the Administrator Server Edition also enables you to offload the creation of models and PowerCubes to the rsserver program (PowerPlay Server Transformer), which is a version of Transformer that runs on a supported UNIX server platforms. This chapter addresses some common issues faced when using PowerPlay Server Transformer, and has two major sections: Part I covers the basics: an installation checklist, step-wise testing, and some

    common pitfalls. Part II discusses how to create an effective working environment by configur-

    ing your working areas, customizing where files go, and automating cube builds.

  • Using Server Transformer

    32

    The Basics Often, users are confused about what the components of PowerPlay Server are and how they interact. Understanding what the pieces do is vital to successfully using the product. The main components of PowerPlay Server Transformer are:

    PowerGrid. PowerGrid provides the basic networking services between client and server Transformer. The PowerGrid network daemon (netd) must be started and left running to respond to periodic requests from the client. PowerGrid uses the script rsserver.sh, which sets the necessary environment variables and starts rsserver, to launch Server Transformer. An addiitonal utility, called NetInfo, is installed on the client machine, and can be used to test your defined client/server connections. NetInfo verifies that messages are being sent to netd, and that netd is responding to them.

    Server Transformer (rsserver). The rsserver program is the executable for Power-Play Server Transformer that runs on the UNIX server. The rsserver program pro-vides the engine to create PowerCubes on any of the supported server platforms such that the processing of data during category generation and cube creation takes place on the server.

    All PowerCube prototyping is still done on the client; however, you must install the PowerPlay Administration Server edition on the client in order to have the options available to process the cubes on the server.

    Installation Checklist

    Before installing and working with the product, you should read the Installation Guide and the Transformer for UNIX Guide. This book focuses wholly on Client/Server Transformer, and contains explanations of all the environment variables that can be used to customize your working area. Reading these will also familiarize you with where specific information is located so you can reference them later.

  • 33 Performance and Scalability in PowerPlay Applications

    The Basics

    Ensure you have the required hardware and software. The supported Operating Systems and databases are:

    Consult the README help file accompanying PowerPlay for any changes to the list of supported databases.

    Also, before beginning your installation, you should ensure you have everything on the following checklist:

    UNIX root password UNIX login and password for user who will store server models and cubes PowerPlay Server CD ROM Host name or IP address of UNIX server UNIX shell ( i.e. csh, sh, ksh). Echo $SHELL to determine this. You should be able to successfully TELNET from the PC to the UNIX server If possible, make sure a DBA will be available for questions, especially if you

    are planning to store cubes in the database. Many DBAs do not like to give out create and drop privileges; they want to be involved in this step.

    Database Product and Version Platforms and Operating SystemsOracle 7.3 (base) and 8 (current HP-UX 10.20 (base) and 11.x (current)

    Sun Solaris 2.5 (base) and 7 (2.7) (current)IBM AIX 4.2 (base) and 4.3 (current)Digital UNIX 4.0 (base) and 4.0E (current)SINIX 5.43 (current)

    Sybase SQL Server 11.2 (base),Adaptive Server 11.5 (current) and Sybase Adaptive Server 11.9.2

    HP-UX 10.20 (base) and 11.x (current)Sun Solaris 2.5 (base) and 7 (2.7) (current)IBM AIX 4.2 (base) and 4.3 (current)Digital UNIX 4.0 (base) and 4.0E (current)

    Informix Dynamic Server 7.2 (base) and 7.3 (current)

    HP-UX 10.20 (base) and 11.x (current)Sun Solaris 2.5 (base) and 7 (2.7) (current)IBM AIX 4.2 (base) and 4.3 (current)Digital UNIX 4.0 (base) and 4.0E (current)SINIX 5.43 (current)

    Informix XPS 8.21 (current) HP-UX 10.20 (base) and 11.x (current)Sun Solaris 2.5 (base) and 7 (2.7) (current)IBM AIX 4.2 (base) and 4.3 (current)Digital UNIX 4.0 (base) and 4.0E (current)SINIX 5.43 (current)

    IBM DB2 Common Server 2.1 (base) and its successor: Universal Data-base 5.x (current)

    HP-UX 10.20 (base) and 11.x (current)Sun Solaris 2.5 (base) and 7 (2.7) (current)IBM AIX 4.2 (base) and 4.3 (current)

  • Using Server Transformer

    Performance and Scalability in PowerPlay Applications 34

    If possible, have a UNIX System Administrator available. Double-check the version of UNIX you are using - the command uname -a at

    the UNIX command line should tell you this. Ensure the version is supported. Make sure you know what database(s) are used and that the version is sup-

    ported. Cognos Customer number. This will enable you to get to the Customer Support

    area and Knowledge Base on the Cognos Web Site to get common solutions to configuration issues.

    For Oracle

    Ensure that:

    the environment variables ORACLE_HOME, TNS_ADMIN, and ORACLE_SID are set up and pointing to the correct locations. For example: ORACLE_HOME=/train/oracle7.3/productsTNS_ADMIN=$ORACLE_HOME/network/adminORACLE_SID=ORAUX73

    the bin directory under Oracle is in the path.

    the Oracle login for the schema where you will place your database Power-Cubes has create table, create index, and drop table privileges.

    OCI via SQL*Net (Oracle 7.3 and 8), Net 8 (Oracle 8) is installed on the PC. the same TNSNAME defined (case sensitive) is defined in the client and server

    TNSNAMES.ORA file.

    you can successfully ping the Oracle Instance using TNSPING from the client PC and the server.

    For Sybase

    Ensure that:

    the environment variables: SYBASE and DSQUERY set up and pointing to the correct locations. For example:SYBASE=/train/sybase11/serverDSQUERY=SYBTRAIN (Sybase Server Name would list here)

    the bin directory under Sybase is in the path

  • 35 Performance and Scalability in PowerPlay Applications

    The Basics

    the Sybase login has create database privileges

    CT-Lib (Sybase 11 Client Library) is installed on the client the same Server Name (case sensitive) defined in the server INTERFACES file

    (stored in $SYBASE) and the client PC (stored in c:\sybase\ini\sql.ini file). You can successfully ping the Sybase SQL Server using WSYPING or

    DSEDIT

    For Informix

    Ensure that:

    the environment variables: INFORMIXDIR and INFORMIXSERVER are set up and pointing to the correct location. For example:INFORMIXDIR=/train/informixINFORMIXSERVER=INFSOL72 (Informix Server Name would list here)

    the Informix login that has create privileges

    ESQL/C 7.23 for Win 32 (Informix Net 7.2 or higher) is installed on the client the same Server Name defined in the server.

    you can successfully access Informix using Ilogin.

    Step-wise TestingAs discussed earlier, several components make up the PowerPlay Administrator Server environment. With step-wise testing, you ensure that each piece is working before you start using the tool. This way, if an environment setting or configuration step was left out, you can find it right away and have a better chance of isolating the proper area if a problem occurs.

    The following steps are provided as guidelaines to make sure each piece works individually, and builds up to using all of the pieces together:

    1. Install the server software and make sure PowerGrid is working. Install the software on the UNIX server and the PC.Configure the environment settings on the Server. For client/server operations, set the database environment variables and source the setpya.sh script in the

  • Using Server Transformer

    Performance and Scalability in PowerPlay Applications 36

    rsserver.sh file that resides in powergrid/tcpip/netbin. For command line opera-tions, make the same modifications to the .profile or the .login. Verify the cognos.ini on the PC is setup correctly to run Transformer Server by ensuring that the Transformer Server service is defined. The entry looks like this:[Service - Transformer Server ]NETWORK=rsserver.sh

    Configure and start the PowerGrid service (netd) on the UNIX server. You may set up logging for netd (see Automation Scripts on page 47 for information about how to do this). Do not forget to assign a port number to PowerGrid in the /etc/services file on the server.Define a PowerGrid connection and test to ensure PowerGrid is working cor-rectly via netinfo.exe from the PC. See Chapter 2 of the Transformer for UNIX Guide for information about PowerGrid and NetInfo.

    2. Make the environment cleaner and nicer to work in (see Creating an Effective Working Environment on page 43.):Create directory structures on the UNIX side for cubes, models, logs, and work directories.Modify the Transformer Server startup script (rsserver.sh) to add preferences for logging, and so on.Create a server preferences file (.trnsfrmr.rc).

    3. Create a server PowerCube using Transformer Server with an ASCII fileCreate a simple model with an ASCII file (national.asc sample) and set it up to work in client/server mode.Move the model file to the server using client Transformer (preferred) or use FTP method.Build this on the server.Copy the PowerCube to the PC and open cube with PowerPlay Client.

    4. Create a server PowerCube using Transformer Server with .IQD filesTake a model that uses .IQD files and set it up to work in client/server mode.Move the model file to the server using client Transformer or FTP method.Build this on the server.Copy PowerCube to PC and open cube with PowerPlay Client.

    5. Optional - only if you are storing cubes in your database: Create Data-base cubes from the serverTake a model and build as standard cube first (or, can use the cube from step 4)

  • 37 Performance and Scalability in PowerPlay Applications

    The Basics

    Create the tables in the database and allocate space using double the size of the standard cube as a rough guideline.Modify the model to build the cube in the database.Move the model file to the server. This is optional; you can buildthe model from client first to make sure it works then move it to the server.Recreate the cube as a database cube.

    Common Pitfalls

    There are some common pitfalls that users encounter when using PowerPlay Server. Most of these concern environment and configuration settings. The sections below discuss different areas and what to look out for in each.

    Environment settings are everything in UNIX

    When using .IQD files to build cubes, you must make sure the client and server database information is in sync. For example, the same TNSNAME must be defined defined (case sensitive) in the client and server TNSNAMES.ora file for Oracle. Similarly, the same server name (case sensitive) must be defined in the cli-ent PC sql.ini file and the server INTERFACES file for Sybase.

    PowerGrid uses rsserver.sh when it launches Server Transformer. The rsserver.sh file sets up environment variables to control preferences for the Server Transformer process, and it sets up the necessary environment variables to run Server Trans-former. If you are using .IQD files to build your cubes, you will need the appropri-ate database environment variables set in this file as well. Set these database variables up before the line that calls the setpya.sh script. See rsserver.sh on page 45.

    Setpya.sh (or setpya.csh if using csh) is used to set up the PowerPlay environment variables and should be sourced in both the users .profile and rsserver.sh. This should be done after all the other environment variables are set as it may depend on specific variables being set first (for example, database specific environment vari-ables).

    Make sure the TNSNAMES.ora, INTERFACES, or other database specific files contain connection information for all .IQD data sources. If you build cubes with new data sources, the connection information for these new sources needs to be added to these server files.

  • Using Server Transformer

    Performance and Scalability in PowerPlay Applications 38

    A few words about Root

    PowerPlay Server Transformer installation files need to be owned by root. Pow-erGrid not only needs to be owned by root, but the process must be started as root.

    If client/server processes such as model synchronization fail suddenly, make sure that the owner of the rsserver.sh file has not been changed from root to another user after the install.

    PowerGrid Needs its Own Port

    Make sure there is an entry for PowerGrid in the /etc/services or, if an NIS Protocol is used in your system, you must define the communications service port for the PowerGrid netd in the Services Database.

    Verify that the port number used in the connection between client and server matches. The default port number set for PowerGrid is 1526. If this is not the port number configured for PowerGrid in the /etc/services file on your UNIX server, you must change your local port setting to match. See Chapter 2 of the Transformer for UNIX Guide.

    When communicating between client and server, PowerGrid must be running. As a rule of thumb, you should modify the system reboot script so that the PowerGrid netd is started after each reboot. This way it is certain to always be running.

    PowerPlay Administrator Server PC Requirements

    You must install the PowerPlay Administration Server client piece on any PC that needs to access the server. This edition enables the options to build cubes on the server.

    The cognos.ini file should have the Transformer Server service defined net-work=rsserver.sh (upon install the product should add this line, but double-check this if you are having problems).

    The 32-bit database connection software is needed to run Transformer 6.x and Impromptu.

  • 39 Performance and Scalability in PowerPlay Applications

    The Basics

    Passwords

    When working with models, it is often a good idea to save the model as an .MDL file as well as a .PYx file. Passwords are embedded in .PYx models, which are binary files. However, they are not embedded in .MDL models, which are text files. This is correct behavior of the product and documented in the PowerPlay Adminis-trators Guide. Therefore, if you have embedded your password into your model, saved it as both .PYx and .MDL, closed the tool and then decided to work with the .MDL model, Transformer will not save the password information in this version of the model. Instead, it will default to prompt for password in the sign on. If you try to restore this model from the client to server and run on the server, the process will fail because the server cannot prompt for the password.

    To avoid this problem, either use the k option (see below) on the UNIX command line to build the cube, or add the password information to the model, save as a .PYx file, and use this format to synchronize the server cubes.

    One method for moving models from the client to server and building on the server is to ftp the .MDL version of the model, process the .MDL model on the server to a .PYx, and then build from the UNIX command line. When using this method, since passwords are not embedded in the .MDL file, you should use the -k option of rsserver to explicitly set the password. All rsserver command line options are docu-mented in Chapter 5 of the Transformer for UNIX Guide.

    If a model suddenly stop working, make sure passwords used are still valid since they may periodically change.

    Before any communication between the client and server can take place, Trans-former will prompt you to connect to the UNIX server with a user id and password. By default, security is enforced with netd(), and the user ID and password from login are checked in /etc/passwd file OR other password file for TRUSTED sys-tems. Examples of password files for Trusted systems are: AIX:/etc/security/passwd HPUX(10 and higher):/tcb/files/auth/Sun Solaris:/etc/shadow

    Therefore, in order for netd to allow you to connect to the server, the user ID and password information should be located in either the /etc/passwd or one of the above files for authentication, otherwise you will not be allowed to connect.

  • Using Server Transformer

    Performance and Scalability in PowerPlay Applications 40

    Some Server file information

    When Transformer Server builds a cube, it creates several temporary files. One is a .QYx, which is a temporary work file and another is a .LCK file, which guards the model from being used by other processes at the same time. After work with a model is finished, these files are removed. However, if the Transformer process is killed or otherwise fails abruptly, these files may not be cleaned out. If a .LCK file exists for a model, Transformer sees this as a locked model and does not allow you to use it. If this occurs and no processes are using the model, you can remove the .LCK file to unlock it and continue.

    Note the supported data sources. Server processed cubes can not use Impromptu Hotfiles or PC format files, such as dbase files. Supported data sources are:

    .IQD files to get data from databases that can be accessed from the UNIX server

    delimited-field text files

    fixed-field text files

    There needs to be enough temporary space on the server to build PowerCubes. You can set DataWorkDirectory and ModelWorkDirectory to places that are large enough to force Transformer to use these as work areas. This space should be larger than the expected size of the cube. See Configuring Work Locations for Trans-former on page 16.

    Special needs of Database Cubes

    If you are storing your cube in a database, but your database option is not available in Transformer, make sure to update the Database Support section in Cognos.ini by putting 1 after the = for the database you would like to store the cube in. Then the option should be available in Transformer.

    While trying to update a database cube, if Transformer returns the following error:(TR1102) Transformer Server detected a model error at line 2 in file (TR1901) PowerPlay Data Server is already in use in an incompatible mode

    This error might indicate another user is connected to the database cube. Have the other user disconnect from the database cube until the cube update is complete. Transformer attempts to apply an exclusive lock to the database

  • 41 Performance and Scalability in PowerPlay Applications

    The Basics

    when performing a database cube operation - if someone else is already using that database, the lock cannot be applied and an error is returned.

    Space Requirements

    If you are building cubes in an RDBMS container, the space requirements are dif-ferent that those required to build the equivalent Cognos standard cube. This sec-tion describes the results of an internal test performed using Oracle. This is based on a specific model, and actual sizes will vary depending of the specific structure of the client model and size of data source:

    As a general rule, an Oracle PowerCube will occupy approximately twice the space of its equivalent Standard PowerCube. For example, if you build a standard cube that produces an 80MB MDC file you would need approximately 160MB of Oracle tablespace to build the equiv-alent RDBMS cube, including indexes. Of the 160MB, Transformer would use approximately 10% for indexes and the rest for data. The end result would be a 144MB data tablespace and a 16MB index tablespace.

    There are 17 tables used to store a PowerPlay 6.0 cube in an Oracle database. Only a few of these tables store significant amounts of data, with the XXXX_BITMAP table occupying the bulk of the space. The following are suggested extent sizes for PowerPlay 6.x:

    Table Name Initial Extent Next ExtentXXXX_ACCESS 5K 5KXXXX_ALLOCATION 5K 5KXXXX_APPLICATION 5K 5KXXXX_APPLICATION60 5K 5KXXXX_BITMAP IM IMXXXX_CAT_INDEX IM 1MXXXX_CATEGORY 1M 1MXXXX_CELL 5K 5KXXXX_CLASS 5K 5KXXXX_CURRENCY 5K 5KXXXX_DIMENSION 5K 5KXXXX_DRILL 1M 1MXXXX_EXCHANGE 5K 5KXXXX_LEVEL 5K 5KXXXX_MEASURE 5K 5KXXXX_PARTITION 5K 5K

  • Using Server Transformer

    Performance and Scalability in PowerPlay Applications 42

    With very large cubes, its advisable to change the INITIAL and NEXT extents for the XXXX_BITMAP table to be much larger depending on the number of times the table can extend. Also, the XXXX_CELL table is only used if crosstab caching is enable in the PowerCube property sheet (Processing TAB). If this option is selected then you would want to change the extent sizes to be 1 MB.

    The default table creation scripts supplied with Transformer do not create the tables and indexes in separate tablespaces. You can modify these scripts to use different tablespaces for the indexes and tables if necessary or desired.

    RDBMS Tunable ParametersWhen creating cubes in an RDBMS, there are certain parameters you can set that may speed up Transformers performance.

    XXXX_TEST 1M 1M

    RDBMS Description of SettingsOracle The Fetch Number of Rows entry in the [Misc] section of cogdmor.ini.

    This setting controls how many rows are retrieved during a fetch operation during data read. The default setting is 10. Try bumping it up to 50 or 100.

    For larger cubes, try creating your indexes and tables in separate tablespaces.

    Certain of Oracles default configuration settings are in conflict with Transformers requirements.

    Sybase The NUMOFROWS setting in the [Misc] section of cogdmct.ini. This setting controls how many rows the server returns to Client Library for each client fetch request. Bumping it up (for example to 50 or 100) can have a positive impact on some systems.

    DB/2 In the cognos.ini file, the following entry controls the level of locking for DB/2 during cube updates:[PowerPlay Server - DB2]security=0/1The default setting is 1, security enabled.When you disable security, you will not be able to tell if someone is using the cube. A cube build will overwrite an existing cube. This wont necessarily be apparent to the client, but data could change during a client session. You should ensure that no users are connected when you build cubes.

    Table Name Initial Extent Next Extent

  • 43 Performance and Scalability in PowerPlay Applications

    Creating an Effective Working Environment

    Creating an Effective Working Environment

    Directory Structures

    For the purposes of discussion, assume that PowerPlay Server Transformer is installed in the directory /opt/cognos. This directory contains all of the files installed with the product.

    Also, assume that the corresponding working directory is located in /home/cognos. This directory contains all of the models, log files, cubes, scripts, and temporary workspace used by PowerPlay Server Transformer. We could have had this work-ing area in the same directory as the product is installed. However, most UNIX applications are kept separate from work areas, and so its best to follow this con-vention. The sub-directories that you could create are:

    Configuration FilesThe following is a listing of files used by PowerPlay Server Transformer that have either been created or edited after installation along with explanations. The only addition that must be added to ensure proper usage is to the file ".profile". The changes to the other files are not absolutely necessary to run PowerPlay Server Transformer, but they either make the product easier to use or add more logging capabilities. Changes are highlighted in bold lettering.

    .trnsfrmr.rc

    /home/cognos/.trnsfrmr.rc: This is a resource file that was created after installation. This file contains default settings for PowerPlay Server Transformer to use when

    Location Description/home/cognos/cubes: The directory where the cubes are saved./home/cognos/models: The directory where the models are saved./home/cognos/log: The directory where log files are saved./home/cognos/scripts: The directory where all scripts are saved./home/cognos/scripts/bin: The directory where script utilities are (buildscript)./home/cognos/work: The directory used for work or temporary space.

  • Using Server Transformer

    Performance and Scalability in PowerPlay Applications 44

    executed on the UNIX command line (not when executed via PowerGrid from the PC). FilenameVariables=trueModelWorkDirectory=/home/cognos/workDataWorkDirectory=/home/cognos/workDataSourceDirectory=/home/cognos/workCubeSaveDirectory=/home/cognos/cubesModelSaveDirectory=/home/cognos/modelsLogFileDirectory=/home/cognos/logLogDetailLevel=6

    .profile (or .login)/home/cognos/.profile: The UNIX system administrator creates this sample login file when the user "cognos" was added to the system. The bold items were added to ensure the proper environment variables are set for PowerPlay Server Transformer when the user logs into the system. The directories /home/cognos/scripts and /home/cognos/scripts/bin are added to the path. These directories contain the scripts created to automate the building process. Also, the setpya.sh file was called to set up the PowerPlay environment variables; it is usually safest to add this as the last line of the file since it may depend on other environment variables being set first (for eaxmple, database specific environment variables). TERM=vt100; export TERMORACLE_TERM=vt100; export ORACLE_TERMORACLE_HOME=/fin030601/oramtmo/7.3.2.2; export ORACLE_HOMEORACLE_SID=sin3dev1; export ORACLE_SID. /usr/local/bin/oraenv

    #PATH=$PATH:.PATH=/usr/local/bin:/usr/bin:/fin030601/oramtmo/7.3.2.2/bin:/home/cognos/scripts/bin:/home/cognos/scriptsEDITOR=vi; export EDITOR

    # Set up the shell environment:# Set -utrap "echo logout"

    stty erase "^H" kill "^U" intr "^C" eof "^D"

    . /opt/cognos/pya65_278/setpya.sh

  • 45 Performance and Scalability in PowerPlay Applications

    Creating an Effective Working Environment

    rsserver.sh

    /opt/cognos/powergrid/tcpip/netbin/rsserver.sh: This file is installed with Power-Play Server Transformer and is used by PowerGrid to set the necessary environ-ment variables and start the Rsserver process. This is only used to start the rsserver process when executing tasks from the PC (via PowerGrid) on the server. The bold items contain default settings for PowerPlay Server Transformer to use when exe-cuting tasks from the PC. Also note that the necessary database environment vari-ables are set near the top of the file.#!/bin/sh# $Header: /High Sierra/UNIX/setup_ux/RSSERVER.SH 4 3/11/97 11:01a Taos $# $Author: Taos $

    # rsserver.sh - Bourne-Shell script to launch Transformer via PowerGrid

    # set environment variable for OracleORACLE_HOME=/fin030601/oramtmo/7.3.2.2 ; export ORACLE_HOME. /usr/local/bin/oraenv

    # set environment variables required to run rsserverPYA_USR=${PYA_USR:=/opt/cognos/pya65_278}. $PYA_USR/setpya.sh

    if [ x"$DSQUERY" = x ] ; then DSQUERY=SYBHPUX491 ; export DSQUERYfiif [ x"$PLANG" = x ] ; then PLANG=english ; export PLANGfi

    # You may modify values and remove comment indicator # to use# these environment variables to modify Transformers behaviour.# Preference values established here override any set by setpya.sh above.)# The documentation describes these and many additional controls.DataSourceDirectory=/home/cognos/work ; export DataSourceDirectoryLogFileName=/home/cognos/log/$$.log ; export LogFileName ModelSaveDirectory=/home/cognos/models ; export ModelSaveDirectory CubeSaveDirectory=/home/cognos/cubes ; export CubeSaveDirectoryDataWorkDirectory=/home/cognos/work ; export DataWorkDirectory# launch Transformer in PowerGrid mode...exec $PYA_USR/bin/rsserver -w

  • Using Server Transformer

    Performance and Scalability in PowerPlay Applications 46

    netpgrc

    /opt/cognos/powergrid/tcpip/network/netpgrc: This file is installed with PowerPlay Server Transformer and is used to set the necessary environment variables and start the netd daemon process (PowerGrid). The item "env > $NPPATH/netd_env.log" will redirect the output of the env command (display current environment vari-ables) to the log file named "netd_env.log" in the netbin directory. The last high-lighted addition tells PowerGrid to log detailed actions and specify that the information be written to a log file called "netd.log" in the netbin directory. This is especially useful when troubleshooting PowerGrid. This file should be used to start the PowerGrid netd process because it ensures the proper environment variables are set and enables logging.#!/bin/sh############################################################################### COGNOS INC. PowerGrid Network Daemon rc script for all UNIX platforms##############################################################################

    COGNOS=/opt/cognosexport COGNOS

    PG_VER_DISPLAY=2.0.6

    ## Sourcing all Cognos client-server products Environment variables#if [ "echo /opt/cognos/powergrid/tcpip/netbin/net??rc" != /opt/cognos/powergrid/tcpip/netbin/net??rc ] ; thenfor script in $COGNOS/powergrid/tcpip/netbin/net??rcdo . $script echo "sourced Cognos Products environment through $script"donefi

    #NPPATH=$COGNOS/powergrid/tcpip/netbinexport NPPATH

    ## The following sample lines are used by NETD for non-default comm. port.# If the env. variable NPNETD is set up here, NETD will NOT pick up # any port number from /etc/services. Please take out "#" both lines

  • 47 Performance and Scalability in PowerPlay Applications

    Creating an Effective Working Environment

    # and change yyyy below to whatever the actual port number before # re-start NETD.##NPNETD=ISyyyy#export NPNETD

    #LANG=Cexport LANG

    SRVCMSGS=$COGNOS/powergrid/tcpip/network/srvcmsgs.catexport SRVCMSGS#

    #COGNLSTAB=$COGNOS/powergrid/tcpip/network/coglangtabexport COGNLSTAB#

    env > $NPPATH/netd_env.log

    if [ uname -m = i386 ]; thenecho "SCO PowerGrid Network Daemon Version $PG_VER_DISPLAY"su root -c "$COGNOS/powergrid/tcpip/network/netd &"elseecho "UNIX PowerGrid Network Daemon Version $PG_VER_DISPLAY"$COGNOS/powergrid/tcpip/network/netd -d -f &

    fi

    Automation Scripts

    The following are examples of scripts that can be used to automate the scheduling of PowerPlay Transformer cube builds. These scripts assume the same installation and work directories as mentioned earlier. The work directories must be in place for these scripts to work.

    File 1, buildscript, is the shell program that can be called to build a script to run any model. To use buildscript:

    At the UNIX command prompt, type "buildscript". . When prompted, type the name of the model you are building the script for. An example:fin_hp3:/home/cognos-> buildscriptPlease enter the name of the model to create the run script for.

  • Using Server Transformer

    Performance and Scalability in PowerPlay Applications 48

    ddbcubeBuilding /home/cognos/scripts/run_ddbcube.sh------ Finished ------fin_hp3:/home/cognos->

    In the above case, the script "run_ddbcube.sh" can now be added to a crontab to be scheduled to run automatically.

    File 2, generic, is used as a template for buildscript to work.

    File 3, run_ddbcube.sh, is a sample script created using buildscript.

    File 4, crontab, is a sample of a crontab that can be used to schedule tasks in UNIX. The "crontab -e" command needs to be run in UNIX to create this file. Note, when more than one task needs to be run, you add the tasks to the same crontab using the "crontab -e" command. Refer to UNIX documentation for more informa-tion on cron.

    File Locations

    For the automation scripts to work, they need to be kept in the directory locations they currently reside in.

    /home/cognos/scripts/bin/buildscript

    /home/cognos/scripts/bin/generic

    All scripts built should reside in:

    /home/cognos/scripts

    To run the scripts, the user should log into UNIX with a user ID set up to work with Server Transformer.

    File 1: /home/cognos/scripts/bin/buildscript

    This file uses /home/cognos/scripts/bin/generic as a template and builds scripts to run models with PowerPlay Server Transformer. Before running this, you must apply execute rights to the file.#! /bin/sh

  • 49 Performance and Scalability in PowerPlay Applications

    Creating an Effective Working Environment

    # Edit "workdir" to point to the directory where the sub-directories for# models, cubes, logs, etc are locatedworkdir=/home/cognos

    echo "Please enter the name of the model to create the run script for. "

    read modelname

    echo "Building $workdir/scripts/run_$modelname.sh"

    #Replace all occurances of "model_name_here" with the input namesed "s/model_name_here/$modelname/g" $workdir/scripts/bin/generic > $workdir/scripts/run_$modelname.sh

    # Make the shell script executablechmod +x $workdir/scripts/run_$modelname.sh

    echo "------ Finished ------"

    File 2: /home/cognos/scripts/bin/generic

    This file is used by /home/cognos/scripts/bin/buildscript as a template to build to run models with PowerPlay Transformer Server.#! /bin/sh

    # This script runs the given model file.

    # Edit workdir to point to the directory where the sub-directories for # models, cubes, logs, etc are located# Edit installdir to point to the appropriate pya* directory where the install files are locatedworkdir=/home/cognosinstalldir=/opt/cognos/pya65_278

    . $HOME/.profile

    . $installdir/setpya.sh

    # Save the previous log file (if it exists)if [ -e $workdir/log/model_name_here.log ]; then mv $workdir/log/model_name_here.log $workdir/log/model_name_here.bakfi

    # Run model_name_here and output to a log filersserver -c -i $workdir/models/model_name_here.pye > $workdir/log/model_name_here.log 2>&1 &

  • Using Server Transformer

    Performance and Scalability in PowerPlay Applications 50

    ret_stat=$?exit $ret_stat

    File 3: /home/cognos/scripts/run_ddbcube.sh

    This is a SAMPLE of a script created by /home/cognos/scripts/bin/buildscript.

    In this example, the name of the model is ddbcube.#! /bin/sh

    # This script runs the given model file

    # Edit workdir to point to the directory where the sub-directories for # models, cubes, logs, etc are located# Edit installdir to point to the appropriate pya* directory where the install files are locatedworkdir=/home/cognosinstalldir=/opt/cognos/pya65_278

    . $HOME/.profile

    . $installdir/setpya.sh

    # Save the previous log file (if it exists)if [ -e $workdir/log/ddbcube.log ]; then mv $workdir/log/ddbcube.log $workdir/log/ddbcube.bakfi

    # Run ddbcube and output to a log filersserver -c -i $workdir/models/ddbcube.pye > $workdir/log/ddbcube.log 2>&1 &

    ret_stat=$?exit $ret_stat

    File 4: Sample crontab with entry to run script: run_ddbcube.sh

    CRON is a UNIX scheduling utility that uses a crontab (like below) to list and track the scheduled tasks it needs to run. # ===================================================================# crontab input file to run cognos# ===================================================================# flags (, between mult values, - for range, * for all)#minute = 0 to 59#hour = 0 to 23# day of month= 1 to 31#month= 1 to 12

  • 51 Performance and Scalability in PowerPlay Applications

    Creating an Effective Working Environment

    # day of week= 0 to 6 (0=Sun, 1=Mon, 2=Tue, 3=Wed, 4=Thu, 5=Fri, 6=Sat)# ===================================================================

    # -------ddbcube --------------# This will run daily at 4:00 am 00 4 * * * /home/cognos/scripts/run_ddbcube.sh

    # ====end of crontab input file ====

  • 53

    CHAPTER 4 Cube Build Samples

    This section describes the results of a series of Cognos internal tests conducted using PowerPlay version 6.5.

    All results were obtained on the following server:Digital Prioris Model ZX-109Dual 200MHz Pentium Pro Processor512 MB memoryDigital Storage Array (DS-SWXRA-HA) configured with 130GB of storage (RAID-5).Drives in array are DS-RZ1DB-VW (9.1 GB Ultra SCSI Drive)

  • Cube Build Samples

    Performance and Scalability in PowerPlay Applications 54

    Cube Build ResultsThe following test results outline Transformer cube build times for three different types and sizes of cube (Row Dominant, Category Dominant, Row & Category Dominant):

    Two of the three models used in the benchmark tests originated from actual clients (Row Dominant and Category Dominant). The Row and Category Dominant model was manufactured internally but its design was based on actual client models.

    Environmental Settings and MethodologyFor all Transformer runs the following environmental settings were used:

    In the [PowerPlay Server] section of cognos.ini, the value for WriteCacheSize was set to 32000.

    In the [Services] section of cognos.ini the value for Temporary was set to a directory that had a large amount of available disk space.

    In the [Services] section of cognos.ini the value for SortMemory was set to 51200. This is the equivalent of 100MB.

    The following steps were repeated 4 times for each of the Transformer models:

    All unneeded processes on the test machine were halted. An MDL script was then used to run the following sequence: Open the test model. Change the name of the powercube (using a numbered sequence) Build the PowerCube. Save the populated model to PY.

    Stated results are an average of the 4 test run timings.

    Row Dominant

    CategoryDominant

    Row & Category Dominant

    Cube Build Time (Min.) 41 31 336Cube Size (MB) 60 170 410

  • 55 Performance and Scalability in PowerPlay Applications

    Cube Build Results

    Row Dominant Model

    The row dominant model was used to create a Cognos Standard cube with the fol-lowing settings:

    auto-partitioning with 5 passes maximum and a maximum partition size of 500,000 records

    crosstab caching enabled

    Dimension Map and Model Attributes

    Model Attribute DescriptionNumber of Categories 32,000Number of Dimensions 5 (measures not counted as a dimension)Number of Measures 4 (one calculated)

    2 x 64 bit, 1 x 32 bit, 1 after rollup calculatedSource Data Format ASCII (comma-delimited)Number of source files 14 (11 provide structure, 3 are transactional)

    Enable Multi Processing was set for the 4 largest queries.

    Number of Transaction input records

    10 million

    Size (in MB) of all source files

    570

  • Cube Build Samples

    Performance and Scalability in PowerPlay Applications 56

    Category Dominant ModelThe category dominant model was used to create a Cognos Standard cube with the following settings:

    auto-partitioning with 5 passes maximum and a maximum partition size of 50,000 records

    crosstab caching enabled

    Dimension Map and Model Attributes

    Model Attribute DescriptionNumber of Categories 146,238Number of Dimensions 5 (measures not counted as a dimension)Number of Measures 15. All 64-bit.Source Data Format ASCII (comma and ~ delimited)Number of source files 6 (5 provide structure, 1 is transactional)

    Enable Multi Processing was enabled for all but the smallest query.

    Number of Transaction input records

    1 million

    Size (in MB) of all source files

    224

  • 57 Performance and Scalability in PowerPlay Applications

    Cube Build Results

    Row and Category Dominant ModelThe category dominant model was used to create a Cognos Standard cube with the following settings:

    auto-partitioning with 5 passes maximum and a maximum partition size of 500,000 records

    crosstab caching enabled

    Dimension Map and Model Attributes

    Model Attribute DescriptionNumber of Categories 492,152Number of Dimensions 5 (measures not counted as a dimension)Number of Measures 5 (two calculated). 3 x 64 bit, 2 after-rollup calcu-

    leted.Source Data Format ASCII (~-delimited)Number of source files 9 (6 provide structure, 3 are transactional)

    Enable Multi Processing was set for the 4 largest queries.

    Number of Transaction input records

    50 million

    Size (in MB) of all source files

    2,280 (2.28 GB)

  • 59

    Part II Zen and the Art of PowerPlay Enterprise Server Configuration

    The most common question we receive with respect to PowerPlay Enterprise Server is:

    How may servers of type A do I require to service requests for n users

    against m cubes?

    This question is difficult to answer without some understanding of the following things:

    How well are the target cubes designed? This will impact how long the average query takes to complete, and how much CPU will be eaten up in servicing requests.

    How large and complex are the average queries? Is the average query a 50 row report or a 10,000 row report with nested levels, all measures, and zero sup-pression?

    How many concurrent users (peak) will be accessing the cube(s)?

    Note: At this time, sizing guidelines are for NT Server only. A later version of this paper will include information on UNIX servers.

  • 61

    CHAPTER 5 PowerPlay Web Stress Testing

    This chapter describes results from a set of internal tests run against PowerPlay Enterprise Server. Based on the results, it makes recommendations for server siz-ing.

  • PowerPlay Web Stress Testing

    Performance and Scalability in PowerPlay Applications 62

    Test Tools and EnvironmentThis section outlines results to date using an internally developed stress testing tool for PowerPlay Web. This tool occupies the same space in the PowerPlay architec-ture as the cgi gateway. The tool (called CAbuse, for Cube Abuser) submits random requests to a given cube on a given PowerPlay Server. These requests are submitted one after the other, with no pauses whatsoever. Several instances of CAbuse run-ning against a particular cube (or mapped cube) on a server simulates a very large number of concurrent requests, and hence a large number of concurrent users.

    A couple of things to note:

    these test results do not include the overhead of sending HTML query results from a request back to the cgi-gateway, through the web server, and ultimately back to the web browser. The results are based strictly on observed times for the query processor to handle requests. Under normal circumstances, this should be the bulk of processing time. In high volume client installations, similar through-put will be achieved on the PowerPlay Enterprise Server machine, provided that the cgi-gateway and web server are on a different machine. The machine that runs the web server must be large enough to handle the volume of HTML files coming from PowerPlay Enterprise Server.

    Because CAbuse generates random queries very quickly, it simulates about the worst possible load imaginable. The randomness of the queries ensures that very little advantage is derived from the servers PPDS cache. Also, because its truly random in its choice of links to follow, CAbuse periodically generates a phe-nomenally large query -- a hundred thousand rows nested, by several columns, displayed as bar charts, for example. The tests, therefore, do include some very large queries (as well as, admittedly, some very small ones).

    We have not (to date) run many tests against UNIX servers. The results in this document are primarily for NT Server only. UNIX tests are planned and will be documented as soon as theyre available.

  • 63 Performance and Scalability in PowerPlay Applications

    Test Results (Windows NT Server)

    Test Results (Windows NT Server)The following table outlines test results on, derived from running multiple instances of the Cube Abuser against the Row Dominant cube (see Row Dominant Model on page 55) on two different NT Servers.

    The above results are average times over several repeated tests, with varying levels of incoming queries. The tests were run with PPDS Read Cache settings ranging between 32MB and 48MB -- no significant difference was discerned between these settings. However, tests run with a PPDS Read Cache setting of only 8MB yielded far lower overall throughput. For example, with an 8MB PPDS Read Cache setting, the dual-processor NT server (Server 2) achieved a maximum throughput of about 7 or 8 requests per second, far lower than was achieved with a higher PPDS Read Cache setting, indicating that the cache does have a positive effect.

    The tests indicate that a single PPDSWEB process will handle slightly better than 50 concurrent users (5 to 7 requests per second), and that, on a dual processor PC, two processes will handle more than 100 concurrent users (10 to 12 requests per second). The results confirm that multiple CPU machines can dramatically increase the throughput possible with PowerPlay Web, up to the maximum number of fully loaded PPDSWEB processes.

    MaximumProcesses

    Server1a Requests/Sec

    a. Compaq DeskPro 6300. 1 x 300 MHz CPU, 320MB RAM.

    Server2bRequests/Sec

    b. IBM Netfinity Server. 2 x 350MHz CPU, 512MB RAM.

    1 5.74 6.552 7.80 11.33 7.11 12.42

  • PowerPlay Web Stress Testing

    Performance and Scalability in PowerPlay Applications 64

    The following charts shows output from NT Performance Monitor (showing %CPU usage for each CPU) and average request completed over the duration of one instance of the above test for the IBM NetFinity Server (Server2):

    Percent CPU Usage

    Average Request Time (ms)

  • 65 Performance and Scalability in PowerPlay Applications

    Test Results (Windows NT Server)

    The following PowerPlay report, derived from the NT Event Log, shows the num-ber of requests completed per second during a test run on Server2:

    The following graph shows the average request time in microseconds for the same test run:

    As you can see, the NetFinity server handled an average of nearly 600 requests per minute, with an average query time of under half a second. If we assume a concur-rent user level of 10% (of licensed users), and concurrent request level of 10% (of concurrent users), that implies this server would handle 100 concurrent users, or 1000 licensed users, with very fast average query times, provided users were exe-cuting average queries.

    Average Queries/Minute

    0

    100

    200

    300

    400

    500

    600

    700

    Minute 01Minute 02

    Minute 03Minute 04

    Minute 05Minute 06

    Minute 07Minute 08

    Minute 09Minute 10

    Minute 11Minute 12

    Queries per Time Period

    Mean

    Average Query Time (ms)

    0

    100

    200

    300

    400

    500

    Minute 01Minute 02

    Minute 03Minute 04

    Minute 05Minute 06

    Minute 07Minute 08

    Minute 09Minute 10

    Minute 11Minute 12

    Query Time (Average)

    Mean

  • PowerPlay Web Stress Testing

    Performance and Scalability in PowerPlay Applications 66

    Detailed logging was enabled for the duration of the test run. With logging dis-abled, slightly greater throughput should be achievable.

    Enterprise Server Concurrency Test Results

    To validate that the PowerPlay Enterprise Server does scale to multiple users, a set of tests was run with 40 PowerPlay clients connected to the PowerPlay Enterprise Server (NT). During this period, a tester submitted a series of benchmark queries against a very large cube, including drill down, drill up, nest rows and columns, and suppress zeros.

    While the PowerPlay Enterprise Server was under load, the impact on performance was minimal. The following table outlines the impact on performance:

    The decrease in time under load for Nesting is due to the effect of a populated cache on the PowerPlay Enterprise Server.

    Non-Load Load Impact %Open Cube 2.37 2.97 25%Drill Down 5.31 24.85 368%Drill Up 7.34 8.67 18%Nest Rows 2.47 2.00 -19%Nest Columns 4.59 2.00 -56%Suppress Zeros 7.50 19.32 158%

  • 67 Performance and Scalability in PowerPlay Applications

    Test Results (Solaris)

    Test Results (Solaris)A test run against the same cube on a quad processor Solaris server yielded sus-tained throughput of 800 to 900 requests per minute, with peaks of over 1000 requests per minute. For the test, 10 processes were set up (as Startup Processes) for the cube, and 64MB of PPDS Read Cache was allocated. For the duration of the run, CPU utilization rested at or above 90%, with no significant processes other than ppserver and the 10 ppdsweb instances on the machine.

    On average, the server handled approximately 15 requests per second. If we assume a concurrent user level of 10% (of licensed users), and concurrent request level of 10% (of concurrent users), that implies this server would handle 150 concurrent users, or 1500 licensed users, with very fast average query times, provided users were executing average queries.

    Server Characteristics:SUN Enterprise 3500 Server 4 x 360 MHz CPUs1GB RAM4MB External cache

    Cube Characteristics:Input Records: 10 MillionDimensions: 5Categories: 32,000Measures: 4Cube Size: 62MB

  • PowerPlay Web Stress Testing

    Performance and Scalability in PowerPlay Applications 68

    PowerPlay Web Sizing Guidelines (NT)Given the results in the preceding sections, its reasonable to assume that each CPU in a single or dual CPU machine will handle about 5 requests per second for aver-age queries against an average cube. For larger servers (quad, for example), assume about 4 requests per second per processor. In order to ensure that you achieve an acceptable level of throughput, follow these guidelines when setting up your sys-tem:

    1. For large installations, strive for an average query response time that is short -- a few seconds at most -- with the server not loaded. A single user should see results back very quickly for nearly all queries. If this is not possible, then set-ting client expectations becomes critical. One thing to ask clients who insist on generating reports with a hundred thousand rows and 10 columns is: When was the last time you opened a table like that on a regular HTML page?. You cannot expect blindingly fast performance if your average report contains a million cells with zeros suppressed.

    2. For multi-processor systems on NT, the scalability from one processor to two has been found to be fairly linear for PP Web requests. That is, if a single CPU P300 Server will handle n concurrent requests per second (on average), then a dual-CPU P300 server should handle 2xn concurrent requests of a similar nature, provided you set up 2 or more processes for the cube. This indicates that PP Web is relatively CPU-bound. The more CPUs the better, for large installations.Note: Although performance improvements have been found to be linear from 1 to 2 processors, its unlikely to be linear from 2 to 4 processors. Larger serv-ers expend additional effort in managing system resources. Further tests in this area are pending.

    3. Tests to date for load balanced servers are inconclusive at this time. However, preliminary tests indicate that load balancing across 2 identical servers does not yield a doubling of throughput. Rather, each additional server adds approx-imately 50% additional capacity. For example, suppose Server A will handle n requests/second. A mapped cube on Server A that is mapped onto physical cubes on Server A and identical Server B will handle approximately n*1.5 requests/second.

    4. The network can become the bottleneck in load-balanced environments. If two back-end PowerPlay servers can normally handle a set number of requests per