76
©2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Wendy Bartlett and Jim Smullen Distinguished Technologists, HP 6 October 2011 OPEN SYSTEM SERVICES AND NONSTOP OS UPDATE

Open System Services and Nonstop OS updatewhp-hou4.cold.extweb.hp.com/pub/nonstop/ccc/oct0611.pdf · 2011. 10. 5. · H06.22 / J06.11 64 Partition Enscribe Key-Sequenced Files •

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • ©2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

    Wendy Bartlett and Jim Smullen

    Distinguished Technologists, HP

    6 October 2011

    OPEN SYSTEM SERVICES AND NONSTOP OS UPDATE

  • FORWARD-LOOKING STATEMENTS

    This document contains forward looking statements regarding future operations,

    product development, product capabilities and availability dates. This

    information is subject to substantial uncertainties and is subject to change at any

    time without prior notification. Statements contained in this document

    concerning these matters only reflect Hewlett Packard's predictions and / or

    expectations as of the date of this document and actual results and future plans

    of Hewlett-Packard may differ significantly as a result of, among other things,

    changes in product strategy resulting from technological, internal corporate,

    market and other changes. This is not a commitment to deliver any material,

    code or functionality and should not be relied upon in making purchasing

    decisions.

    This is a rolling (up to three year) Statement of Direction and is subject to change without notice.

    October 6, 2011 2

  • OS AND OSS UPDATE

    October 6, 2011 3

    • Recent enhancements

    – 64 Partition Enscribe File System

    – POSIX User Threads

    – FLOSS and Samba

    – OSS restricted-access filesets

    – OSS as-delivered file security

    – OSS shared memory extension

  • H06.22 / J06.11

    64 Partition Enscribe Key-Sequenced Files

    • Enscribe now supports up to 64 partitions. New terminology:

    – Legacy key-sequenced (LKS) files have 0 – 15 secondary partitions.

    – Enhanced key-sequenced (EKS) files have16 – 63 secondary partitions.

    • BACKUP/RESTORE, FUP, and SMF all support EKS files.

    • Alternate key files may be EKS files.

    • Benefits:

    – Increased scalability.

    –Higher aggregate throughput.

    4 October 6, 2011 4

  • H06.22 / J06.11

    64 Partition Enscribe Key-Sequenced Files

    • Minimal application coding changes are required to create EKS files.

    • Because the file label does not have enough room to store information on up to 64 partitions, EKS file metadata is stored in the primary partition.

    – All user data is stored in secondary partitions.

    • Fallback is possible.

    – Data must be retrieved one partition at a time.

    – The fallback procedure is documented.

    5 October 6, 2011 5

  • H06.21/J06.10

    POSIX User Model Thread Library (PUT)

    • Overview

    – PUT is a user-space implementation of the IEEE Std 1003.1, 2004, POSIX System

    Application Program Interface for use by native C and C++ TNS/E applications in the

    OSS environment

    – A thread-aware version of NSK public DLLs allows multi-threaded programs to safely

    execute non-blocking library I/O functions and functions that share context.

    – Scheduling is performed by the PUT library, not by the OS; there is no preemption.

    – Stack overflow detection provides protected memory: process signal stack, and protected

    thread stacks.

    • It allows an OSS process to enable a signal handler to catch stack overflow traps (SIGSTK).

    6 October 6, 2011 6

  • H06.21/J06.10

    POSIX User Model Thread Library (PUT)

    • Benefits

    – Improved performance for threaded applications

    – Improved robustness in applications (stack overflow detection)

    – Improved POSIX standard compliance

    – Threading support enhancements will be made for PUT rather than SPT

    7 October 6, 2011 7

  • Usage

    8

    • Use the PUT model instead of SPT for new development or when porting open

    source

    • Convert code from SPT to PUT in cases where the existing code is to be bound

    into programs written using the PUT model

    • Existing standalone SPT code does not need to be converted

    – Consider converting it to obtain improved thread performance and stack overflow protection

    – Consider converting it if you expect to be making ongoing enhancements, as future enhancements

    will be made to PUT rather than SPT

    • Migration details are documented in the Open System Services Programmer’s

    Guide

    POSIX User Threads (PUT)

    October 6, 2011

  • H06.22 / J06.11

    FLOSS • The FLOSS package is a set of command-line utilities and libraries that aid in

    porting packages to NonStop systems.

    • The FLOSS package is used when rebuilding any of the open source packages

    that are posted in the ITUGLIB website.

    – It is not needed when installing and using prebuilt ITUGLIB packages.

    • It also can be used to assist in porting your own software package.

    • FLOSS is now a supported product.

    9 October 6, 2011 9

  • H06.22 / J06.11

    10

    • FLOSS provides open source compatible build scripts that perform mappings to

    corresponding NonStop build tools and options.

    • During execution, FLOSS libraries that are statically linked with the open source

    software transform some of the specific open source system and library calls to

    OSS compatible calls.

    • FLOSS helper utilities, such as findno, findcall_floss_oss enable you to verify that

    the configuration and build are performed properly.

    • For details, see the FLOSS on NonStop User Manual.

    FLOSS features

    October 6, 2011

  • H06.22 / J06.11

    Samba • Samba provides file shares on NonStop platforms accessible from Windows.

    • It does not support print shares.

    • There are a number of security considerations that require your attention when

    configuring Samba.

    • For details, see the Samba on NonStop User Manual.

    11 October 6, 2011 11

  • H06.22 / J06.11

    OSS restricted access filesets

    – For many years, it has been possible to prevent SUPER.SUPER from

    accessing specific Guardian files by using the “DENY” option in

    Safeguard Access Control Lists.

    – OSS restricted access fileset support make it possible to also prevent

    the super ID from accessing files in specified OSS filesets unless

    explicitly granted access.

    – Advance warning: this is a complex topic. Read the manuals and

    decide which operations staff need which privileges before attempting

    to use this feature.

    12 October 6, 2011 12

  • RestrictedAccess filesets

    • A RestrictedAccess fileset is an OSS fileset in which the

    super-user is denied special access privileges. Super-user is

    required to follow the fileset’s file and directory

    permissions (standard UNIX permissions and OSS ACLs).

    • A new RestrictedAccess fileset attribute has been added to

    OSS filesets to distinguish between normal OSS filesets

    and RestrictedAccess filesets.

    • Restricted access is not the default, and must be explicitly

    enabled on a fileset basis.

    • This feature is only supported on Version 3 filesets.

    October 6, 2011 13

  • RestrictedAccess filesets: considerations

    • The super-user can explicitly be granted access to its files or directories

    as appropriate.

    • The super-user cannot evade access controls by running a program that uses

    setuid() or similar functions that allow it to switch IDs to one that does have

    access without providing that ID’s password.

    • It still is possible to perform administrative actions such as backing up and

    restoring filesets using HP’s standard programs while protecting against abuse

    by a security administrator.

    • Customers have the ability to identify specific programs as being allowed access

    even though they have switched to a non-super ID without providing a password.

    October 6, 2011 14

  • H06.23 / J06.12

    October 6, 2011 15

    • OSS file and directory permissions as shipped have not been consistent across

    products and, in some cases, have offered a lower level of security than many

    customers find acceptable.

    • HP has altered many of these permissions to improve out-of-the-box security and

    consistency – see the individual softdocs for details.

    • The default value for umask has been changed from 0 to 022 in osh and

    pinstall.

    • As always, customers are encouraged to modify these security settings where

    needed after installation to match their own security policies.

    OSS as-delivered file security

  • H06.23 / J06.12

    OSS Shared Memory Segment Extensions

    • Allow Guardian processes to create, share, and remove OSS Shared Memory

    segments (shm).

    – An shm can be shared between Guardian and OSS processes.

    – Permissions operate the same way for both kinds of process.

    • Limit the ability to create shm only by available processor resources.

    – Remove the limitation of 13 OSS shm per process.

    – Remove the limitation of 128 MB per segment.

    • Relax the 32-MB alignment restriction on user-specified shm addresses.

    – Rounding is to 4 MB, if requested.

    – Minimum alignment is 16 KB.

    October 6, 2011 16

  • OS AND OSS UPDATE

    October 6, 2011 17

    • Upcoming enhancements

    –64-bit OSS Processes

    –Open Source Tools and Utilities

    – Additional OSS Limits Lifting

    This is a rolling (up to three year) Statement of Direction and is subject to change without notice .

  • Target delivery date TBD

    64 Bit OSS Processes

    • Will be supported for C/C++ programs.

    • Will allow OSS programs access to a much larger virtual address space for

    data through support of 64-bit addressing.

    – SQL/MX access supported from processes using 64-bit addressing.

    • Will place run-time heap in 64-bit addressable virtual memory.

    – Default limit is 12GB.

    – NonStop OS limit is .5TB for heap plus 64-bit flat segments.

    – Practical limit determined by the amount of available physical memory and disk

    space for swap files.

    • Benefits:

    – Increased scalability

    18

    This is a rolling (up to three year) Statement of Direction and is subject to change without notice . October 6, 2011 18

  • Open Source Tools and Utilities

    19

    • HP will continue to add selected open source tools and utilities as

    supported products.

    – What are your top candidates for support, and why?

    • HP also will continue to update other tools and utilities on ITUGLIB:

    – BASH, Archival tools, Core utilities, etc.

    – RPM and RPM-based packages.

    • Benefits:

    – Enhance portability of open source packages.

    – Supply more of the ecosystem that is expected by developers and system administrators.

    This is a rolling (up to three year) Statement of Direction and is subject to change without notice . October 6, 2011 19

  • Target delivery date TBD

    Additional OSS Limits Lifting

    • More OSS processes per system:

    – 128,000 simultaneous OSS processes on a 16-processor system.

    • More OSS file opens (128,000 opens per CPU):

    – 96,000 disk opens; 32,000 sockets; 32,000 pipes; 32,000 ttys.

    • More memory for file IO operations.

    – 120 MB disk cache; 64 MB sockets; 64 MB pipes:

    • Resource monitoring capabilities with SCF commands and with Measure.

    20

    This is a rolling (up to three year) Statement of Direction and is subject to change without notice . October 6, 2011 20

  • MULTICORE OS AGENDA • Bottom Line for the NB54000c

    • Multicore Architecture Terms

    • NB54000c Physical View

    • NB54000c Logical View

    • J06.11 Software changes

    – More J-series versions of products

    – Locks

    – Process Scheduler

    – Performance

    • J06.12 Software changes

    October 6, 2011 21

  • Bottom Line

    • The NB54000c has 1.85x the performance of the NB50000c

    – running Order Entry SQL/MP

    – Same footprint as NB50000c

    • Obviously your mileage will vary with application, some better some

    less so.

    • The J06.11 release was all about getting the best performance from the

    NB54000c system

    October 6, 2011 22

  • Multicore Architecture Terms (1 of 2)

    • NSMA: NonStop Multicore Architecture, the J-series

    • IPU (Instruction Processing Unit): a core

    – If we used Itanium Hyperthreading then an IPU would be a thread but NonStop does not

    use this Intel feature

    • CPU: logical processor

    – The traditional NonStop logical CPU extended to be a multiprocessor

    – A set of IPUs (cores) sharing the same memory

    • Exception: a small per-IPU area for use by low-level software

    – One X and one Y ServerNet interface per CPU

    October 6, 2011 23

  • Multicore Architecture Terms (2 of 2)

    • n-Way: traditional indication of the number of IPUs in a multiprocessor

    – 2-way means 2 IPUs per CPU

    – 4-way means 4 IPUs per CPU

    • Process Scheduler (PS): the NonStop OS subsystem that distributes and

    redistributes processes among the IPUs of a CPU

    • Monarch: the initial IPU that begins execution upon power on

    – Intel calls it the boot processor

    – Every CPU has one monarch

    October 6, 2011 24

  • October 6, 2011

    NB54000c physical View Blade chassis

    • c-Class enclosure

    • Two to eight BL860 i2 blades per chassis

    • ServerNet double-wide switch modules

    • Ethernet single-wide switch modules (maintenance connections)

    • NEW: Supports use of two c7000s with 8 or less CPUs.

    • Allows more I/O connectively for such a system

    • Called “Flex Processor Bay Configuration”

    BLADES

    STORAGE CLIM

    NETWORK CLIM

    SAS

    c-Class ENCLOSURE

    25

  • October 6, 2011

    NB54000c - BL860c i2 Blade View Logical processors/blades

    • One 1.73 GHz quad core Intel 9340 Itanium (Tukwila) microprocessor (one logical CPU)

    • ServerNet Mezzanine card

    • 16, 24, 32, 48 GB main memory per logical CPU

    • The NB50000c minimum was 8GB.

    • Cannot be mixed with NB50000c CPUs in the same system

    • So a system upgrade is offline and entails replacing all NB50000c blades

    BLADES

    STORAGE CLIM

    NETWORK CLIM

    SAS

    c-Class ENCLOSURE

    26

  • NB54000c Logical System View

    October 6, 2011

    Memory

    SNet Interface

    Socket/Chip

    IPU 0 IPU 1

    IPU 2 IPU 3

    4 Processor 4-way System

    SNet Switch

    Memory

    SNet Interface

    Socket/Chip

    IPU 0 IPU 1

    IPU 2 IPU 3

    Memory

    SNet Interface

    Socket/Chip

    IPU 0 IPU 1

    IPU 2 IPU 3

    Memory

    SNet Interface

    Socket/Chip

    IPU 0 IPU 1

    IPU 2 IPU 3

    27

  • NB54000c: 4-way software

    J06.11 changes: Performance

    October 6, 2011 28

  • J06.11 changes • Much finer granularity of locking

    • Before J06.11 most locks were global, that is one lock per CPU for each required task

    • With J06.11, there are many more locks and a significant number are embedded (per control block), or in some case multiple locks per control block

    – T9050, NSK

    • Process control

    • Memory management

    • Message System

    • Much greater use of atomic update (no software locking required)

    October 6, 2011 29

    - TNet (aka ServerNet) services - Measure - Millicode, multiple spin locks

  • J06.11, changes

    • More Guardian processes (12K vs. 8K), more OSS processes

    • More sophisticated Process Scheduler

    • A larger set of J-series (vs. H-series) products – Prior to J06.11

    • T9050 (NSK)

    • Measure

    – As of J06.11 we add the following • File system

    • TNet services

    • DP2

    • SQL/MP

    • TMF

    • I/O Drivers

    – This allows for both compile time platform test (H vs. J) as well as compiler optimizations only applicable for J-series platforms

    October 6, 2011 30

  • Multicore = Multiple IPUs = New Paradigms

    • Relative priorities between processes no longer can be depended on for synchronization – The lower-priority process may be running at the same time in another IPU

    • Resource contention is expensive – Waiting for locks hurts performance

    – Increasing the granularity of locks reduce contention • That is reduce the scope of what the lock controls

    – Mutex becomes was awfully big hammer: good for a uniprocessor; bad news for a multiprocessor

    • Reducing Resource contention is the key to better performance, more locks – When we introduced the NB50000c we had a dozen or so locks

    – With J06.11 we have thousands of locks

    October 6, 2011 31

  • NSK Locks

    • Introduced in the H series

    • Used for inter-process synchronization within a CPU

    • Constructed as binary semaphores

    • When introduced

    – Non-priority inverting, that is the priority of the lock holder is raised to that of the highest

    contender

    • Provide an opportunity to replace use of Mutex with finer-grained

    control over individual resources

    October 6, 2011 32

  • NSK Locks, updated for Multicore Architecture and further updated for J06.11 • A static priority associated with each lock

    – The holder runs at the higher of its own priority and the static priority

    – A number of locks (including Mutex) run at maximal priority • If found to be contended, the new requester spins for a short period before being linked as a contender

    • Multicore Architecture makes much greater use of non-Mutex locks than the H series did, for example:

    – NSK timers, memory management, message system, and process control

    – DP2

    – TMF

    – TNet services

    – Measure

    October 6, 2011 33

  • Multicore Architecture Process Scheduling: Monarch IPU Considerations • Early CPU Initialization only runs on the Monarch

    • In normal operation, any IPU can initiate I/O

    • Only the Monarch (aka IPU 0) receives I/O interrupts

    • The Interrupt Processes which services the interrupt typically

    runs on IPU 0 but can run in any IPU

    • A side effect of this is that on a very busy CPU, IPU 0 is

    largely consumed by system work (usually the Message System

    Interrupt Process)

    October 6, 2011 34

  • What is a Process Scheduler?

    • The Process Scheduler (PS) picks which IPU a process is to run

    – Each IPU has a distinct ready list

    – The PS decides which other list a process gets placed upon

    – The PS is composed of sampling logic to see how the system runs and decision logic to

    rearrange the contents of the ready lists

    – The PS uses the recent past history to guide it towards optimizing use of all of the IPUs

    • The Dispatcher is the logic that unloads the current process and loads

    the process at the top of the ready list.

    – Prior to the Multicore Architecture the NonStop OS only needed a dispatcher since the

    ready list was priority ordered and there was only one IPU to run processes

    And what is a dispatcher

    October 6, 2011 35

  • IPU-Level Scheduling Considerations

    • A process may have cache affinity with the IPU in which it has been running – Instructions currently in its instruction cache

    –Data currently in its data cache

    • If a process that has data or instructions in one IPU’s cache is moved to the other IPU, that data or block of instructions has to be pulled over to the other IPU’s cache as it is accessed –Note: the Montvale and Tukwila Itanium processors have dedicated per core cache.

    • In some cases, there are additional factors affecting where an individual process should be scheduled to run (its IPU affinity)

    October 6, 2011 36

  • IPU Affinity

    • Some types of system processes are not load balanced by the Process Scheduler (PS):

    – Dynamic: IPU selected by dispatcher millicode when the process is made ready to run • Currently used only for selected Interrupt Processes and Auxiliary Processes

    – Hard: locked to an IPU for the process’ lifetime • Currently used only for performance IPs (processH data collection) and the Idle AP

    • The Process Scheduler (PS) currently load balances processes with the following types of IPU affinity:

    – Group: entire process group kept on a single IPU and load-balanced together • Used for DP2 process groups

    – Soft: IPU selected by PS based on load

    October 6, 2011 37

  • When to Schedule?

    • When a process comes out of wait (e.g. waiting for a message completion)

    – If the process was recently run then it will typically run best of the IPU that it was running on (i.e. it had data in the various HW caches)

    – If the process has not run for a while then a more lightly loaded IPU would be a good choice

    • Periodically compare the load among the IPUs

    – If out of balance then move a process from a heavily loaded IPU to a lightly loaded one.

    – Usually do NOT move from the front of the list as those processes have high cache affinity

    • When an IPU goes idle

    October 6, 2011 38

  • Process Scheduler (PS) change History

    • In J06.03-04 the PS only balances the DP2 workload, rearranging DP2 process groups as needed

    • In J06.05, in addition to continuing to balance the DP2 workload, PS also balances the complete workload across the IPU by re-arranging Soft Affinity processes

    – To minimize the effect of cache affinity, PS only moves Soft Affinity processes that haven’t run recently (currently defined as 100 ms)

    • In J06.11 the PS balances DP2 workload and includes Interrupt Processes (IP) as part of the equation – For a 4-way CPU the system load, represented as the IP can be substantial and can consume an entire IPU

    – Keep in mind that one IPU is only 25% of the total CPU in an NB54000c

    • In J06.11 the PS has special handling of special situations – For example during RELOAD on a very busy system it takes action to ensure forward progress is made.

    • The algorithms for assignment of both Group and Soft Affinity processes to IPUs are subject to release-to-release variation as the scheduler is enhanced

    October 6, 2011 39

  • When is IPU imbalance a problem?

    • Sometimes the workload balance across IPUs can become skewed

    • This in and of itself is not a problem

    – And moving processes to fully balance IPUs may result in even loads but not necessarily

    improved throughput or reduced latency (and could make them worse)

    • It is important to see the forest for the trees, more throughput and lower

    latency are the goals, not IPU balance

    October 6, 2011 40

  • Where Are IPUs User Visible?

    • IPUs are system resources that make more CPU cycles available

    • Individual IPUs are externalized only selectively:

    – Measure’s CPU entity shows IPU level busy/idle

    – Measure’s Process entity shows (as of J06.09) • The current IPU number

    • The number of IPU switches in the interval

    • If a switch was made the previous (or last if there is more than one) IPU number is shown

    – PEEK shows the number of IPUs in a CPU

    – PROCESSOR_GETINFOLIST_ has an attribute that returns the number of IPUs in a given CPU

    – Multiprocessor CPU model numbers are distinct from uniprocessor model numbers and do not denote the number of IPUs in the CPU

    – The IPU can be displayed/selected in the debugger for memory dump analysis

    October 6, 2011 41

  • Performance Data

    • Where to get it and how to view it?

    October 6, 2011 42

  • Measure

    • CPU entity includes IPU-level information

    – Rate on: 100% means all IPUs are 100% busy

    – Rate off: Total IPU seconds • so 4 CPU seconds per elapsed second on the NB54000c

    • Process Entity

    – Rate on: 100% means 1 IPU-second of processing

    – as of J06.09 the following are also displayed • The current IPU number

    • The number of IPU switches in the interval

    • If a switch was made the previous (or last if there is more than one) IPU number is shown

    • Must use ZMS style records to retrieve this data, as legacy records don’t include enough information (i.e., the IPU count)

    October 6, 2011 43

  • Retrieving IPU-Level Performance Data

    • Raw data related to process execution time can be obtained from the

    system via:

    • PROCESSOR_GETINFOLIST_ attribute #74

    −Total number of IPUs contained in the CPU (IPU-count)

    • PROCESS_GETINFOLIST_ attribute:

    – #30: Total process time, not a rate

    – #137: The IPU number the process was last run on

    • Any rate calculation to determine CPU busy percentage or IPU busy

    percentage needs to be done by the caller

    October 6, 2011 44

  • Retrieving IPU-level performance data

    • The percentage of time a process consumed an IPU for a specified

    duration can be calculated by:

    ( process-time / elapsed-time )

    • The percentage of time a process consumed a CPU for a specified

    duration can be calculated by:

    ( process-time / elapsed-time ) / IPU-count

    • Process-time is defined as the delta of the values returned in attribute

    #30 (process time) from two calls to PROCESS_GETINFOLIST_

    October 6, 2011 45

  • Retrieving IPU-level performance data

    • The process busy time can also be obtained by calling:

    −PROCESS_GETINFO_

    −PROCESSTIME

    −MYPROCESSTIME

    • PROCESS_GETINFOLIST_ attribute #137 returns the IPU number the

    process was last run on

    −Applications should not assume IPU numbers are sequential

    −They should not assume that it is the “current” IPU (things change)

    October 6, 2011 46

  • Explicit User Control over process to IPU mapping

    October 6, 2011 47

    • J06.12 release introduces this feature

    • It includes both a command and programmatic interface to force processes to

    only run on a designated IPU

    – Includes many system processes such as DP2 and the ServerNet Interrupt processes.

    • It also allows a degree of control over Process Scheduler features

    • All controls are on running processes, not at process launch

    – So you map processes after they are created

    • This feature has been long asked for

    – but can be a very dangerous mechanism since it overrides the system’s control

  • Explicit User Control over process to IPU mapping

    October 6, 2011 48

    • J06.12 release introduces this feature

    • It includes both a command and programmatic interface to force processes to

    only run on a designated IPU

    – Includes many system processes such as DP2 and the ServerNet Interrupt processes.

    • It also allows a degree of control over Process Scheduler features

    • All controls are on running processes, not at process launch

    – So you map processes after they are created

    • This feature has been long asked for

    – but can be a very dangerous mechanism since it overrides the system’s control

    • See more info for the Manuals to consult

  • More Information:

    • Important Support Notes:

    – S08093: NSMA Process Busy Instrumentation Overview

    – S11044: Tuning Guidelines For NB54000c NonStop Servers

    – S11008: Workload Imbalance in Multicore NonStop Processors

    • Explicit Control over IPU Affinity Manuals (J06.12)

    – Guardian Programming Guide

    – Guardian Procedure Calls Manual

    – TACL Reference Manual

    – Nonstop Operating System Event Management Programming Manual

    October 6, 2011 49

  • THANK YOU

  • BACKUP SLIDES

  • MORE ON THE POSIX USER THREADS (PUT) LIBRARY

    52

  • Post-POSIX User Thread

    Thread-Aware I/O Availability

    called directly from:

    Thread-aware system I/O on non-disk file

    Thread-aware system I/O on disk file

    Thread-aware C I/O on non-disk file

    Thread-aware C I/O on disk file

    Thread-aware C++ I/O streaming

    SPT-based application √ √ √ Х Х

    Public DLL linked to SPT-based applications

    Х Х Х Х Х

    called directly from:

    Thread-aware system I/O on non-disk file

    Thread-aware system I/O on disk file

    Thread-aware C I/O on non-disk file

    Thread-aware C I/O on disk file

    Thread-aware C++ I/O streaming

    Thread-aware public DLL functions

    SPT-based application √ √ √ Х Х Х

    Public DLL linked to SPT-based applications

    Х Х Х Х Х Х

    PUT-based application √ √ √ √ √ √

    Public DLL linked to PUT-based applications

    √ √ √ √ √ √

    Pre-POSIX User Thread

    October 6, 2011 53

  • Migrating from the SPT library to the PUT library

    • PUT-based applications must statically link with the PUT library (ZPUTDLL, located in the sysnn subvolume).

    • You may not mix the existing POSIX threads library (SPT) and the PUT library in the same process. – Attempts to do so fail with runtime error 019.

    • Thread-aware signals are always enabled in the PUT library.

    • SPT-specific pthread_attr_default, pthread_mutexattr_default and pthread_condattr_default global variables do not exist in the PUT model library. – These symbols are no longer defined in the final version of the IEEE standard

    – For compatibility, they are provided as macros (i.e. “#define”) in put_extensions.h.

    • The PUT library functions return -1 and set errno rather than returning the errno value.

    • unistd.h has a set of macros that influence HP’s thread implementation. – See the Open System Services Programmer’s Guide for details.

    POSIX User Threads migration considerations

    October 6, 2011 54

  • Migrating from the SPT (T1248) library to the PUT library (T1280) – Compiler defines

    POSIX User Threads

    • Include pthread.h in /usr/include instead of spthread.h

    –#include

    • _PUT_MODEL_

    – This define is required for threaded applications using the PUT library.

    – It may be specified in the source code as a #define or on the compiler command line

    with the –D switch.

    • _PUT_SELECT_SINGLE_, if a faster select() algorithm is desired

    – This is equivalent to SPT_SELECT_SINGLE in the SPT library (supply a single file

    descriptor).

    October 6, 2011 55

  • Migrating from the SPT library to the PUT library – Environment variables

    POSIX User Threads

    • PUT_THREAD_AWARE_REGULAR_IO_DISABLE

    – Setting this environment variable before running an application that uses the PUT library

    causes thread-aware APIs to have process blocking behavior.

    – Disables thread-aware regular I/O behavior.

    – Thread-aware APIs ignore the O_NONBLOCK flag for regular files.

    • PUT_PROTECTED_STACK_DISABLE

    – Setting this environment variable causes the thread stack to be allocated from the

    process’ heap instead of being allocated on a separate protected stack.

    – The default is to allocate on a separate protected stack.

    October 6, 2011 56

  • Migrating from the SPT library to the PUT library – linking

    POSIX User Threads

    • PUT_THREAD_AWARE_REGULAR_IO_DISABLE

    – Disables thread-aware regular I/O behavior.

    – Thread-aware APIs ignore the O_NONBLOCK flag for regular files.

    – Setting this environment variable before running an application that used the PUT library

    causes thread-aware APIs to have process blocking behavior.

    – See the Open System Services Programmer’s Guide for details.

    • PUT_PROTECTED_STACK_DISABLE

    – Setting this environment variable causes the thread stack to be allocated from the

    process’ heap instead of being allocated on a separate protected stack.

    – The default is to allocate on a separate protected stack.

    October 6, 2011 57

  • Signal handling topics (1 of 3)

    POSIX User Threads migration considerations

    • PUT model applications cannot call sigaltstack() to establish a signal

    stack. – However, they can call sigaltstack() to find the information about the signal stack.

    • signal() is deprecated, use sigaction() instead.

    • See the Open System Services Programmer’s Guide for a list of APIs that support thread-aware signal handling.

    • Several aspects are similar to the SPT library:

    – The Open System Services Programmer’s Guide lists a set of functions that are not async-signal safe and should not be used in signal handlers.

    – Job control signals and their corresponding actions are not supported.

    – The Real-time Signals Extension option is not supported.

    October 6, 2011 58

  • Signal handling topics (2 of 3)

    Additional aspects that are similar to the SPT library:

    • Some signals affect the entire process.

    • If a signal is delivered to a thread waiting on a condition variable, then upon return from the signal handler, the thread will resume waiting for the condition variable as if it were not interrupted.

    • Do not use the raise() API for an exception, and do not use the longjmp() API to exit a synchronous signal handler. Simply returning from a signal handler is safe and is the recommended practice.

    • Use exception handlers instead of the setjmp() and longjmp() APIs for threaded OSS applications.

    • Do not use the option of the sigsetjmp() and siglongjmp() APIs that saves and restores the signal mask. Doing so can cause signals enabled by another thread to be masked.

    POSIX User Threads migration considerations

    October 6, 2011 59

  • Signal handling topics (3 of 3)

    • Do not deliver signals like SIGALRM and SIGCHLD using the kill command or the raise() function because these do not provide enough information about

    the process IDs for the parent and child processes.

    – Instead, use the PUT Model library version of the alarm() API to raise alarms.

    POSIX User Threads migration considerations

    October 6, 2011 60

  • APIs in the SPT library that do not exist in the PUT library

    • The PUT model library does not provide analogs to spt_*x() or spt_*z() APIs, such as spt_readx() or spt_readz(), whose purpose is to be non-

    blocking alternatives to standard system library functions.

    – Instead, use the standard API names.

    • The PUT model library does not provide analogs to spt_*() functions, such as

    spt_fprintf(), whose purpose is to be non-blocking alternatives to the

    standard C run-time library functions.

    – Instead, use the standard API names.

    – The C run-time library functions are thread-aware if called from a PUT-based application.

    POSIX User Threads migration considerations

    October 6, 2011 61

  • Pitfalls

    • See the Open System Services Programmer’s Guide for a list of thread-aware APIs, a list of non-thread-safe APIs, non-thread-safe cancellation points, and a list

    of APIs that are potential cancellation points

    • Do not mix modules compiled to use the PUT library with non thread-aware

    modules when the modules perform I/O to the same files.

    • Look out for warnings of the form: function “"

    declared implicitly or reports that no explicit definition of () appeared and therefore the compiler assumed defaults for all function

    attributes.

    – Consequently, calls to () are not aliased to the non-blocking variant and

    cause an unneeded and unwanted suspension of the entire process.

    POSIX User Threads migration considerations

    October 6, 2011 62

  • MORE ON OSS RESTRICTED FILESETS

    63

  • RestrictedAccess fileset attribute

    • Can be modified only by users who are members of both the new SECURITY-

    PRV-ADMIN (SPA) group and the SUPER group.

    • “DISABLED”: normal, unrestricted OSS fileset (default).

    • “ENABLED”: a RestrictedAccess fileset which can only be accessed remotely by

    systems that have full RestrictedAccess fileset support.

    • “LOCAL”: a RestrictedAccess fileset which can be accessed remotely by systems

    without full RestrictedAccess fileset support.

    –This option is intended only for use during migration of multiple systems to

    H06.22 / J06.11 or newer RVUs

    October 6, 2011 64

  • Access is granted to RestrictedAccess filesets for:

    • Authenticated users (used logon passwords), subject to usual file permissions

    checks.

    – Super-user is treated like a normal user ID.

    • HP-authorized programs such as BACKUP2 and RESTORE2 that are required for

    fileset maintenance.

    • Customer-authorized programs that perform privileged operations to switch IDs

    (e.g., setuid()) and then access a RestrictedAccess fileset as that new ID.

    • NFS clients, including an NFS client whose ID maps to the super-user.

    October 6, 2011 65

  • Access is unconditionally denied to RestrictedAccess filesets for: • Unauthenticated users (no logon password).

    –Stops super-user from logging on as another user ID without a password to

    gain access as that user ID.

    • Unauthorized programs that perform privileged operations to switch IDs (e.g.,

    setuid()) and then access a fileset as that new ID.

    –Stops super-user from running an unauthorized program to switch to another ID

    and gain access.

    October 6, 2011 66

  • Separation of duties • Super-user

    – Treated as a normal user in RestrictedAccess filesets.

    – Can run authorized programs to switch to another ID (e.g., setuid()) and then access RestrictedAccess filesets.

    • SECURITY-OSS-ADMIN members (SOAs) – Have some special access privileges for fileset maintenance (change owner and file

    permissions).

    – Can run authorized programs which can directly open, read, write, rename, and create files in RestrictedAccess filesets (e.g., BR2).

    • SECURITY-PRV-ADMIN members (SPAs) – new

    – Have special ability to set OSS file privileges on Guardian and OSS programs.

    – Can modify the RestrictedAccess fileset attribute (if also a member of the SUPER

    group). October 6, 2011 67

  • Special access privileges for RestrictedAccess fileset maintenance

    • Members of the SECURITY-OSS-ADMIN security group (SOAs) can perform the

    following commands on any file, just like for unrestricted filesets:

    cd(1), ls(1), chown(1), chmod(1)*, and setacl(1)

    • Members of the new SECURITY-PRV-ADMIN security group (SPAs) can perform

    the following commands on any directory or disk file:

    cd(1), ls(1), setfilepriv(1)

    *cannot set the set-uid or set-gid mode bits.

    October 6, 2011 68

  • New OSS file privileges

    • PRIVSETID

    Authorizes a process to be able to perform a privileged set id operation to

    switch to another ID without providing a password, and then access a

    RestrictedAccess fileset as that ID.

    • PRIVSOARFOPEN

    Authorizes a process to perform the open, create, remove, rename, and set

    permissions operations needed to perform a BR2 backup and restore of a

    RestrictedAccess fileset.

    These privileges are set on individual Guardian and OSS program files by a SPA

    member using setfilepriv(1).

    October 6, 2011 69

  • OSS file privilege rules

    • OSS file privileges only apply to executables, user libraries, and ordinary DLLs.

    • Public and implicit DLLs do not need OSS file privileges; they are assumed.

    • If the main executable has an OSS file privilege, then any user libraries or ordinary DLLs loaded by that process must also have the same or greater OSS file privileges, or a load error is reported.

    • If a file with an OSS file privilege is modified, any file privileges are removed, and a Safeguard audit record and EMS event are generated.

    – Only a SPA member can set the file privilege back on a file that was changed.

    October 6, 2011 70

  • SECURITY-PRV-ADMIN group

    • Members of the SPA group can:

    – Set/modify file privileges.

    – Set/modify the value of the restrictedAccess OSS fileset attribute, if the user is also a

    member of the SUPER group.

    • Membership in the SPA security group is marked in the user’s environment

    during logon.

    • Any user or alias (including SPA members) executing a SETUID or PROGID

    program owned by a SPA security group member shall not be granted any SPA

    privileges.

    October 6, 2011 71

  • SAFECOM commands

    Examples

    • ADD SECURITY-GROUP SECURITY-PRV-ADMINISTRATOR, ACCESS SECGRP.* (E)

    • ALTER SECURITY-GROUP SECURITY-PRV-ADMINISTRATOR, AUDIT-MANAGE ALL

    • INFO SECURITY-GROUP SECURITY-PRV-ADMINISTRATOR

    LAST-MODIFIED OWNER STATUS

    SECURITY-PRV-ADMINISTRATOR

    1MAY10, 13:20 SUPER.SUPER THAWED

    GROUP \*.SECGRP E

    AUDIT-ACCESS-PASS = NONE AUDIT-MANAGE-PASS = ALL

    AUDIT-ACCESS-FAIL = NONE AUDIT-MANAGE-FAIL = ALL October 6, 2011 72

  • New/Changed OSS system calls

    73

    • setfilepriv(2) – sets file privileges on a file

    #include

    int setfilepriv(

    const char *path,

    const unsigned char *fileprivs);

    PARAMETERS

    path Points to the OSS pathname of the

    executable file.

    fileprivs Points to the bit pattern that

    determines the file privileges.

    • stat(2) family of system calls - returns the OSS file privileges on a file October 6, 2011

  • New/Changed OSS commands

    74

    • getfilepriv(1)

    – used to display the file privileges on a file

    • setfilepriv(1)

    – used to set and remove OSS file privileges on

    a file

    • find(1)

    – used to locate files with file privileges

    • ls(1)

    – may optionally display a flag indicating a

    file privilege on a file (e.g., ls -lP)

    October 6, 2011

  • Updated SCF Commands

    75

    • add fileset

    • alter fileset

    • info fileset

    • status fileset

    • Example:

    > alter fileset ABC, restrictedaccess enabled

    > control fileset ABC, sync

    October 6, 2011