36
Compilation Technology SCICOMP-13 | July 19, 2007 © 2007 IBM Corporation Software Group IBM System p Compiler Roadmap Roch Archambault IBM Toronto Laboratory [email protected]

IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

Compilation Technology

SCICOMP-13 | July 19, 2007 © 2007 IBM Corporation

Software Group

IBM System p Compiler Roadmap

Roch ArchambaultIBM Toronto [email protected]

Page 2: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

2

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Agenda

� Overall Roadmap� The System p Compiler Products� Detailed Roadmaps

Common Features & Compiler Architecture

XL Fortran

XL C/C++

XL Compilers for Blue Gene

XL Compilers for Cell

XL UPC Compiler

� Customer Requirements� Online documentation� Performance Comparison� Q&A

Page 3: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

3

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Roadmap of XL Compiler Releases

V8.0 & V10.1 AIX PTFs

Dev Line

2006 2007 2008

All information subject to change without notice

V8.0 & V10.1 LNX PTFs

V8.0 & V10.1 BG/L PTFs

V8.0.1 & V10.1.1 LNX

SLES 9

V9.0 & V11.1 AIX

V9.0 & V11.1 LNX

V9.0 & V11.1 BG/P

V9.0 & V11.1 BG/L

C/C++ V9.0 for CELL

V10.0 & V12.1 AIX

V10.0 & V12.1 LNX

SLES 10

SLES 9

SLES 9

C/C++ V0.8 for CELL on alphaWorks

SLES 10 & RHEL4

SLES 10 & RHEL5

Fortran V11.1 for CELL

Page 4: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

4

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

The System p Compiler Products: Latest Versions

� All POWER4, POWER5, POWER5+ and PPC970 enabled

XL C/C++ Enterprise Edition V8.0 for AIX

XL Fortran Enterprise Edition V10.1 for AIX

XL C/C++ Advanced Edition V8.0 for Linux (SLES 9 & RHEL4)

XL Fortran Advanced Edition V10.1 for Linux (SLES 9 & RHEL4)

XL C/C++ Advanced Edition V8.0.1 for Linux (SLES 10 & RHEL4)

XL Fortran Advanced Edition V10.1.1 for Linux (SLES 10 & RHEL4)

XL C/C++ Enterprise Edition for AIX, V9.0 (POWER6 enabled)

XL Fortran Enterprise Edition for AIX, V11.1 (POWER6 enabled)

� Blue Gene/L (BG/L) enabled

XL C/C++ Advanced Edition V8.0 for BG/L (PRPQ)

XL Fortran Advanced Edition V10.1 for BG/L (PRPQ)

Page 5: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

5

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

The System p Compiler Products: Latest Versions

� Technology Preview currently available from alphaWorks

IBM XL C/C++ Alpha Edition for Cell Broadband Engine Processor on Linux,

V0.8.1 (support SDK 2.0 on FC5 x86 and PPC)

IBM XL C/C++ Alpha Edition for Cell Broadband Engine Processor on Linux,

V0.8.2 (support SDK 2.1 on FC6 x86 and PPC)

Download: http://www.alphaworks.ibm.com/tech/cellcompiler

IBM XL Fortran Alpha Edition for Cell Broadband Engine Processor on Linux,

V0.11

Download: http://www.alphaworks.ibm.com/tech/cellfortran

XL UPC language support on AIX and Linux

Download: http://www.alphaworks.ibm.com/tech/upccompiler

Page 6: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

6

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

The System p Compiler Products: End Of Service

� The following compilers went out of service in April 2007:

VisualAge C++ Version 6.0 for AIX

VisualAge C++ Version 6.0 for Linux

C for AIX V 6.0

XL Fortran Version 8.1 for AIX

XL Fortran Version 8.1 for Linux

Page 7: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

7

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

The System p Compiler Products: Future Versions

� POWER6 enabled Linux releases

XL C/C++ Advanced Edition for Linux, V9.0

XL Fortran Advanced Edition for Linux, V11.1

Blue Gene/P (BG/P) enabled

XL C/C++ Advanced Edition for BG/P, V9.0

XL Fortran Advanced Edition for BG/P, V11.1

� Blue Gene/L (BG/L) enabled

XL C/C++ Advanced Edition for BG/L, V9.0

XL Fortran Advanced Edition for BG/L, V11.1

All information subject to change without notice

Page 8: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

8

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

The System p Compiler Products: Future Versions

� Cell/B.E. cross compiler products:

XL C/C++ for Cell SDK 3.0 Linux, V9.0 (x86 and PPC)

XL Fortran for Cell SDK 3.0 Linux, V11.1 (PPC only)

� Cell/B.E. cross compilers from alphaWorks:

XL C/C++ for Cell SDK 3.0 Linux, V0.9.1 (x86 and PPC)

(Tech preview for single source compiler)

� Extra POWER6 performance XL C/C++ Enterprise Edition for AIX, V10.0

XL Fortran Enterprise Edition for AIX, V12.1

XL C/C++ Advanced Edition for Linux, V10.0

XL Fortran Advanced Edition for Linux, V12.1

All information subject to change without notice

Page 9: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

9

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Common Fortran, C and C++ Features

� Linux (SLES and RHEL) and AIX, 32 and 64 bit

� Debug support

Debuggers on AIX:

Total View (TotalView Technologies), DDT (Allinea), IBM Debugger and DBX

Debuggers on Linux:

TotalView, DDT and GDB

� Full support for debugging of OpenMP programs (TotalView)

� Snapshot directive for debugging optimized code

� Portfolio of optimizing transformations

Instruction path length reduction

Whole program analysis

Loop optimization for parallelism, locality and instruction scheduling

Use profile directed feedback (PDF) in most optimizations

� Tuned performance on POWER3, POWER4, POWER5, PPC970, PPC440,

POWER6 and CELL systems

� Optimized OpenMP

Page 10: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

10

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

TPOTPO

IPA IPA

ObjectsObjects

Other Other

ObjectsObjects

System System

LinkerLinker

Optimized Optimized

ObjectsObjects

EXE

DLL

PartitionsPartitions

TOBEYTOBEY

C FEC FE C++ FEC++ FEFORTRAN FORTRAN

FEFECompile Step

Optimization

LibrariesLibraries

PDF infoPDF info

Link Step

Optimization

O4 and O5

Wcode+

Wcode

Wcode+

Instrumented

runs

WcodeWcode

Wcode

Wcode

IBM XL Compiler Architecture

noopt and O2

O3, O4 and O5

Page 11: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

11

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

XL Fortran Roadmap: Strategic Priorities

� Superior Customer Service

Continue to work closely with key ISVs and customers in scientific and technical

computing industries

� Compliance to Language Standards and Industry Specifications

OpenMP API V2.5

Fortran 77, 90 and 95 standards

Fortran 2003 Standard

� Exploitation of Hardware

Committed to maximum performance on POWER4, PPC970, POWER5,

POWER6, PPC440 and successors

Continue to work very closely with processor design teams

Page 12: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

12

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

XL Fortran Version 11.1 for AIX/Linux – Spring/Summer 2007

� AIX Announcement Letter:http://www.ibm.com/common/ssi/fcgi-

bin/ssialias?infotype=an&subtype=ca&appname=Demonstration&htmlfid=897/ENUS207-125

� Continued rollout of Fortran 2003

� Compliant to OpenMP V2.5

� Perform subset of loop transformations at –O3 optimization level� Tuned BLAS routines (DGEMM and DGEMV) are included in compiler

runtime (libxlopt)

� Recognize matrix multiply and replace with call to DGEMM

� Runtime check for availability of ESSL

� Support for auto-simdization and VMX intrinsics (and data types) on AIX� Inline MASS library functions (math functions)

Page 13: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

13

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

XLF 11.1 delivers most of the remaining F2003 standard

� Full support of procedure pointers and allocatable

object semantics� Object-oriented Fortran programming with constructs

similar to C++ classes, methods, and destructors

� User-defined derived type I/O� Derived Type Parameters (similar to C++ templates)

will be the only major F2003 feature not available in 11.1

Page 14: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

14

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

XL C/C++ Roadmap: Strategic Priorities

� Superior Customer Service

� Compliance to Language Standards and Industry Specifications

ANSI / ISO C and C++ Standards

OpenMP API V2.5

� Exploitation of Hardware

Committed to maximum performance on POWER4, PPC970, POWER5, PPC440, POWER6 and successors

Continue to work very closely with processor design teams

� Exploitation of OS and Middleware

Synergies with operating system and middleware ISVs (performance, specialized function)

Committed to AIX Linux affinity strategy and to Linux on pSeries

� Reduced Emphasis on Proprietary Tooling

Affinity with GNU toolchain

Page 15: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

15

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

XL C/C++ Version 9.0 for AIX/Linux – Spring/Summer 2007

� AIX Announcement Letter:http://www.ibm.com/common/ssi/fcgi-

bin/ssialias?infotype=an&subtype=ca&appname=Demonstration&htmlfid=897/ENUS207-124

� Compliant to OpenMP V2.5� Perform subset of loop transformations at –O3 optimization level� Tuned BLAS routines (DGEMM and DGEMV) are included in compiler runtime

(libxlopt)

� Recognize matrix multiply and replace with call to DGEMM

� Runtime check for availability of ESSL

� Support for auto-simdization and VMX intrinsics on AIX� Inline MASS library functions (math functions)� Exploit “restrict” keyword in C 1999

� Partial compliance to C++ TR1 libraries and Boost 1.34.0

� Support for -qtemplatedepth which allows the user to control number of recursive template instantiations allowed by the compiler.

� Exploit DFP and VMX on Power6.� Improved inline assembler support

Page 16: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

16

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Blue Gene Compilers

XL C/C++ Advanced Edition V8.0 for BG/L andXL Fortran Advanced Edition V10.1 for BG/L� Performance tuning of SPEC2000FP, DDCMD Kernels, NAS 3.2 Serial and sPPM.

� Performance tuning of MASS library

� Exploit 440D instructions for complex arithmetic

� BG/L compiler white paper (Exploiting the Dual FPU in BG/L):

http://www.ibm.com/support/docview.wss?uid=swg27007511

� June 2006 PTF (compiler refresh):

Support Blue Gene software release 3

Overall SPEC2000FP faster for 440D than 440

Updated white paper to reflect June 2006 PTF performance improvements

� December 2006 PTF (compiler refresh)

Continue to improve 440D performance of benchmarks listed above

Updated white paper to reflect December 2006 PTF performance improvements

Page 17: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

17

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Blue Gene Compilers

XL C/C++ Advanced Edition for BG/P, V9.0 andXL Fortran Advanced Edition for BG/P, V11.1� Support for OpenMP, automatic parallelization and

dynamic linking

� Performance improvements: SIMD and other general optimizations

� MASS/MASSV performance improvements

� FEN (Front End Node) is SLES10

XL C/C++ Advanced Edition for BG/L, V9.0 andXL Fortran Advanced Edition for BG/L, V11.1� Same code base as BG/P release except FEN is SLES 9

� GA is one month after BG/P GA

All information subject to change without notice

Page 18: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

18

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Cell/B.E. Compilers

� Currently available on alphaWorks

� IBM XL C/C++ Alpha Edition for Cell Broadband Engine Processor on Linux, V0.8.1

Hosted on Linux x86 and Linux PPC (FC5)

Support SDK 2.0 interfaces

Targets QS20 Blade (Cell Blade 1 hardware)

� IBM XL C/C++ Alpha Edition for Cell Broadband Engine Processor on Linux, V0.8.2

� IBM XL Fortran Alpha Edition for Cell Broadband Engine Processoron Linux, V0.11

Hosted on Linux x86 (C/C++ only) and Linux PPC (FC6)

Support SDK 2.1 interfaces

Targets QS20 Blade

Support QS22 (Cell Blade 2 hardware) via simulator

Page 19: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

19

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Cell/B.E. Compilers

� Future cross compilers products:

Hosted on RHEL5U1 (Red Hat) and F7 (Fedora)

Hosted on x86 and PPC (separate products)

Support SDK 3.0 interfaces

Targets QS20 and QS21 Blades

Targets QS22 Blade (Soma Hardware or Cell Blade 2)

IBM XL C/C++ Advanced Edition for Multicore Acceleration for Linux, V9.0

IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1

� Future tech previews on alphaWorks:

Hosted on RHEL5U1 and F7

Hosted on x86 and PPC

Support SDK 3.0 interfaces

User directed single source compiler (using OpenMP)

IBM XL C/C++ alphaWorks Edition for Multicore Acceleration for Linux, V0.9.2

All information subject to change without notice

Page 20: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

20

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

XL UPC Compiler

� Future tech previews on alphaWorks Based on XL C V9.0 compiler

Compiler generated interface to the runtime system is identical for shared and distributed memory implementations

Optimizations take advantage of system architecture knowledge

� On AIX Shared Memory (pthreads)

Distributed (LAPI)

� On LinuxShared Memory (pthreads)

Distributed (LAPI)

� On BG/LBG Message Layer

� Using approximately 1000 test scenarios:GWU UPC test suite

UPC version of NAS benchmarks

Berkeley UPC test suite

MTU UPC test suite

HPC Challenge suite

All information subject to change without notice

Page 21: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

21

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

XL UPC Compiler: Optimization Improvements

12.17

34.19

18.22

528.52

0.81

6.42

3.26

1.63

18.34

9.09

4.59

2.294.88

2.41

1.21

9.50

284.16

141.64

71.06

33.34

0.1

1

10

100

1000

1 2 4 8 15

THREADS

Mil

lio

n o

f A

dd

s/S

ec

Original

Aggregation

Privitization

All

shared [BF] int RES[N], V1[N], V2[N];

upc_forall (i=0; i<N; i++; &RES[i]) {

RES[i] = V1[i+BF] + V2[i+BF];

}

for (i=BF*MYTHREAD, blk=0; i<N; i+=BF*THREADS, blk++) {

if (i%BF==0) { // aggregate V1 and V2

upc_memget(&lv1[0], &V1[i+BF], BF*sizeof(int));

upc_memget(&lv2[0], &V2[i+BF], BF*sizeof(int)); }

for (j=0; j < BF; j++) // privatize RES

lptr[(blk*BF)+j] = lv1[j] + lv2[j];}

Throughput

50% gain

180% gain

43x gain

All information subject to change without notice

Working on the following optimizations : (some will be available in tech preview)

�Parallel loop reshaping �Reduce the overhead of local-shared accesses�Aggregate fine grained communication in bulk transfers�Schedule remote shared accesses to hide latency

Page 22: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

22

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Customer Requirements Shipped In 2006

� Preserve User’s CFG file (SPXXL 3.10) :

Compiler reads a user config file specified using an environment variable (XLC_USR_CONFIG)

# /etc/vac.cfg

xlc: use = DEFLT

options = A B

DEFLT: options = C

# ./my.cfg

xlc: use xlc

options = D

Compiler would use “D A B C" as options for the xlc stanza.

� Separate environment variables for C/C++ (XLC_USR_CONFIG) and Fortran (XLF_USR_CONFIG)

� Shipped in August PTF for XL C/C++ V8.0 and XL Fortran V10.1

Page 23: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

23

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Customer Requirements Shipped In 2007� Extended Support for –qversion option (SPXXL 3.9) :

+ --- noversion ------------------+| |

>>-- -q version -+------------------------------------+----><

| |

+--- = verbose ------------------------+

• The default is –qnoversion. When -qversion is specified, the default is no suboption. If the option is specified more than once on the command line, the setting of the last one is taken.

• The verbose suboption asks the compiler to display the build level of each compiler phases. The information may look like the following:

XL Fortran Enterprise Edition for AIX, V11.1Version: 11.01.0000.0001 Driver Version: 11.01 (Fortran) Level: 060414 Fortran Transformer Version: 11.01(Fortran) Level: 060419 Fortran Front End Version : 11.01(Fortran) Level: 060420 High Level Optimizer Version: 09.00(C/C++) and 11.01(Fortran) Level: 060411 Low Level Optimizer Version: 09.00(C/C++) and 11.01(Fortran) Level: 060418

• The –qsaveopt option will be extended to save the above information to the .o file.

• Shipped in XL C/C++ V9.0 and XL Fortran V11.1

Page 24: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

24

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Customer Requirements Shipped In 2007

� The following features are shipped in XL C/C++ V9.0 and XL Fortran V11.1 :

� Provide Filename and Line Number in ALLOC/DEALLOC Failure (Fortran)

� Provide Filename and Line Number in NAMELIST Failure (Fortran)

� Little-Endian Data I/O Support (Fortran)

� Thread Number in Standard Error output (Fortran)

� Detect a thread's stack going beyond its limit (Fortran and C/C++). Implemented with –qsmp=stackcheck.

� Initialize Allocatable Arrays with NaNS (Fortran). This will be done in AIX 6.1 via extension to malloc.

� Improve performance of critical codes on BG/L

� Support for –qtune=balanced for good performance on POWER6 without causing major degradation on POWER5

� Compile time improvements for WRF (OpenMP with array section parms)

� Improve reporting of automatic SIMDization (-qreport=hotlist)

Page 25: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

25

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

New Compiler Options and Directives Shipped In 2007

� New suboptions to –qfloat:-qfloat=fenv asserts that FPSCR may be accessed (default is nofenv)

-qfloat=hscmplx better performance for complex divide/abs (defaults is nohscmplx)

-qfloat=nosingle does not generate single precision float operations (default is single)

-qfloat=norngchk does not generate range check for software divide and (default is rngchk)

� -qoptdebug for debugging optimized code� -qxlf90=nosignedzero now the default when –qnostrict (improves

max/min perf)� Builtin functions for new Power6 instructions:

dcbfl (local flush)

new dcbt variant (prefetch depth)

dcbst (store stream)

� Expected value directive for function arguments.� All the above has been shipped in XL C/C++ V9.0 and XL Fortran V11.1

Page 26: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

26

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Feature Request

� Request for a feature to be supported by our compilers

� C/C++ feature request page:http://www.ibm.com/support/docview.wss?uid=swg27005811

� Fortran feature request page:http://www.ibm.com/support/docview.wss?uid=swg27005812

� Or send e-mail to [email protected]

Page 27: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

27

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Documentation

� An information center containing the documentation for the XL Fortran V10.1 and XL C/C++ V8.0 versions of the AIX compilers is available at: http://publib.boulder.ibm.com/infocenter/comphelp/v8v101/index.jsp

� An information center containing the documentation for the XL Fortran V11.1 and XL C/C++ V9.0 versions of the AIX compilers is available at: http://publib.boulder.ibm.com/infocenter/comphelp/v9v111/index.jsp

� Optimization and Tuning Guide for XLF V10.1 and XLF V11.1 is now available online at: http://publib.boulder.ibm.com/infocenter/comphelp/v9v111/index.jsp

� New whitepaper “Overview of the IBM XL C/C++ and XL Fortran Compiler Family”available at: http://www.ibm.com/support/docview.wss?uid=swg27005175

� This information center contains all the html documentation shipped with the compilers. It is completely searchable.

� Please send any comments or suggestions on this information center or about the existing C, C++ or Fortran documentation shipped with the products to [email protected].

Page 28: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

28

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

SPEC2006 FP Comparison Between Power6, Itanium-2 And Core Duo

-100.00%

-80.00%

-60.00%

-40.00%

-20.00%

0.00%

20.00%

40.00%

60.00%

80.00%

100.00%

bw

aves

gam

es

mil

c

zeu

sm

p

gro

macs

cactu

sA

DM

lesli

e3d

nam

d

dealI

I

so

ple

x

po

vra

y

calc

uli

x

gem

s

ton

to

lbm

wrf

sp

hin

x3

Overa

ll

P6 vs IT2 (6)

P6 vs DUO (7)

Using base options from spec.org

Overall 11% faster than

DUO and 8% faster than IT2

Page 29: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

29

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

SPEC2006 FP Comparison Between AIX and Linux on Power6

-15.00%

-10.00%

-5.00%

0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

30.00%

35.00%

40.00%

45.00%

50.00%b

waves

gam

es

mil

c

zeu

sm

p

gro

macs

cactu

sA

DM

lesli

e3d

nam

d

dealI

I

so

ple

x

po

vra

y

calc

uli

x

gem

s

ton

to

lbm

wrf

sp

hin

x3

Overa

ll

AIXvsLinux

Using peak options from spec.org

Overall < 1%

Page 30: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

30

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

BACKUP SLIDES

Page 31: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

31

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

History Of Compiler Improvement On Power4

Note: SPEC2000 base options improvements from www.spec.org

9.9%46%5%18%5%12%baselineSpecFLOAT

7.6%34%7%3%0%21%baselineSpecINT

CAGR

Rate

Compound

Over 4

Years

2005

V8/V10.1

2004

V7/V9.1

2003

V6/V8.1.1

2002

V6/V8.1

2001

V5/V7.1.1

Compilers

Page 32: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

32

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Installation of Multiple Compiler Versions

� Installation of multiple compiler versions is supported� The vacppndi and xlfndi scripts shipped with VisualAge C++ 6.0 and

XL Fortran 8.1 and all subsequent releases allow the installation of a given compiler release or update into a non-default directory

� The configuration file can be used to direct compilation to a specific version of the compiler

Example: xlf_v8r1 –c foo.f

May direct compilation to use components in a non-default directory

� Care must be taken when multiple runtimes are installed on the same machine (details on next slide)

Page 33: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

33

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Coexistence of Multiple Compiler Runtimes

� Backward compatibility

C, C++ and Fortran runtimes support backward compatibility.

Executables generated by an earlier release of a compiler will work with a later version of the run-time environment.

� Concurrent installation

Multiple versions of a compiler and runtime environment can be installed on the same machine

Full support in xlfndi and vacppndi scripts is now available

� Limited support for coexistence

LIBPATH must be used to ensure that a compatible runtime version is used with a given executable

Only one runtime version can be used in a given process.

Renaming a compiler library is not allowed.

Take care in statically linking compiler libraries or in the use of dlopen or load .

Details in the compiler FAQ http://www.ibm.com/software/awdtools/fortran/xlfortran/support/

http://www.ibm.com/software/awdtools/xlcpp/support/

Page 34: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

34

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

FPU

A Unified Simdization FrameworkGlobal information gathering

Pointer Analysis Alignment Analysis

Simdization

Straightline-code Simdization Loop-level Simdization

General Transformation for SIMD

Dependence Elimination Data Layout Optimization

Simdization

SIMD Intrinsic Generator

Constant Propagation

VMX

CELL

architecture independent

architecture specific

Diagnostic

output

Idiom Recognition

Page 35: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

35

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Blue Gene Compilers: Performance Results

Overall Improvement with -O5:

V8/10.1 GA, PTF1, PTF2 vs. V7/9.1 Compilers

0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

30.00%

35.00%

40.00%

45.00%

50.00%

Spec2000FP NAS 3.2 Serial sPPM ddcmd uKernels

V8/10.1 GA 440

V8/10.1 GA 440d

V8/10.1 PTF1 440

V8/10.1 PTF1 440d

V8/10.1 PTF2 440

V8/10.1 PTF2 440d

Note that NAS and ddCMD actually improved with 440d, but bars are smaller due to 440 improvements

Page 36: IBM System p Compiler Roadmap · 2009-02-05 · IBM XL Fortran Advanced Edition for Multicore Acceleration for Linux, V11.1 Future tech previews on alphaWorks: Hosted on RHEL5U1 and

36

Compilation Technology

SCICOMP-13 | IBM System P Compiler Roadmap © 2007 IBM Corporation

Software Group

Blue Gene Compilers: Performance Results

Overall Improvement with -O5:

-qarch=440d vs. -qarch=440

-60.00%

-40.00%

-20.00%

0.00%

20.00%

40.00%

60.00%

80.00%

Spec2000FP NAS 3.2 Serial sPPM ddcmd uKernels

V7/9.1

V8/10.1 GA

V8/10.1 PTF1

V8/10.1 PTF2