32
7/30/2019 81320112 iNtel Fortran http://slidepdf.com/reader/full/81320112-intel-fortran 1/32 Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials Document Number: 323650-001US World Wide Web: http://developer.intel.com Legal Information

81320112 iNtel Fortran

Embed Size (px)

Citation preview

Page 1: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 1/32

Intel ® Visual Fortran Composer XE 2011Getting Started Tutorials

Document Number: 323650-001US

World Wide Web: http://developer.intel.com

Legal Information

Page 2: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 2/32

Page 3: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 3/32

ContentsLegal Information......................... ..................... .................... ................5Introducing the Intel ® Visual Fortran Composer XE 2011 ......................7Prerequisites.........................................................................................9

Chapter 1: Navigation Quick StartGetting Started with the Intel ® Visual Fortran Composer XE 2011.....................11Starting the Intel ® Parallel Debugger Extension..............................................11

Chapter 2: Tutorial: Intel ® Fortran CompilerUsing Auto Vectorization.............................................................................13

Introduction to Auto-vectorization.......................................................13Establishing a Performance Baseline.....................................................14Generating a Vectorization Report........................................................16Improving Performance by Aligning Data..............................................17Improving Performance with Interprocedural Optimization......................19Additional Exercises...........................................................................19

Using Guided Auto-parallelization.................................................................20Introduction to Guided Auto-parallelization...........................................20Preparing the Project for Guided Auto-parallelization..............................20Running Guided Auto-parallelization.....................................................21Analyzing Guided Auto-parallelization Reports.......................................24Implementing Guided Auto-parallelization Recommendations..................25

Using Coarry Fortran..................................................................................28Introduction to Coarray Fortran...........................................................28Compiling the Sample Program...........................................................28Controlling the Number of Images.......................................................31

iii

Page 4: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 4/32

iv

Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 5: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 5/32

Legal InformationINFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL(R) PRODUCTS. NO LICENSE,EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTEDBY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS,INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY,RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TOFITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHTOR OTHER INTELLECTUAL PROPERTY RIGHT.UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDEDFOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHEREPERSONAL INJURY OR DEATH MAY OCCUR.Intel may make changes to specifications and product descriptions at any time, without notice. Designers mustnot rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intelreserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilitiesarising from future changes to them. The information here is subject to change without notice. Do not finalize adesign with this information.The products described in this document may contain design defects or errors known as errata which may causethe product to deviate from published specifications. Current characterized errata are available on request.Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing yourproduct order.Copies of documents which have an order number and are referenced in this document, or other Intel literature,may be obtained by calling 1-800-548-4725, or by visiting Intel's Web Site.

Intel processor numbers are not a measure of performance. Processor numbers differentiate features within eachprocessor family, not across different processor families. See http://www.intel.com/products/processor_numberfor details.

Centrino, Cilk, Intel, Intel Atom, Intel Core, Intel NetBurst, Itanium, MMX, Pentium, Xeon are trademarks of Intel Corporation in the U.S. and/or other countries.

* Other names and brands may be claimed as the property of others.

Portions Copyright (C) 2001, Hewlett-Packard Development Company, L.P.

Microsoft product screen shot(s) reprinted with permission from Microsoft Corporation.

Copyright © 2011, Intel Corporation. All rights reserved.

5

Page 6: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 6/32

6

Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 7: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 7/32

Introducing the Intel ® Visual FortranComposer XE 2011This guide shows you how to start the Intel ® Visual Fortran Composer XE 2011 and begin debugging code usingthe Intel ® Parallel Debugger Extension. The Intel(R) Visual Fortran Composer XE 2011 is a comprehensive setof software development tools that includes the following components:

• Intel ® Fortran Compiler

• Intel®

Math Kernel Library• Intel ® Parallel Debugger Extension

Check http://software.intel.com/en-us/articles/intel-software-product-tutorials/ for the following:

• ShowMe video for using Intel ® Visual Fortran Composer XE with Microsoft Visual Studio*

7

Page 8: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 8/32

8

Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 9: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 9/32

PrerequisitesYou need the following tools, skills, and knowledge to effectively use these tutorials.

NOTE. Although the instructions and screen captures in these tutorials refer to the Visual Studio*2005 integrated development environment (IDE), you can use these tutorials with later versionsof Visual Studio.

Required ToolsYou need the following tools to use these tutorials:

• Microsoft Visual Studio 2005 or later.

• Intel ® Visual Fortran Composer XE 2011.

• Sample code included with the Intel ® Visual Fortran Composer XE 2011.

NOTE.

• Samples are non-deterministic. Your results may vary from the examples shown throughoutthese tutorials.

• Samples are designed only to illustrate features and do not represent best practices for creatingmultithreaded code.

Required Skills and Knowledge

These tutorials are designed for developers with a basic understanding of Microsoft Visual Studio, including howto:

• open a project/solution.

• access the Document Explorer . (valid in Microsoft Visual Studio 2005 /2008 )

• display the Solution Explorer .

• compile and link a project.• ensure a project compiled successfully.

9

Page 10: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 10/32

10

Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 11: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 11/32

1Navigation Quick Start

Getting Started with the Intel ® Visual Fortran Composer XE2011The Intel ® Visual Fortran Composer XE 2011 integrates into the following versions of the Microsoft Visual Studio*Integrated Development Environment (IDE):

• Microsoft Visual Studio 2010*

• Microsoft Visual Studio 2008*

• Microsoft Visual Studio 2005*

If you do not have one of these Microsoft products on your system, the Intel ® Visual Fortran Composer XE 2011installation can install Microsoft Visual Studio 2008 Shell and Libraries*.

To start the Intel ® Visual Fortran Compiler XE 12.1 from Microsoft Visual Studio* IDE, perform the followingsteps:

1. Launch Microsoft Visual Studio*.2. Select File > New > Project .

3. In the New Project window select a project type under Intel ® Visual Fortran .

4. Select the desired template.

5. Click OK .

Setting Compiler Options

1. Select Project > Properties . The Property Pages for your solution display.

2. Locate Fortran in the list and expand the heading.

3. Step through the available properties to select your configuration.

The results of the compilation display in the Output window.

Starting the Intel ® Parallel Debugger ExtensionThe Intel ® Parallel Debugger Extension for Microsoft Visual Studio* is a debugging add-on for the Intel ® Compiler'sparallel code development features. It facilitates developing parallelism into applications based on the Intel ®

OpenMP* runtime environment.

11

Page 12: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 12/32

Page 13: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 13/32

2Tutorial: Intel ® Fortran Compiler

Using Auto Vectorization

Introduction to Auto-vectorizationFor the Intel ® Fortran Compiler, vectorization is the unrolling of a loop combined with the generation of packedSIMD instructions. Because the packed instructions operate on more than one data element at a time, the loopcan execute more efficiently. It is sometimes referred to as auto-vectorization to emphasize that the compilerautomatically identifies and optimizes suitable loops on its own.

Using the -vec (Linux* OS) or the /Qvec (Windows* OS) option enables vectorization at default optimizationlevels for both Intel ® microprocessors and non-Intel microprocessors. Vectorization may call library routines thatcan result in additional performance gain on Intel microprocessors than on non-Intel microprocessors. Thevectorization can also be affected by certain options, such as /arch or /Qx or -m or -x (Linux and Mac OS X).

Vectorization is enabled with the Intel Fortran Compiler at optimization levels of /O2 and higher. Many loops arevectorized automatically, but in cases where this doesn't happen, you may be able to vectorize loops by makingsimple code modifications. In this tutorial, you will:

• establish a performance baseline

• generate a vectorization report

• improve performance by aligning data

• improve performance using Interprocedural Optimization

Locating the Samples

To begin this tutorial, open the vec_samples.zip archive in the product's Samples directory:<install-dir>\Samples\<locale>\Fortran\vec_samples.zip

Use these files for this tutorial:

• matrix_vector_multiplication_f.sln

• matrix_vector_multiplication_f.vcproj

• driver.f90

• matvec.f90

Open the Microsoft Visual Studio solution file, matrix_vector_multiplication_f.sln ,

13

Page 14: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 14/32

and follow the steps below to prepare the project for the vectorization exercises in this tutorial:

1. Change the Active solution configuration to Release using Build > Configuration Manager .

2. Clean the solution by selecting Build > Clean Solution .

Establishing a Performance Baseline

To set a performance baseline for the improvements that follow in this tutorial, build your project with thesesettings:

1. Select Project > Properties > Fortran > Optimization > Optimization > Minimum Size(/O1)

14

2 Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 15: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 15/32

2. Select Project > Properties > Fortran > Data > Default Real KIND > 8(/real_size:64)

3. Rebuild the project, then run the executable ( Debug > Start Without Debugging ) and record the executiontime reported in the output. This is the baseline against which subsequent improvements will be measured.

15

Tutorial: Intel ® Fortran Compiler 2

Page 16: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 16/32

Generating a Vectorization ReportA vectorization report tells you whether the loops in your code were vectorized, and if not, explains why not.

Add the /Qvec-report1 option by selecting Project > Properties > Fortran > Diagnostics > VectorizerDiagnostic Level > Loops Successefully Vectorized(1)(/Qvec-report1) .

Because vectorization is off at /O1 , the compiler does not generate a vectorization report, so recompile at /O2(default optimization):

Select Fortran > Optimization > Optimization > Maximize Speed

Record the new execution time. The reduction in time is mostly due to auto-vectorization of the inner loop atline 32 noted in the vectorization report:

matvec.f90(32) (col. 3): remark: LOOP WAS VECTORIZED. matvec.f90(38) (col. 6): remark: LOOPWAS VECTORIZED. driver.f90(59) (col. 5): remark: LOOP WAS VECTORIZED. driver.f90(61) (col. 5):remark: LOOP WAS VECTORIZED. driver.f90(80) (col. 29): remark: LOOP WAS VECTORIZED.

The /Qvec-report2 option returns a list that also includes loops that were not vectorized, along with the reasonwhy the compiler did not vectorize them.

Change /Qvec-report1 to /Qvec-report2 .

16

2 Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 17: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 17/32

Also, for Linker > Command Line > Additional Options , add /Qvec-report2 :

Rebuild your project.

The vectorization report indicates that the loop at line 33 in matvec.f90 did not vectorize because it is not theinnermost loop of the loop nest.

matvec.f90(32) (col. 3): remark: LOOP WAS VECTORIZED. matvec.f90(33) (col. 3): remark: loopwas not vectorized: not inner loop. matvec.f90(38) (col. 6): remark: LOOP WAS VECTORIZED.driver.f90(59) (col. 5): remark: loop was not vectorized: not inner loop. driver.f90(59) (col. 5):remark: loop was not vectorized: vectorization possible but seems inefficient. driver.f90(59) (col. 5):remark: loop was not vectorized: not inner loop. driver.f90(59) (col. 5): remark: loop was notvectorized: subscript too complex. driver.f90(59) (col. 5): remark: loop was not vectorized: not innerloop. driver.f90(59) (col. 5): remark: LOOP WAS VECTORIZED. driver.f90(61) (col. 5): remark: loopwas not vectorized: vectorization possible but seems inefficient. driver.f90(61) (col. 5): remark: LOOPWAS VECTORIZED. driver.f90(80) (col. 29): remark: LOOP WAS VECTORIZED. driver.f90(74) (col.7): remark: loop was not vectorized: nonstandard loop is not a vectorization candidate.

NOTE. For more information on the /Qvec-report compiler option, see the Compiler Optionssection in the Compiler User and Reference Guide.

Improving Performance by Aligning Data

The vectorizer can generate faster code when operating on aligned data. In this activity you will improve thevectorizer performance by aligning the arrays a , b , and c in driver.f90 on a 16-byte boundary so the vectorizercan use aligned load instructions for all arrays rather than the slower unaligned load instructions and can avoidruntime tests of alignment. Using the ALIGNED macro will insert an alignment directive for a , b , and c indriver.f90 with the following syntax:

!dir$attributes align : 16 :: a,b,c

This instructs the compiler to create arrays that it are aligned on a 16-byte boundary, which should facilitate theuse of SSE aligned load instructions.

17

Tutorial: Intel ® Fortran Compiler 2

Page 18: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 18/32

In addition, the column height of the matrix a needs to be padded out to be a multiple of 16 bytes, so that eachindividual column of a maintains the same 16-byte alignment. In practice, maintaining a constant alignmentbetween columns is much more important than aligning the start of the arrays.

To derive the maximum benefit from this alignment, we also need to tell the vectorizer it can safely assume thatthe arrays in matvec.f90 are aligned by using the directive

!dir$ vector aligned

NOTE. If you use !dir$ vector aligned , you must be sure that all the arrays or subarrays inthe loop are 16-byte aligned. Otherwise, you may get a runtime error. Aligning data may still givea performance benefit even if !dir$ vector aligned is not used. See the code under the ALIGNEDmacro in matvec.f90

If your compilation targets the Intel ® AVX instruction set, you should try to align data on a 32-byteboundary. This may result in improved performance. In this case, !dir$ vector aligned advises

the compiler that the data is 32-byte aligned.

Rebuild the program after adding the ALIGNED Preprocessor Definition to ensure consistently aligned data:

Fortran > Preprocessor > Preprocessor Definitions

Rebuild your project.

matvec.f90(32) (col. 3): remark: LOOP WAS VECTORIZED. matvec.f90(33) (col. 3): remark: loopwas not vectorized: not inner loop. matvec.f90(38) (col. 6): remark: LOOP WAS VECTORIZED.driver.f90(59) (col. 5): remark: loop was not vectorized: not inner loop. driver.f90(59) (col. 5):remark: loop was not vectorized: vectorization possible but seems inefficient. driver.f90(59) (col. 5):remark: loop was not vectorized: not inner loop. driver.f90(59) (col. 5): remark: loop was notvectorized: subscript too complex. driver.f90(59) (col. 5): remark: loop was not vectorized: not innerloop. driver.f90(59) (col. 5): remark: LOOP WAS VECTORIZED. driver.f90(61) (col. 5): remark: loopwas not vectorized: vectorization possible but seems inefficient. driver.f90(61) (col. 5): remark: LOOPWAS VECTORIZED. driver.f90(63) (col. 21): remark: loop was not vectorized: not inner loop.driver.f90(63) (col. 21): remark: LOOP WAS VECTORIZED. driver.f90(80) (col. 29): remark: LOOPWAS VECTORIZED. driver.f90(74) (col. 7): remark: loop was not vectorized: nonstandard loop is nota vectorization candidate.

18

2 Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 19: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 19/32

Improving Performance with Interprocedural OptimizationThe compiler may be able to perform additional optimizations if it is able to optimize across source line boundaries.These may include, but are not limited to, function inlining. This is enabled with the /Qipo option.

Rebuild the program using the /Qipo option to enable interprocedural optimization.

Select Optimization > Interprocedural Optimization > Multi-file(/Qipo)

Note that the vectorization messages now appear at the point of inlining in driver.f90 (line 70).

driver.f90(59) (col. 5): remark: loop was not vectorized: not inner loop. driver.f90(59) (col. 5):remark: loop was not vectorized: vectorization possible but seems inefficient. driver.f90(59) (col. 5):remark: loop was not vectorized: not inner loop. driver.f90(59) (col. 5): remark: loop was notvectorized: subscript too complex. driver.f90(59) (col. 5): remark: LOOP WAS VECTORIZED.driver.f90(61) (col. 5): remark: loop was not vectorized: vectorization possible but seems inefficient.driver.f90(61) (col. 5): remark: LOOP WAS VECTORIZED. driver.f90(63) (col. 21): remark: loop wasnot vectorized: not inner loop. driver.f90(63) (col. 21): remark: LOOP WAS VECTORIZED. driver.f90(73)(col. 16): remark: loop was not vectorized: not inner loop. driver.f90(70) (col. 14): remark: LOOPWAS VECTORIZED. driver.f90(70) (col. 14): remark: loop was not vectorized: not inner loop.driver.f90(70) (col. 14): remark: LOOP WAS VECTORIZED. driver.f90(80) (col. 29): remark: LOOPWAS VECTORIZED.

Now, run the executable and record the execution time.

Additional ExercisesThe previous examples made use of double precision arrays. They may be built instead with single precisionarrays by changing the command-line option /real-size:64 to /real-size:32 The non-vectorized versionsof the loop execute only slightly faster the double precision version; however, the vectorized versions aresubstantially faster. This is because a packed SIMD instruction operating on a 16-byte vector register operateson four single precision data elements at once instead of two double precision data elements.

19

Tutorial: Intel ® Fortran Compiler 2

Page 20: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 20/32

NOTE. In the example with data alignment, you will need to set ROWBUF=3 to ensure 16-bytealignment for each row of the matrix a . Otherwise, the directive !dir$ vector aligned will causethe program to fail.

This completes the tutorial for auto-vectorization, where you have seen how the compiler can optimize performancewith various vectorization techniques.

Using Guided Auto-parallelization

Introduction to Guided Auto-parallelizationGuided Auto-parallelization (GAP) is a feature of the Intel ® Fortran Compiler that offers selective advice and,when correctly applied, results in auto-vectorization or auto-parallelization for serially-coded applications. Using

the /Qguide option with your normal compiler options at /O2 or higher is sufficient to enable the GAP technologyto generate the advice for auto-vectorization. Using /Qguide in conjunction with /Qparallel will enable thecompiler to generate advice for auto-parallelization.

In this tutorial, you will:

1. prepare the project for Guided Auto-parallelization.

2. run Guided Auto-parallelization.

3. analyze Guided Auto-parallelization reports.

4. implement Guided Auto-parallelization recommendations.

Preparing the Project for Guided Auto-parallelizationTo begin this tutorial, open the GuidedAutoParallel.zip archive located in the product's Samples directorylocated at:\Samples\<Locale>\Fortran\The following Visual Studio* 2005 project files and source files are included:

• GAP-f.sln

• GAP-f.vfproj

• main.f90

• scalar_dep.f90

Open the Microsoft Visual Studio Solution file, GAP-f.sln ,

20

2 Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 21: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 21/32

and follow the steps below to prepare the project for Guided Auto-parallelization (GAP).

1. Clean the Solution by selecting Build > Clean Solution .

2. Since GAP is enabled only with option /O2 or higher, you will need to change the build configuration to Releaseusing Build > Configuration Manager .

Running Guided Auto-parallelization

There are several ways to run GAP analysis in Visual Studio, depending on whether you want analysis for thewhole solution, the project, a single file, a function, or a range of lines in your source code. In this tutorial, wewill use single-file analysis. Follow the steps below to run a single-file analysis on scalar_dep.f90 in the GAP-f project:

1. In the GAP-f project, right-click on scalar_dep.f90 .

2. Select Intel Visual Fortran Composer XE > Guided Auto Parallelism > Run Analysis on file"scalar_dep.f90"

3. If the /Qipo option is enabled, the Analysis with Multi-file optimization dialog appears. Click Run Analysis .

4. On the Configure Analysis dialog, click Run Analysis using the choices shown here:

21

Tutorial: Intel ® Fortran Compiler 2

Page 22: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 22/32

NOTE. If you select Send remarks to a file , GAP messages will not be available in the Outputwindow or Error List window.

See the GAP Report in the Output window. GAP reports in the standard Output window are encapsulated withGAP REPORT LOG OPENED and END OF GAP REPORT LOG .

22

2 Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 23: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 23/32

Also, see the GAP Messages in the Error List window:

23

Tutorial: Intel ® Fortran Compiler 2

Page 24: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 24/32

Analyzing Guided Auto-parallelization Reports

Analyze the output generated by GAP analysis and determine whether or not the specific suggestions areappropriate for the specified source code. For this sample tutorial, GAP generates output for the loop inscalar_dep.f90 :

do i = 1, nif (a(i) >= 0) then

t = iend ifif (a(i) > 0) then

a(i) = t * (1 / (a(i) * a(i)))end if

end do

In this example, the GAP Report generates a recommendation (remark #30761) to add the /Qparallel option

to improve auto-parallelization. Remark #30515 indicates if variable t can be unconditionally assigned, thecompiler will be able to vectorize the loop.

24

2 Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 25: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 25/32

Implementing Guided Auto-parallelization Recommendations

The GAP Report in this example recommends using the /Qparallel option to enable parallelization. Follow thesesteps to enable this option:

1. Right-click on the GAP-f project and select Properties

2. On the Property Pages dialog, expand the Fortran heading and select Optimization .

3. In the right-hand pane under, select Parallelization , then choose Yes (/Qparallel) and click OK .

Now, run the GAP Analysis again and review the GAP Report:

25

Tutorial: Intel ® Fortran Compiler 2

Page 26: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 26/32

Apply the necessary changes after verifying that the GAP recommendations are appropriate and do not changethe semantics of the program.

For this loop, the conditional compilation enables parallelization and vectorization of the loop as recommendedby GAP:

do i = 1, n!dir$ if defined(test_gap)

t = i!dir$else

if (a(i) >= 0) thent = i

end if!dir$ endif

if (a(i) > 0) then

a(i) = t * (1 / (a(i) * a(i)))end ifend do

To verify that the loop is parallelized and vectorized:

1. Add the options /Qdiag-enable:par /Qdiag-enable:vec to the Command Line > Additional Optionsdialog.

26

2 Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 27: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 27/32

2. Add the preprocessor definition test_gap to compile the appropriate code path.

3. Rebuild the GAP-f project and note the reports in the output window:

27

Tutorial: Intel ® Fortran Compiler 2

Page 28: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 28/32

For more information on using the -guide , -vec-report , and -par-report compiler options, see the CompilerOptions section in the Compiler User Guide and Reference.

This completes the tutorial for Guided Auto-parallelization, where you have seen how the compiler can guideyou to an optimized solution through auto-parallelization.

Using Coarry Fortran

Introduction to Coarray FortranThe Intel ® Fortran Compiler XE supports parallel programming using coarrays as defined in the Fortran 2008standard. As an extension to the Fortran language, coarrays offer one method to use Fortran as a robust andefficient parallel programming language. Coarray Fortran uses a single-program, multi-data programming model(SPMD).

Coarrays are supported in the Intel ® Fortran Compiler XE for Linux* and Intel ® Visual Fortran Compiler XE forWindows*.

This tutorial demonstrates how to compile a simple coarray Fortran application using the Intel Fortran CompilerXE, and how to control the number of images (processes) for the application.

Locating the Sample

To begin this tutorial, locate the source file in the product's Samples directory:<install-dir>\Samples\<locale>\Fortran\coarray_samples.zip

Extract the Visual Studio project files from the .zip archive to a working directory:

• coarray_samples.sln

• coarray_samples.vfproj

• hello_image.f90

NOTE. The Intel Fortran Compiler implementation of coarrays follows the standard provided in adraft version of the Fortran 2008 Standard. Not all features present in the Fortran 2008 Standardmay be implemented by Intel. Consult the Release Notes for a list of supported features.

Compiling the Sample ProgramThe hello_image.f90 sample is a hello world program. Unlike the usual hello world, this coarray Fortran program

will spawn multiple images, or processes, that will run concurrently on the host computer. Examining the sourcecode for this application shows a simple Fortran program:

program hello_image

write(*,*) "Hello from image ", this_image(), &

28

2 Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 29: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 29/32

"out of ", num_images()," total images"

end program hello_image

Note the function calls to this_image() and num_images( ). These are new Fortran 2008 intrinsic functions.The num_images() function returns the total number of images or processes spawned for this program. Thethis_image() function returns a unique identifier for each image in the range 1 to N, where N is the total numberof images created for this program.

After installing the Intel ® Visual Fortran Composer XE 2011, start Microsoft Visual Studio* and open thecoarray_samples.sln file.

To build the project using coarrays, select:Project > Properties > Fortran > Command Line > /Qcoarray

29

Tutorial: Intel ® Fortran Compiler 2

Page 30: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 30/32

Now, build the solution ( Build > Build Solution ), then run the executable ( Debug > Start Without Debugging ).Your output should be similar to this:

Hello from image 1 out of 8 total images Hello from image 6 out of 8 total images Hello from image7 out of 8 total images Hello from image 2 out of 8 total images Hello from image 5 out of 8 totalimages Hello from image 8 out of 8 total images Hello from image 3 out of 8 total images Hello fromimage 4 out of 8 total images

By default, when a Coarray Fortran application is compiled with the Intel Fortran Compiler, the invocation createsas many images as there are processor cores on the host platform. The example shown above was run on a dualquad-core host system with eight total cores. As shown, each image is a separately spawned process on thesystem and executes asynchronously.

NOTE. The /Qcoarray option cannot be used in conjunction with /Qopenmp options. One cannotmix Coarray Fortran language extensions with OpenMP extensions.

30

2 Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials

Page 31: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 31/32

Controlling the Number of ImagesThere are two methods to control the number of images created for a Coarray Fortran application. First, you canuse the /Qcoarray-num-images=N compiler option to compile the application, where N is the number of images.This option sets the number of images created for the application during run time. For example, use the/Qcoarray-num-images=2 option to the limit the number of images of the hello_image.f90 program to exactlytwo:

To use the /Qcoarray-num-images=N option, select:Project > Properties > Fortran > Command Line > /Qcoarray-num-images=N

In this example, we use /Qcoarray-num-images=2 to generate the following output:

Hello from image 2 out of 2 total images Hello from image 1 out of 2 total images

The second way to control the number of images is to use the environment variable FOR_COARRAY_NUM_IMAGES ,setting this to the number of images you want to spawn.

As an example, recompile hello_image.f90 without the /Qcoarray-num-images option. Before running theexecutable, set the environment variable FOR_COARRAY_NUM_IMAGES to the number of images you want createdduring the program run.

31

Tutorial: Intel ® Fortran Compiler 2

Page 32: 81320112 iNtel Fortran

7/30/2019 81320112 iNtel Fortran

http://slidepdf.com/reader/full/81320112-intel-fortran 32/32

To set an environment variable in Visual Studio, select Project Properties > Configuration Properties >Debugging > Environment . Then set FOR_COARRAY_NUM_IMAGES=N where N is the number of images youwant to create at runtime.

Hello from image 3 out of 3 total images Hello from image 2 out of 3 total images Hello from image1 out of 3 total images

NOTE. Setting FOR_COARRAY_NUM_IMAGES=N overrides the /Qcoarray_num_images compileroption.

2 Intel ® Visual Fortran Composer XE 2011 Getting Started Tutorials