• OpenMP for Fortran



    OpenMP for Fortran


    • OpenMP Directive
    • Syntax of OpenMP compiler directive for Fortran:
       !$OMP  DirectiveName Optional_CLAUSES...
         ...
         ... Program statements between the !$OMP lines
         ... are executed in parallel by all threads      
         ...
       !$OMP  END DirectiveName 
      
    • Program statements between the 2 red lines are executed by multiple threads


    • Setting the level of parallellism in OpenMP programs
    • The number of threads that will be created to execute parallel sections in an OpenMP program is controlled by the environment variable OMP_NUM_THREADS
    • To set this environment variable use:
        export OMP_NUM_THREADS=...            
      
      Example:
      
        export OMP_NUM_THREADS=8
      


    • Compiling OpenMP programs
      • Fortran
        • Compile:
            f90 -O -c -xopenmp -stackvar Prog.f90    
          
        • Link:
            f90 -O -o Executable 
               -xopenmp -stackvar 
               Prog1.o Prog2.o ....
          


    • Introductory Example
      • Parallel "Hello World" OpenMP program:
           PROGRAM  Main
        
           !$OMP PARALLEL
        
           print *, "Hello World !"                 
        
           !$OMP END PARALLEL
        
           END
        

      • Example Program: (Demo above code)                                                
      • Compile with:
            f90 -O
        -xopenmp -stackvar
          openMP01.f90
      • Run with:
        • export OMP_NUM_THREADS=8
        • a.out

        Make sure you do it on compute.

        You will see "Hello World !!!" printed EIGHT times !!! (Remove the #pragma line and you get ONE line)....



    • Defining shared and private (non-shared) variables in parallel section
    • Recall:
      • There is no scopes in Fortran

      Fortran uses option keywords to define private (non-shared) (and shared) variables....


    • Defining shared and private variables in a PARALLEL section
      • A variable is by default shared among all threads
      • A private variable in a PARALLE section must be specified using the option PRIVATE

    • Fortran example of SHARED variable:
         PROGRAM  Main
         IMPLICIT NONE
      
         integer :: N         ! Shared
      
         N = 1001
         print *, "Before parallel section: N = ", N            
      
         !$OMP PARALLEL
         N = N + 1
         print *, "Inside parallel section: N = ", N
         !$OMP END PARALLEL
      
         print *, "After parallel section: N = ", N
         END
      

    • Example Program: (Demo above code)                        
      • Prog file: (Shared variable in OpenMP) --- click here
    • Compile with:
          f90 -O
      -xopenmp -stackvar
        openMP02a.f90
    • Run a few times with:
      • export OMP_NUM_THREADS=8
      • a.out

      You should see the value for N at the end is not always 1009, it could be less. This is evidence of asynchronous update.



    • Fortran example of NON-SHARED (private) variable:
         PROGRAM  Main
         IMPLICIT NONE
      
         integer :: N         ! Shared
      
         N = 1001
         print *, "Before parallel section: N = ", N
      
         !$OMP PARALLEL PRIVATE(N)
         N = N + 1
         print *, "Inside parallel section: N = ", N
         !$OMP END PARALLEL
      
         print *, "After parallel section: N = ", N
         END
      

    • Example Program: (Demo above code)                        
      • Prog file: (Private variable in OpenMP) --- click here
    • Compile with:
          f90 -O
      -xopenmp -stackvar
        openMP02b.f90
    • Run a few times with:
      • export OMP_NUM_THREADS=8
      • a.out

    • Output:
          Before parallel section: N =  1001            
          Inside parallel section: N =  1
          Inside parallel section: N =  1
          Inside parallel section: N =  1
          Inside parallel section: N =  1
          Inside parallel section: N =  1
          Inside parallel section: N =  1
          Inside parallel section: N =  1
          Inside parallel section: N =  1
          After parallel section: N =  1001
      

      Each thread has its own variable N

      This variable N is different from the "program" variable defined in the main program !!!



    • OpenMP Support function
    • Most useful support functions in OpenMP:
      Function NameEffect
      omp_set_num_threads(int nthread) Set size of thread team
      INTEGER omp_get_num_threads() return size of thread team
      INTEGER omp_get_max_threads() return max size of thread team (typically equal to the number of processors
      INTEGER omp_get_thread_num() return thread ID of the thread that calls this function
      INTEGER omp_get_num_procs() return number of processors
      LOGICAL omp_in_parallel() return TRUE if currently in a PARALLEL segment
    • Here is a simple OMP program in Fortran:
         PROGRAM  Main
         IMPLICIT NONE
      
         INTEGER :: nthreads, myid
         INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS
      
      
         !$OMP PARALLEL private(nthreads, myid)
      
      
         myid = OMP_GET_THREAD_NUM()
      
         print *, "Hello I am thread ", myid
      
         if (myid == 0) then
            nthreads = OMP_GET_NUM_THREADS()
            print *, "Number of threads = ", nthreads
         end if
      
         !$OMP END PARALLEL
      
         END
      
    • Example Program: (OpenMP Fortran program) --- click here        
    • Compile using the following command:
          f90 -O
      -xopenmp -stackvar
        hello.f90
    • Run with:
      • export OMP_NUM_THREADS=8
      • a.out

    • Output:
        Hello I am thread  7
        Hello I am thread  5
        Hello I am thread  1
        Hello I am thread  0
        Hello I am thread  2
        Number of threads =  8
        Hello I am thread  4
        Hello I am thread  3
        Hello I am thread  6
      


    • Caveat with Fortran
      • Recall:
        • Array indices in Fortran by default start with 1 (ONE)
      • Observed from "Hello" program:
        • Thread IDs start with 0 (ZERO)
      • Caveat:
        • Use ThreadID+1 as index to an array in Fortran !!!


    • Example OpenMP Program: Find minimum in an array
      • A sequential program in C++ can be found here: ( click here )
      • We will write this program using OpenMP in Fortran
      • Parallel Find Min program in Fortran:
          PROGRAM Min
           IMPLICIT NONE
        
           INTEGER, PARAMETER :: MAX = 10000000
        
           DOUBLE PRECISION, DIMENSION(MAX) :: x
           DOUBLE PRECISION, DIMENSION(10)  :: my_min
           DOUBLE PRECISION :: rmin
        
           INTEGER :: num_threads
           INTEGER :: i, n
           INTEGER :: id, start, stop
        
           ! ===========================================================
           ! Declare the OpenMP functions
           ! ===========================================================     
           INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS
        
        
          ! ===================================
          ! Parallel section: Find local minima
          ! ===================================
        !$OMP  PARALLEL  PRIVATE(i, id, start, stop, num_threads, n)
        
           num_threads = omp_get_num_threads()
           n = MAX/num_threads
        
           id = omp_get_thread_num()
        
           ! ----------------------------------
           ! Find my own starting index
           ! ----------------------------------
           start = id * n + 1          !! Array start at 1
        
           ! ----------------------------------
           ! Find my own stopping index
           ! ----------------------------------
           if ( id <> (num_threads-1) ) then
              stop = start + n
           else
              stop = MAX
           end if
        
           ! ----------------------------------
           ! Find my own min
           ! ----------------------------------
           my_min(id+1) = x(start)
        
           DO i = start+1, stop
              IF ( x(i) < my_min(id+1) ) THEN
                 my_min(id+1) = x(i)
              END IF
           END DO
        
        !$OMP END PARALLEL
        
        
          ! ===================================
          ! Find min over the local minima
          ! ===================================
           rmin = my_min(1)
        
           DO i = 2, num_threads
              IF ( rmin < my_min(i) ) THEN
                 rmin = my_min(i)
              END IF
           END DO
        
           print *, "min = ", rmin
           END PROGRAM
        
      • Example Program: (Demo above code)                                                
            f90 -O
        -xopenmp -stackvar
          min-mt1.f90
      • Run with:
        • export OMP_NUM_THREADS=8
        • a.out


    • Mutual exclusion synchronization Primitives
    • This mutual exclusion effect in Fortran is achieved in OpenMP using the following pragma:
      
         !$OMP CRITICAL
      
             ... statements are guaranteed to be executed
             ,,, by ONE thread at any one time
      
      
         !$OMP END CRITICAL
      


    • Example OpenMP program with synchronization: compute Pi
    • Example:
        PROGRAM Compute_PI
         IMPLICIT NONE
      
      
         INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS     
      
         INTEGER           N, i
         INTEGER           id, num_threads
         DOUBLE PRECISION  w, x, sum
         DOUBLE PRECISION  pi, mypi
      
      
         N = 50000000         !! Number of intervals
         w = 1.0d0/N          !! width of each interval
      
         sum = 0.0d0
      
      !$OMP    PARALLEL PRIVATE(i, id, num_threads, x, mypi)
      
         num_threads = omp_get_num_threads()
         id = omp_get_thread_num()
      
         mypi = 0.0d0;
      
         DO i = id,   N-1,   num_threads
           x = w * (i + 0.5d0)
           mypi = mypi + w*f(x)
         END DO
      
      
      !$OMP CRITICAL
         pi = pi + mypi
      !$OMP END CRITICAL
      
      
      !$OMP    END PARALLEL
      
         PRINT *, "Pi = ", pi
      
         END PROGRAM
      
      
    • Example Program: (OpenMP compute Pi) --- click here        
    • Compile with:
          f90 -O
      -xopenmp -stackvar
        openMP_compute_pi2.f90
    • Run a few times with:
      • export OMP_NUM_THREADS=8
      • a.out



    • Parallel For Loop in OpenMP

      The division of labor (splitting the work of a for-loop) of a for-loop can be done in OpenMP through a special Parallel LOOP construct.

    • A Parallel Loop construct MUST appear within a Parallel region of the program !
    • The syntax of a Parallel LOOP construct in Fortran is:
      
         !$OMP    DO
      
            DO  index = ....
                ....            ! Division of labor is taken care of       
      			  ! by the Fortran compiler
            END DO
      
         !$OMP    END DO
      
    • The meaning of this Parallel LOOP construct is to distribute the iterations in the for-loop (or do-loop) among the threads.

      Each iteration of the for-loop is executed exactly once by each thread.

      The loop variable used in the Parallel LOOP construct is by default PRIVATE (other variables are still by default SHARED)


    • Example: compute Pi with parallel DO loop
        PROGRAM Compute_PI
         IMPLICIT NONE
      
         INTEGER           N, i, num_threads
         DOUBLE PRECISION  w, x, sum
         DOUBLE PRECISION  pi, mypi
      
      
         N = 50000000         !! Number of intervals
         w = 1.0d0/N          !! width of each interval
      
         sum = 0.0d0
      
      !$OMP    PARALLEL PRIVATE(x, mypi)
      
         mypi = 0.0d0;
      
      !$OMP    DO
         DO i = 0, N-1                !! Parallel Loop
           x = w * (i + 0.5d0)
           mypi = mypi + w*f(x)
         END DO
      !$OMP    END DO
      
      
      !$OMP CRITICAL
         pi = pi + mypi
      !$OMP END CRITICAL
      
      
      !$OMP    END PARALLEL
      
         PRINT *, "Pi = ", pi
      
         END PROGRAM
      
      
    • Example Program: (OpenMP compute Pi) --- click here        
    • Compile with:
          f90 -O
      -xopenmp -stackvar
        openMP_compute_pi3.f90
    • Run with:
      • export OMP_NUM_THREADS=8
      • a.out


    • Final Notes
    • The stack size of each thread can be controlled by setting another environment variable:
        setenv   STACKSIZE    nBytes       
      
    • For more information on OpenMP, see: http://www.openmp.org





  • 相关阅读:
    bzoj1904: Musical Water-fence
    bzoj3822: 文学
    bzoj1513: [POI2006]Tet-Tetris 3D
    bzoj4130: [PA2011]Kangaroos
    bzoj2515 Room
    bzoj2518: [Shoi2010]滚动的正四面体
    bzoj4617: [Wf2016]Spin Doctor
    bzoj3086: Coci2009 dvapravca
    bzoj3745: [Coci2015]Norma
    bzoj1837: [CROATIAN2009]cavli 凸包1
  • 原文地址:https://www.cnblogs.com/China3S/p/3500515.html
Copyright © 2020-2023  润新知