MLIR

Multi-Level IR Compiler Framework

'omp' Dialect

Operations 

source

omp.atomic.capture (omp::AtomicCaptureOp) 

Performs an atomic capture

Syntax:

operation ::= `omp.atomic.capture` oilist(`memory_order` `(` custom<ClauseAttr>($memory_order_val) `)`
              |`hint` `(` custom<SynchronizationHint>($hint_val) `)`)
              $region attr-dict

This operation performs an atomic capture.

hint is the value of hint (as used in the hint clause). It is a compile time constant. As the name suggests, this is just a hint for optimization.

memory_order indicates the memory ordering behavior of the construct. It can be one of seq_cst, acq_rel, release, acquire or relaxed.

The region has the following allowed forms:

  omp.atomic.capture {
    omp.atomic.update ...
    omp.atomic.read ...
    omp.terminator
  }

  omp.atomic.capture {
    omp.atomic.read ...
    omp.atomic.update ...
    omp.terminator
  }

  omp.atomic.capture {
    omp.atomic.read ...
    omp.atomic.write ...
    omp.terminator
  }

Traits: RecursiveMemoryEffects, SingleBlockImplicitTerminator<TerminatorOp>, SingleBlock

Interfaces: AtomicCaptureOpInterface

Attributes: 

AttributeMLIR TypeDescription
hint_val::mlir::IntegerAttr64-bit signless integer attribute
memory_order_val::mlir::omp::ClauseMemoryOrderKindAttr
MemoryOrderKind Clause

Enum cases:

  • seq_cst (Seq_cst)
  • acq_rel (Acq_rel)
  • acquire (Acquire)
  • release (Release)
  • relaxed (Relaxed)

omp.atomic.read (omp::AtomicReadOp) 

Performs an atomic read

Syntax:

operation ::= `omp.atomic.read` $v `=` $x
              oilist( `memory_order` `(` custom<ClauseAttr>($memory_order_val) `)`
              | `hint` `(` custom<SynchronizationHint>($hint_val) `)`)
              `:` type($x) `,` $element_type attr-dict

This operation performs an atomic read.

The operand x is the address from where the value is atomically read. The operand v is the address where the value is stored after reading.

hint is the value of hint (as specified in the hint clause). It is a compile time constant. As the name suggests, this is just a hint for optimization.

memory_order indicates the memory ordering behavior of the construct. It can be one of seq_cst, acquire or relaxed.

Interfaces: AtomicReadOpInterface

Attributes: 

AttributeMLIR TypeDescription
element_type::mlir::TypeAttrany type attribute
hint_val::mlir::IntegerAttr64-bit signless integer attribute
memory_order_val::mlir::omp::ClauseMemoryOrderKindAttr
MemoryOrderKind Clause

Enum cases:

  • seq_cst (Seq_cst)
  • acq_rel (Acq_rel)
  • acquire (Acquire)
  • release (Release)
  • relaxed (Relaxed)

Operands: 

OperandDescription
xOpenMP-compatible variable type
vOpenMP-compatible variable type

omp.atomic.update (omp::AtomicUpdateOp) 

Performs an atomic update

Syntax:

operation ::= `omp.atomic.update` oilist( `memory_order` `(` custom<ClauseAttr>($memory_order_val) `)`
              | `hint` `(` custom<SynchronizationHint>($hint_val) `)`)
              $x `:` type($x) $region attr-dict

This operation performs an atomic update.

The operand x is exactly the same as the operand x in the OpenMP Standard (OpenMP 5.0, section 2.17.7). It is the address of the variable that is being updated. x is atomically read/written.

hint is the value of hint (as used in the hint clause). It is a compile time constant. As the name suggests, this is just a hint for optimization.

memory_order indicates the memory ordering behavior of the construct. It can be one of seq_cst, release or relaxed.

The region describes how to update the value of x. It takes the value at x as an input and must yield the updated value. Only the update to x is atomic. Generally the region must have only one instruction, but can potentially have more than one instructions too. The update is sematically similar to a compare-exchange loop based atomic update.

The syntax of atomic update operation is different from atomic read and atomic write operations. This is because only the host dialect knows how to appropriately update a value. For example, while generating LLVM IR, if there are no special atomicrmw instructions for the operation-type combination in atomic update, a compare-exchange loop is generated, where the core update operation is directly translated like regular operations by the host dialect. The front-end must handle semantic checks for allowed operations.

Traits: RecursiveMemoryEffects, SingleBlockImplicitTerminator<YieldOp>, SingleBlock

Interfaces: AtomicUpdateOpInterface

Attributes: 

AttributeMLIR TypeDescription
hint_val::mlir::IntegerAttr64-bit signless integer attribute
memory_order_val::mlir::omp::ClauseMemoryOrderKindAttr
MemoryOrderKind Clause

Enum cases:

  • seq_cst (Seq_cst)
  • acq_rel (Acq_rel)
  • acquire (Acquire)
  • release (Release)
  • relaxed (Relaxed)

Operands: 

OperandDescription
xOpenMP-compatible variable type

omp.atomic.write (omp::AtomicWriteOp) 

Performs an atomic write

Syntax:

operation ::= `omp.atomic.write` $x `=` $expr
              oilist( `hint` `(` custom<SynchronizationHint>($hint_val) `)`
              | `memory_order` `(` custom<ClauseAttr>($memory_order_val) `)`)
              `:` type($x) `,` type($expr)
              attr-dict

This operation performs an atomic write.

The operand x is the address to where the expr is atomically written w.r.t. multiple threads. The evaluation of expr need not be atomic w.r.t. the write to address. In general, the type(x) must dereference to type(expr).

hint is the value of hint (as specified in the hint clause). It is a compile time constant. As the name suggests, this is just a hint for optimization.

memory_order indicates the memory ordering behavior of the construct. It can be one of seq_cst, release or relaxed.

Interfaces: AtomicWriteOpInterface

Attributes: 

AttributeMLIR TypeDescription
hint_val::mlir::IntegerAttr64-bit signless integer attribute
memory_order_val::mlir::omp::ClauseMemoryOrderKindAttr
MemoryOrderKind Clause

Enum cases:

  • seq_cst (Seq_cst)
  • acq_rel (Acq_rel)
  • acquire (Acquire)
  • release (Release)
  • relaxed (Relaxed)

Operands: 

OperandDescription
xOpenMP-compatible variable type
exprany type

omp.barrier (omp::BarrierOp) 

Barrier construct

Syntax:

operation ::= `omp.barrier` attr-dict

The barrier construct specifies an explicit barrier at the point at which the construct appears.

omp.cancel (omp::CancelOp) 

Cancel directive

Syntax:

operation ::= `omp.cancel` `cancellation_construct_type` `(`
              custom<ClauseAttr>($cancellation_construct_type_val) `)`
              ( `if` `(` $if_expr^ `)` )? attr-dict

The cancel construct activates cancellation of the innermost enclosing region of the type specified.

Attributes: 

AttributeMLIR TypeDescription
cancellation_construct_type_val::mlir::omp::ClauseCancellationConstructTypeAttr
CancellationConstructType Clause

Enum cases:

  • parallel (Parallel)
  • loop (Loop)
  • sections (Sections)
  • taskgroup (Taskgroup)

Operands: 

OperandDescription
if_expr1-bit signless integer

omp.cancellation_point (omp::CancellationPointOp) 

Cancellation point directive

Syntax:

operation ::= `omp.cancellation_point` `cancellation_construct_type` `(`
              custom<ClauseAttr>($cancellation_construct_type_val) `)`
              attr-dict

The cancellation point construct introduces a user-defined cancellation point at which implicit or explicit tasks check if cancellation of the innermost enclosing region of the type specified has been activated.

Attributes: 

AttributeMLIR TypeDescription
cancellation_construct_type_val::mlir::omp::ClauseCancellationConstructTypeAttr
CancellationConstructType Clause

Enum cases:

  • parallel (Parallel)
  • loop (Loop)
  • sections (Sections)
  • taskgroup (Taskgroup)

omp.critical (omp::CriticalOp) 

Critical construct

Syntax:

operation ::= `omp.critical` (`(` $name^ `)`)? $region attr-dict

The critical construct imposes a restriction on the associated structured block (region) to be executed by only a single thread at a time.

Interfaces: SymbolUserOpInterface

Attributes: 

AttributeMLIR TypeDescription
name::mlir::FlatSymbolRefAttrflat symbol reference attribute

omp.critical.declare (omp::CriticalDeclareOp) 

Declares a named critical section.

Syntax:

operation ::= `omp.critical.declare` $sym_name oilist(`hint` `(` custom<SynchronizationHint>($hint_val) `)`)
              attr-dict

Declares a named critical section.

The name can be used in critical constructs in the dialect.

Interfaces: Symbol

Attributes: 

AttributeMLIR TypeDescription
sym_name::mlir::StringAttrstring attribute
hint_val::mlir::IntegerAttr64-bit signless integer attribute

omp.declare_reduction (omp::DeclareReductionOp) 

Declares a reduction kind

Syntax:

operation ::= `omp.declare_reduction` $sym_name `:` $type attr-dict-with-keyword `init` $initializerRegion `combiner` $reductionRegion custom<AtomicReductionRegion>($atomicReductionRegion) custom<CleanupReductionRegion>($cleanupRegion)

Declares an OpenMP reduction kind. This requires two mandatory and two optional regions.

  1. The initializer region specifies how to initialize the thread-local reduction value. This is usually the neutral element of the reduction. For convenience, the region has an argument that contains the value of the reduction accumulator at the start of the reduction. It is expected to omp.yield the new value on all control flow paths.
  2. The reduction region specifies how to combine two values into one, i.e. the reduction operator. It accepts the two values as arguments and is expected to omp.yield the combined value on all control flow paths.
  3. The atomic reduction region is optional and specifies how two values can be combined atomically given local accumulator variables. It is expected to store the combined value in the first accumulator variable.
  4. The cleanup region is optional and specifies how to clean up any memory allocated by the initializer region. The region has an argument that contains the value of the thread-local reduction accumulator. This will be executed after the reduction has completed.

Note that the MLIR type system does not allow for type-polymorphic reductions. Separate reduction declarations should be created for different element and accumulator types.

For initializer and reduction regions, the operand to omp.yield must match the parent operation’s results.

Traits: IsolatedFromAbove

Interfaces: Symbol

Attributes: 

AttributeMLIR TypeDescription
sym_name::mlir::StringAttrstring attribute
type::mlir::TypeAttrany type attribute

omp.distribute (omp::DistributeOp) 

Distribute construct

Syntax:

operation ::= `omp.distribute` oilist(`dist_schedule_static` $dist_schedule_static
              |`chunk_size` `(` $chunk_size `:` type($chunk_size) `)`
              |`order` `(` custom<ClauseAttr>($order_val) `)`
              |`allocate` `(`
              custom<AllocateAndAllocator>(
              $allocate_vars, type($allocate_vars),
              $allocators_vars, type($allocators_vars)
              ) `)`
              ) $region attr-dict

The distribute construct specifies that the iterations of one or more loops (optionally specified using collapse clause) will be executed by the initial teams in the context of their implicit tasks. The loops that the distribute op is associated with starts with the outermost loop enclosed by the distribute op region and going down the loop nest toward the innermost loop. The iterations are distributed across the initial threads of all initial teams that execute the teams region to which the distribute region binds.

The distribute loop construct specifies that the iterations of the loop(s) will be executed in parallel by threads in the current context. These iterations are spread across threads that already exist in the enclosing region.

The body region can only contain a single block which must contain a single operation and a terminator. The operation must be another compatible loop wrapper or an omp.loop_nest.

The dist_schedule_static attribute specifies the schedule for this loop, determining how the loop is distributed across the parallel threads. The optional schedule_chunk associated with this determines further controls this distribution.

omp.distribute <clauses> {
  omp.loop_nest (%i1, %i2) : index = (%c0, %c0) to (%c10, %c10) step (%c1, %c1) {
    %a = load %arrA[%i1, %i2] : memref<?x?xf32>
    %b = load %arrB[%i1, %i2] : memref<?x?xf32>
    %sum = arith.addf %a, %b : f32
    store %sum, %arrC[%i1, %i2] : memref<?x?xf32>
    omp.yield
  }
  omp.terminator
}

// TODO: private_var, firstprivate_var, lastprivate_var, collapse

Traits: AttrSizedOperandSegments, RecursiveMemoryEffects, SingleBlockImplicitTerminator<TerminatorOp>, SingleBlock

Interfaces: LoopWrapperInterface

Attributes: 

AttributeMLIR TypeDescription
dist_schedule_static::mlir::UnitAttrunit attribute
order_val::mlir::omp::ClauseOrderKindAttr
OrderKind Clause

Enum cases:

  • concurrent (Concurrent)

Operands: 

OperandDescription
chunk_sizeinteger or index
allocate_varsvariadic of any type
allocators_varsvariadic of any type

omp.flush (omp::FlushOp) 

Flush construct

Syntax:

operation ::= `omp.flush` ( `(` $varList^ `:` type($varList) `)` )? attr-dict

The flush construct executes the OpenMP flush operation. This operation makes a thread’s temporary view of memory consistent with memory and enforces an order on the memory operations of the variables explicitly specified or implied.

Operands: 

OperandDescription
varListvariadic of OpenMP-compatible variable type

omp.loop_nest (omp::LoopNestOp) 

Rectangular loop nest

This operation represents a collapsed rectangular loop nest. For each rectangular loop of the nest represented by an instance of this operation, lower and upper bounds, as well as a step variable, must be defined.

The lower and upper bounds specify a half-open range: the range includes the lower bound but does not include the upper bound. If the inclusive attribute is specified then the upper bound is also included.

The body region can contain any number of blocks. The region is terminated by an omp.yield instruction without operands. The induction variables, represented as entry block arguments to the loop nest operation’s single region, match the types of the lowerBound, upperBound and step arguments.

omp.loop_nest (%i1, %i2) : i32 = (%c0, %c0) to (%c10, %c10) step (%c1, %c1) {
  %a = load %arrA[%i1, %i2] : memref<?x?xf32>
  %b = load %arrB[%i1, %i2] : memref<?x?xf32>
  %sum = arith.addf %a, %b : f32
  store %sum, %arrC[%i1, %i2] : memref<?x?xf32>
  omp.yield
}

This is a temporary simplified definition of a loop based on existing OpenMP loop operations intended to serve as a stopgap solution until the long-term representation of canonical loops is defined. Specifically, this operation is intended to serve as a unique source for loop information during the transition to making omp.distribute, omp.simd, omp.taskloop and omp.wsloop wrapper operations. It is not intended to help with the addition of support for loop transformations, non-rectangular loops and non-perfectly nested loops.

Traits: RecursiveMemoryEffects, SameVariadicOperandSize

Attributes: 

AttributeMLIR TypeDescription
inclusive::mlir::UnitAttrunit attribute

Operands: 

OperandDescription
lowerBoundvariadic of integer or index
upperBoundvariadic of integer or index
stepvariadic of integer or index

omp.map.bounds (omp::MapBoundsOp) 

Represents normalized bounds information for map clauses.

Syntax:

operation ::= `omp.map.bounds` oilist(
              `lower_bound` `(` $lower_bound `:` type($lower_bound) `)`
              | `upper_bound` `(` $upper_bound `:` type($upper_bound) `)`
              | `extent` `(` $extent `:` type($extent) `)`
              | `stride` `(` $stride `:` type($stride) `)`
              | `start_idx` `(` $start_idx `:` type($start_idx) `)`
              ) attr-dict

This operation is a variation on the OpenACC dialects DataBoundsOp. Within the OpenMP dialect it stores the bounds/range of data to be mapped to a device specified by map clauses on target directives. Within the OpenMP dialect, the MapBoundsOp is associated with MapInfoOp, helping to store bounds information for the mapped variable.

It is used to support OpenMP array sectioning, Fortran pointer and allocatable mapping and pointer/allocatable member of derived types. In all cases the MapBoundsOp holds information on the section of data to be mapped. Such as the upper bound and lower bound of the section of data to be mapped. This information is currently utilised by the LLVM-IR lowering to help generate instructions to copy data to and from the device when processing target operations.

The example below copys a section of a 10-element array; all except the first element, utilising OpenMP array sectioning syntax where array subscripts are provided to specify the bounds to be mapped to device. To simplify the examples, the constants are used directly, in reality they will be MLIR SSA values.

C++:

int array[10];
#pragma target map(array[1:9])

=>

omp.map.bounds lower_bound(1) upper_bound(9) extent(9) start_idx(0)

Fortran:

integer :: array(1:10)
!$target map(array(2:10))

=>

omp.map.bounds lower_bound(1) upper_bound(9) extent(9) start_idx(1)

For Fortran pointers and allocatables (as well as those that are members of derived types) the bounds information is provided by the Fortran compiler and runtime through descriptor information.

A basic pointer example can be found below (constants again provided for simplicity, where in reality SSA values will be used, in this case that point to data yielded by Fortran’s descriptors):

Fortran:

integer, pointer :: ptr(:)
allocate(ptr(10))
!$target map(ptr)

=>

omp.map.bounds lower_bound(0) upper_bound(9) extent(10) start_idx(1)

This operation records the bounds information in a normalized fashion (zero-based). This works well with the PointerLikeType requirement in data clauses - since a lower_bound of 0 means looking at data at the zero offset from pointer.

This operation must have an upper_bound or extent (or both are allowed - but not checked for consistency). When the source language’s arrays are not zero-based, the start_idx must specify the zero-position index.

Traits: AttrSizedOperandSegments

Interfaces: NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
stride_in_bytes::mlir::BoolAttrbool attribute

Operands: 

OperandDescription
lower_boundinteger or index
upper_boundinteger or index
extentinteger or index
strideinteger or index
start_idxinteger or index

Results: 

ResultDescription
resultType for representing omp map clause bounds information

omp.map.info (omp::MapInfoOp) 

Syntax:

operation ::= `omp.map.info` `var_ptr` `(` $var_ptr `:` type($var_ptr) `,` $var_type `)`
              oilist(
              `var_ptr_ptr` `(` $var_ptr_ptr `:` type($var_ptr_ptr) `)`
              | `map_clauses` `(` custom<MapClause>($map_type) `)`
              | `capture` `(` custom<CaptureType>($map_capture_type) `)`
              | `members` `(` $members `:` custom<MembersIndex>($members_index) `:` type($members) `)`
              | `bounds` `(` $bounds `)`
              ) `->` type($omp_ptr) attr-dict

The MapInfoOp captures information relating to individual OpenMP map clauses that are applied to certain OpenMP directives such as Target and Target Data.

For example, the map type modifier; such as from, tofrom and to, the variable being captured or the bounds of an array section being mapped.

It can be used to capture both implicit and explicit map information, where explicit is an argument directly specified to an OpenMP map clause or implicit where a variable is utilised in a target region but is defined externally to the target region.

This map information is later used to aid the lowering of the target operations they are attached to providing argument input and output context for kernels generated or the target data mapping environment.

Example (Fortran):

integer :: index
!$target map(to: index)

=>

omp.map.info var_ptr(%index_ssa) map_type(to) map_capture_type(ByRef)
  name(index)

Description of arguments:

  • var_ptr: The address of variable to copy.
  • var_type: The type of the variable to copy.
  • var_ptr_ptr: Used when the variable copied is a member of a class, structure or derived type and refers to the originating struct.
  • members: Used to indicate mapped child members for the current MapInfoOp, represented as other MapInfoOp’s, utilised in cases where a parent structure type and members of the structure type are being mapped at the same time. For example: map(to: parent, parent->member, parent->member2[:10])
  • members_index: Used to indicate the ordering of members within the containing parent (generally a record type such as a structure, class or derived type), e.g. struct {int x, float y, double z}, x would be 0, y would be 1, and z would be 2. This aids the mapping.
  • bounds: Used when copying slices of array’s, pointers or pointer members of objects (e.g. derived types or classes), indicates the bounds to be copied of the variable. When it’s an array slice it is in rank order where rank 0 is the inner-most dimension.
  • ‘map_clauses’: OpenMP map type for this map capture, for example: from, to and always. It’s a bitfield composed of the OpenMP runtime flags stored in OpenMPOffloadMappingFlags.
  • ‘map_capture_type’: Capture type for the variable e.g. this, byref, byvalue, byvla this can affect how the variable is lowered.
  • name: Holds the name of variable as specified in user clause (including bounds).
  • partial_map: The record type being mapped will not be mapped in its entirety, it may be used however, in a mapping to bind it’s mapped components together.

Traits: AttrSizedOperandSegments

Attributes: 

AttributeMLIR TypeDescription
var_type::mlir::TypeAttrany type attribute
members_index::mlir::DenseIntElementsAttrinteger elements attribute
map_type::mlir::IntegerAttr64-bit unsigned integer attribute
map_capture_type::mlir::omp::VariableCaptureKindAttr
variable capture kind

Enum cases:

  • This (This)
  • ByRef (ByRef)
  • ByCopy (ByCopy)
  • VLAType (VLAType)
name::mlir::StringAttrstring attribute
partial_map::mlir::BoolAttrbool attribute

Operands: 

OperandDescription
var_ptrOpenMP-compatible variable type
var_ptr_ptrOpenMP-compatible variable type
membersvariadic of OpenMP-compatible variable type
boundsvariadic of Type for representing omp map clause bounds information

Results: 

ResultDescription
omp_ptrOpenMP-compatible variable type

omp.master (omp::MasterOp) 

Master construct

Syntax:

operation ::= `omp.master` $region attr-dict

The master construct specifies a structured block that is executed by the master thread of the team.

omp.ordered (omp::OrderedOp) 

Ordered construct without region

Syntax:

operation ::= `omp.ordered` ( `depend_type` `` $depend_type_val^ )?
              ( `depend_vec` `(` $depend_vec_vars^ `:` type($depend_vec_vars) `)` )?
              attr-dict

The ordered construct without region is a stand-alone directive that specifies cross-iteration dependences in a doacross loop nest.

The depend_type_val attribute refers to either the DEPEND(SOURCE) clause or the DEPEND(SINK: vec) clause.

The num_loops_val attribute specifies the number of loops in the doacross nest.

The depend_vec_vars is a variadic list of operands that specifies the index of the loop iterator in the doacross nest for the DEPEND(SOURCE) clause or the index of the element of “vec” for the DEPEND(SINK: vec) clause. It contains the operands in multiple “vec” when multiple DEPEND(SINK: vec) clauses exist in one ORDERED directive.

Attributes: 

AttributeMLIR TypeDescription
depend_type_val::mlir::omp::ClauseDependAttr
depend clause

Enum cases:

  • dependsource (dependsource)
  • dependsink (dependsink)
num_loops_val::mlir::IntegerAttr64-bit signless integer attribute whose minimum value is 0

Operands: 

OperandDescription
depend_vec_varsvariadic of any type

omp.ordered.region (omp::OrderedRegionOp) 

Ordered construct with region

Syntax:

operation ::= `omp.ordered.region` ( `simd` $simd^ )? $region attr-dict

The ordered construct with region specifies a structured block in a worksharing-loop, SIMD, or worksharing-loop SIMD region that is executed in the order of the loop iterations.

The simd attribute corresponds to the SIMD clause specified. If it is not present, it behaves as if the THREADS clause is specified or no clause is specified.

Attributes: 

AttributeMLIR TypeDescription
simd::mlir::UnitAttrunit attribute

omp.parallel (omp::ParallelOp) 

Parallel construct

Syntax:

operation ::= `omp.parallel` oilist(
              `if` `(` $if_expr_var `:` type($if_expr_var) `)`
              | `num_threads` `(` $num_threads_var `:` type($num_threads_var) `)`
              | `allocate` `(`
              custom<AllocateAndAllocator>(
              $allocate_vars, type($allocate_vars),
              $allocators_vars, type($allocators_vars)
              ) `)`
              | `proc_bind` `(` custom<ClauseAttr>($proc_bind_val) `)`
              ) custom<ParallelRegion>($region, $reduction_vars, type($reduction_vars),
              $reduction_vars_byref, $reductions, $private_vars,
              type($private_vars), $privatizers) attr-dict

The parallel construct includes a region of code which is to be executed by a team of threads.

The optional $if_expr_var parameter specifies a boolean result of a conditional check. If this value is 1 or is not provided then the parallel region runs as normal, if it is 0 then the parallel region is executed with one thread.

The optional $num_threads_var parameter specifies the number of threads which should be used to execute the parallel region.

The $allocators_vars and $allocate_vars parameters are a variadic list of values that specify the memory allocator to be used to obtain storage for private values.

Reductions can be performed in a parallel construct by specifying reduction accumulator variables in reduction_vars, symbols referring to reduction declarations in the reductions attribute, and whether the reduction variable should be passed into the reduction region by value or by reference in reduction_vars_byref. Each reduction is identified by the accumulator it uses and accumulators must not be repeated in the same reduction. The reduction declaration specifies how to combine the values from each thread into the final value, which is available in the accumulator after all the threads complete.

The optional $proc_bind_val attribute controls the thread affinity for the execution of the parallel region.

Traits: AttrSizedOperandSegments, AutomaticAllocationScope, RecursiveMemoryEffects

Interfaces: LoopWrapperInterface, OutlineableOpenMPOpInterface, ReductionClauseInterface

Attributes: 

AttributeMLIR TypeDescription
reduction_vars_byref::mlir::DenseBoolArrayAttri1 dense array attribute
reductions::mlir::ArrayAttrsymbol ref array attribute
proc_bind_val::mlir::omp::ClauseProcBindKindAttr
ProcBindKind Clause

Enum cases:

  • primary (Primary)
  • master (Master)
  • close (Close)
  • spread (Spread)
privatizers::mlir::ArrayAttrsymbol ref array attribute

Operands: 

OperandDescription
if_expr_var1-bit signless integer
num_threads_varinteger or index
allocate_varsvariadic of any type
allocators_varsvariadic of any type
reduction_varsvariadic of OpenMP-compatible variable type
private_varsvariadic of any type

omp.private (omp::PrivateClauseOp) 

Provides declaration of [first]private logic.

Syntax:

operation ::= `omp.private` $data_sharing_type $sym_name `:` $type
              `alloc` $alloc_region
              (`copy` $copy_region^)?
              (`dealloc` $dealloc_region^)?
              attr-dict

This operation provides a declaration of how to implement the [first]privatization of a variable. The dialect users should provide information about how to create an instance of the type in the alloc region, how to initialize the copy from the original item in the copy region, and if needed, how to deallocate allocated memory in the dealloc region.

Examples:

  • private(x) would be emitted as:
omp.private {type = private} @x.privatizer : !fir.ref<i32> alloc {
^bb0(%arg0: !fir.ref<i32>):
%0 = ... allocate proper memory for the private clone ...
omp.yield(%0 : !fir.ref<i32>)
}
  • firstprivate(x) would be emitted as:
omp.private {type = firstprivate} @x.privatizer : !fir.ref<i32> alloc {
^bb0(%arg0: !fir.ref<i32>):
%0 = ... allocate proper memory for the private clone ...
omp.yield(%0 : !fir.ref<i32>)
} copy {
^bb0(%arg0: !fir.ref<i32>, %arg1: !fir.ref<i32>):
// %arg0 is the original host variable. Same as for `alloc`.
// %arg1 represents the memory allocated in `alloc`.
... copy from host to the privatized clone ....
omp.yield(%arg1 : !fir.ref<i32>)
}
  • private(x) for “allocatables” would be emitted as:
omp.private {type = private} @x.privatizer : !some.type alloc {
^bb0(%arg0: !some.type):
%0 = ... allocate proper memory for the private clone ...
omp.yield(%0 : !fir.ref<i32>)
} dealloc {
^bb0(%arg0: !some.type):
... deallocate allocated memory ...
omp.yield
}

There are no restrictions on the body except for:

  • The alloc & dealloc regions have a single argument.
  • The copy region has 2 arguments.
  • All three regions are terminated by omp.yield ops. The above restrictions and other obvious restrictions (e.g. verifying the type of yielded values) are verified by the custom op verifier. The actual contents of the blocks inside all regions are not verified.

Instances of this op would then be used by ops that model directives that accept data-sharing attribute clauses.

The $sym_name attribute provides a symbol by which the privatizer op can be referenced by other dialect ops.

The $type attribute is the type of the value being privatized.

The $data_sharing_type attribute specifies whether privatizer corresponds to a private or a firstprivate clause.

Traits: IsolatedFromAbove

Attributes: 

AttributeMLIR TypeDescription
sym_name::mlir::StringAttrstring attribute
type::mlir::TypeAttrtype attribute of any type
data_sharing_type::mlir::omp::DataSharingClauseTypeAttr
Type of a data-sharing clause

Enum cases:

  • private (Private)
  • firstprivate (FirstPrivate)

omp.section (omp::SectionOp) 

Section directive

Syntax:

operation ::= `omp.section` $region attr-dict

A section operation encloses a region which represents one section in a sections construct. A section op should always be surrounded by an omp.sections operation.

Traits: HasParent<SectionsOp>

omp.sections (omp::SectionsOp) 

Sections construct

Syntax:

operation ::= `omp.sections` oilist( `reduction` `(`
              custom<ReductionVarList>(
              $reduction_vars, type($reduction_vars), $reductions
              ) `)`
              | `allocate` `(`
              custom<AllocateAndAllocator>(
              $allocate_vars, type($allocate_vars),
              $allocators_vars, type($allocators_vars)
              ) `)`
              | `nowait` $nowait
              ) $region attr-dict

The sections construct is a non-iterative worksharing construct that contains omp.section operations. The omp.section operations are to be distributed among and executed by the threads in a team. Each omp.section is executed once by one of the threads in the team in the context of its implicit task.

Reductions can be performed in a sections construct by specifying reduction accumulator variables in reduction_vars and symbols referring to reduction declarations in the reductions attribute. Each reduction is identified by the accumulator it uses and accumulators must not be repeated in the same reduction. The reduction declaration specifies how to combine the values from each section into the final value, which is available in the accumulator after all the sections complete.

The $allocators_vars and $allocate_vars parameters are a variadic list of values that specify the memory allocator to be used to obtain storage for private values.

The nowait attribute, when present, signifies that there should be no implicit barrier at the end of the construct.

Traits: AttrSizedOperandSegments

Interfaces: ReductionClauseInterface

Attributes: 

AttributeMLIR TypeDescription
reductions::mlir::ArrayAttrsymbol ref array attribute
nowait::mlir::UnitAttrunit attribute

Operands: 

OperandDescription
reduction_varsvariadic of OpenMP-compatible variable type
allocate_varsvariadic of any type
allocators_varsvariadic of any type

omp.simd (omp::SimdOp) 

Simd construct

Syntax:

operation ::= `omp.simd` oilist(`aligned` `(`
              custom<AlignedClause>($aligned_vars, type($aligned_vars),
              $alignment_values) `)`
              |`if` `(` $if_expr `)`
              |`nontemporal` `(`  $nontemporal_vars `:` type($nontemporal_vars) `)`
              |`order` `(` custom<ClauseAttr>($order_val) `)`
              |`simdlen` `(` $simdlen  `)`
              |`safelen` `(` $safelen  `)`
              ) $region attr-dict

The simd construct can be applied to a loop to indicate that the loop can be transformed into a SIMD loop (that is, multiple iterations of the loop can be executed concurrently using SIMD instructions).

The body region can only contain a single block which must contain a single operation and a terminator. The operation must be another compatible loop wrapper or an omp.loop_nest.

The alignment_values attribute additionally specifies alignment of each corresponding aligned operand. Note that $aligned_vars and alignment_values should contain the same number of elements.

When an if clause is present and evaluates to false, the preferred number of iterations to be executed concurrently is one, regardless of whether a simdlen clause is specified.

The optional nontemporal attribute specifies variables which have low temporal locality across the iterations where they are accessed.

The optional order attribute specifies which order the iterations of the associate loops are executed in. Currently the only option for this attribute is “concurrent”.

When a simdlen clause is present, the preferred number of iterations to be executed concurrently is the value provided to the simdlen clause.

The safelen clause specifies that no two concurrent iterations within a SIMD chunk can have a distance in the logical iteration space that is greater than or equal to the value given in the clause.

omp.simd <clauses> {
  omp.loop_nest (%i1, %i2) : index = (%c0, %c0) to (%c10, %c10) step (%c1, %c1) {
    %a = load %arrA[%i1, %i2] : memref<?x?xf32>
    %b = load %arrB[%i1, %i2] : memref<?x?xf32>
    %sum = arith.addf %a, %b : f32
    store %sum, %arrC[%i1, %i2] : memref<?x?xf32>
    omp.yield
  }
  omp.terminator
}

Traits: AttrSizedOperandSegments, RecursiveMemoryEffects, SingleBlockImplicitTerminator<TerminatorOp>, SingleBlock

Interfaces: LoopWrapperInterface

Attributes: 

AttributeMLIR TypeDescription
alignment_values::mlir::ArrayAttr64-bit integer array attribute
order_val::mlir::omp::ClauseOrderKindAttr
OrderKind Clause

Enum cases:

  • concurrent (Concurrent)
simdlen::mlir::IntegerAttr64-bit signless integer attribute whose value is positive
safelen::mlir::IntegerAttr64-bit signless integer attribute whose value is positive

Operands: 

OperandDescription
aligned_varsvariadic of OpenMP-compatible variable type
if_expr1-bit signless integer
nontemporal_varsvariadic of OpenMP-compatible variable type

omp.single (omp::SingleOp) 

Single directive

Syntax:

operation ::= `omp.single` oilist(`allocate` `(`
              custom<AllocateAndAllocator>(
              $allocate_vars, type($allocate_vars),
              $allocators_vars, type($allocators_vars)
              ) `)`
              |`nowait` $nowait
              |`copyprivate` `(`
              custom<CopyPrivateVarList>(
              $copyprivate_vars, type($copyprivate_vars), $copyprivate_funcs
              ) `)`
              ) $region attr-dict

The single construct specifies that the associated structured block is executed by only one of the threads in the team (not necessarily the master thread), in the context of its implicit task. The other threads in the team, which do not execute the block, wait at an implicit barrier at the end of the single construct unless a nowait clause is specified.

If copyprivate variables and functions are specified, then each thread variable is updated with the variable value of the thread that executed the single region, using the specified copy functions.

Traits: AttrSizedOperandSegments

Attributes: 

AttributeMLIR TypeDescription
copyprivate_funcs::mlir::ArrayAttrsymbol ref array attribute
nowait::mlir::UnitAttrunit attribute

Operands: 

OperandDescription
allocate_varsvariadic of any type
allocators_varsvariadic of any type
copyprivate_varsvariadic of OpenMP-compatible variable type

omp.target (omp::TargetOp) 

Target construct

Syntax:

operation ::= `omp.target` oilist( `if` `(` $if_expr `)`
              | `device` `(` $device `:` type($device) `)`
              | `thread_limit` `(` $thread_limit `:` type($thread_limit) `)`
              | `nowait` $nowait
              | `is_device_ptr` `(` $is_device_ptr `:` type($is_device_ptr) `)`
              | `has_device_addr` `(` $has_device_addr `:` type($has_device_addr) `)`
              | `map_entries` `(` custom<MapEntries>($map_operands, type($map_operands)) `)`
              | `private` `(` custom<PrivateList>($private_vars, type($private_vars), $privatizers) `)`
              | `depend` `(` custom<DependVarList>($depend_vars, type($depend_vars), $depends) `)`
              ) $region attr-dict

The target construct includes a region of code which is to be executed on a device.

The optional $if_expr parameter specifies a boolean result of a conditional check. If this value is 1 or is not provided then the target region runs on a device, if it is 0 then the target region is executed on the host device.

The optional $device parameter specifies the device number for the target region.

The optional $thread_limit specifies the limit on the number of threads

The optional $nowait eliminates the implicit barrier so the parent task can make progress even if the target task is not yet completed.

The depends and depend_vars arguments are variadic lists of values that specify the dependencies of this particular target task in relation to other tasks.

The optional $is_device_ptr indicates list items are device pointers.

The optional $has_device_addr indicates that list items already have device addresses, so they may be directly accessed from the target device. This includes array sections.

The optional $map_operands maps data from the task’s environment to the device environment.

TODO: defaultmap, in_reduction

Traits: AttrSizedOperandSegments, IsolatedFromAbove

Interfaces: MapClauseOwningOpInterface, OutlineableOpenMPOpInterface

Attributes: 

AttributeMLIR TypeDescription
depends::mlir::ArrayAttrdepend clause in a target or task construct array
nowait::mlir::UnitAttrunit attribute
privatizers::mlir::ArrayAttrsymbol ref array attribute

Operands: 

OperandDescription
if_expr1-bit signless integer
deviceinteger
thread_limitinteger
depend_varsvariadic of OpenMP-compatible variable type
is_device_ptrvariadic of OpenMP-compatible variable type
has_device_addrvariadic of OpenMP-compatible variable type
map_operandsvariadic of any type
private_varsvariadic of any type

omp.target_data (omp::TargetDataOp) 

Target data construct

Syntax:

operation ::= `omp.target_data` oilist(`if` `(` $if_expr `:` type($if_expr) `)`
              | `device` `(` $device `:` type($device) `)`
              | `map_entries` `(` $map_operands `:` type($map_operands) `)`
              | `use_device_ptr` `(` $use_device_ptr `:` type($use_device_ptr) `)`
              | `use_device_addr` `(` $use_device_addr `:` type($use_device_addr) `)`)
              $region attr-dict

Map variables to a device data environment for the extent of the region.

The omp target data directive maps variables to a device data environment, and defines the lexical scope of the data environment that is created. The omp target data directive can reduce data copies to and from the offloading device when multiple target regions are using the same data.

The optional $if_expr parameter specifies a boolean result of a conditional check. If this value is 1 or is not provided then the target region runs on a device, if it is 0 then the target region is executed on the host device.

The optional $device parameter specifies the device number for the target region.

The optional $use_device_ptr specifies the device pointers to the corresponding list items in the device data environment.

The optional $use_device_addr specifies the address of the objects in the device data enviornment.

The $map_operands specifies the locator-list operands of the map clause.

The $map_types specifies the types and modifiers for the map clause.

TODO: depend clause and map_type_modifier values iterator and mapper.

Traits: AttrSizedOperandSegments

Interfaces: MapClauseOwningOpInterface

Operands: 

OperandDescription
if_expr1-bit signless integer
deviceinteger
use_device_ptrvariadic of OpenMP-compatible variable type
use_device_addrvariadic of OpenMP-compatible variable type
map_operandsvariadic of any type

omp.target_enter_data (omp::TargetEnterDataOp) 

Target enter data construct

Syntax:

operation ::= `omp.target_enter_data` oilist(`if` `(` $if_expr `:` type($if_expr) `)`
              | `device` `(` $device `:` type($device) `)`
              | `nowait` $nowait
              | `map_entries` `(` $map_operands `:` type($map_operands) `)`
              | `depend` `(` custom<DependVarList>($depend_vars, type($depend_vars), $depends) `)`
              ) attr-dict

The target enter data directive specifies that variables are mapped to a device data environment. The target enter data directive is a stand-alone directive.

The optional $if_expr parameter specifies a boolean result of a conditional check. If this value is 1 or is not provided then the target region runs on a device, if it is 0 then the target region is executed on the host device.

The optional $device parameter specifies the device number for the target region.

The optional $nowait eliminates the implicit barrier so the parent task can make progress even if the target task is not yet completed.

The $map_operands specifies the locator-list operands of the map clause.

The $map_types specifies the types and modifiers for the map clause.

The depends and depend_vars arguments are variadic lists of values that specify the dependencies of this particular target task in relation to other tasks.

TODO: map_type_modifier values iterator and mapper.

Traits: AttrSizedOperandSegments

Interfaces: MapClauseOwningOpInterface

Attributes: 

AttributeMLIR TypeDescription
depends::mlir::ArrayAttrdepend clause in a target or task construct array
nowait::mlir::UnitAttrunit attribute

Operands: 

OperandDescription
if_expr1-bit signless integer
deviceinteger
depend_varsvariadic of OpenMP-compatible variable type
map_operandsvariadic of any type

omp.target_exit_data (omp::TargetExitDataOp) 

Target exit data construct

Syntax:

operation ::= `omp.target_exit_data` oilist(`if` `(` $if_expr `:` type($if_expr) `)`
              | `device` `(` $device `:` type($device) `)`
              | `nowait` $nowait
              | `map_entries` `(` $map_operands `:` type($map_operands) `)`
              | `depend` `(` custom<DependVarList>($depend_vars, type($depend_vars), $depends) `)`
              ) attr-dict

The target exit data directive specifies that variables are mapped to a device data environment. The target exit data directive is a stand-alone directive.

The optional $if_expr parameter specifies a boolean result of a conditional check. If this value is 1 or is not provided then the target region runs on a device, if it is 0 then the target region is executed on the host device.

The optional $device parameter specifies the device number for the target region.

The optional $nowait eliminates the implicit barrier so the parent task can make progress even if the target task is not yet completed.

The $map_operands specifies the locator-list operands of the map clause.

The $map_types specifies the types and modifiers for the map clause.

The depends and depend_vars arguments are variadic lists of values that specify the dependencies of this particular target task in relation to other tasks.

TODO: map_type_modifier values iterator and mapper.

Traits: AttrSizedOperandSegments

Interfaces: MapClauseOwningOpInterface

Attributes: 

AttributeMLIR TypeDescription
depends::mlir::ArrayAttrdepend clause in a target or task construct array
nowait::mlir::UnitAttrunit attribute

Operands: 

OperandDescription
if_expr1-bit signless integer
deviceinteger
depend_varsvariadic of OpenMP-compatible variable type
map_operandsvariadic of any type

omp.target_update (omp::TargetUpdateOp) 

Target update construct

Syntax:

operation ::= `omp.target_update` oilist(`if` `(` $if_expr `:` type($if_expr) `)`
              | `device` `(` $device `:` type($device) `)`
              | `nowait` $nowait
              | `motion_entries` `(` $map_operands `:` type($map_operands) `)`
              | `depend` `(` custom<DependVarList>($depend_vars, type($depend_vars), $depends) `)`
              ) attr-dict

The target update directive makes the corresponding list items in the device data environment consistent with their original list items, according to the specified motion clauses. The target update construct is a stand-alone directive.

The optional $if_expr parameter specifies a boolean result of a conditional check. If this value is 1 or is not provided then the target region runs on a device, if it is 0 then the target region is executed on the host device.

The optional $device parameter specifies the device number for the target region.

The optional $nowait eliminates the implicit barrier so the parent task can make progress even if the target task is not yet completed.

We use MapInfoOp to model the motion clauses and their modifiers. Even though the spec differentiates between map-types & map-type-modifiers vs. motion-clauses & motion-modifiers, the motion clauses and their modifiers are a subset of map types and their modifiers. The subset relation is handled in during verification to make sure the restrictions for target update are respected.

The depends and depend_vars arguments are variadic lists of values that specify the dependencies of this particular target task in relation to other tasks.

Traits: AttrSizedOperandSegments

Interfaces: MapClauseOwningOpInterface

Attributes: 

AttributeMLIR TypeDescription
depends::mlir::ArrayAttrdepend clause in a target or task construct array
nowait::mlir::UnitAttrunit attribute

Operands: 

OperandDescription
if_expr1-bit signless integer
deviceinteger
depend_varsvariadic of OpenMP-compatible variable type
map_operandsvariadic of OpenMP-compatible variable type

omp.task (omp::TaskOp) 

Task construct

Syntax:

operation ::= `omp.task` oilist(`if` `(` $if_expr `)`
              |`final` `(` $final_expr `)`
              |`untied` $untied
              |`mergeable` $mergeable
              |`in_reduction` `(`
              custom<ReductionVarList>(
              $in_reduction_vars, type($in_reduction_vars), $in_reductions
              ) `)`
              |`priority` `(` $priority `)`
              |`allocate` `(`
              custom<AllocateAndAllocator>(
              $allocate_vars, type($allocate_vars),
              $allocators_vars, type($allocators_vars)
              ) `)`
              |`depend` `(`
              custom<DependVarList>(
              $depend_vars, type($depend_vars), $depends
              ) `)`
              ) $region attr-dict

The task construct defines an explicit task.

For definitions of “undeferred task”, “included task”, “final task” and “mergeable task”, please check OpenMP Specification.

When an if clause is present on a task construct, and the value of if_expr evaluates to false, an “undeferred task” is generated, and the encountering thread must suspend the current task region, for which execution cannot be resumed until execution of the structured block that is associated with the generated task is completed.

When a final clause is present on a task construct and the final_expr evaluates to true, the generated task will be a “final task”. All task constructs encountered during execution of a final task will generate final and included tasks.

If the untied clause is present on a task construct, any thread in the team can resume the task region after a suspension. The untied clause is ignored if a final clause is present on the same task construct and the final_expr evaluates to true, or if a task is an included task.

When the mergeable clause is present on a task construct, the generated task is a “mergeable task”.

The in_reduction clause specifies that this particular task (among all the tasks in current taskgroup, if any) participates in a reduction.

The priority clause is a hint for the priority of the generated task. The priority is a non-negative integer expression that provides a hint for task execution order. Among all tasks ready to be executed, higher priority tasks (those with a higher numerical value in the priority clause expression) are recommended to execute before lower priority ones. The default priority-value when no priority clause is specified should be assumed to be zero (the lowest priority).

The depends and depend_vars arguments are variadic lists of values that specify the dependencies of this particular task in relation to other tasks.

The allocators_vars and allocate_vars arguments are a variadic list of values that specify the memory allocator to be used to obtain storage for private values.

Traits: AttrSizedOperandSegments, AutomaticAllocationScope

Interfaces: OutlineableOpenMPOpInterface, ReductionClauseInterface

Attributes: 

AttributeMLIR TypeDescription
untied::mlir::UnitAttrunit attribute
mergeable::mlir::UnitAttrunit attribute
in_reductions::mlir::ArrayAttrsymbol ref array attribute
depends::mlir::ArrayAttrdepend clause in a target or task construct array

Operands: 

OperandDescription
if_expr1-bit signless integer
final_expr1-bit signless integer
in_reduction_varsvariadic of OpenMP-compatible variable type
priority32-bit signless integer
depend_varsvariadic of OpenMP-compatible variable type
allocate_varsvariadic of any type
allocators_varsvariadic of any type

omp.taskgroup (omp::TaskgroupOp) 

Taskgroup construct

Syntax:

operation ::= `omp.taskgroup` oilist(`task_reduction` `(`
              custom<ReductionVarList>(
              $task_reduction_vars, type($task_reduction_vars), $task_reductions
              ) `)`
              |`allocate` `(`
              custom<AllocateAndAllocator>(
              $allocate_vars, type($allocate_vars),
              $allocators_vars, type($allocators_vars)
              ) `)`
              ) $region attr-dict

The taskgroup construct specifies a wait on completion of child tasks of the current task and their descendent tasks.

When a thread encounters a taskgroup construct, it starts executing the region. All child tasks generated in the taskgroup region and all of their descendants that bind to the same parallel region as the taskgroup region are part of the taskgroup set associated with the taskgroup region. There is an implicit task scheduling point at the end of the taskgroup region. The current task is suspended at the task scheduling point until all tasks in the taskgroup set complete execution.

The task_reduction clause specifies a reduction among tasks. For each list item, the number of copies is unspecified. Any copies associated with the reduction are initialized before they are accessed by the tasks participating in the reduction. After the end of the region, the original list item contains the result of the reduction.

The allocators_vars and allocate_vars arguments are a variadic list of values that specify the memory allocator to be used to obtain storage for private values.

Traits: AttrSizedOperandSegments, AutomaticAllocationScope

Interfaces: ReductionClauseInterface

Attributes: 

AttributeMLIR TypeDescription
task_reductions::mlir::ArrayAttrsymbol ref array attribute

Operands: 

OperandDescription
task_reduction_varsvariadic of OpenMP-compatible variable type
allocate_varsvariadic of any type
allocators_varsvariadic of any type

omp.taskloop (omp::TaskloopOp) 

Taskloop construct

Syntax:

operation ::= `omp.taskloop` oilist(`if` `(` $if_expr `)`
              |`final` `(` $final_expr `)`
              |`untied` $untied
              |`mergeable` $mergeable
              |`in_reduction` `(`
              custom<ReductionVarList>(
              $in_reduction_vars, type($in_reduction_vars), $in_reductions
              ) `)`
              |`reduction` `(`
              custom<ReductionVarList>(
              $reduction_vars, type($reduction_vars), $reductions
              ) `)`
              |`priority` `(` $priority `:` type($priority) `)`
              |`allocate` `(`
              custom<AllocateAndAllocator>(
              $allocate_vars, type($allocate_vars),
              $allocators_vars, type($allocators_vars)
              ) `)`
              |`grain_size` `(` $grain_size `:` type($grain_size) `)`
              |`num_tasks` `(` $num_tasks `:` type($num_tasks) `)`
              |`nogroup` $nogroup
              ) $region attr-dict

The taskloop construct specifies that the iterations of one or more associated loops will be executed in parallel using explicit tasks. The iterations are distributed across tasks generated by the construct and scheduled to be executed.

The body region can only contain a single block which must contain a single operation and a terminator. The operation must be another compatible loop wrapper or an omp.loop_nest.

omp.taskloop <clauses> {
  omp.loop_nest (%i1, %i2) : index = (%c0, %c0) to (%c10, %c10) step (%c1, %c1) {
    %a = load %arrA[%i1, %i2] : memref<?x?xf32>
    %b = load %arrB[%i1, %i2] : memref<?x?xf32>
    %sum = arith.addf %a, %b : f32
    store %sum, %arrC[%i1, %i2] : memref<?x?xf32>
    omp.yield
  }
  omp.terminator
}

For definitions of “undeferred task”, “included task”, “final task” and “mergeable task”, please check OpenMP Specification.

When an if clause is present on a taskloop construct, and if the if clause expression evaluates to false, undeferred tasks are generated. The use of a variable in an if clause expression of a taskloop construct causes an implicit reference to the variable in all enclosing constructs.

When a final clause is present on a taskloop construct and the final clause expression evaluates to true, the generated tasks will be final tasks. The use of a variable in a final clause expression of a taskloop construct causes an implicit reference to the variable in all enclosing constructs.

If the untied clause is specified, all tasks generated by the taskloop construct are untied tasks.

When the mergeable clause is present on a taskloop construct, each generated task is a mergeable task.

Reductions can be performed in a loop by specifying reduction accumulator variables in reduction_vars or in_reduction_vars and symbols referring to reduction declarations in the reductions or in_reductions attribute. Each reduction is identified by the accumulator it uses and accumulators must not be repeated in the same reduction. The reduction declaration specifies how to combine the values from each iteration into the final value, which is available in the accumulator after the loop completes.

If an in_reduction clause is present on the taskloop construct, the behavior is as if each generated task was defined by a task construct on which an in_reduction clause with the same reduction operator and list items is present. Thus, the generated tasks are participants of a reduction previously defined by a reduction scoping clause.

If a reduction clause is present on the taskloop construct, the behavior is as if a task_reduction clause with the same reduction operator and list items was applied to the implicit taskgroup construct enclosing the taskloop construct. The taskloop construct executes as if each generated task was defined by a task construct on which an in_reduction clause with the same reduction operator and list items is present. Thus, the generated tasks are participants of the reduction defined by the task_reduction clause that was applied to the implicit taskgroup construct.

When a priority clause is present on a taskloop construct, the generated tasks use the priority-value as if it was specified for each individual task. If the priority clause is not specified, tasks generated by the taskloop construct have the default task priority (zero).

The allocators_vars and allocate_vars arguments are a variadic list of values that specify the memory allocator to be used to obtain storage for private values.

If a grainsize clause is present on the taskloop construct, the number of logical loop iterations assigned to each generated task is greater than or equal to the minimum of the value of the grain-size expression and the number of logical loop iterations, but less than two times the value of the grain-size expression.

If num_tasks is specified, the taskloop construct creates as many tasks as the minimum of the num-tasks expression and the number of logical loop iterations. Each task must have at least one logical loop iteration.

By default, the taskloop construct executes as if it was enclosed in a taskgroup construct with no statements or directives outside of the taskloop construct. Thus, the taskloop construct creates an implicit taskgroup region. If the nogroup clause is present, no implicit taskgroup region is created.

Traits: AttrSizedOperandSegments, AutomaticAllocationScope, RecursiveMemoryEffects, SingleBlockImplicitTerminator<TerminatorOp>, SingleBlock

Interfaces: LoopWrapperInterface, ReductionClauseInterface

Attributes: 

AttributeMLIR TypeDescription
untied::mlir::UnitAttrunit attribute
mergeable::mlir::UnitAttrunit attribute
in_reductions::mlir::ArrayAttrsymbol ref array attribute
reductions::mlir::ArrayAttrsymbol ref array attribute
nogroup::mlir::UnitAttrunit attribute

Operands: 

OperandDescription
if_expr1-bit signless integer
final_expr1-bit signless integer
in_reduction_varsvariadic of OpenMP-compatible variable type
reduction_varsvariadic of OpenMP-compatible variable type
priorityinteger or index
allocate_varsvariadic of any type
allocators_varsvariadic of any type
grain_sizeinteger or index
num_tasksinteger or index

omp.taskwait (omp::TaskwaitOp) 

Taskwait construct

Syntax:

operation ::= `omp.taskwait` attr-dict

The taskwait construct specifies a wait on the completion of child tasks of the current task.

omp.taskyield (omp::TaskyieldOp) 

Taskyield construct

Syntax:

operation ::= `omp.taskyield` attr-dict

The taskyield construct specifies that the current task can be suspended in favor of execution of a different task.

omp.teams (omp::TeamsOp) 

Teams construct

Syntax:

operation ::= `omp.teams` oilist(
              `num_teams` `(` ( $num_teams_lower^ `:` type($num_teams_lower) )? `to`
              $num_teams_upper `:` type($num_teams_upper) `)`
              | `if` `(` $if_expr `)`
              | `thread_limit` `(` $thread_limit `:` type($thread_limit) `)`
              | `reduction` `(`
              custom<ReductionVarList>(
              $reduction_vars, type($reduction_vars), $reductions
              ) `)`
              | `allocate` `(`
              custom<AllocateAndAllocator>(
              $allocate_vars, type($allocate_vars),
              $allocators_vars, type($allocators_vars)
              ) `)`
              ) $region attr-dict

The teams construct defines a region of code that triggers the creation of a league of teams. Once created, the number of teams remains constant for the duration of its code region.

The optional $num_teams_upper and $num_teams_lower specify the limit on the number of teams to be created. If only the upper bound is specified, it acts as if the lower bound was set to the same value. It is not supported to set $num_teams_lower if $num_teams_upper is not specified. They define a closed range, where both the lower and upper bounds are included.

If the $if_expr is present and it evaluates to false, the number of teams created is one.

The optional $thread_limit specifies the limit on the number of threads.

The $allocators_vars and $allocate_vars parameters are a variadic list of values that specify the memory allocator to be used to obtain storage for private values.

Traits: AttrSizedOperandSegments, RecursiveMemoryEffects

Interfaces: ReductionClauseInterface

Attributes: 

AttributeMLIR TypeDescription
reductions::mlir::ArrayAttrsymbol ref array attribute

Operands: 

OperandDescription
num_teams_lowerinteger
num_teams_upperinteger
if_expr1-bit signless integer
thread_limitinteger
allocate_varsvariadic of any type
allocators_varsvariadic of any type
reduction_varsvariadic of OpenMP-compatible variable type

omp.terminator (omp::TerminatorOp) 

Terminator for OpenMP regions

Syntax:

operation ::= `omp.terminator` attr-dict

A terminator operation for regions that appear in the body of OpenMP operation. These regions are not expected to return any value so the terminator takes no operands. The terminator op returns control to the enclosing op.

Traits: AlwaysSpeculatableImplTrait, Terminator

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

omp.threadprivate (omp::ThreadprivateOp) 

Threadprivate directive

Syntax:

operation ::= `omp.threadprivate` $sym_addr `:` type($sym_addr) `->` type($tls_addr) attr-dict

The threadprivate directive specifies that variables are replicated, with each thread having its own copy.

The current implementation uses the OpenMP runtime to provide thread-local storage (TLS). Using the TLS feature of the LLVM IR will be supported in future.

This operation takes in the address of a symbol that represents the original variable and returns the address of its TLS. All occurrences of threadprivate variables in a parallel region should use the TLS returned by this operation.

The sym_addr refers to the address of the symbol, which is a pointer to the original variable.

Operands: 

OperandDescription
sym_addrOpenMP-compatible variable type

Results: 

ResultDescription
tls_addrOpenMP-compatible variable type

omp.wsloop (omp::WsloopOp) 

Worksharing-loop construct

Syntax:

operation ::= `omp.wsloop` oilist(`linear` `(`
              custom<LinearClause>($linear_vars, type($linear_vars),
              $linear_step_vars) `)`
              |`schedule` `(`
              custom<ScheduleClause>(
              $schedule_val, $schedule_modifier, $simd_modifier,
              $schedule_chunk_var, type($schedule_chunk_var)) `)`
              |`nowait` $nowait
              |`ordered` `(` $ordered_val `)`
              |`order` `(` custom<ClauseAttr>($order_val) `)`
              ) custom<Wsloop>($region, $reduction_vars, type($reduction_vars),
              $reduction_vars_byref, $reductions) attr-dict

The worksharing-loop construct specifies that the iterations of the loop(s) will be executed in parallel by threads in the current context. These iterations are spread across threads that already exist in the enclosing parallel region.

The body region can only contain a single block which must contain a single operation and a terminator. The operation must be another compatible loop wrapper or an omp.loop_nest.

omp.wsloop <clauses> {
  omp.loop_nest (%i1, %i2) : index = (%c0, %c0) to (%c10, %c10) step (%c1, %c1) {
    %a = load %arrA[%i1, %i2] : memref<?x?xf32>
    %b = load %arrB[%i1, %i2] : memref<?x?xf32>
    %sum = arith.addf %a, %b : f32
    store %sum, %arrC[%i1, %i2] : memref<?x?xf32>
    omp.yield
  }
  omp.terminator
}

The linear_step_vars operand additionally specifies the step for each associated linear operand. Note that the linear_vars and linear_step_vars variadic lists should contain the same number of elements.

Reductions can be performed in a worksharing-loop by specifying reduction accumulator variables in reduction_vars, symbols referring to reduction declarations in the reductions attribute, and whether the reduction variable should be passed by reference or value in reduction_vars_byref. Each reduction is identified by the accumulator it uses and accumulators must not be repeated in the same reduction. A private variable corresponding to the accumulator is used in place of the accumulator inside the body of the worksharing-loop. The reduction declaration specifies how to combine the values from each iteration into the final value, which is available in the accumulator after the loop completes.

The optional schedule_val attribute specifies the loop schedule for this loop, determining how the loop is distributed across the parallel threads. The optional schedule_chunk_var associated with this determines further controls this distribution.

Collapsed loops are represented by the worksharing-loop having a list of indices, bounds and steps where the size of the list is equal to the collapse value.

The nowait attribute, when present, signifies that there should be no implicit barrier at the end of the loop.

The optional ordered_val attribute specifies how many loops are associated with the worksharing-loop construct. The value of zero refers to the ordered clause specified without parameter.

The optional order attribute specifies which order the iterations of the associate loops are executed in. Currently the only option for this attribute is “concurrent”.

Traits: AttrSizedOperandSegments, RecursiveMemoryEffects, SingleBlockImplicitTerminator<TerminatorOp>, SingleBlock

Interfaces: LoopWrapperInterface, ReductionClauseInterface

Attributes: 

AttributeMLIR TypeDescription
reduction_vars_byref::mlir::DenseBoolArrayAttri1 dense array attribute
reductions::mlir::ArrayAttrsymbol ref array attribute
schedule_val::mlir::omp::ClauseScheduleKindAttr
ScheduleKind Clause

Enum cases:

  • static (Static)
  • dynamic (Dynamic)
  • guided (Guided)
  • auto (Auto)
  • runtime (Runtime)
schedule_modifier::mlir::omp::ScheduleModifierAttr
OpenMP Schedule Modifier

Enum cases:

  • none (none)
  • monotonic (monotonic)
  • nonmonotonic (nonmonotonic)
  • simd (simd)
simd_modifier::mlir::UnitAttrunit attribute
nowait::mlir::UnitAttrunit attribute
ordered_val::mlir::IntegerAttr64-bit signless integer attribute whose minimum value is 0
order_val::mlir::omp::ClauseOrderKindAttr
OrderKind Clause

Enum cases:

  • concurrent (Concurrent)

Operands: 

OperandDescription
linear_varsvariadic of any type
linear_step_varsvariadic of 32-bit signless integer
reduction_varsvariadic of OpenMP-compatible variable type
schedule_chunk_varany type

omp.yield (omp::YieldOp) 

Loop yield and termination operation

Syntax:

operation ::= `omp.yield` ( `(` $results^ `:` type($results) `)` )? attr-dict

“omp.yield” yields SSA values from the OpenMP dialect op region and terminates the region. The semantics of how the values are yielded is defined by the parent operation.

Traits: AlwaysSpeculatableImplTrait, HasParent<AtomicUpdateOp, DeclareReductionOp, LoopNestOp, PrivateClauseOp>, ReturnLike, Terminator

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), RegionBranchTerminatorOpInterface

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
resultsvariadic of any type

Attributes 

ClauseCancellationConstructTypeAttr 

CancellationConstructType Clause

Syntax:

#omp.cancellationconstructtype<
  ::mlir::omp::ClauseCancellationConstructType   # value
>

Enum cases:

  • parallel (Parallel)
  • loop (Loop)
  • sections (Sections)
  • taskgroup (Taskgroup)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::ClauseCancellationConstructTypean enum of type ClauseCancellationConstructType

ClauseDependAttr 

depend clause

Syntax:

#omp.clause_depend<
  ::mlir::omp::ClauseDepend   # value
>

Enum cases:

  • dependsource (dependsource)
  • dependsink (dependsink)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::ClauseDependan enum of type ClauseDepend

ClauseRequiresAttr 

requires clauses

Syntax:

#omp.clause_requires<
  ::mlir::omp::ClauseRequires   # value
>

Enum cases:

  • none (none)
  • reverse_offload (reverse_offload)
  • unified_address (unified_address)
  • unified_shared_memory (unified_shared_memory)
  • dynamic_allocators (dynamic_allocators)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::ClauseRequiresan enum of type ClauseRequires

ClauseTaskDependAttr 

depend clause in a target or task construct

Syntax:

#omp.clause_task_depend<
  ::mlir::omp::ClauseTaskDepend   # value
>

Enum cases:

  • taskdependin (taskdependin)
  • taskdependout (taskdependout)
  • taskdependinout (taskdependinout)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::ClauseTaskDependan enum of type ClauseTaskDepend

DataSharingClauseTypeAttr 

Type of a data-sharing clause

Syntax:

#omp.data_sharing_type<
  ::mlir::omp::DataSharingClauseType   # value
>

Enum cases:

  • private (Private)
  • firstprivate (FirstPrivate)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::DataSharingClauseTypean enum of type DataSharingClauseType

DeclareTargetAttr 

Syntax:

#omp.declaretarget<
  DeclareTargetDeviceTypeAttr,   # device_type
  DeclareTargetCaptureClauseAttr   # capture_clause
>

Parameters: 

ParameterC++ typeDescription
device_typeDeclareTargetDeviceTypeAttr
capture_clauseDeclareTargetCaptureClauseAttr

DeclareTargetCaptureClauseAttr 

capture clause

Syntax:

#omp.capture_clause<
  ::mlir::omp::DeclareTargetCaptureClause   # value
>

Enum cases:

  • to (to)
  • link (link)
  • enter (enter)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::DeclareTargetCaptureClausean enum of type DeclareTargetCaptureClause

DeclareTargetDeviceTypeAttr 

device_type clause

Syntax:

#omp.device_type<
  ::mlir::omp::DeclareTargetDeviceType   # value
>

Enum cases:

  • any (any)
  • host (host)
  • nohost (nohost)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::DeclareTargetDeviceTypean enum of type DeclareTargetDeviceType

FlagsAttr 

Syntax:

#omp.flags<
  uint32_t,   # debug_kind
  bool,   # assume_teams_oversubscription
  bool,   # assume_threads_oversubscription
  bool,   # assume_no_thread_state
  bool,   # assume_no_nested_parallelism
  bool,   # no_gpu_lib
  uint32_t   # openmp_device_version
>

Parameters: 

ParameterC++ typeDescription
debug_kinduint32_t
assume_teams_oversubscriptionbool
assume_threads_oversubscriptionbool
assume_no_thread_statebool
assume_no_nested_parallelismbool
no_gpu_libbool
openmp_device_versionuint32_t

ClauseGrainsizeTypeAttr 

GrainsizeType Clause

Syntax:

#omp.grainsizetype<
  ::mlir::omp::ClauseGrainsizeType   # value
>

Enum cases:

  • strict (Strict)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::ClauseGrainsizeTypean enum of type ClauseGrainsizeType

ClauseMemoryOrderKindAttr 

MemoryOrderKind Clause

Syntax:

#omp.memoryorderkind<
  ::mlir::omp::ClauseMemoryOrderKind   # value
>

Enum cases:

  • seq_cst (Seq_cst)
  • acq_rel (Acq_rel)
  • acquire (Acquire)
  • release (Release)
  • relaxed (Relaxed)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::ClauseMemoryOrderKindan enum of type ClauseMemoryOrderKind

ClauseNumTasksTypeAttr 

NumTasksType Clause

Syntax:

#omp.numtaskstype<
  ::mlir::omp::ClauseNumTasksType   # value
>

Enum cases:

  • strict (Strict)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::ClauseNumTasksTypean enum of type ClauseNumTasksType

ClauseOrderKindAttr 

OrderKind Clause

Syntax:

#omp.orderkind<
  ::mlir::omp::ClauseOrderKind   # value
>

Enum cases:

  • concurrent (Concurrent)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::ClauseOrderKindan enum of type ClauseOrderKind

ClauseProcBindKindAttr 

ProcBindKind Clause

Syntax:

#omp.procbindkind<
  ::mlir::omp::ClauseProcBindKind   # value
>

Enum cases:

  • primary (Primary)
  • master (Master)
  • close (Close)
  • spread (Spread)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::ClauseProcBindKindan enum of type ClauseProcBindKind

ClauseScheduleKindAttr 

ScheduleKind Clause

Syntax:

#omp.schedulekind<
  ::mlir::omp::ClauseScheduleKind   # value
>

Enum cases:

  • static (Static)
  • dynamic (Dynamic)
  • guided (Guided)
  • auto (Auto)
  • runtime (Runtime)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::ClauseScheduleKindan enum of type ClauseScheduleKind

ScheduleModifierAttr 

OpenMP Schedule Modifier

Syntax:

#omp.sched_mod<
  ::mlir::omp::ScheduleModifier   # value
>

Enum cases:

  • none (none)
  • monotonic (monotonic)
  • nonmonotonic (nonmonotonic)
  • simd (simd)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::ScheduleModifieran enum of type ScheduleModifier

VariableCaptureKindAttr 

variable capture kind

Syntax:

#omp.variable_capture_kind<
  ::mlir::omp::VariableCaptureKind   # value
>

Enum cases:

  • This (This)
  • ByRef (ByRef)
  • ByCopy (ByCopy)
  • VLAType (VLAType)

Parameters: 

ParameterC++ typeDescription
value::mlir::omp::VariableCaptureKindan enum of type VariableCaptureKind

VersionAttr 

Syntax:

#omp.version<
  uint32_t   # version
>

Parameters: 

ParameterC++ typeDescription
versionuint32_t

Types 

MapBoundsType 

Type for representing omp map clause bounds information

Syntax: !omp.map_bounds_ty