Newest 'openmp' Questions

-1 votes

0 answers

38 views

OpenMP schedule(dynamic) fails in RcppEigen on macOS Sequoia 15.6.1 (symbol not found in flat namespace)

I am running a straightforward parallelized computation using RcppEigen and OpenMP. Here is a minimal reproducible example of the code: // [[Rcpp::depends(RcppEigen)]] // [[Rcpp::plugins(openmp)]] #...

Ari.stat

469

asked Dec 4 at 3:57

2 votes

1 answer

147 views

How is a non-parallelized for loop inside an OpenMP parallel section executed?

Consider the following code: #pragma omp parallel for (int run = 0; run < 10; run++) { std::vector<int> out; #pragma omp for for (int i = 0; i < 1'000'000; i++) { ... } } ...

F.X.

7,515

asked Nov 20 at 9:34

Best practices

0 votes

3 replies

96 views

Code design conundrum: runtime polymorphism, templates, compile times, and OpenMP

I'm struggling to finalise the design of my C++17 library. One of the primary goals is to use runtime polymorphism to allow users to extend or rewrite default features of the library for their own use ...

josh_eime

196

asked Nov 16 at 0:02

2 votes

1 answer

117 views

Does the construct `workshare` work with `do-concurrent`?

I'm one of the developers of the Lumen code: https://www.lumen-code.org/. That is computational code for condensed matter physics simulations. We are replacing FORALL with DO CONCURRENT, since FORALL ...

attacc

21

asked Nov 7 at 19:29

0 votes

1 answer

95 views

C++ segmentation fault when throwing in ordered OMP parallel for

the code below crashes with terminate called after throwing an instance of 'std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >' Aborted ...

user1407220

431

asked Oct 31 at 14:13

1 vote

1 answer

114 views

OpenMP in C | How to keep private iterable after loop

So i'm working on some homework related to fixing som buggy code when i ran into an interesting problem. I don't think it was one of the indended bugs because the lecturer was confused by it as well. ...

Tuned Rockets

37

asked Oct 4 at 13:08

-1 votes

0 answers

34 views

Session aborted in stochastic simulation using Rcpp and OpenMP [duplicate]

Overview - simple actuarial model to perform a stochastic simulation on a group of individuals. How it works - In a nutshell, the sequential version works as follows: in each of the nRip iterations, ...

AM A

1

asked Sep 30 at 8:40

3 votes

2 answers

176 views

OpenMP tasks not working in a Fortran code with asymmetric loads

I am optimising a code already parallelised but not very optimised because of the very different duration of operations among the threads, even though all threads have a similar task to do. So I ...

martinit18

41

asked Sep 27 at 0:55

0 votes

0 answers

79 views

Segmentation fault in Fortran OpenMP+OpenACC hybrid parallelization

I'm working on porting the CAMB (Cosmological Boltzmann code) to run with hybrid CPU+GPU parallelization using OpenMP and OpenACC with NVIDIA HPC SDK compilers. The code works perfectly when compiled ...

Sbomba Sbomba

1

asked Sep 25 at 13:07

0 votes

0 answers

82 views

Android NDK Crash in libomp.so at __kmpc_barrier with FluidSynth and OpenMP

I'm developing an Android app using the NDK that plays MIDI files with FluidSynth, integrated with a Java frontend. The app crashes with a native crash in libomp.so at __kmpc_barrier, and I suspect it'...

سامي عبدالله علي المصباحي

3

asked Aug 27 at 19:47

3 votes

2 answers

130 views

How to do user defined reduction on allocatable array and user reduction functions with openMP in C?

I have written some programs with OMP reduction directive in Fortran and in C. But the data types were simple (int, float, arrays with fixed size, ...) and reduction-identifiers used were implicitly ...

Stef1611

2,525

asked Aug 12 at 16:30

2 votes

3 answers

110 views

Open-MP Parallel for (three-dimensional array)

We are working with the following code: int i, j, k; for (i = 2; i < n; i++){ // S1 for (j = 3; j < n - 3; j++){ // S2 for (k = 4; k < n - 4; k++){ // S3 A[...

user31223185

21

asked Aug 6 at 13:17

1 vote

1 answer

138 views

Fortran OpenMP offloading painfully slow on NVIDIA architectures

I am currently trying to porting a big portion of a Fortran code to GPU devices with OpenMP. I have a working version for AMD, specifically for the MI300A which features unified shared memory. I ...

Giorgio Daneri

11

asked Jul 29 at 18:05

1 vote

1 answer

166 views

Is there a way to express dependency on the first iteration of the parallelized loop?

My program needs to perform some heavy calculations on all widgets in the box. The calculations are repeated an appreciable number of times processing multiple variations of each widget. All of the ...

Mikhail T.

4,266

asked Jul 21 at 18:54

0 votes

0 answers

102 views

Why does my Radix Sort with OpenMP+AVX not sort correctly, while the OpenMP-only version works?

I'm implementing a bitwise Radix Sort to sort an array of 64-bit unsigned integers (key_t is uint_fast64_t) that represent encoded points. I have two versions of the algorithm: One using OpenMP only ✅...

Padibel

1

asked Jul 15 at 13:32

3 votes

1 answer

98 views

OpenMP atomic: Difference between write and update

I am reading the OMP documentation (6.0) about atomic operations. As I came across clauses I read about write and update. I understand the difference between the two: write is used to atomically ...

GabrijelOkorn

85

asked Jul 9 at 10:23

0 votes

2 answers

133 views

OpenMP target function with gcc compiler

I am working on a project which performs extensive matrix operations and takes advantage of GPU offloading via OpenMP. Things work well for most of the code, except when I try to run some target-...

Giovanni La Mura

13

asked Jul 4 at 7:52

0 votes

0 answers

78 views

Adding OpenMP reduction clause to loop inside a function

I have a function that contains an OpenMP-parallelized for loop, which calls a callback at each iteration, similar to this: template<class Callback> void iterate(const Callback& callback, ...

tmlen

9,230

asked Jul 3 at 11:36

2 votes

1 answer

88 views

OpenMP behavior of for outside of parallel

I have a function that uses an OpenMP-parallelized for loop in its implementation. It should be possible to turn on/off the parallelization at runtime. Currently it is like this: void iterate(bool ...

tmlen

9,230

asked Jun 23 at 12:25

1 vote

0 answers

120 views

Parallelization of a for loop in a C application: is the same iteration held by both threads?

I'm working on a C code application, running on Windows 10; my CPU is an i7-7500u with 2 cores. My C compiler is anaconda3/Library/bin/gcc.exe and I use the following flags: -O2 -march=native -...

AmnonJW

11

asked Jun 22 at 17:36

1 vote

2 answers

72 views

How to debug OpenMP scalabilty

I'm trying to implement Conway's Game of Life in C, showing that it can scale with OpenMP. The results in the following picture are from running it on a socket of a machine with 64 cores equally ...

pettepiero

13

asked Jun 19 at 14:57

0 votes

2 answers

44 views

OpenMP system variables and environments

I'm currently dealing with some benchmarks that are highly dependent on OpenMP settings. I noticed that the results can vary greatly from setting values for variables such as OMP_PLACES OMP_PROC_BIND ...

Tara Tara

3

asked Jun 19 at 12:48

8 votes

0 answers

131 views

What is the difference between the OpenMP loop and for constructs?

The OpenMP specification names the worksharing-loop construct with syntax #pragma omp for and the loop construct with syntax #pragma omp loop The description in the linked pages sounds a bit like ...

fuz

94.8k

asked Jun 10 at 17:58

0 votes

1 answer

89 views

LLVM OpenMP GPU Offloading Docker Image

I've been trying to get clang OpenMP GPU offloading working in the Docker Image and it builds fine at first (cmdline: ~/llvm_project/llvm/utils/docker:$ bash build_docker_image.sh --source nvidia-...

Florian Becker

123

asked Jun 2 at 10:27

0 votes

0 answers

109 views

How is a multithreaded code (ex using OpenMP) converted to assembly?

For example, When I multithread a "for loop" using OpenMP, making the iterations as different threads, How does it get translated to Assembly? Also can a multithreaded code run on hardware ...

Ali Asgar 3

1

asked Jun 2 at 2:22

3 votes

1 answer

108 views

Does omp parallel for scheduling affect memory access?

I have a found myself most confused by the result of my tests regarding effects of different scheduling for #pragma omp parallel for. Basically my confusion sprouted from the following problem: I have ...

Yas Nas

41

asked May 30 at 14:39

0 votes

0 answers

43 views

OpenMP crashes on Android when APK contains large assets files (100MB+), works fine after removing them

I'm encountering a strange issue where OpenMP parallel code crashes on certain Android devices when the APK contains large files (100MB+) in assets or res/raw. The crash disappears after removing ...

nan

1

asked May 26 at 7:50

1 vote

2 answers

159 views

Is there a way to “split” an OpenMP synchronization barrier?

I have some code that previously looked like: std::vector<MyObject> data; do { // May run 100 times or more prepare(); remaining = compute1(my_data); remaining -= compute2(my_data); ...

user2233709

225

asked May 15 at 9:03

0 votes

0 answers

112 views

Why `mul_mat` in ggml slower than llama.cpp?

I use the following command to compile an executable file for Android: cmake \ -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \ -DANDROID_ABI=arm64-v8a \ -...

XUHAO77

11

asked May 13 at 6:49

0 votes

1 answer

109 views

Strange behavior with pointers in derived types when using OpenMP

I am having trouble identifying a problem in an OpenMP parallelized Fortran code. In doing so, I created a small reproducible that mimics the derived types I am using in the code. The small ...

luismsfern

43

asked May 6 at 16:34

2 votes

0 answers

78 views

OpenMP vs OpenMP API

For a school assignment I need a precise definition of OpenMP. I find the Wikipedia and other sources defective. Specificaly I need to understand what is the difference between OpenMP and OpenMP API. ...

GabrijelOkorn

85

asked May 3 at 15:33

1 vote

0 answers

63 views

Correct usage of multi-dependencies and mutexinoutset in OpenMP

Recent OpenMP standards introduced task multi-dependencies. This feature was originally proposed in OmpSs if I'm not mistaken. I found an example of applying multi-dependencies to the assembly step of ...

IPribec

246

asked Apr 25 at 20:52

1 vote

1 answer

101 views

OpenMP thread affinity not working in Rcpp package: "Affinity not supported on this configuration"

I'm developing an R package using Rcpp with OpenMP for parallelization. I'm facing an issue where threads are randomly spread across CPU cores, leading often to threads beeing scheduled primarily on ...

roh

11

asked Apr 20 at 16:41

1 vote

1 answer

105 views

OpenMP directive in constexpr functions

According to OpenMP specification (2.1. Directive Format) Directives may not appear in constexpr functions or in constant expressions. Variadic parameter packs cannot be expanded into a directive or ...

Chameleon

2,239

asked Apr 17 at 7:41

0 votes

0 answers

47 views

Does "pragma omp parallel for" only allow specific syntax in for loops? [duplicate]

In gcc (GCC) 14.2.0 the following program int main(int argc, char *argv[]) { #pragma omp parallel for for (int i{0}; i < 1; ++i) { } return 0; } does compile with g++ main.cpp (without ...

phinz

1,647

asked Apr 8 at 7:51

1 vote

0 answers

55 views

Setting the stack size for nested parallel regions with Open MP (Fortran)

I'm using Open MP with Fortran 08 (compiled with GFortran) to create nested parallel regions (in my case, a 3-level nest). Before running my executable, I set some Linux environment variables to ...

Frank

11

asked Apr 3 at 18:58

1 vote

0 answers

94 views

Variable declaration in fortran : Pros and Cons between declaration in a block construct or copy in a private OMP Parallel clause?

I wrote these two programs to test OMP PARALLEL: 1 : my_idx is declared at the beginning of the program and after a private copy for eah thread is done. program my_pgm use omp_lib implicit ...

Stef1611

2,525

asked Apr 3 at 16:03

0 votes

0 answers

46 views

Problem with construction BLOCK and openmp in gfortran

I wrote the following program : program my_pgm use omp_lib implicit none CALL OMP_set_dynamic(.FALSE.) CALL OMP_set_num_threads(2) !$OMP PARALLEL DEFAULT(SHARED) ...

Stef1611

2,525

asked Apr 3 at 15:42

1 vote

1 answer

116 views

Why do OpenMP programs run faster on a single process than on multiple processes?

The task was to implement various matrix multiplication algorithms using OpenMP. It turned out that with num_threads(1), the program runs faster than with any other number of threads. Is this due to ...

Илья Анненков

11

asked Apr 3 at 11:52

2 votes

2 answers

97 views

Standard way of calling math functions in C when using OpenMP & its offloading feature(s)?

I am writing some code in C in which I want to add the optional ability to have certain sections of the code accelerated using OpenMP, and with an additional optional ability to have them accelerated ...

Matthew G.

124

asked Mar 30 at 19:01

4 votes

4 answers

428 views

Optimize a separable convolution for SIMD friendly and efficiency

I implemented a simple 2d separable convolution code: void SeparableConvolution2D(const float * restrict mI, float * restrict mO, float * restrict mB, int numRows, int numCols, const float * restrict ...

Royi

5,133

asked Mar 30 at 6:59

0 votes

1 answer

106 views

Getting multiple outputs on Open MPI C++ Code

The following code gives multiple outputs Sum of first 100000 natural numbers: 5000050000 Sum of first 100000 natural numbers: 5000050000 Sum of first 100000 natural numbers: 5000050000 Sum of ...

Anjeneya Swami Kare

1

asked Mar 25 at 5:09

2 votes

1 answer

134 views

How to declare an OpenMP reduction for a std::vector inside a struct?

I'm trying to perform an std::vector sum reduction with an OpenMP reduction declaration for it: // g++ -fopenmp MRE.cpp -o MRE #include <vector> #include <algorithm> #include <omp.h> ...

Stefan de Souza

43

asked Mar 24 at 9:33

0 votes

1 answer

53 views

Openmp lastprivate confusing result

I am getting following results roughly 20% of the execution with the given code and lastprivate clause is not working as expected. Pls note that end value of b should be 50 however at times it is not ...

Muhammad Arshad Islam

9

asked Mar 22 at 6:59

1 vote

0 answers

81 views

loop parallelization with omp

Good morning, I was trying to parallelize a loop with OMP in a program I created for a flood simulator with the goal of achieving a performance boost. The loop I want to parallelize is as follows: #...

NaeSs

11

asked Mar 15 at 15:40

1 vote

3 answers

173 views

Parallel loop in Fortran using OpenMP

I'm exploring OpenMP in Fortran, using the Mandelbrot algorithm as an example: !$omp parallel do reduction(+:iters) do iy=0,30 do ix=0,70 c = cmplx(xmin+stepx*ix, ymin+stepy*iy, qp) ...

Raf

1,789

asked Mar 12 at 13:05

0 votes

1 answer

102 views

File writing while compiling with flag -fopenmp

I'm trying to write to file a certain value, calculated with some parallelized operations using Eigen, in particular with the flag -fopenmp. I've written a code that works perfectly with -g -o, but I'...

Fra_liturri

3

asked Mar 11 at 4:04

6 votes

3 answers

254 views

Using inclusive scan syntax in OpenMP in the C language

I want to use the inclusive scan operation in OpenMP to implement an algorithm. What follows is a description of my attempt at doing so, and failing to get more than a tepid speedup. The inclusive ...

smilingbuddha

14.9k

asked Mar 9 at 2:04

1 vote

1 answer

66 views

OpenMP declare reduction: initializer refers to variable which is not 'omp_priv' nor 'omp_orig'

Leaving out non-essential code: double* max_of_two(double *r,double *n); int npoints = 1000000; double* xy = (double*)malloc(2*npoints*sizeof(double)); double zero_coord[2] = {0.0,0.0}; double ...

Victor Eijkhout

5,990

asked Mar 7 at 17:20

0 votes

1 answer

42 views

In OpenMP, does explicit barrier synchronize implicit tasks in any ways?

In OpenMP specification 6.0, the barrier construct "specifies an explicit barrier at the point at which the construct appears", and "all threads of the team that executes that binding ...

Ke Du

1

asked Feb 27 at 9:52

Collectives™ on Stack Overflow