497 questions
0
votes
0
answers
78
views
Adding OpenMP reduction clause to loop inside a function
I have a function that contains an OpenMP-parallelized for loop, which calls a callback at each iteration, similar to this:
template<class Callback>
void iterate(const Callback& callback, ...
2
votes
1
answer
134
views
How to declare an OpenMP reduction for a std::vector inside a struct?
I'm trying to perform an std::vector sum reduction with an OpenMP reduction declaration for it:
// g++ -fopenmp MRE.cpp -o MRE
#include <vector>
#include <algorithm>
#include <omp.h>
...
1
vote
1
answer
99
views
Why does Algorithms Illuminated use Turing reductions instead of Karp for NP-hardness
The book Algorithms Illuminated Part 4 has the following definition to prove problems NP-hard:
A problem A reduces to another problem B if an algorithm that solves B can be easily translated into one ...
0
votes
1
answer
88
views
Can PyTorch `scatter` or `gather` be used to reproduce `torch_geometric` aggregation functions?
I can't understand if torch.scatter or torch.gather could be used to reduce values of a tensor according to a reduction function over specified indices.
I've frequently used the torch_geometric.nn....
1
vote
1
answer
66
views
OpenMP declare reduction: initializer refers to variable which is not 'omp_priv' nor 'omp_orig'
Leaving out non-essential code:
double* max_of_two(double *r,double *n);
int npoints = 1000000;
double* xy = (double*)malloc(2*npoints*sizeof(double));
double zero_coord[2] = {0.0,0.0};
double ...
0
votes
1
answer
27
views
reduction/conjuction/disjunction functions for OpenCL vector types?
OpenCL offers built-in/intrinsic "vector types" (see table 3 at the link), such as int4 or float2. It also defines binary and unary elementwise operators which accept these types, e.g. ...
2
votes
0
answers
43
views
Use thrust::reduce for multplying a sequence of matrices
I am trying to use a reduction algorithm like thrust::reduce for a sequence of matrices. Let's say I want to do the product of N matrices: A1A2....*AN. I think a reduction algorithm would be great ...
0
votes
0
answers
60
views
Understanding `omp_orig` in a custom OpenMP reduction
I've encountered a bug when using Clang[1] with libomp[2] whereby using omp_priv = omp_orig in the initializer of a custom OpenMP reduction silently gives erroneous output. For example:
/* file.cpp */
...
3
votes
1
answer
105
views
What is the best way to perform parallel reduction to consolidate contributions to a matrix?
I am attempting to parallelise a calculation and consolidate the results into a matrix. A large number of calculations are performed and each one contributes to a summed matrix of all the results.
...
0
votes
0
answers
54
views
Problem with fortran openmp reduction to vector
I've got the following problem, trying to parallelize my code. The simplified code looks like this:
!$omp parallel do private(e, b0_vek) reduction(+:b_vek) schedule(static, chunk_elem)
do e = 1, ...
0
votes
0
answers
188
views
Reducing the Independent Set Problem to Independent Set for 3-Colorable Graphs
I am exploring a reduction from the general Independent Set Problem to the Independent Set Problem specifically for 3-colorable graphs. The goal is to demonstrate that the maximal independent set of a ...
-1
votes
1
answer
103
views
OpenMP code in fortran - Reduction question
The code solves the following equation:
A1(y,bp,kp) = \sum_i (B(y,yp_i)*C(Yp_i,Bp,Kp)*sum_j(D(bpp_j,kpp_j,yp_i,bp,kp)*A0(yp_i,bpp,kpp)))
I have the following code with multiple do-loops. The purpose ...
0
votes
1
answer
101
views
How do parentheses work in Lambda Calculus Reduction?
Okay so I'm just learning some lambda calculus and I came across this problem.
Perform reduction on this - if it cannot be reduced then say it will diverge
(λy.(λx.xx)y)(λx.x)
These are the steps I ...
-2
votes
1
answer
111
views
Cuda reduce kernel result off by 2
I'm following a previous answered question here about how to implement an all reduce in cuda, which links to a slide deck from nvidia. What I have works majority of the time (when the input size is a ...
0
votes
2
answers
944
views
CUDA kernel for finding the min and max index of values in a 1D array greater than particular threshold
I am trying to write a CUDA kernel that is used to find min and max index of values in 1D array greater than a particular threshold
Below is the pseudo code in CPU for doing the same
int min_index = 0,...
2
votes
1
answer
335
views
openMP reduction on struct
I want to find the total sum, minimum and maximum (and their positions) in a matrix using openMP, and more specifically, the reduction clause.
The problem I'm having is that I can't apply reduction ...
0
votes
0
answers
208
views
Range Reduction algorithm for simple precision float
I am trying to implement a small library of mathematical functions for 32-bit floats (simple precision) as part of one of my java projects. When it comes to calculating the sine of very large ...
1
vote
2
answers
257
views
Are there aggregate operations in x86 AVX?
I am try to writing a simple game and I need to study some x86 assemble for vector operation. Use xmm as 4 packed single-precision floating-point, are there any aggregate operations? Such as:
"...
-1
votes
1
answer
167
views
Running Issue on SEctractor
I got this bug while running SExtractor
`----- SExtractor 2.28.0 started on 2023-12-04 at 16:30:12 with 1 thread
Reading Neural Network Weights
Error: SOM file not found: default.som
`
But I didn't ...
0
votes
1
answer
183
views
"Check if a cycle of K nodes exists" reduction to SAT?
I got the following problem statement:
"Given a undirected graph, check if a cycle of K nodes exists."
And I want to take any input and convert it to a Conjunctive Normal Form formula for ...
0
votes
1
answer
293
views
correctly using fplll as c++ library
I'm trying to use the bkz_reduction function of the fplll library in my c++ programm, however, I always get an "undefined reference to `fplll::bkz_reduction(fplll::ZZ_mat<__mpz_struct [1]>&...
-4
votes
1
answer
60
views
Why do I get the wrong results while calculating the mean of the array using openMP?
#include <stdio.h>
#include <omp.h>
#define N 5
int X[N];
int main() {
int num = 0;
int moy = 0;
// Initialize the array (you should populate it as needed)
for (int i = ...
1
vote
0
answers
89
views
OpenMP: Finding the maximum value of array using reduction clause
I have the following function that calculates the maximum value of 2D/3D arrays in a nested for loop. I used reduction clause to gain some additional speedup however I am not getting a good speedup ...
0
votes
1
answer
65
views
How can I fix this problem I have with OpenMP's built-in vector reduction?
I am developing my own implementation of sparse BLAS functions for CSC storage formats. To do so, I created the following data structure:
typedef struct SparseMatrixCSC {
int m; // Number ...
0
votes
1
answer
248
views
Difficult time trying to figure out where the row operations are going wrong. Following the operations given
I am taking a Linear Algebra for Data Science class through DeepLearning.AI, and one of the exercises has a linear system of equation problem, where you define 3 functions (MultiplyRow, AddRows, ...
2
votes
3
answers
312
views
Numpy aggregate across multiple axes
Let's say I have a 3d numpy array.shape of (27,27,27). I want to compress this to (9,9,9) by averaging every 3 elements across every axis simultaneously (e.g. make 3x3x3 pixels into 1x1x1). The ...
1
vote
0
answers
283
views
Recommended approach for performing a reduction on an entire structuredbuffer in HLSL?
What is the standard approach for doing a reduction operation such as computing the maximum, on an entire structured buffer in HLSL?
Context:
I have a HLSL RWStructuredBuffer which I want to normalize ...
0
votes
1
answer
186
views
Using Reduction in nested for loops with OpenMP and Eigen
I am trying to parallelize the following function (an iterative solver) that has a while loop and a nested for loop inside. The code looks like:
static const int nx = 128;
static const int ny = 128;
...
-2
votes
1
answer
244
views
Blockwise/Strided reduction using CUDA
TLDR: I am trying to write a GPU code that computes a blockwise reduction on an array. The input looks like [block_0, trash_0, block_1, trash_1, ..., block_n, trash_n], and I want to compute block_0 + ...
1
vote
0
answers
30
views
Why reduction working in one case but not in another case?
Following are the two code snippets where I am working with OpenMP reduction, in the first case (reduction variable total_num_sp_edges) every time I get the correct result, but in the second case, ...
0
votes
2
answers
285
views
how can I do parallel reduction approach to combine the partial sums in c
I have to do partial sums using parallel reduction approach in C. but I doesn't have any idea about it. So, I need guidance of Community to achieve this.
What I need to achieve: for example, ...
1
vote
1
answer
100
views
OpenMP reductions inside subgrograms
The following Fortran code fails (random result), but replacing the call to mysum by abc=abc+1
gives the correct result. How to make OpenMP recognizing the reduction in a subprogram?
program reduc
...
-1
votes
1
answer
353
views
Openmp reproducible reduction [duplicate]
I have the following example code:
!$omp threadpriavate(var)
!$omp parallel do reduction(+:var)
do
var = var + compilated_floating_point_computation()
end do
!$omp end parallel do
print *,var
And ...
0
votes
0
answers
49
views
MPI_Upack inside userd define operation of MPI_reduce()
I have to send a struct that contains, among other things, a dynamically allocated array of another struct.
The receiver has to merge the received message with its data and then send the result to ...
2
votes
2
answers
124
views
Implementing a reduction operation for finding the maximum difference of numbers?
I am searching for a suitable stream-based reduction operation to find the maximum difference of a double-list. (Please no solutions with old-style nested for-loops...)
Lets say my double list is
List&...
-1
votes
1
answer
57
views
Is there a simple way to reduce this set of tuples in Python
In a set of tuples (pairs), like this one:
s = {(1, 2), (3, 4), (1, 3), ('v', 'n'), ('v', 'k')}
I would like to remove all pairs (a, b) and (a, c), so that the resulting set is:
{(3, 4)}
Is there a ...
1
vote
2
answers
93
views
Postgresql sequential add/remove reduction operation
I have a table with line numbers and either a "define" or an "undefine" event of an identifier. Example:
line_no | def | undef
--------------------
1 | 'a' | NULL
2 | '...
0
votes
1
answer
724
views
sum up an array in the special style of reduction | (i * 2)
I am currently sitting on a java problem I've found online.
We have an array which has several thousand, if not millions, of entries.
the goal is to efficiently get the full sum of the array.
The ...
0
votes
1
answer
60
views
Code for #into: is related to the code for #inject:into: , but must be solving a different problem. Explain the difference?
What does this do, and is there a simpler way to write it?
Collection>>into: a2block
| all pair |
all := ([:allIn :each| allIn key key: allIn value. each])
-> (pair := nil -...
8
votes
1
answer
4k
views
How to get unique elements and their firstly appeared indices of a pytorch tensor?
Assume a 2*X(always 2 rows) pytorch tensor:
A = tensor([[ 1., 2., 2., 3., 3., 3., 4., 4., 4.],
[43., 33., 43., 76., 33., 76., 55., 55., 55.]])
torch.unique(A, dim=1) will return:
...
1
vote
0
answers
502
views
How to do PCA on multiple images by flattening the tensors?
I have an array of tensors for a single image. I want to flatten the vectors and perform PCA on the same.
The below is the code to extract the tensors on a single image :
bottle_neck_model_tensors = ...
0
votes
1
answer
895
views
Thrust is very slow for array reduction
I am trying to use thrust to reduce an array of 1M elements to a single value. My code is as follows:
#include<chrono>
#include<iostream>
#include<thrust/host_vector.h>
#include<...
1
vote
0
answers
491
views
How to select the best splitting criteria in decision trees with multiple best splits?
I wrote a decision tree regressor from scratch in python. It is outperformed by the sklearn algorithm. Both trees build exactly the same splits with the same leaf nodes. BUT when looking for the best ...
0
votes
1
answer
77
views
Speeding up triple loop
Initially I had the loop
import numpy
datos = numpy.random.rand(1000,17)
clusters = 250
n_variables = 17
centros = numpy.random.rand(clusters,n_variables)
desviaciones = numpy.random.rand(n_variables)...
0
votes
0
answers
70
views
Understanding Reductions to show NP-Completeness
I have a homework problem that I am finding difficult to begin. We are working on Karp (single-call) reductions to show intractability. For this assignment, the problem is intentionally vague. I was ...
1
vote
1
answer
101
views
The result of stream().parallel().reduce() is strange [duplicate]
My computer has 4 cores and now I ran the stringList list on the 4 cores using the parallel method and called the reduce method with the value identity = "A". Normally, this list should be ...
0
votes
1
answer
413
views
OpenCL sum `cl_khr_fp64` double values into a single number
From this question and this question I managed to compile a minimal example of summing a vector into a single double inside OpenCL 1.2.
/* https://suhorukov.blogspot.com/2011/12/opencl-11-atomic-...
1
vote
1
answer
957
views
Strategy for reducing CNF-SAT to this problem
Suppose there is a satisfiability problem (call it oscillating-CNF) where the input is a list of CNF clauses and we want to show that this problem is indeed NP-complete (by reducing CNF-SAT to ...
0
votes
1
answer
339
views
last warp loop unrolling in Nvidia's parallel reduction tutorial problem
I ran into a problem for understanding the logic behind "the last warp loop unrolling" technique in Nvidia's parallel reduction tutorial available here.
In case of thread31 (for which tid=31)...
1
vote
0
answers
264
views
How to bind descriptors dynamically
In ManagmentSystem I have a basic descriptor setup, binding attributes by name to the Employee descriptor class in a loop.
I want to do this dynamically, passing values to init to bind them to ...