Skip to content

Conversation

@metzm
Copy link
Contributor

@metzm metzm commented Nov 26, 2025

Parallelized r.mapcalc fails if multiple outputs are calculated with several expressions separated by ;. This PR separates the evaluation of the expressions (calculating results) and writing out the results into different loops.

Test case with the NC sample data:

# prepare data
g.region raster=elevation
r.relief input=elevation output=relief zscale=10

expected success with

r.blend -c first=elevation second=relief output=blend percent=75 nprocs=1

without this PR it should fail with nprocs > 1, e.g.

r.blend -c first=elevation second=relief output=blend_nprocs percent=75 nprocs=16

With this PR, r.blend ... nprocs=16 succeeds and the result is identical to r.blend ... nprocs=1

Theoretically, there should be no significant speed penality with this change, but this needs to be tested.

Fixes #6158

Please test!

Still a draft because calls to G_percent() need to be fixed.

@metzm metzm added this to the 8.5.0 milestone Nov 26, 2025
@metzm metzm added bug Something isn't working raster Related to raster data processing C Related code is in C module labels Nov 26, 2025
@nilason
Copy link
Contributor

nilason commented Nov 26, 2025

Just ran a quick test, works fine!

@tmszi
Copy link
Member

tmszi commented Nov 27, 2025

Tested but the output map is different:

  1. r.blend output raster map without your patch (nprocs = 1)
r_blend_nproc_1_without_r_mapcalc_patch

2 . r.blend output raster map with your patch (nprocs >= 1)

r_blend_nproc_1_with_r_mapcalc_patch

@neteler neteler changed the title r.mapcalc: fix multiple outputs wih nprocs > 1 r.mapcalc: fix multiple outputs with nprocs > 1 Nov 27, 2025
@tmszi
Copy link
Member

tmszi commented Nov 27, 2025

Tested again with commit 6a6df84, and result raster map is ok (nprocs >= 1).

@metzm metzm marked this pull request as ready for review November 27, 2025 13:48
@metzm
Copy link
Contributor Author

metzm commented Nov 27, 2025

Speed of r.blend -c is a bit disappointing with a region of a bit more than 200 million cells: highly parallelized with nprocs=16 it takes 1m25sec, single thread is a bit faster with 1m15sec. At least the results are readable and identical.

@wenzeslaus
Copy link
Member

Try nprocs=4. In the tests of other tools, 4 was often optimal (some tools have benchmark - Python script and plots in the documentation).

@metzm
Copy link
Contributor Author

metzm commented Nov 27, 2025

Try nprocs=4. In the tests of other tools, 4 was often optimal (some tools have benchmark - Python script and plots in the documentation).

I tested nprocs=2,4,6,8,16. r.mapcalc is always slower than with nprocs=1.

I have a local version that is faster when parallelized and produces readable (not corrupt) output, but the rows are mixed up. Maybe an approach like for r.series is needed where row buffers are written out after the end of the parallel region?

@nilason
Copy link
Contributor

nilason commented Nov 27, 2025

I'm not quite sure of how this works, but perhaps another approach could be to serialise each ;-divided subexpression, which in turn runs parallelised. Just a thought.

@metzm
Copy link
Contributor Author

metzm commented Nov 27, 2025

I'm not quite sure of how this works, but perhaps another approach could be to serialise each ;-divided subexpression, which in turn runs parallelised. Just a thought.

This would lead to increased input reading times because the idea is that for several expressions as in the r.mapcalc call of r.blend, each row of each input raster needs to be read only once and, more importantly, needs to be decompressed only once for all three expressions. When serializing the different subexpressions, the number of times reading input rows and decompressing these rows is multiplied by the number of expressions.

@metzm
Copy link
Contributor Author

metzm commented Nov 28, 2025

In r.mapcalc as in other OpenMP parallelized modules like series and r.neighbors, the loop over rows is parallelized. In the other modules, the loop over rows is broken up into nprocs chunks. After each subloop, results are written out in the correct row order. Not so in r.mapcalc where the loop over rows is also parallelized, but results are written out in correct row order within and not after the parallelized loop, not afterwards. This requires #pragma omp ordered within the parallelized loop which effectively not only kills parallelization but also leads to total longer processing times together with multitudes longer CPU times.

I prefer to merge this PR as it is because it is a bugfix.

As long as parallelization in r.mapcalc is not properly implemented, I would also recommend to set the default nprocs to 1.

Proper parallelization with OpenMP of r.mapcalc shlould happen in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working C Related code is in C module raster Related to raster data processing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] r.blend (main) fails to create valid data: "Error uncompressing fp raster data for row"

4 participants