Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions lib/gis/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,13 @@ if(NOT WIN32)
list(FILTER gislib_SRCS EXCLUDE REGEX [[.*/fmode\.c$]])
endif()

find_package(Threads)

set(grass_gis_DEFS "-DGRASS_VERSION_DATE=\"${GRASS_VERSION_DATE}\"")
if(Threads_FOUND)
list(APPEND grass_gis_DEFS "HAVE_PTHREAD")
endif()

Comment on lines +7 to +13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this should be replaced by the addition of check_target(Threads::Threads HAVE_PTHREAD) to this file:

check_target(PROJ::proj HAVE_PROJ_H)

if(MSVC)
set(grass_gis_DEFS "${grass_gis_DEFS};-D_USE_MATH_DEFINES=1")
set(gislib_INCLUDES "../../msvc")
Expand Down
2 changes: 1 addition & 1 deletion lib/gis/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ MODULE_TOPDIR = ../..

LIB = GIS

LIBES = $(OPENMP_LIBPATH) $(OPENMP_LIB)
LIBES = $(OPENMP_LIBPATH) $(OPENMP_LIB) $(PTHREADLIB)
EXTRA_INC = $(ZLIBINCPATH) $(BZIP2INCPATH) $(ZSTDINCPATH) $(PTHREADINCPATH) $(REGEXINCPATH) $(OPENMP_INCPATH)
EXTRA_CFLAGS = $(OPENMP_CFLAGS) -DGRASS_VERSION_DATE=\"'$(GRASS_VERSION_DATE)'\"

Expand Down
38 changes: 36 additions & 2 deletions lib/gis/lrand48.c
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@
*
* \brief GIS Library - Pseudo-random number generation
*
* (C) 2014 by the GRASS Development Team
* (C) 2014, 2025 by the GRASS Development Team
*
* This program is free software under the GNU General Public License
* (>=v2). Read the file COPYING that comes with GRASS for details.
*
* \author Glynn Clements
* \authors Glynn Clements, Maris Nartiss (thread safety)
*/

#include <stdio.h>
Expand All @@ -19,6 +19,10 @@
#include <grass/gis.h>
#include <grass/glocale.h>

#ifdef HAVE_PTHREAD
#include <pthread.h>
#endif

#ifdef HAVE_GETTIMEOFDAY
#include <sys/time.h>
#else
Expand All @@ -41,22 +45,36 @@ static const uint32 b0 = 0xB;

static int seeded;

#ifdef HAVE_PTHREAD
static pthread_mutex_t lrand48_mutex = PTHREAD_MUTEX_INITIALIZER;
#endif

#define LO(x) ((x) & 0xFFFFU)
#define HI(x) ((x) >> 16)

/*!
* \brief Seed the pseudo-random number generator
*
* \param[in] seedval 32-bit integer used to seed the PRNG
*
* This function is thread-safe.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like the following should indicate that this applies only for certain builds. I don't know if you can actually have a combination when the thread safety matters while not having pthreds.

Suggested change
* This function is thread-safe.
* This function is thread-safe (if threading is enabled, i.e., `HAVE_PTHREAD` is defined).

or: ...if pthreads is available,...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree. This makes no sense. Of course there is no thread safety when code is compiled without threads! This is a programmer documentation that is meant to signal other developers that it is safe to call this function from threaded code. Attempts to compile core GRASS without pthreads and calling from a module with pthreads should not be considered in documentation of each function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough.

*/
void G_srand48(long seedval)
{
uint32 x = (uint32) * (unsigned long *)&seedval;

#ifdef HAVE_PTHREAD
pthread_mutex_lock(&lrand48_mutex);
#endif

x2 = (uint16)HI(x);
x1 = (uint16)LO(x);
x0 = (uint16)0x330E;
seeded = 1;

#ifdef HAVE_PTHREAD
pthread_mutex_unlock(&lrand48_mutex);
#endif
}
Comment on lines 64 to 80
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling this from multiple threads will reset the global state. So, while the locking will prevent creation of a broken global state and it will prevent from reading a half-baked state, it will not work as expected. If you call this from multiple threads, they will replace the previous thread random state. Only the main thread should call this function and typically would call it once unless resetting the seed - that should be documented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. I added a note to the documentation. Mutex guards should remain in place, but seed initialization always should be called only from main process/single thread. I called multiple times, it would result in "last wins" state.


/*!
Expand All @@ -65,6 +83,8 @@ void G_srand48(long seedval)
* A weak hash of the current time and PID is generated and used to
* seed the PRNG
*
* This function is thread-safe.
*
* \return generated seed value passed to G_srand48()
*/
long G_srand48_auto(void)
Expand Down Expand Up @@ -104,6 +124,10 @@ long G_srand48_auto(void)

static void G__next(void)
{
#ifdef HAVE_PTHREAD
pthread_mutex_lock(&lrand48_mutex);
#endif

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach only locks the advancing, so there will be no broken state. However, consider this scenario:

  1. in Thread 1, mrand48 calls next
  2. in Thread 1, next locks and does the advancing
  3. in Thread 1, next unlocks and returns
  4. in Thread 2, mrand48 calls next
  5. in Thread 2, next locks and does the advancing
  6. in Thread 2, mrand48 reads the static variable values
  7. in Thread 1, mrand48 reads the static variable values which are the same ones as in Thread 2, so the Thread 1 values were never used

Alternatively, Thread 1 can be reading the half-baked state which is being overwritten by Thread 2.

Either the locking needs to happen at the level of mrand48-type functions, i.e., include both reading and writing (or lock separately from writing), or G__next (or its wrapper) needs to return the values rather than rely on the global state.

I prefer the solution which limits the global state and uses explicit return values (or even passes a state around), so my choice would be to return (pass) values from G__next. (In a more substantial rewrite scenario, an explicit "context" object would make the sharing of state explicit.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are spot-on about too narrow scope of mutex. I changed code to include also consumption of state inside the mutex.

The new implementation makes RNG serial – independently from number of threads calling those functions, number sequence always will be the same.

If G__next would return a state object, each thread would have its own state and, with an identical seed, result in an identical stream of random numbers – not good at all. If state is passed around between calls (e.g. G_lrand48(struct state)) we are even worse as then locking/sharing of state would have to be implemented by each caller itself and we still are facing thread synchronization problem. Did I miss something?

Copy link
Member

@wenzeslaus wenzeslaus Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new implementation makes RNG serial – independently from number of threads calling those functions, number sequence always will be the same.

Awesome, thanks. That's what I had in mind. Least changes in terms of code structure, although most mutex additions.

If G__next would return a state object, each thread would have its own state and, with an identical seed, result in an identical stream of random numbers – not good at all.

This requires starting the different states with different seeds. This works well for different stochastic replicates of the same simulation. It would require additional decisions on how to behave within one raster.

If state is passed around between calls (e.g. G_lrand48(struct state)) we are even worse as then locking/sharing of state would have to be implemented by each caller itself and we still are facing thread synchronization problem.

The mutex would have to be part of that state and access would have to be only through functions, basically an OOP encapsulation. In other words, every variable from the library would have to go to the state.

Edit: That is if we want the locking for the states, but my original idea was that independent states don't need locking because there is nothing shared between them, and they are simply not suitable to be used across threads. In other words, a not thread-safe API meant to be used with different instances in different threads which would be then a thread-safe usage pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: That is if we want the locking for the states, but my original idea was that independent states don't need locking because there is nothing shared between them, and they are simply not suitable to be used across threads. In other words, a not thread-safe API meant to be used with different instances in different threads which would be then a thread-safe usage pattern.

You mean – thread-local storage? I discarded this idea because outcome is totally not reproducible. To make things worse – each thread would have to have its own unique seed value otherwise all of them would produce identical number sequences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread-local storage - whatever the naming, I meant passing a separate state object to each thread.

Reproducible? If each thread gets its own seed, it generates its own sequence. For a given number of threads, there is the same set numbers generated. If each thread is assigned the same rows every time it runs, the raster gets the same numbers on each position.

Each thread getting its own seed - yes, that's what I'm saying would be needed to get reproducible results. For a raster, it further depends on how threads are assigned rows, if that depends on thread's performance or it is determined ahead of time based on the number of threads (I think that's what we have in the r.mapcalc code). To make the results not only reproducible for seed+nprocs, but for seed and any nprocs number, we would have to have a separate seed for each row so that it is independent on number of threads (so each row seeded with seed plus row number).

In any case, this something extra. My point here was the missing mutex.

However, the discussion also connects to the issue of the currently generated sequence. What throws me off is your mention of discarding the idea of thread-local storage because outcome is not reproducible. In your current implementation, the number sequence is the same for every run and number of threads, but how they are distributed in practice within a raster depends on the order execution of the threads, i.e., it may differ for every run even with the same number of threads.

So, even your current implementation does not offer reproducibility. Your comment about guarantees, outcomes, and reproducibility suggests we are in agreement on that. Is that correct? I just want to be sure we are clear about what to expect from the code now.

uint32 a0x0 = a0 * x0;
uint32 a0x1 = a0 * x1;
uint32 a0x2 = a0 * x2;
Expand All @@ -123,11 +147,17 @@ static void G__next(void)
x1 = (uint16)LO(y1);
y2 += HI(y1);
x2 = (uint16)LO(y2);

#ifdef HAVE_PTHREAD
pthread_mutex_unlock(&lrand48_mutex);
#endif
}

/*!
* \brief Generate an integer in the range [0, 2^31)
*
* This function is thread-safe.
*
* \return the generated value
*/
long G_lrand48(void)
Expand All @@ -142,6 +172,8 @@ long G_lrand48(void)
/*!
* \brief Generate an integer in the range [-2^31, 2^31)
*
* This function is thread-safe.
*
* \return the generated value
*/
long G_mrand48(void)
Expand All @@ -156,6 +188,8 @@ long G_mrand48(void)
/*!
* \brief Generate a floating-point value in the range [0,1)
*
* This function is thread-safe.
*
* \return the generated value
*/
double G_drand48(void)
Expand Down
Loading