Parallel Computing Project: Numerical Integration with CUDA

This project explores the implementation and benchmarking of various parallel reduction strategies using CUDA C++ to calculate definite integrals with the trapezoidal rule. It compares the performance of different GPU techniques against each other and against a sequential CPU implementation.

Description

The software calculates the definite integral of a function f(x) over an interval [a, b] by dividing the area under the curve into n trapezoids. The user can select the function to integrate, the integration limits, and the number of trapezoids. Benchmarks are then performed on several GPU implementations and one CPU implementation, measuring execution time, speedup, and efficiency.

Key Features

Numerical Integration: Calculation of definite integrals using the trapezoidal rule.
Dynamic Function Selection: The user can choose from several predefined f(x) functions:
- x * e^(-x) * cos(2x) (Default)
- sin(x)
- x^2
- (Easily extensible to add more functions)
Configurable Parameters: The user can specify integration limits a and b, and the number of subdivisions n.
Implemented GPU Reduction Strategies:
1. AtomicAdd (Global Memory): Atomic summation of partial results in global memory.
2. AtomicAdd (Unified Memory): Same as above, but using unified memory.
3. Reduction (Host Array Global Memory): GPU threads compute f(x_i) and write them to an array in global memory; the final sum occurs on the host.
4. Reduction (Host Array Unified Memory): Similar to the previous, but with unified memory.
5. Reduction (Shared Memory - Tree Structured Sum): Classic tree-structured reduction using shared memory within each block.
6. Reduction (Shared Memory - Warp Shuffle - Tree Structured Sum): Warp-level reduction using __shfl_down_sync instructions, with partial sums from warps combined atomically.
Detailed Benchmarking:
- Comparison of GPU implementations' performance against a GPU baseline (1 thread, 1 block).
- Comparison of GPU implementations' performance against a sequential CPU implementation.
- Tests with various GPU grid and block configurations.
Results Output:
- Benchmark results are saved to CSV files (results/gpu_results.csv, results/cpu_results.csv).
- Properties of the GPU used are saved to results/gpu_property.csv.
Graphical Visualization: A Python script (scripts/plot.py) generates plots for execution time, speedup, and efficiency, saving them to the plots/ directory.
Support Scripts:
- compile.sh: Compiles the source code.
- run.sh: Executes the compiled application and the plotting script.
- compile_and_run.sh: Compiles and runs the application and the plotting script.

Project Structure

PHPC-Consegna1/
├── .gitignore
├── bin/                # Compiled executables
│   └── .gitkeep
├── plots/              # Generated plots
│   └── .gitkeep
├── results/            # CSV files with benchmark results
│   └── .gitkeep
├── scripts/            # Utility scripts
│   ├── compile_and_run.sh
│   ├── compile.sh
│   ├── plot.py
│   └── run.sh
├── src/                # Source code
│   ├── functions.cu
│   ├── functions.h
│   ├── main.cu
│   ├── utils.cu
│   └── utils.h
├── requirements.txt    # Python dependencies (Provided)
└── README.md           # This file

Prerequisites

CUDA Toolkit: Version 9.0 or later (the compile script checks the version).
C++ Compiler: Compatible with C++11 standard (usually provided with the CUDA Toolkit).
NVIDIA GPU: Required for executing CUDA code.
Python 3: For the plot generation script.

Python Virtual Environment

It is highly recommended to use a Python virtual environment to manage dependencies for the plotting script. Using the requirements.txt file present in the project root.

Create and activate the virtual environment: Navigate to the root directory of the PHPC-Consegna1 project in your terminal and run the following commands:

# Create the virtual environment (e.g., named .venv)
python3 -m venv .venv

# Activate the virtual environment
# On Linux/macOS:
source .venv/bin/activate
# On Windows (Git Bash or WSL):
source .venv/Scripts/activate
#On Windows (Command Prompt):
.venv\Scripts\activate.bat
#On Windows (PowerShell):
.venv\Scripts\Activate.ps1

You should see the name of the virtual environment (e.g., (.venv)) in your terminal prompt.

Install dependencies: Once the virtual environment is activated, install the required Python libraries from the provided requirements.txt file:
```
pip install -r requirements.txt
```
Deactivating the virtual environment: When you're done working, you can deactivate the environment by simply typing:
```
deactivate
```
Remember to activate the environment again (source .venv/bin/activate) whenever you want to run the Python plotting script.

Compilation

To compile the project, navigate to the root directory of PHPC-Consegna1 and run:

./scripts/compile.sh

This script will check the CUDA Toolkit version, detect (if possible) the GPU's Compute Capability, and compile the source code, placing the main_app executable in the bin/ directory.

Execution

To compile and run the application (including plot generation) navigate to the root directory of PHPC-Consegna1 and run :

./scripts/compile_and_run.sh

Alternatively, if the code is already compiled, to run only the application and plot generation:

./scripts/run.sh

The application will prompt the user to enter integration parameters and choose the f(x) function to use.

Output

Console: Information about benchmark progress.
CSV Files:
- results/gpu_property.csv: Details about the GPU used.
- results/gpu_results.csv: Execution times, speedup, and efficiency for GPU implementations compared to the GPU baseline.
- results/cpu_results.csv: Execution times, speedup, and efficiency for GPU implementations compared to the CPU implementation.
PNG Plots:
- plots/gpu_execution_time_vs_total_threads.png
- plots/gpu_speedup_vs_total_threads_from_csv.png
- plots/gpu_efficiency_vs_total_threads_from_csv.png
- plots/gpu_speedup_vs_cpu_sequential.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Parallel Computing Project: Numerical Integration with CUDA

Description

Key Features

Project Structure

Prerequisites

Python Virtual Environment

Compilation

Execution

Output

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
bin		bin
plots		plots
results		results
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Redy1908/cuda-numerical-integration-benchmark

Folders and files

Latest commit

History

Repository files navigation

Parallel Computing Project: Numerical Integration with CUDA

Description

Key Features

Project Structure

Prerequisites

Python Virtual Environment

Compilation

Execution

Output

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages