Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda filter demo, cuda-pcl is worse than pcl when I use the VoxelGrid #41

Open
NJUSTzwh opened this issue Apr 22, 2023 · 2 comments
Open

Comments

@NJUSTzwh
Copy link

NJUSTzwh commented Apr 22, 2023




cuda-pcl in PassThrough is better than pcl but in VoxelGrid is not well

@MagicalBrain
Copy link

Your output info make me confused, your NX even slower than my jetson nano(4GB), and it should not be.
The output info of my jetson nano as follows:

./demo 

GPU has cuda devices: 1
----device id: 0 info----
  GPU : NVIDIA Tegra X1 
  Capbility: 5.3
  Global memory: 3956MB
  Const memory: 64KB
  SM in a block: 48KB
  warp size: 32
  threads in a block: 1024
  block dim: (1024,1024,64)
  grid dim: (2147483647,65535,65535)


------------checking CUDA ---------------- 
CUDA Loaded 119978 data points from PCD file with the following fields: x y z

------------checking CUDA PassThrough ---------------- 
CUDA PassThrough by Time: 1.9844 ms.
CUDA PassThrough before filtering: 119978
CUDA PassThrough after filtering: 5110

------------checking CUDA VoxelGrid---------------- 
CUDA VoxelGrid by Time: 35.325 ms.
CUDA VoxelGrid before filtering: 119978
CUDA VoxelGrid after filtering: 3440


------------checking PCL ---------------- 
PCL(CPU) Loaded 119978 data points from PCD file with the following fields: x y z

------------checking PCL(CPU) PassThrough ---------------- 
PCL(CPU) PassThrough by Time: 9.47348 ms.
PointCloud before filtering: 119978 data points (x y z).
PointCloud after filtering: 5110 data points (x y z).

------------checking PCL VoxelGrid---------------- 
PCL VoxelGrid by Time: 24.2884 ms.
PointCloud before filtering: 119978 data points (x y z).
PointCloud after filtering: 3440 data points (x y z).

And when I run the jetson clocks, it will be faster, the output info as follows:

./demo 

GPU has cuda devices: 1
----device id: 0 info----
  GPU : NVIDIA Tegra X1 
  Capbility: 5.3
  Global memory: 3956MB
  Const memory: 64KB
  SM in a block: 48KB
  warp size: 32
  threads in a block: 1024
  block dim: (1024,1024,64)
  grid dim: (2147483647,65535,65535)


------------checking CUDA ---------------- 
CUDA Loaded 119978 data points from PCD file with the following fields: x y z

------------checking CUDA PassThrough ---------------- 
CUDA PassThrough by Time: 1.39955 ms.
CUDA PassThrough before filtering: 119978
CUDA PassThrough after filtering: 5110

------------checking CUDA VoxelGrid---------------- 
CUDA VoxelGrid by Time: 11.9661 ms.
CUDA VoxelGrid before filtering: 119978
CUDA VoxelGrid after filtering: 3440


------------checking PCL ---------------- 
PCL(CPU) Loaded 119978 data points from PCD file with the following fields: x y z

------------checking PCL(CPU) PassThrough ---------------- 
PCL(CPU) PassThrough by Time: 3.32619 ms.
PointCloud before filtering: 119978 data points (x y z).
PointCloud after filtering: 5110 data points (x y z).

------------checking PCL VoxelGrid---------------- 
PCL VoxelGrid by Time: 16.5497 ms.
PointCloud before filtering: 119978 data points (x y z).
PointCloud after filtering: 3440 data points (x y z).

Finally, I don't know why cuda-pcl in PassThrough is better than pcl but in VoxelGrid is not well, but I think maybe that's why pcl remove the cuda support of voxelgrid in pcl-1.13.1.

@qilinhu
Copy link

qilinhu commented Mar 25, 2024

@MagicalBrain hello,I want to ask for advice.
Running machine environment:
image

When I use the official cuFilter demo, the cuda calculation time is basically the same as the official one. As follows:
------------checking CUDA VoxelGrid----------------
CUDA VoxelGrid by Time: 3.20768 ms.
CUDA VoxelGrid before filtering: 119978
CUDA VoxelGrid after filtering: 3440

But when I try to set setP.voxelX, setP.voxelY, and setP.voxelZ to 0.09, the cuda calculation time is much slower, which is not as expected. As follows:
------------checking CUDA VoxelGrid----------------
CUDA VoxelGrid by Time: 3109.65 ms.
CUDA VoxelGrid before filtering: 119978
CUDA VoxelGrid after filtering: 62844

Why is this? Is there any way to solve this situation? In most cases, setP.voxelX, setP.voxelY, and setP.voxelZ cannot always be set to 1. I hope someone can help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants