Skip to content

Commit

Permalink
Non-atomic for UnsortedSegmentCustomKernel
Browse files Browse the repository at this point in the history
  • Loading branch information
amd-jianli12 committed Jan 8, 2025
1 parent 67c5af6 commit 8f2d900
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion tensorflow/core/kernels/segment_reduction_ops_gpu.cu.h
Original file line number Diff line number Diff line change
Expand Up @@ -902,7 +902,7 @@ struct UnsortedSegmentFunctor<GPUDevice, T, Index, InitialValueF, ReductionF> {
config = GetGpuLaunchConfig(data_size, d);
TF_CHECK_OK(GpuLaunchKernel(
UnsortedSegmentCustomKernel<
T, Index, typename ReduceUpdateOpFor<ReductionF>::atomic_op>,
T, Index, typename ReduceUpdateOpFor<ReductionF>::nonatomic_op>,
config.block_count, config.thread_per_block, 0, d.stream(),
input_outer_dim_size, input_inner_dim_size, output_outer_dim_size,
unsorted_segment_ids.data(), data.data(), output.data()));
Expand Down

0 comments on commit 8f2d900

Please sign in to comment.