[STABLE ABI] Porting forced_align

This issue collects tasks that block porting [forced_align/cpu/compute.cpp](https://github.com/pytorch/audio/blob/main/src/libtorchaudio/forced_align/cpu/compute.cpp) and [forced_align/gpu/compute.cu](https://github.com/pytorch/audio/blob/main/src/libtorchaudio/forced_align/gpu/compute.cu) to use torch stable ABI.

- [ ] expose `AT_DISPATCH_FLOATING_TYPES_AND_HALF` to stable ABI, currently one need to implement the dispatch logic using switch block. Not a blocker but would be nice to have.
      For a workaround example, see
      <details>
      https://github.com/pytorch/audio/blob/b3a5a0e636e2ff35669807d7f188d0b2a78d5f96/src/libtorchaudio/forced_align/cpu/compute.cpp#L184-L221
      <details> 
- [ ] implement `item<T>()` as a `torch::stable::Tensor` template method. A workaround is implemented in
      <details>
      https://github.com/pytorch/audio/blob/b3a5a0e636e2ff35669807d7f188d0b2a78d5f96/src/libtorchaudio/utils.h#L19-L33
      </details>
- [ ] implement `accessor` template as a `torch::stable::Tensor` template method or replace its usage. Fix available: https://github.com/pytorch/pytorch/pull/161967
- [ ] expose `PackedTensorAccessor32` to stable ABI or replace its usage
      Fix available: https://github.com/pytorch/pytorch/pull/161897
- [ ] implement `to` as a `torch::stable::Tensor` method
      Workarounds: use `cpu()` op from https://github.com/pytorch/pytorch/pull/161911 or `new_empty`/`copy_` approach as in
      <details>
      https://github.com/pytorch/audio/blob/b3a5a0e636e2ff35669807d7f188d0b2a78d5f96/src/libtorchaudio/forced_align/gpu/compute.cu#L335-L340
      </details>

^ @NicolasHug @scotts @janeyx99

	switch (logProbs.scalar_type()) {
	case ScalarType::Double: {
	if (targets.scalar_type() == ScalarType::Long) {
	forced_align_impl<double, ScalarType::Long>(logProbs, targets, blank, paths);
	} else if (targets.scalar_type() == ScalarType::Int) {
	forced_align_impl<double, ScalarType::Int>(logProbs, targets, blank, paths);
	} else {
	STD_TORCH_CHECK(false, "unreachable");
	}
	break;
	}
	case ScalarType::Float: {
	if (targets.scalar_type() == ScalarType::Long) {
	forced_align_impl<float, ScalarType::Long>(logProbs, targets, blank, paths);
	} else if (targets.scalar_type() == ScalarType::Int) {
	forced_align_impl<float, ScalarType::Int>(logProbs, targets, blank, paths);
	} else {
	STD_TORCH_CHECK(false, "unreachable");
	}
	break;
	}
	case ScalarType::Half: {
	if (targets.scalar_type() == ScalarType::Long) {
	forced_align_impl<c10::Half, ScalarType::Long>(logProbs, targets, blank, paths);
	} else if (targets.scalar_type() == ScalarType::Int) {
	forced_align_impl<c10::Half, ScalarType::Int>(logProbs, targets, blank, paths);
	} else {
	STD_TORCH_CHECK(false, "unreachable");
	}
	break;
	}
	default: {
	STD_TORCH_CHECK(false, "unreachable");
	}
	};

	return std::make_tuple(paths, logProbs);
	}

	template <typename T>
	T item(const torch::stable::Tensor& t) {
	STD_TORCH_CHECK(t.numel() == 1, "item requires single element tensor input");
	if (t.is_cpu()) {
	return t.const_data_ptr<T>()[0];
	#ifdef USE_CUDA
	} else if (t.is_cuda()) {
	T value;
	C10_CUDA_CHECK(cudaMemcpyAsync(&value, t.data_ptr(), sizeof(T), cudaMemcpyDeviceToHost));
	return value;
	#endif
	} else {
	STD_TORCH_CHECK(false, "unreachable");
	}
	}

	Tensor pathsCuda = torch::stable::new_empty(paths,
	torchaudio::util::sizes(paths),
	std::nullopt,
	aoti_torch_device_type_cuda(),
	logProbs.get_device_index());
	torch::stable::copy_(pathsCuda, paths);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[STABLE ABI] Porting forced_align #4078

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[STABLE ABI] Porting forced_align #4078

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions