Skip to content

[STABLE ABI] Porting forced_align #4078

@pearu

Description

@pearu

This issue collects tasks that block porting forced_align/cpu/compute.cpp and forced_align/gpu/compute.cu to use torch stable ABI.

  • expose AT_DISPATCH_FLOATING_TYPES_AND_HALF to stable ABI, currently one need to implement the dispatch logic using switch block. Not a blocker but would be nice to have.
    For a workaround example, see

    switch (logProbs.scalar_type()) {
    case ScalarType::Double: {
    if (targets.scalar_type() == ScalarType::Long) {
    forced_align_impl<double, ScalarType::Long>(logProbs, targets, blank, paths);
    } else if (targets.scalar_type() == ScalarType::Int) {
    forced_align_impl<double, ScalarType::Int>(logProbs, targets, blank, paths);
    } else {
    STD_TORCH_CHECK(false, "unreachable");
    }
    break;
    }
    case ScalarType::Float: {
    if (targets.scalar_type() == ScalarType::Long) {
    forced_align_impl<float, ScalarType::Long>(logProbs, targets, blank, paths);
    } else if (targets.scalar_type() == ScalarType::Int) {
    forced_align_impl<float, ScalarType::Int>(logProbs, targets, blank, paths);
    } else {
    STD_TORCH_CHECK(false, "unreachable");
    }
    break;
    }
    case ScalarType::Half: {
    if (targets.scalar_type() == ScalarType::Long) {
    forced_align_impl<c10::Half, ScalarType::Long>(logProbs, targets, blank, paths);
    } else if (targets.scalar_type() == ScalarType::Int) {
    forced_align_impl<c10::Half, ScalarType::Int>(logProbs, targets, blank, paths);
    } else {
    STD_TORCH_CHECK(false, "unreachable");
    }
    break;
    }
    default: {
    STD_TORCH_CHECK(false, "unreachable");
    }
    };
    return std::make_tuple(paths, logProbs);
    }

  • implement item<T>() as a torch::stable::Tensor template method. A workaround is implemented in

    template <typename T>
    T item(const torch::stable::Tensor& t) {
    STD_TORCH_CHECK(t.numel() == 1, "item requires single element tensor input");
    if (t.is_cpu()) {
    return t.const_data_ptr<T>()[0];
    #ifdef USE_CUDA
    } else if (t.is_cuda()) {
    T value;
    C10_CUDA_CHECK(cudaMemcpyAsync(&value, t.data_ptr(), sizeof(T), cudaMemcpyDeviceToHost));
    return value;
    #endif
    } else {
    STD_TORCH_CHECK(false, "unreachable");
    }
    }

  • implement accessor template as a torch::stable::Tensor template method or replace its usage. Fix available: [STABLE ABI] Add accessor template method to torch::stable::Tensor pytorch#161967
  • expose PackedTensorAccessor32 to stable ABI or replace its usage
    Fix available: [STABLE ABI] Add packed_accessor32 and generic_packed_accessor template methods to torch::stable::Tensor pytorch#161897
  • implement to as a torch::stable::Tensor method
    Workarounds: use cpu() op from [STABLE ABI] Add cpu operation. pytorch#161911 or new_empty/copy_ approach as in

    Tensor pathsCuda = torch::stable::new_empty(paths,
    torchaudio::util::sizes(paths),
    std::nullopt,
    aoti_torch_device_type_cuda(),
    logProbs.get_device_index());
    torch::stable::copy_(pathsCuda, paths);

^ @NicolasHug @scotts @janeyx99

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions