[Frontend] Add Comprehensive Core Aten IR Operator Support #637
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
PyTorch exposes more than 2000 operators. Implementing a custom lowering for every single operator in a backend would be prohibitively expensive. Fortunately, PyTorch provides decomposition facilities: complex high-level operators in PyTorch (such as
torch.nn.functional.softmaxortorch.batch_norm) can be decomposed and normalized by the AOTAutograd component into Core Aten IR.According to the official documentation for
torch.compiler_ir(https://docs.pytorch.org/docs/2.8/torch.compiler_ir.html), decomposed graphs are expressed using both Core Aten IR and Prims IR. Compared to Core Aten IR, Prims IR is a lower-level, more fine-grained IR.On PyTorch 2.8.0, this change targets all 191 Core Aten IR operators listed in that document and provides full coverage for them in our backend.
New Operator Implementations
Frontend Core (
frontend.py,operation.py)Opclasses for Core Aten IR operators.LessThanOp→LtTensorOp,GreaterThanOp→GtTensorOp)._embedding_bag_forward_onlyoperator.aot_autograd_decompositionparameter.TOSA Dialect (
tosa.py)abs,log,ceil,floor,log10,log2,log1p,exp2,expm1,sqrt,rsqrt,sign,digamma,lgamma,i0, etc.eq,ne,gt,ge,lt,le(both Tensor and Scalar variants).logical_and,logical_or,logical_xor,logical_not.bitwise_not,bitwise_and,bitwise_or,bitwise_xor.sum,prod,mean,std,var,amin,amax,argmax,argmin,cumsum,cumprod,all,any,norm.avg_pool1d/2d/3d,max_pool1d/3d,adaptive_avg_pool1d/2d/3d,adaptive_max_pool1d/2d.gelu,elu,selu,celu,leaky_relu,hardsigmoid,hardswish,hardtanh,hardshrink,softshrink,softplus,mish,prelu,rrelu,threshold.reshape,squeeze,unsqueeze,permute,transpose,expand,slice,select,gather,scatter,flip,roll,tile,unfold,narrow,split_with_sizes.native_layer_norm,native_group_norm, andnative_batch_normvariants.upsample,pad,clamp,where,masked_fill,sort,topk,embedding,diagonal, etc.Linalg Dialect (
linalg.py)index_opfor cases likeidx0: (1, 1)+idx1: (N,).ops_registrymappings forLtTensorOpandSqueezeDimOp.Math Dialect (
math.py)sinh,cosh,tan,erf,exp2, and other math operations.Test Coverage
tests/Python/AtenOps.DynamoCompilerand use FileCheck for IR validation.Known Issues
In PyTorch 2.8.0, the interaction between AOTAutograd and Aten IR still has some limitations. Under the default
inductor_decompdecomposition rules, certain operators cannot be lowered correctly and require custom decomposition settings:tests/Python/AtenOps/test_reflection_pad1d.pymust setaot_autograd_decomposition=None.Otherwise, the Inductor decompositions produce a complex sequence of
tosa.const+tosa.abs+tosa.sub+linalg.generic+tensor.extract,even though
aten.reflection_pad1dis marked as a Core Aten IR operator in the documentation.In
tests/Python/AtenOps/test_max_pool3d_linalg.py,max_pool3dmust usecore_aten_decompositions()instead of the defaultinductor_decomprules.The Inductor decomposition for
max_pool3dtriggers a tracing error, whilecore_aten_decompositions()keeps the operator intact so that buddy-mlir can directly use its own implementation.These issues are limitations of the current PyTorch 2.8.0 decomposition behavior rather than the backend changes in this PR.