Skip to content

Conversation

@thomasw21
Copy link
Owner

Related to pytorch#66442

def f(x):
    y = torch.sigmoid(x)
    z = torch.sigmoid(x)
    return y + z

def pattern(x):
    return torch.sigmoid(x)

def replacement(x):
    return torch.relu(x)

def comparison(x):
    y = torch.relu(x)
    z = torch.relu(x)
    return y + z

traced = symbolic_trace(f)

subgraph_rewriter.replace_pattern(traced, pattern, replacement) # This replaces only one sigmoid

As shown in the code previously, we allow a single node to match a pattern multiple time. This is need if you look at the traced f:

# Tracing f
.graph():
    %x : [#users=2] = placeholder[target=x]
    %sigmoid : [#users=1] = call_function[target=torch.sigmoid](args = (%x,), kwargs = {})
    %sigmoid_1 : [#users=1] = call_function[target=torch.sigmoid](args = (%x,), kwargs = {})
    %add : [#users=1] = call_function[target=operator.add](args = (%sigmoid, %sigmoid_1), kwargs = {})
    return add

You can see that add aggregates two values.

This is mainly due to the matching algorithm to support on Dict[Node, Node] where the key are pattern nodes and values are nodes living in the graph module graph.

@thomasw21 thomasw21 changed the title [Torch.fx] Fix pattern matching single node multiple times [torch.fx] Fix pattern matching single node multiple times Oct 13, 2021
@thomasw21 thomasw21 changed the title [torch.fx] Fix pattern matching single node multiple times [torch.fx] ~Fix pattern matching single node multiple times~ Oct 13, 2021
@thomasw21 thomasw21 changed the title [torch.fx] ~Fix pattern matching single node multiple times~ [torch.fx] Fix pattern matching single node multiple times Oct 13, 2021
@thomasw21
Copy link
Owner Author

Closing in favor of #2

@thomasw21 thomasw21 closed this Oct 13, 2021
thomasw21 pushed a commit that referenced this pull request Jun 1, 2023
Pass size argument.

<details>
<summary>ASAN report</summary>

```
==1640574==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x609000022160 at pc 0x03ff31a04b42 bp 0x03ff69885dc0 sp 0x03ff69885db0
READ of size 16 at 0x609000022160 thread T1
    #0 0x3ff31a04b41 in at::vec::ZVECTOR::Vectorized<unsigned char, void>::loadu(void const*, int) /home/user/pytorch/aten/src/ATen/cpu/vec/vec256/zarch/vec256_zarch.h:397
    #1 0x3ff31a04b41 in at::vec::ZVECTOR::Vectorized<c10::quint8, void>::loadu(void const*, int) /home/user/pytorch/aten/src/ATen/cpu/vec/vec256/zarch/vec256_zarch.h:1574
    #2 0x3ff31a04b41 in operator() /home/user/pytorch/aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp:2668
    pytorch#3 0x3ff31cefa5d in void at::internal::invoke_parallel<at::native::(anonymous namespace)::quantized_normalize_kernel(at::Tensor const&, at::Tensor const&, at::Tensor const&, bool, int, int, long, long
, double, at::Tensor*)::{lambda()#1}::operator()() const::{lambda()#2}::operator()() const::{lambda(long, long)#1}>(long, long, long, at::native::(anonymous namespace)::quantized_normalize_kernel(at::Tens
or const&, at::Tensor const&, at::Tensor const&, bool, int, int, long, long, double, at::Tensor*)::{lambda()#1}::operator()() const::{lambda()#2}::operator()() const::{lambda(long, long)#1} const&) [clone
 ._omp_fn.0] /home/user/pytorch/aten/src/ATen/ParallelOpenMP.h:42
    pytorch#4 0x3ff6f31f52d in gomp_thread_start /var/tmp/portage/sys-devel/gcc-12.2.1_p20230304/work/gcc-12-20230304/libgomp/team.c:129
    pytorch#5 0x3ff82218381 in start_thread /usr/src/debug/sys-libs/glibc-2.37-r1/glibc-2.37/nptl/pthread_create.c:444
    pytorch#6 0x3ff822943f1  (/lib64/libc.so.6+0x1143f1)

0x609000022160 is located 0 bytes to the right of 32-byte region [0x609000022140,0x609000022160)
allocated by thread T0 here:
    #0 0x3ff82a3663f in __interceptor_posix_memalign /usr/src/debug/sys-devel/gcc-11.3.1_p20230303/gcc-11-20230303/libsanitizer/asan/asan_malloc_linux.cpp:226
    #1 0x3ff6f53ad95 in c10::alloc_cpu(unsigned long) /home/user/pytorch/c10/core/impl/alloc_cpu.cpp:74

Thread T1 created by T0 here:
    #0 0x3ff829dc263 in __interceptor_pthread_create /usr/src/debug/sys-devel/gcc-11.3.1_p20230303/gcc-11-20230303/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x3ff6f31fad5 in gomp_team_start /var/tmp/portage/sys-devel/gcc-12.2.1_p20230304/work/gcc-12-20230304/libgomp/team.c:858

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/user/pytorch/aten/src/ATen/cpu/vec/vec256/zarch/vec256_zarch.h:397 in at::vec::ZVECTOR::Vectorized<unsigned char, void>::loadu(void const*, int)
Shadow bytes around the buggy address:
  0x100c12000043d0: 00 fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c12000043e0: fd fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c12000043f0: fd fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1200004400: fd fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1200004410: fa fa fa fa fa fa fa fa fd fa fa fa fa fa fa fa
=>0x100c1200004420: fa fa fa fa fa fa fa fa 00 00 00 00[fa]fa fa fa
  0x100c1200004430: fa fa fa fa fa fa fa fa fd fd fa fa fa fa fa fa
  0x100c1200004440: fa fa fa fa fa fa fa fa fd fd fa fa fa fa fa fa
  0x100c1200004450: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1200004460: 00 00 fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1200004470: 00 00 fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==1640574==ABORTING
```
</details>

Pull Request resolved: pytorch#101970
Approved by: https://github.com/Skylion007, https://github.com/jgong5
thomasw21 pushed a commit that referenced this pull request Jun 20, 2023
Hi! I found heap-buffer-overflow during PyTorch RPC-module fuzzing.

[crash-9cc26b8da3b688a9c26614481239943b357c5636.zip](https://github.com/pytorch/pytorch/files/11707706/crash-9cc26b8da3b688a9c26614481239943b357c5636.zip)

```
    "==10634==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6060001b6a98 at pc 0x000000639a2e bp 0x7fffffff9100 sp 0x7fffffff90f8",
    "READ of size 4 at 0x6060001b6a98 thread T0",
    "    #0 0x639a2d in c10::IValue::isTensor() const /pytorch/aten/src/ATen/core/ivalue.h:432:27",
    "    #1 0x639a2d in c10::IValue::toTensor() && /pytorch/aten/src/ATen/core/ivalue_inl.h:159:7",
    "    #2 0xc5eb105 in at::Tensor c10::IValue::to<at::Tensor>() && /pytorch/aten/src/ATen/core/ivalue_inl.h:1690:1",
    "    pytorch#3 0xc5eb105 in void torch::jit::pop<at::Tensor>(std::vector<c10::IValue, std::allocator<c10::IValue> >&, at::Tensor&) /pytorch/aten/src/ATen/core/stack.h:130:55",
    "    pytorch#4 0xc5eaedb in torch::jit::dtype(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch/torch/csrc/jit/mobile/promoted_prim_ops.cpp:105:3",
    "    pytorch#5 0xcc79600 in torch::jit::InterpreterStateImpl::runImpl(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch/torch/csrc/jit/runtime/interpreter.cpp:682:13",
    "    pytorch#6 0xcc4158b in torch::jit::InterpreterStateImpl::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch/torch/csrc/jit/runtime/interpreter.cpp:1052:9",
    "    pytorch#7 0x60f378 in runGraph(std::shared_ptr<torch::jit::Graph>, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) /jit_differential.cc:66:38",
    "    pytorch#8 0x610bb9 in LLVMFuzzerTestOneInput /jit_differential.cc:107:25",
    "    pytorch#9 0x535c91 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15",
    "    pytorch#10 0x51fb9c in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6",
    "    pytorch#11 0x5258eb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9",
    "    pytorch#12 0x54eea2 in main /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10",
    "    pytorch#13 0x7ffff7a37082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 1878e6b475720c7c51969e69ab2d276fae6d1dee)",
    "    pytorch#14 0x51a4bd in _start (/jit_differential_fuzz+0x51a4bd)",
    "",
    "0x6060001b6a98 is located 8 bytes to the left of 64-byte region [0x6060001b6aa0,0x6060001b6ae0)",
    "allocated by thread T0 here:",
    "    #0 0x60c66d in operator new(unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/asan/asan_new_delete.cpp:95:3",
    "    #1 0xa5a41b in std::_Vector_base<c10::IValue, std::allocator<c10::IValue> >::_M_allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:346:20",
    "    #2 0xa5a41b in void std::vector<c10::IValue, std::allocator<c10::IValue> >::_M_realloc_insert<c10::IValue&>(__gnu_cxx::__normal_iterator<c10::IValue*, std::vector<c10::IValue, std::allocator<c10::IValue> > >, c10::IValue&) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/vector.tcc:440:33",
    "    pytorch#3 0xa5a241 in c10::IValue& std::vector<c10::IValue, std::allocator<c10::IValue> >::emplace_back<c10::IValue&>(c10::IValue&) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/vector.tcc:121:4",
    "    pytorch#4 0xcc8209c in torch::jit::InterpreterStateImpl::runImpl(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch/torch/csrc/jit/runtime/interpreter.cpp:345:19",
    "    pytorch#5 0xcc4158b in torch::jit::InterpreterStateImpl::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch/torch/csrc/jit/runtime/interpreter.cpp:1052:9",
    "    pytorch#6 0x60f378 in runGraph(std::shared_ptr<torch::jit::Graph>, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) /jit_differential.cc:66:38",
    "    pytorch#7 0x610bb9 in LLVMFuzzerTestOneInput /jit_differential.cc:107:25",
    "    pytorch#8 0x535c91 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15",
    "    pytorch#9 0x51fb9c in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6",
    "    pytorch#10 0x5258eb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9",
    "    pytorch#11 0x54eea2 in main /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10",
    "    pytorch#12 0x7ffff7a37082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 1878e6b475720c7c51969e69ab2d276fae6d1dee)",
    "",
    "SUMMARY: AddressSanitizer: heap-buffer-overflow /pytorch/aten/src/ATen/core/ivalue.h:432:27 in c10::IValue::isTensor() const",
    "Shadow bytes around the buggy address:",
    "  0x0c0c8002ed00: 00 00 00 00 00 00 00 fa fa fa fa fa fd fd fd fd",
    "  0x0c0c8002ed10: fd fd fd fd fa fa fa fa fd fd fd fd fd fd fd fd",
    "  0x0c0c8002ed20: fa fa fa fa fd fd fd fd fd fd fd fd fa fa fa fa",
    "  0x0c0c8002ed30: fd fd fd fd fd fd fd fd fa fa fa fa 00 00 00 00",
    "  0x0c0c8002ed40: 00 00 00 00 fa fa fa fa fd fd fd fd fd fd fd fd",
    "=>0x0c0c8002ed50: fa fa fa[fa]00 00 00 00 00 00 00 00 fa fa fa fa",
    "  0x0c0c8002ed60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c0c8002ed70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c0c8002ed80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c0c8002ed90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "  0x0c0c8002eda0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa",
    "Shadow byte legend (one shadow byte represents 8 application bytes):",
    "  Addressable:           00",
    "  Partially addressable: 01 02 03 04 05 06 07",
    "  Heap left redzone:       fa",
    "  Freed heap region:       fd",
    "  Stack left redzone:      f1",
    "  Stack mid redzone:       f2",
    "  Stack right redzone:     f3",
    "  Stack after return:      f5",
    "  Stack use after scope:   f8",
    "  Global redzone:          f9",
    "  Global init order:       f6",
    "  Poisoned by user:        f7",
    "  Container overflow:      fc",
    "  Array cookie:            ac",
    "  Intra object redzone:    bb",
    "  ASan internal:           fe",
    "  Left alloca redzone:     ca",
    "  Right alloca redzone:    cb",
    "==10634==ABORTING"
```
Pull Request resolved: pytorch#103327
Approved by: https://github.com/Skylion007
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants