Skip to content

Support dynamically quantized 2D convolutions #10248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
244 commits
Select commit Hold shift + click to select a range
0b5b0e8
WIP: add initial support for dq 2D conv
keyprocedure Apr 8, 2025
8fcb117
Permute before quant
keyprocedure Apr 12, 2025
4d064da
Refactor permute code
keyprocedure Apr 12, 2025
2905b98
Corrects input to conv
keyprocedure Apr 12, 2025
0fef04a
Add is_dequant check for trace back when inserting permute
keyprocedure Apr 12, 2025
f8f998c
Fix node identity check
keyprocedure Apr 12, 2025
2efe9bb
Use existing is_dequant check and update atol
keyprocedure Apr 13, 2025
3762e0d
Implement replace_all_uses_with function
keyprocedure Apr 15, 2025
4112c6a
Remove cmake file
keyprocedure Apr 15, 2025
cdd6f2d
Restore original supported conv2d operators
keyprocedure Apr 16, 2025
7150872
Add dynamic quant check before NHWC permute
keyprocedure Apr 16, 2025
6b44c4b
Refactor dq conv2d test
keyprocedure Apr 16, 2025
7054f2e
Revert formatting
keyprocedure Apr 16, 2025
fc48e03
Add check to only annotate dq conv2d
keyprocedure Apr 16, 2025
84b3634
Remove unused import
keyprocedure Apr 16, 2025
62e30e5
Add computation for non-batch dims; remove non-batch dims check
keyprocedure Apr 16, 2025
3c7fe32
Refactor test and imports
keyprocedure Apr 16, 2025
064671b
Update comments
keyprocedure Apr 16, 2025
d5c4970
Support slice ops with default start
pssrawat Apr 6, 2025
c573b6f
Arm backend: Add pytest.mark.flaky on U85 tests in test_mm.py (#9926)
martinlsm Apr 7, 2025
644e55d
Arm backend: Convert assert to throw ValueError in op_exp (#9929)
Sebastian-Larsson Apr 7, 2025
9451faa
Arm backend: Convert assert to raise ValueError for comparison operat…
Sebastian-Larsson Apr 7, 2025
968dec9
Qualcomm AI Engine Direct - Fix mobilebert finetune script (#9927)
shewu-quic Apr 7, 2025
412b1c0
aten.leakyrelu.default Op registery
hossein1387 Apr 7, 2025
24825d9
Update release.yaml (#9891)
metascroy Apr 7, 2025
78f152b
Add code structure and a few other links to CONTRIBUTING.md (#9793)
larryliu0820 Apr 7, 2025
d1e4aa8
Add a path to use quantized gemm from torchao in sdpa
kimishpatel Apr 7, 2025
d2af44b
per_channel_group can't be dynamic
mcr229 Apr 7, 2025
3cb3316
Update Executorch ops registration for rms_norm (#9920)
Vysarat Apr 7, 2025
15dc927
[Easy] Fix numbering typo in Llama README docs (#9936)
Jack-Khuu Apr 7, 2025
7238cda
[ET-VK] Improve packing format for int4 linear operator + misc improv…
SS-JIA Apr 7, 2025
9846436
[ET-VK][ez] Make squeeze insertion requirements more strict (#9950)
SS-JIA Apr 7, 2025
84acf8d
[ET-VK][ez] Allow logit linear layer to be lowered to Vulkan (#9951)
SS-JIA Apr 7, 2025
ec114d9
Expose L4 ops to ExecuTorch Client and add MWA to ExecuTorch Client
derekxu Apr 8, 2025
8166803
introducing filter to etdumpgen
Gasoonjia Apr 8, 2025
80191a5
Arm backend: Add pre-push checks for op tests (#9899)
AdrianLundell Apr 8, 2025
265d5fe
Fix naming convention in quantizer
mcremon-meta Apr 8, 2025
4f6b6e0
Minor fixes on Intro How It Works (#9939)
mergennachin Apr 8, 2025
75190a9
executorch/backends/xnnpack/test/ops
gmagogsfm Apr 8, 2025
edbd2e4
Allow emitting mutable buffer names in schema
JacobSzwejbka Apr 8, 2025
409916e
Update test_ios_ci.sh to use the app from executorch-examples repo (#…
shoumikhin Apr 8, 2025
ff38d9f
android-release-artifacts.yml allow upload AAR to maven (#9954)
kirklandsign Apr 8, 2025
64a21c8
[Release 0.6] update torchao pin (#9947)
metascroy Apr 8, 2025
fb2e9d8
Update test_ios.sh (#9968)
shoumikhin Apr 8, 2025
faad344
Update demo-apps-ios.md (#9970)
shoumikhin Apr 8, 2025
6771afe
Getting Started, compare against reference eager model for verificati…
mergennachin Apr 8, 2025
4235eca
Add python version in building from source page (#9975)
mergennachin Apr 8, 2025
3134050
Delete examples/demo-apps/apple_ios/ExecuTorchDemo directory (#9976)
shoumikhin Apr 8, 2025
a0a9a2a
Add op_amax support (#9955)
cccclai Apr 9, 2025
f61d719
Modify executorch_module_static and executorch_tensor spelling proble…
zxc503 Apr 9, 2025
65787ec
Just build AAR in place
kirklandsign Apr 9, 2025
df57605
[ez][release blocker fix] Insert `linalg_vector_norm` into decomp tab…
SS-JIA Apr 9, 2025
7e34471
Add maven version in main in Getting Started page (#9980)
mergennachin Apr 9, 2025
f51623b
[doc] Fix tokenizer related documentation (#10000)
larryliu0820 Apr 9, 2025
c42a6b9
Fix tokenizer convert in xnnpack_README.md (#10003)
larryliu0820 Apr 9, 2025
7cfc148
using-executorch-android.md: Use an easier example (#10008)
kirklandsign Apr 9, 2025
2b071be
[Doc] Update getting-started-architecture.md (#10027)
iseeyuan Apr 9, 2025
f061ae8
Typo updates for README and Contributions (#10025)
Jack-Khuu Apr 9, 2025
da65dcb
Qualcomm AI Engine Direct - Add submodule quant config setting (#9355)
chunit-quic Apr 10, 2025
a7dddb4
Fix LLM getting-started.md (#10028)
kirklandsign Apr 10, 2025
799d526
Fix deprecated unittest asserts in executorch
itamaro Apr 10, 2025
7567b7e
Add a namespace for ATen mode
larryliu0820 Apr 10, 2025
d797d41
remove reduntant log_delegate_intermediate_logging_helper call
Gasoonjia Apr 10, 2025
a805c77
Arm backend: Set all 16/32 bit sigmoid tests to flaky (#9993)
martinlsm Apr 10, 2025
fca5f85
Arm backend: Raise atol in MobileNetV3 unit tests (#9959)
martinlsm Apr 10, 2025
9e49784
Add examples for CPP demo app (#10022)
mergennachin Apr 10, 2025
1a1bb7a
[arm][ez] Add `xfail` for `norm` tests (#10043)
SS-JIA Apr 10, 2025
c962480
Add link to executorch-examples in module/c++ doc (#10026)
lucylq Apr 10, 2025
9fee71b
Arm Backend: Add Tiny Stories llama test case for TOSA BI (#9996)
ArmRyan Apr 10, 2025
09113d8
[Executorch][SDPA] Refactor + Make quantized sdpa handle sequence at …
pytorchbot Apr 10, 2025
20198b0
[Executorch][llama] Renamed quantized_kv_cache to custom_kv_cache (#1…
pytorchbot Apr 10, 2025
0174b6d
[Executorch][llama] Enable quantized sdpa (#10062)
pytorchbot Apr 10, 2025
0d5b742
Fix up some Docs (#10038)
mcr229 Apr 10, 2025
61c00df
Qualcomm AI Engine Direct - oss model enablement (EfficientSAM) (#9266)
DannyYuyang-quic Apr 10, 2025
0722ec0
memory_planning algos take the specs as inputs instead of calculating…
JacobSzwejbka Apr 10, 2025
57f4dc3
Update kernels readme
manuelcandales Apr 10, 2025
e876a80
[#9971] Gracefully error out in ETDump for get_flatbuffer_scalar_type…
pytorchbot Apr 10, 2025
8845477
Llama exported model from executorch-community (#10064)
mergennachin Apr 10, 2025
bdef23f
Qualcomm AI Engine Direct - Support tile op for different I/O rank (#…
DannyYuyang-quic Apr 10, 2025
ab2a2f3
llama doc update (#10082)
lucylq Apr 10, 2025
2707f96
Update using-executorch-android.md (#10084)
kirklandsign Apr 10, 2025
fa07c0d
Update LlamaDemo AAR build docs (#10086)
kirklandsign Apr 10, 2025
614b35c
Improve android instrumentation test experience and add docs (#9989)
kirklandsign Apr 10, 2025
7e3dfed
Update backends-xnnpack.md (#10024)
metascroy Apr 10, 2025
defd83b
Arm backend: Tosa tools update (#9451)
digantdesai Apr 10, 2025
05fe758
New Contributor Guide documentation for newcomers to ExecuTorch or op…
jhelsby Apr 11, 2025
508a5ae
Remove unused lines in build_android_library.sh (#10088)
kirklandsign Apr 11, 2025
c89d037
Return None as dynamic shape when enable_dynamic_shape is False
larryliu0820 Apr 11, 2025
eca8718
fix lint (#10090)
cccclai Apr 11, 2025
739768d
In quantized sdpa dequant v
kimishpatel Apr 11, 2025
f19069e
Arm backend: Convert asserts to raise errors in op_minimum (#10055)
Sebastian-Larsson Apr 11, 2025
ac89464
Arm backend: Add TOSA support for gt.Scalar and lt.Scalar (#9908)
YufengShi-dudu Apr 11, 2025
2b38695
Arm backend: Add support for sqrt (#9928)
fumchin Apr 11, 2025
0d8006f
Arm backend: Placeholder processing now handles non-persistent buffer…
iliyan-georgiev-arm Apr 11, 2025
51d03bc
Arm backend: Add test_arm_baremetal.sh checks (#10057)
perheld Apr 11, 2025
10e0f89
Arm backend: Convert assert to raise ValueError in op_clamp (#9932)
Sebastian-Larsson Apr 11, 2025
85abbd5
Arm backend: Convert asserts to raise errors in op_maximum (#10102)
Sebastian-Larsson Apr 11, 2025
ce9a98a
Arm backend: Convert assert to raise TypeError in op_sub (#9958)
Sebastian-Larsson Apr 11, 2025
de8fe99
Arm backend: Remove logger.setLevel (#10103)
oscarandersson8218 Apr 11, 2025
93e0535
Arm backend: Convert asserts to raise errors in op_reciprocal (#10105)
Sebastian-Larsson Apr 11, 2025
5a5c481
Arm backend: Add check to not partition ops with float64 input (#10106)
YufengShi-dudu Apr 11, 2025
0e0945e
Arm backend: Convert asserts to raise errors in op_rsqrt (#10107)
Sebastian-Larsson Apr 11, 2025
7649d57
Prevent decomposing RMSNorm in Jarvis (#10074)
Vysarat Apr 11, 2025
02234e8
Remove partitioner/quantizer discussion from readme and link to docs …
metascroy Apr 11, 2025
14b7ba2
allow not memory planning mutable buffers
JacobSzwejbka Apr 11, 2025
3be7833
Create daily AAR snapshot (#10092)
kirklandsign Apr 11, 2025
a85f043
Arm Ethos-u: Buckify Slice tests
digantdesai Apr 11, 2025
4495b63
Arm backend: Add GELU operator (#10109)
iliyan-georgiev-arm Apr 11, 2025
e9e1e86
Corrected New Contributor Guide PR merging details (#10101)
jhelsby Apr 11, 2025
41a0af7
Simplify AAR copy in build_android_library.sh (#10121)
kirklandsign Apr 11, 2025
dd046cf
Update CONTRIBUTING.md and New Contributors Guide to reflect that any…
jhelsby Apr 11, 2025
01a6fa2
llava cmakelists (#10127)
lucylq Apr 12, 2025
d9caa79
Fixup op_slice negative start arguments
3l1 Apr 14, 2025
08e8e17
{Executorch][llm] quantized sdpa. update attn_scores @ v gemm
kimishpatel Apr 14, 2025
a41403b
Arm backend: Added 8 new unit tests for testing various passes. (#9037)
Michiel-Olieslagers Apr 14, 2025
eca61ff
Arm backend: Remove node vistor for full (#9904)
gggekov Apr 14, 2025
8b9662c
Arm backend: Add support for TOSA 1.0 serializer (#10135)
per Apr 14, 2025
3849366
Fix paths in LLaMa project.pbxproj (#10132)
shoumikhin Apr 14, 2025
fe695c9
Make llama model search case insensitive for benchmark app (#10133)
shoumikhin Apr 14, 2025
bf10011
Clone submodules recursively in install_executorch.py (#10131)
shoumikhin Apr 14, 2025
64a362a
LLM custom ops tutorial should direct to general custom ops (#10139)
kirklandsign Apr 14, 2025
1c6264f
[#9971] Gracefully error out in ETDump for set_debug_buffer (#10130)
zhongmingyuan Apr 14, 2025
1ea5aff
Mimi: sqnr and test without streaming
iseeyuan Apr 14, 2025
dd5581c
NXP backend: Add NeutronQuantizer (#9876)
skywall Apr 14, 2025
341e6df
Update demo-apps-ios.md (#10146)
shoumikhin Apr 14, 2025
7cb4860
Update using-executorch-ios.md (#10128)
shoumikhin Apr 14, 2025
0624298
[ET-VK][ez] Support convolutions with padding > 0 and dilation > 1 (#…
pytorchbot Apr 14, 2025
f9423fc
Add memory requirement and clarify image format for llava example
larryliu0820 Apr 14, 2025
72dfc6b
Delete obsolete docs (#10159)
shoumikhin Apr 14, 2025
49918f0
Update doc links to relative markdown files (#10164)
shoumikhin Apr 14, 2025
4bce90f
Update android docs for nightly snapshots
kirklandsign Apr 14, 2025
a306f4f
Update mps_README.md (#10167)
shoumikhin Apr 14, 2025
f76bb50
[doc] Link Hugging Face models to the ExecuTorch doc (#10154)
guangy10 Apr 14, 2025
22c69ff
Update README.md (#10168)
shoumikhin Apr 14, 2025
64ad4cf
Fix compiler warnings in a few places (#10165)
GregoryComer Apr 14, 2025
83dbad6
Add approximate gelu replacement to opt level 2
mcremon-meta Apr 14, 2025
264942c
Add '--recursive' to git submodule update --init (#10178)
lucylq Apr 15, 2025
8fff36c
Consolidate references in docs (#10175)
shoumikhin Apr 15, 2025
ee37fba
Fix android instrumentation (#10125)
kirklandsign Apr 15, 2025
fab63dc
Update instrumentation test docs (#10173)
kirklandsign Apr 15, 2025
2f0f732
Add docs for $BUILD_AAR_DIR (#10174)
kirklandsign Apr 15, 2025
c956664
[Core ML] Improve error logging
cymbalrush Apr 15, 2025
aa6b31d
Fix linter (#10183)
kirklandsign Apr 15, 2025
56f4955
FIx links in docs (#10184)
shoumikhin Apr 15, 2025
34ef971
Update README.md (#10186)
shoumikhin Apr 15, 2025
2484bca
import complex.h from c10
manuelcandales Apr 15, 2025
39fbde9
Refactor internal switch cases
manuelcandales Apr 15, 2025
71349df
Qualcomm AI Engine Direct - Mimi Enablement Stage 2 (#10098)
winskuo-quic Apr 15, 2025
78d747e
Arm backend: Fixing typos (#10189)
wwwind Apr 15, 2025
b3cea58
Arm backend: Add support to ge.Scalar (#10195)
fumchin Apr 15, 2025
a9a0fda
Arm Backend: Add New DecomposeSilu pass to arm_pass_manager (#9448)
ArmRyan Apr 15, 2025
532d593
Update llama cmake for custom ops (#10176)
lucylq Apr 15, 2025
935e9ed
port hardtanh and add hardtanh test
zonglinpeng Apr 15, 2025
f16cf3e
Remove layer norm from the default quantizer, add one that has it
mcremon-meta Apr 15, 2025
be8b7ee
Update pytorch-labs/tokenizers to 295ee78 (#10161)
jathu Apr 15, 2025
6eec7fb
Strip .html suffix from doc links (#10210)
shoumikhin Apr 15, 2025
5b1c2ea
Experiment with private rooted Pixel 3 devices (#10192)
huydhn Apr 15, 2025
d22689d
Run Android release job on ephemeral runners (#10190)
huydhn Apr 15, 2025
f52dcd7
Arm backend: Add support alias_copy operator (#10199)
iliyan-georgiev-arm Apr 15, 2025
ce1c7a6
Add error message for empty string in filedataloader.
JacobSzwejbka Apr 15, 2025
e10883d
Set the default list of models running on private devices (#10217)
huydhn Apr 16, 2025
12ac9d1
Complex Support: bmm
manuelcandales Apr 16, 2025
06b3feb
[0.6 documentation] Fix Page Developer Tools: Bundled Program (#10222)
pytorchbot Apr 16, 2025
94ccfad
add BoxWithNMSLimit_out to DSP as a custom portable op
zonglinpeng Apr 16, 2025
7120bc6
[ET][Testing] Build test_backend_compiler_lib when testing is on (#9953)
mcr229 Apr 16, 2025
a5a9247
[Executorch][to_backend] Introduce preprocess_multimethod (#9823)
mcr229 Apr 16, 2025
947dfab
Android MV2 E2E instrumentation test (#10219)
kirklandsign Apr 16, 2025
3793b36
Qualcomm AI Engine Direct - OSS models breakage fix (#10191)
DannyYuyang-quic Apr 16, 2025
0554b91
Increase Android perf test timeout to 4h (#10232)
huydhn Apr 16, 2025
a68a71d
Use sccache to accelerate android build (#9587)
kirklandsign Apr 16, 2025
dabade9
Android E2E with real input (#10230)
kirklandsign Apr 16, 2025
e29aead
Add quantized kernels to executorch_jni_full
GregoryComer Apr 16, 2025
d6624cd
[ET-VK][ez] Add support for buffer backed qparams in int4 linear + ad…
pytorchbot Apr 16, 2025
d47d0bd
[ET-VK] Allow int4 linear to execute without 8bit buffer support (#10…
pytorchbot Apr 16, 2025
49e4f77
[ET-VK] Add co-op algorithm for 4 bit weight only quantized linear (#…
pytorchbot Apr 16, 2025
bc537e9
Qualcomm AI Engine Direct - Add block quantization to llama (#10225)
chunit-quic Apr 16, 2025
21c37aa
[ET-VK] Use performant tiled algorithm for 4 bit weight only quantize…
pytorchbot Apr 16, 2025
d6e4c18
clean up complex tests
manuelcandales Apr 16, 2025
5589ab4
[ET-VK] Manual sync to fbsource (#10238)
SS-JIA Apr 16, 2025
73be01f
Add redirects for relocated docs (#10221)
GregoryComer Apr 16, 2025
625d47a
[ET-VK] Manual sync native layer norm (#10242)
SS-JIA Apr 16, 2025
9dd1dde
forward fix preprocess multimethod
mcr229 Apr 16, 2025
e43cc9b
Add split_with_sizes to block list
metascroy Apr 16, 2025
f0ed485
Buckify Sigmoid test
3l1 Apr 16, 2025
0e8bb30
Add unit tests for dynamic quant sequential and parallel convs
keyprocedure Apr 20, 2025
85692ee
Add unit test for dynamic quant conv2d with channels-last permute
keyprocedure Apr 20, 2025
877e31b
Add check to determine if node feeds into conv and set non-batch dims…
keyprocedure Apr 20, 2025
206cf78
Add depthwise conv checks for dynamic quant
keyprocedure Apr 20, 2025
de1926d
Fix timespec_get not compatiable for AOSP OS Android N14
derekxu Apr 17, 2025
db70196
reset devtool webpage tutorial
Gasoonjia Apr 17, 2025
32732b7
Update screenshot in using on iOS page (#10256)
shoumikhin Apr 17, 2025
d1775bc
Update LLaMA iOS docs (#10255)
shoumikhin Apr 17, 2025
1bb2665
Update demo-apps-ios.md (#10252)
shoumikhin Apr 17, 2025
3c67d11
Add new dependency library for vulkan tests (#10136)
FFFrog Apr 17, 2025
0a023dc
Qualcomm AI Engine Direct - add op support list (#10253)
haowhsu-quic Apr 17, 2025
f67466f
[ET-VK] Enable auto-generated operator correctness tests and benchmar…
SS-JIA Apr 17, 2025
7e9f370
Use __XTENSA__ in et_pal.cpp
hsharma35 Apr 17, 2025
73b4bad
Revert "Add new dependency library for vulkan tests" (#10273)
SS-JIA Apr 17, 2025
401ac5a
Buckify Tanh test
3l1 Apr 17, 2025
f4fe9e1
Qualcomm AI Engine Direct - Fix the bug in rms_norm builder (#10250)
shewu-quic Apr 17, 2025
271b972
Qualcomm AI Engine Direct - add more profile event (#10227)
haowhsu-quic Apr 17, 2025
bef9a97
fix typo
cccclai Apr 17, 2025
5676250
Fix links in docs (#10277)
shoumikhin Apr 17, 2025
9377a0b
Instruct users to run llama for qnn to the active repro
cccclai Apr 17, 2025
aaef69e
Bump torchao pin, adjust llama export to support pre-quantization via…
metascroy Apr 17, 2025
8acdf56
Fix x86_64 emulator stuck issue and enable tests
kirklandsign Apr 17, 2025
f46e321
Fix bugs in executorch package
danachang Apr 18, 2025
43bddf3
Update CONTRIBUTING.md (#10291)
metascroy Apr 18, 2025
3165aeb
Introduce GenerationConfig
larryliu0820 Apr 18, 2025
e1c153c
Automatically update version name for maven upload (#10290)
kirklandsign Apr 18, 2025
1c74c01
Script to validate URLs (#10289)
shoumikhin Apr 18, 2025
8b6ef57
Remove unused pass and test to replace `linalg.vector_norm`.
hsharma35 Apr 18, 2025
2b3ee97
Add view_as_real_copy.out
pssrawat Apr 18, 2025
85a95b7
Support pre-quantization via torchao quantize_ (#10293)
metascroy Apr 18, 2025
818098e
Runtime API to retrieve attributes
JacobSzwejbka Apr 18, 2025
3db6a2b
Build vulkan+xnnpack AAR (#10301)
kirklandsign Apr 18, 2025
c3e6f78
Script to validate links (#10309)
shoumikhin Apr 18, 2025
49f1a3d
Lint docs before building (#10310)
shoumikhin Apr 18, 2025
feb0a80
Fix docs link in android (#10311)
kirklandsign Apr 18, 2025
5834390
[Android] Use same stats as llm::Stats
kirklandsign Apr 19, 2025
0bf2369
Implement _fft_c2r core ATen op
pssrawat Apr 19, 2025
7d65187
Remove args from LLMEdgeManager and misc cleanup
jackzhxng Apr 19, 2025
4083282
Print actual numel in et_view
larryliu0820 Apr 19, 2025
49d7040
Fix cross-links (#10313)
shoumikhin Apr 19, 2025
34c59c3
retrieve cadence_passes in apply_jarvis_passes
zonglinpeng Apr 20, 2025
26033dd
Add aten_lib to executorch_llama
larryliu0820 Apr 20, 2025
915c1be
Android Linter fix (#10317)
kirklandsign Apr 21, 2025
8040581
Fix URLs (#10316)
shoumikhin Apr 21, 2025
3537781
Fixed inaccurate PR labelling instructions (#10268)
jhelsby Apr 21, 2025
c1f133b
Documentation updates for OpenVINO backend (#10172)
suryasidd Apr 21, 2025
5fbbe11
Move depthwise conv check to helper function in utils
keyprocedure Apr 21, 2025
acb2321
Use existing Conv2d class; get conv count from model
keyprocedure Apr 21, 2025
fc0a15e
[exir] Allow verifiers in _transform (#10322)
pytorchbot Apr 21, 2025
10b12ae
[mps] Disable dialect verifier under mps preprocess (#10323)
pytorchbot Apr 21, 2025
d25715b
[Android] Remove old onStats
kirklandsign Apr 21, 2025
adc4892
Fix Linter (#10333)
kirklandsign Apr 21, 2025
40b15a0
Refactor export_delegated_program (#10334)
pytorchbot Apr 21, 2025
f5bd273
Update check_urls.sh (#10321)
shoumikhin Apr 21, 2025
25ceeec
Fix android instrumentation (#10335)
kirklandsign Apr 21, 2025
84ca4fe
[Executorch][BE] Fix error logging with better message (#10339)
pytorchbot Apr 21, 2025
3d33785
[Executorch][llama] bug fix for custom sdpa for attention bias (#10340)
pytorchbot Apr 21, 2025
8115322
[Executorch][llama] Allow custom sdpa op replacement pass to leverage…
pytorchbot Apr 21, 2025
b03a2d8
[Executorch][llama] Hookup use_attention_mask option in the source tr…
pytorchbot Apr 21, 2025
9693061
Update check_xrefs.sh (#10343)
shoumikhin Apr 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .ci/docker/requirements-ci.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ parameterized==0.9.0

# Doc build requirements, same as https://github.com/pytorch/pytorch/blob/main/.ci/docker/requirements-docs.txt
sphinx==5.3.0
sphinx-reredirects==0.1.4
sphinx-gallery==0.14.0
breathe==4.34.0
exhale==0.2.3
Expand Down
21 changes: 0 additions & 21 deletions .ci/scripts/build_android_instrumentation.sh

This file was deleted.

1 change: 1 addition & 0 deletions .ci/scripts/gather_benchmark_configs.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
"samsung_galaxy_s22": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/e59f866a-30aa-4aa1-87b7-4510e5820dfa",
"samsung_galaxy_s24": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/98f8788c-2e25-4a3c-8bb2-0d1e8897c0db",
"google_pixel_8_pro": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/d65096ab-900b-4521-be8b-a3619b69236a",
"google_pixel_3_private_rooted": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/98d23ca8-ea9e-4fb7-b725-d402017b198d",
}

# Predefined benchmark configurations
Expand Down
6 changes: 5 additions & 1 deletion .ci/scripts/test_ios_ci.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

set -e

APP_PATH="examples/demo-apps/apple_ios/ExecuTorchDemo/ExecuTorchDemo"
APP_PATH="executorch-examples/apple/ExecuTorchDemo/ExecuTorchDemo"
MODEL_NAME="mv3"
SIMULATOR_NAME="executorch"

Expand All @@ -34,6 +34,10 @@ say() {
echo -e "\033[1m\n\t** $1 **\n\033[0m"
}

say "Cloning the Demo App"

git clone --depth 1 https://github.com/pytorch-labs/executorch-examples.git

say "Installing CoreML Backend Requirements"

./backends/apple/coreml/scripts/install_requirements.sh
Expand Down
2 changes: 1 addition & 1 deletion .ci/scripts/test_llava.sh
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ run_and_verify() {
EXPECTED_PREFIX="ASSISTANT: image captures a basketball game in progress, with several players on the court. One of the players is dribbling the ball, while the others are in various"
else
# set the expected prefix to be the same as prompt because there's a bug in sdpa_with_kv_cache that causes <unk> tokens.
EXPECTED_PREFIX="ASSISTANT:"
EXPECTED_PREFIX="ASSISTANT: image"
fi
if [[ "${RESULT}" == *"${EXPECTED_PREFIX}"* ]]; then
echo "Expected result prefix: ${EXPECTED_PREFIX}"
Expand Down
61 changes: 43 additions & 18 deletions .github/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,57 +15,82 @@ changelog:
- title: ARM
labels:
- "release notes: arm"
- "module: arm"
- "partner: arm"
- title: NXP
labels:
labels:
- "release notes: nxp"
- "module: nxp"
- title: Exir
labels:
labels:
- "release notes: exir"
- "module: exir"
- title: Misc
labels:
labels:
- "release notes: misc"
- title: Apple
labels:
labels:
- "release notes: apple"
- "module: coreml"
- "module: mps"
- title: Android
labels:
- "module: android"
- title: IOS
labels:
- "module: ios"
- title: Build
labels:
labels:
- "release notes: build"
- title: Vulkan
labels:
labels:
- "release notes: vulkan"
- "module: vulkan"
- title: Cadence
labels:
labels:
- "release notes: cadence"
- "module: cadence"
- title: Runtime
labels:
labels:
- "release notes: runtime"
- "module: runtime"
- title: XNNPACK
labels:
labels:
- "release notes: xnnpack"
- "module: xnnpack"
- title: Devtools
labels:
labels:
- "release notes: devtools"
- "module: devtools"
- title: Examples
labels:
labels:
- "release notes: examples"
- title: LLM
labels:
- "module: llm"
- title: Mediatek
labels:
labels:
- "release notes: mediatek"
- "partner: mediatek"
- title: Openvino
labels:
labels:
- "release notes: openvino"
- title: Qualcomm
labels:
labels:
- "release notes: qualcomm"
- "partner: qualcomm"
- "module: qnn"
- title: Training
labels:
labels:
- "release notes: training"
- "module: training"
- title: Quantization
labels:
labels:
- "release notes: quantization"
- title: Ops & kernels
labels:
- "release notes: ops & kernels"
labels:
- "release notes: ops & kernels"
- "module: kernels"
- title: Other Changes
labels:
- "*"
14 changes: 10 additions & 4 deletions .github/workflows/_android.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,18 @@ jobs:
with:
runner: linux.2xlarge
docker-image: executorch-ubuntu-22.04-clang12-android
submodules: 'true'
submodules: 'recursive'
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
timeout: 90
upload-artifact: android-apps
upload-artifact-to-s3: true
script: |
set -eux

# Use sccache for NDK compiler as well
export CMAKE_CXX_COMPILER_LAUNCHER=sccache
export CMAKE_C_COMPILER_LAUNCHER=sccache

# The generic Linux job chooses to use base env, not the one setup by the image
CONDA_ENV=$(conda env list --json | jq -r ".envs | .[-1]")
conda activate "${CONDA_ENV}"
Expand All @@ -36,8 +40,9 @@ jobs:
cp ${BUILD_AAR_DIR}/executorch.aar $ARTIFACTS_DIR_NAME

mkdir -p ${ARTIFACTS_DIR_NAME}/library_test_dir
bash .ci/scripts/build_android_instrumentation.sh
cp ${BUILD_AAR_DIR}/executorch_android/build/outputs/apk/androidTest/debug/executorch_android-debug-androidTest.apk "${ARTIFACTS_DIR_NAME}/library_test_dir"
bash extension/android/executorch_android/android_test_setup.sh
(cd extension/android; ANDROID_HOME="${ANDROID_SDK:-/opt/android/sdk}" ./gradlew :executorch_android:assembleAndroidTest)
cp extension/android/executorch_android/build/outputs/apk/androidTest/debug/executorch_android-debug-androidTest.apk "${ARTIFACTS_DIR_NAME}/library_test_dir"

mkdir -p ${ARTIFACTS_DIR_NAME}/fp32-xnnpack-custom
bash examples/models/llama/install_requirements.sh
Expand Down Expand Up @@ -130,7 +135,8 @@ jobs:
# https://github.com/ReactiveCircus/android-emulator-runner. The max number
# of cores we can set is 6, any higher number will be reduced to 6.
cores: 6
ram-size: 12288M
ram-size: 16384M
heap-size: 12288M
force-avd-creation: false
disable-animations: true
emulator-options: -no-snapshot-save -no-window -gpu swiftshader_indirect -noaudio -no-boot-anim -camera-back none
Expand Down
62 changes: 62 additions & 0 deletions .github/workflows/android-perf-private-device-experiment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
name: android-perf (private devices)

on:
schedule:
- cron: 0 0,4,8,12,16,20 * * *
pull_request:
paths:
- .github/workflows/android-perf-private-device-experiment.yml
push:
branches:
- main
paths:
- .github/workflows/android-perf-private-device-experiment.yml
# Note: GitHub has an upper limit of 10 inputs
workflow_dispatch:
inputs:
models:
description: Models to be benchmarked
required: false
type: string
default: mv3,meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8,meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8
devices:
description: Target devices to run benchmark
required: false
type: string
default: google_pixel_3_private_rooted
benchmark_configs:
description: The list of configs used the benchmark
required: false
type: string
workflow_call:
inputs:
models:
description: Models to be benchmarked
required: false
type: string
default: mv3,meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8,meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8
devices:
description: Target devices to run benchmark
required: false
type: string
default: google_pixel_3_private_rooted
benchmark_configs:
description: The list of configs used the benchmark
required: false
type: string

concurrency:
group: android-perf-private-devices-${{ github.event.pull_request.number || github.ref_name }}-${{ github.ref_type == 'branch' && github.sha }}-${{ github.event_name == 'workflow_dispatch' }}-${{ github.event_name == 'schedule' }}
cancel-in-progress: true

jobs:
android:
uses: ./.github/workflows/android-perf.yml
secrets: inherit
permissions:
id-token: write
contents: read
with:
models: ${{ inputs.models || 'mv3,meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8,meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8' }}
devices: google_pixel_3_private_rooted
benchmark_configs: ${{ inputs.benchmark_configs }}
8 changes: 6 additions & 2 deletions .github/workflows/android-perf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -345,14 +345,18 @@ jobs:
with:
runner: linux.2xlarge
docker-image: executorch-ubuntu-22.04-clang12-android
submodules: 'true'
submodules: 'recursive'
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
timeout: 90
upload-artifact: android-apps
upload-artifact-to-s3: true
script: |
set -eux

# Use sccache for NDK compiler as well
export CMAKE_CXX_COMPILER_LAUNCHER=sccache
export CMAKE_C_COMPILER_LAUNCHER=sccache

# The generic Linux job chooses to use base env, not the one setup by the image
CONDA_ENV=$(conda env list --json | jq -r ".envs | .[-1]")
conda activate "${CONDA_ENV}"
Expand Down Expand Up @@ -392,7 +396,7 @@ jobs:
fail-fast: false
with:
# Due to scheduling a job may be pushed beyond the default 60m threshold
timeout: 120
timeout: 240
device-type: android
runner: linux.2xlarge
test-infra-ref: ''
Expand Down
Loading