Releases: EnzymeAD/Reactant.jl
Releases · EnzymeAD/Reactant.jl
v0.2.27
Reactant v0.2.27
Merged pull requests:
- Format code of branch "main" (#711) (@github-actions[bot])
- feat: overload ifelse for more types (#712) (@avik-pal)
- fix: multi-device execution and sharding [take III] (#713) (@avik-pal)
- Replace capture maps with
Holded
wrapper (#715) (@mofeing) - refactor: split XLA.jl into multiple files (#716) (@avik-pal)
- feat: enable async on CPU (#717) (@avik-pal)
- [ReactantExtra] IFRT bindings (round 4) (#718) (@mofeing)
- [ReactantExtra] feat: OpSharding bindings for Julia (#721) (@avik-pal)
- [ReactantExtra] fix: build on mac (#722) (@avik-pal)
- Update WORKSPACE (#723) (@avik-pal)
- Fix jll (#724) (@wsmoses)
Closed issues:
- shardy functions not visible on macos (#714)
v0.2.26
Reactant v0.2.26
Merged pull requests:
@trace
function calls (#366) (@jumerckx)- chore: missing upstream optimization passes (#624) (@avik-pal)
- feat: shardy and multi device execution (#637) (@avik-pal)
- Regenerate MLIR Bindings (#686) (@github-actions[bot])
- Misc fixes (#687) (@wsmoses)
- dict value fix (#688) (@wsmoses)
- [deps] Some improvements to the
build_local.jl
script (#689) (@giordano) - Multiple device error (#690) (@wsmoses)
- feat: API changes for multi-device execution [ReactantExtra JLL changes] (#692) (@avik-pal)
- Wrapping RCReferences (#697) (@hhkit)
- Ref ptr fix (#698) (@wsmoses)
- Add GPUCompiler and LLVM as deps to CUDA extension and run CUDA tests on macOS (#700) (@giordano)
- vendor optimize (#703) (@wsmoses)
- [ReactantExtra] Stop removing references to
hardware_interference_size
(#704) (@giordano) - Update Project.toml (#705) (@wsmoses)
- JLL related fixups (#706) (@wsmoses)
- Regenerate MLIR Bindings (#708) (@github-actions[bot])
- Format code of branch "main" (#709) (@github-actions[bot])
- fix: don't trace val (#710) (@avik-pal)
Closed issues:
v0.2.25
Reactant v0.2.25
Merged pull requests:
- make
similar
return empty tensors. (#632) (@jumerckx) - Use
LLVMOpenMP_jll
to call OpenMP functions (#673) (@giordano) - [ReactantCUDAExt] Skip precompile load on Julia v1.11.3 (#675) (@giordano)
- Regenerate MLIR Bindings (#680) (@github-actions[bot])
- [ReactantExtra] Add argument to
ClientCompile
to pass CUDA data dir (#683) (@giordano) - CUDA: fix gc issues (#685) (@wsmoses)
Closed issues:
v0.2.24
v0.2.23
Reactant v0.2.23
Merged pull requests:
- Regenerate MLIR Bindings (#627) (@github-actions[bot])
- [CI] Add workflow to clean up docs previews (#628) (@giordano)
- fix: build error with shardy (#629) (@avik-pal)
- [ReactantExtra] Improvements to BUILD file to compile CUDA for aarch64 (#631) (@giordano)
- fix cuda abi setting (#633) (@wsmoses)
- Format code of branch "main" (#634) (@github-actions[bot])
- [tests] Replace random custom type numbers with fixed set of numbers (#636) (@giordano)
- Add IR dumping (#638) (@wsmoses)
- [ReactantExtra] Bump XLA version (#640) (@giordano)
- TPU profiler (#642) (@Pangoraw)
- Applehw (#643) (@wsmoses)
- Regenerate MLIR Bindings (#644) (@github-actions[bot])
- feat: add dispatch for KA get_backend (#645) (@avik-pal)
- Use
xla/stream_executor/cuda:cuda_compute_capability_proto_cc_impl
only on non CUDA (#646) (@giordano) - CPU backend (#647) (@wsmoses)
- docs: add shardy to docs (#648) (@avik-pal)
- chore: generate shardy c wrappers (#650) (@avik-pal)
- Regenerate MLIR Bindings (#651) (@github-actions[bot])
- chore: missing header files in API (#652) (@avik-pal)
- feat: the big jll PR (#653) (@avik-pal)
- [CI] Fix path of previews directory in PreviewCleanup workflow (#656) (@giordano)
- Detect TPU using PCI devices (#659) (@Pangoraw)
- Replace
trim
->strip
(#661) (@giordano) - Silence various warnings in tests (#662) (@giordano)
- Feature: allow colon indexing of traced vectors (#664) (@floffy-f)
- Format code of branch "main" (#665) (@github-actions[bot])
- Regenerate MLIR Bindings (#666) (@github-actions[bot])
- KA ext (#667) (@wsmoses)
- [docs] Add information about configuration on GPU and TPU systems (#668) (@giordano)
- Fix ntuple traced type issue on unionall (#669) (@wsmoses)
Closed issues:
v0.2.22
Reactant v0.2.22
Merged pull requests:
- [CI] Move tests on aarch64 linux to GitHub Actions (#543) (@giordano)
- feat: multi GPU support (#587) (@avik-pal)
- feat: expose gpu memory allocation options (#589) (@avik-pal)
- Fix condition to skip CUDA tests on aarch64 (#592) (@giordano)
- feat: add the new optimization passes (#595) (@avik-pal)
- feat: support lowering custom fp types (#596) (@avik-pal)
- Update ReactantCUDAExt.jl (#597) (@wsmoses)
- Add convert (#598) (@wsmoses)
- feat: support dynamic indexing for reshaped arrays (#601) (@avik-pal)
- Fix dense elements attribute in
Enzyme.autodiff
#593 (#604) (@mofeing) - feat: overload LinearAlgebra.kron (#607) (@avik-pal)
- feat: more indexing support (#608) (@avik-pal)
- feat: forward more base ops to chlo (#611) (@avik-pal)
- Add hermetic cuda getter (#612) (@wsmoses)
- [tests] Always skip CUDA tests on non-CUDA machines (#615) (@giordano)
- Typed rounding (#619) (@wsmoses)
- Regenerate MLIR Bindings (#621) (@github-actions[bot])
- feat: build the shardy dialect (#622) (@avik-pal)
- feat: support more set indexing (#625) (@avik-pal)
- Add bound optimizations (#626) (@wsmoses)
Closed issues:
v0.2.21
v0.2.20
Reactant v0.2.20
Merged pull requests:
- [ReactantExtra] Use XLA commit for building with CUDA 12.1 (#579) (@giordano)
- Regenerate MLIR Bindings (#580) (@github-actions[bot])
- Profiler annotations & tutorial (#582) (@Pangoraw)
- Fix for unknown cuda drivers (#586) (@wsmoses)
Closed issues:
- Profiling Tutorial (#581)
v0.2.19
Reactant v0.2.19
Merged pull requests:
- respect scopping rules in for (#310) (@Pangoraw)
- Despecialize make_tracer (#540) (@wsmoses)
- XLA profiler (#541) (@Pangoraw)
- feat: add isnan and isfinite dispatches (#544) (@avik-pal)
- unionnone (#545) (@wsmoses)
- print (#547) (@wsmoses)
- add int override (#549) (@wsmoses)
- docs: missing doc links in sidebar and navbar (#551) (@avik-pal)
- Fix downgrader CI job (#553) (@giordano)
- Simplify process to builds docs (#554) (@giordano)
- fix: define getindexing into sub reshaped array (#556) (@avik-pal)
- fix: inconsistent return dims (#558) (@avik-pal)
- [CI] Format generated files twice to work around JuliaFormatter bug (#560) (@giordano)
- Regenerate MLIR Bindings (#561) (@github-actions[bot])
- Format code of branch "main" (#562) (@github-actions[bot])
- [GHA] Add
paths
settings for workflow triggers (#563) (@giordano) - CUDA: fix nv intrinsic errs (#564) (@wsmoses)
- feat: support arbitrary structures in control flow (#565) (@avik-pal)
- [GHA] Fix syntax of regenerate MLIR bindings workflow (#566) (@giordano)
- More jll/cuda stuff (#567) (@wsmoses)
- Format code of branch "main" (#568) (@github-actions[bot])
- fix: reduction of integers (#573) (@avik-pal)
- profiler: Add option to generate perfetto url (#575) (@Pangoraw)
- [CI] Remove useless call to
Pkg.instantiate
(#576) (@giordano) - fix: specialize / on integer types (#577) (@avik-pal)
Closed issues:
setindex!
doesn't work with@trace
(#210)- Incorrect code-generation for
@trace for ...
(#301) - Precompiling
Reactant
errors in GPU-related symbol (#526) - How to check NaN? (#542)
- MethodError for
setindex!
insum!
(#548) - Error in
fill!
(#550) - [Docs] Stableurl doesn't go anywhere (#552)
- Subarray indexing error (#555)
- Mark all kernel arguments with the llvm.noalias attribute (#571)
- [Profiling] Add option to autogenerate perfetto and/or tensorboad url (#572)
- Incorrect division of Integers (
/
operator notdiv
) (#574)
v0.2.18
Reactant v0.2.18
Merged pull requests:
- linearize kernel args (#497) (@mofeing)
- Ka2 (#498) (@wsmoses)
- Regenerate MLIR Bindings (#501) (@github-actions[bot])
- linearize aliased kernel args (#504) (@jumerckx)
- Split
should_rewrite_ft
forcall
andinvoke
expressions, and overlayBase._unique_dims
(#505) (@mofeing) - feat: add rsqrt simplification (#506) (@avik-pal)
- Regenerate MLIR Bindings (#507) (@github-actions[bot])
- Format code of branch "main" (#509) (@github-actions[bot])
- feat: optimization passes (#510) (@avik-pal)
- Regenerate MLIR Bindings (#513) (@github-actions[bot])
- Make v and hcat with numbers work. (#514) (@jaeminoh)
- XLA Allocator stats (#517) (@Pangoraw)
- fix: generalize broadcast_in_dims for setindex (#518) (@avik-pal)
- Format code of branch "main" (#520) (@github-actions[bot])
- WIP: adapt to sroa jll (#521) (@wsmoses)
- Kernel: support constant input arg (#522) (@wsmoses)
- Implement
isnan
andisfinite
for TracedRNumber (#525) (@Pangoraw) - Format code of branch "main" (#528) (@github-actions[bot])
- feat: sorting and related functions (#529) (@avik-pal)
- Regenerate MLIR Bindings (#531) (@github-actions[bot])
- Format code of branch "main" (#533) (@github-actions[bot])
- Generalize precompilation support (#534) (@wsmoses)
- More constprop (#536) (@wsmoses)
- Fix tolerance on loggamma integration test (#537) (@wsmoses)
- Fix missing dialects in docs (#538) (@wsmoses)
Closed issues:
- minimize XLA error in gemm_autotuner for CUDA (#444)
- Not support
partialsortperm
? (#485) - KernelAbstractions + Reactant: UndefVarError:
pop
not defined (#488) - Infinite recursion on
unique(::Vector{Symbol})
within Reactant (#493) - Concatenation of scalar and ConcreteRArray gives a Vector (#511)
- Incorrect
broadcast_to_size
implementation (#512) - How to set NaN values in an RArray to a certain number? (#524)
- Precompilation of Reactant 0.2.1x fails (#527)