v2.0.0-beta.3
Pre-release
Pre-release
New feature:
- derivatives for deep tensor (#805)
Performance improvement:
- speedup ROCm kernels which use atomicAdd (#809 #815 ) (from ByteDance)
- speedup CUDA kernels (use atomicAdd inside) by reducing the global memory write (#811)
Enhancement:
- add type-embedding developer doc (#762)
- add model compression support for models with exclude_types feature (#754)
- improve the doc and user interface of model compression (#772)
- allow c++ tests to run without internet (#785)
- support converting models generated in v1.3 to 2.0 compatibility (#725)
- give a default value to T and convert models from v1.2 to 2.0 compatibility (#789)
- improved documents for conda (#798)
- throw a message if tf runtime is incompatible (#797)
- capture OOM and print debug message (#801)
Bug fixings