Skip to content

Commit

Permalink
temporarily disable zero on device memory during kv_cache alloc (#1269)
Browse files Browse the repository at this point in the history
Co-authored-by: Ryan Hill <[email protected]>
  • Loading branch information
guschmue and RyanUnderhill authored Feb 21, 2025
1 parent dee4160 commit 0f58d95
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion src/models/kv_cache.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,10 @@ DefaultKeyValueCache::DefaultKeyValueCache(State& state)
sb_kv_caches_.empty() ? OrtValue::CreateTensor(Allocator(), shape_, type_)
: sb_kv_caches_[i]->CreateTensorOnStaticBuffer(shape_, type_));
// Zero the memory so we don't leak any data from the previous run
ByteWrapTensor(Device(), *presents_.back()).Zero();
// WebGPU device has no Zero() implementation yet. Since this zeroing is optional we disable it for WebGPU for now
if (Device().GetType() != DeviceType::WEBGPU) {
ByteWrapTensor(Device(), *presents_.back()).Zero();
}
}
} catch (const Ort::Exception&) {
std::ostringstream oss;
Expand Down

0 comments on commit 0f58d95

Please sign in to comment.