Skip to content

Commit

Permalink
temporarily disable zero on device memory during kv_cache alloc
Browse files Browse the repository at this point in the history
  • Loading branch information
guschmue committed Feb 20, 2025
1 parent 16fb079 commit 15b251d
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion src/models/kv_cache.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,10 @@ DefaultKeyValueCache::DefaultKeyValueCache(State& state)
sb_kv_caches_.empty() ? OrtValue::CreateTensor(Allocator(), shape_, type_)
: sb_kv_caches_[i]->CreateTensorOnStaticBuffer(shape_, type_));
// Zero the memory so we don't leak any data from the previous run
ByteWrapTensor(Device(), *presents_.back()).Zero();
if (Device().GetType() != DeviceType::WEBGPU) {
// ort c api does have a method to update device memory - temporarily disable.
ByteWrapTensor(Device(), *presents_.back()).Zero();
}
}
} catch (const Ort::Exception&) {
std::ostringstream oss;
Expand Down

0 comments on commit 15b251d

Please sign in to comment.