-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[tmva][sofie] Add support for optimal memory allocation of dynamic tensors #20434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Test Results 22 files 22 suites 3d 23h 10m 4s ⏱️ For more details on these failures, see this check. Results for commit 8db589a. ♻️ This comment has been updated with latest results. |
sanjibansg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall, just some questions:
| for (size_t i = 0; i < fNBroadcastedInputs.size(); i++) { | ||
| inputs[i] = fNBroadcastedInputs[i] + "[id]"; | ||
|
|
||
| // implement operator without broadcasting, but using loos on all indices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // implement operator without broadcasting, but using loos on all indices | |
| // implement operator without broadcasting, but using loops on all indices |
| std::copy(inputData, inputData + inputLength, outputData.begin() + offset ); | ||
| offset += inputLength; | ||
| // data do not need to be written as a weight | ||
| // data do not need to be written in teh generated code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // data do not need to be written in teh generated code | |
| // data do not need to be written in the generated code |
tmva/sofie/src/RModel.cxx
Outdated
| //fGC += "std::vector<float> fTensor_" + i.first + ";\n"; | ||
| fGC += "float * tensor_" + i.first + " = nullptr;\n"; | ||
| } else if (i.second.type == ETensorType::DOUBLE) { | ||
| fGC += "std::vector<double> fTensor_" + i.first + ";\n"; | ||
| //fGC += "std::vector<double> fTensor_" + i.first + ";\n"; | ||
| fGC += "double * tensor_" + i.first + " = nullptr;\n"; | ||
| } else if (i.second.type == ETensorType::INT64) { | ||
| fGC += "std::vector<int64_t> fTensor_" + i.first + ";\n"; | ||
| //fGC += "std::vector<int64_t> fTensor_" + i.first + ";\n"; | ||
| fGC += "int64_t * tensor_" + i.first + " = nullptr;\n"; | ||
| } else if (i.second.type == ETensorType::BOOL) { | ||
| //fGC += "std::vector<uint8_t> fTensor_" + i.first + ";\n"; | ||
| fGC += "uint8_t * tensor_" + i.first + " = nullptr;\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we remove the commented out code?
tmva/sofie/src/RModel.cxx
Outdated
| bool modelHasWeights = false; | ||
| for (auto &i : fInitializedTensors) { | ||
| if (i.second.type() == ETensorType::FLOAT) { | ||
| if (i.second.IsWeightTensor()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be an issue if we do not make type checks here?
| // for (auto &i : fDynamicTensorInfos) { | ||
| // auto length = ConvertDynamicShapeToLength(i.second.shape); | ||
| // out << SP << "if (" << length << " > 0) {\n"; | ||
| // out << SP << SP << "fTensor_" << i.first << ".resize(" << length << ");\n"; | ||
| // out << SP << SP << "tensor_" << i.first << " = fTensor_" << i.first << ".data();\n"; | ||
| // out << SP << "}\n"; | ||
| // } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can remove this commented code?
|
|
||
| struct MemoryEvent { | ||
| int t; // time (i.e. operator index) | ||
| int type; // 0 = END first, 1 = START |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does the tensor index signify here?
| // /d to add a new intermediate tensor for broadcasted bias tensor | ||
| // fNC2 = fNC + "bcast"; | ||
| // if (!fIsDynamic) { | ||
| // model.AddIntermed/ In case of session add broadcasting code in Session constructor and in GenerateInitCode | ||
| // // we neeiateTensor(fNC2, model.GetTensorType(fNC), shapeY); | ||
| // } | ||
| // else | ||
| // model.AddDynamicTensor(fNC2,model.GetTensorType(fNC), fShapeY); | ||
| // // do not add to lists of input/output tensors since broadcasted tensors are special | ||
| // // and we manage their memory separatly | ||
| // //fInputTensorNames.emplace_back(fNC2); | ||
| // //fOutputTensorNames.emplace_back(fNC2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this else block is not needed anymore, maybe we can remove the if-else branching completely?
a907978 to
72f683c
Compare
2cc1f83 to
e2275ff
Compare
Add missing support for Dynamic tensors for some operators. With this commit a full support for dynamic tensor is available for ParticleNet model. Fix also a bug in Concat operator when the concat axis is not the first one
Since we use now for boolean tensors a std::vector<uint8_t> it is not needed to have a special treatment when the output ttype of the operator is a boolean (e.g. in Comparison)
…ensors Add a new function in SOFIE_common OrganizeMemory which computes the total memory and the offset for each tensor given tensor begin /end life and size. Fix also some small issue with dynamic tensor. One is for the bias of Gemm and Conv. The broadcasting of bias is done for dynamic tensor in the Session constructor only if needed. For the broadcasted tensor there is no need to create a new tensor, but the existing one is resized to the broadcasted needed size using vector::resize
… broadcasting The assert that was generated when broadcasting dynamic tensors was not correct
Apply also other fixes for the SOFIE tests and add a new test for StackMul
The order execution ws not set for tensor inputs to operators added using the GNN Sofie classes. This is now fixed and the correct memory mangement can be performed.
SOme fixes are needed for the test, since the session is not used for this tests. Need also to force using Session in case of Dynamic tensors Fix also a warning in Gemm operator and RModel
e2275ff to
55fb3e7
Compare
…on ctor Do an alphabetical order of Session shape parameters for dynamic tensors, otherwise they may get a random order. Observed different order on different platforms Add some small improvements in the generated code (add nunber and shape informations) when generating Gemm code
…on ctor Avoid creating a broadcasted bias tensor which uses lots of memory. Do broadcasting of the bias on the fly before computing Gemm by using the output tensor. This saves a large amount of memory on models using large Gemm calss like the atlas GNN model used for tracking
Add alias tensors to cope with identity operators. In this case just a pointer assignment is performed by the operator. Exclude this tensors in the allocation and take care of them in the dynamic memory pool Optimise Slice operator when slice is an identity
1d0bd08 to
8db589a
Compare
This Pull request provides support for optimal memory allocation of dynamic tensor.
A function to compute the total size and the optimal offset for each tensor given the dynamic input parameters (e.g. batch_size, number of input features, etc..) is added in SOFIE_Common.