Description
I think it is unclear to users how they should choose between model.predict()
and model.execute()
. The docstrings are too similar to understand the differences. The main distinction seems to be that predict()
can provide batching, while execute()
allows requesting intermediate Tensors. These are orthogonal concerns, and beyond those the functionality is so similar that perhaps these methods could be consolidated (mod API stability guarantees, I know).
Another difference between predict()
and execute()
is that the former does strict type and shape checking of the inputs where the latter does not (in both LayersModel
and GraphModel
, afaict). This fact--which may be relevant for performance, I guess--is not mentioned in the docstrings, so the user can't make an informed decision on that basis.
GraphModel
and LayersModel
implement InferenceModel.predict()
differently: LayersModel.predict()
slices its input into batches according to the config.batchSize
argument. GraphModel.predict()
ignores the config, so it actually corresponds to LayersModel.predictOnBatch()
.
What's more, LayersModel
extends Layer
, so it also has call()
(which just delegates to execute()
after reformatting the arguments and wrapping in tidy()
), as well as apply()
, which delegates to call()
in the eager case. (At least these symbols, while exported, are not exposed in our API docs).
Finally our tfjs-models variously use predict*()
, infer()
, and classify()
. These are a bit less concerning, but still perhaps unnecessarily diverse. It's not obvious that e.g. our MobileNet wrapper should extend InferenceModel
--but it's not obvious that it shouldn't, either.
The upshot is: it would be great to converge (over time) on fewer ways of doing the same thing, and to clearly explain the remaining choices in docstrings and maybe even a 'guide' doc.