EDIT: the previous answer refered to Tensorflow Lite code. I updated it to refer to Tensorflow.
Looking at the implementation of Tensorflow’s quantize_weights, these are the instances where weights don’t get quantized:
- tensor that is not type float
- tensor that has fewer than 1024 weights (or another number specified by the parameter
minimum_size
)
If you are able to modify nodes in the graph so that they are excluded by one of the above rules, then quantize, then revert the nodes to the pre-quantized state, you might be able to do this.