Larq Compute Engine Inference¶
To perform inference with Larq Compute Engine (LCE), we use the TensorFlow Lite interpreter. An LCE-compatible TensorFlow Lite interpreter drives the Larq model inference and uses LCE custom operators instead of built-in TensorFlow Lite operators for each applicable subgraph of the model.
This guide describes how to create a TensorFlow Lite interpreter with registered LCE custom Ops and perform an inference with a converted Larq model using LCE C++ API.
Load and run a model in C++¶
Running inference with TensorFlow Lite consists of multiple steps, which are comprehensively described in the TensorFlow Lite inference guide. Below we list these steps with one additional step to register LCE customs operators using the LCE C++ function RegisterLCECustomOps()
:
-
Load
FlatBuffer
model:// Load model std::unique_ptr<tflite::FlatBufferModel> model = tflite::FlatBufferModel::BuildFromFile(filename);
-
Build the
BuiltinOpResolver
with registered LCE operators:// create a builtin OpResolver tflite::ops::builtin::BuiltinOpResolver resolver; // register LCE custom ops compute_engine::tflite::RegisterLCECustomOps(&resolver);
-
Build an Interpreter with custom
OpResolver
:// Build the interpreter InterpreterBuilder builder(*model, resolver); std::unique_ptr<Interpreter> interpreter; builder(&interpreter);
-
Set input tensor values:
// Fill `input`. float* input = interpreter->typed_input_tensor<float>(0); // Resize input tensors, if desired. interpreter->AllocateTensors();
-
Invoke inference:
interpreter->Invoke();
-
Read inference results:
float* output = interpreter->typed_output_tensor<float>(0);
To build the inference binary with Bazel, it needs to be linked against //larq_compute_engine/tflite/kernels:lce_op_kernels
target. See LCE minimal for a complete example.