A Quantizer defines the way of transforming a full precision input to a quantized output and the pseudo-gradient method used for the backwards pass.
Quantizers can either be used through quantizer arguments that are supported for Larq layers, such as input_quantizer
and kernel_quantizer
; or they can be used similar to activations, i.e. either through an Activation
layer, or through the activation
argument supported by all forward layers:
import tensorflow as tf
import larq as lq
x = lq.layers.QuantDense(64, activation=None)(x)
x = lq.layers.QuantDense(64, input_quantizer="ste_sign")(x)
is equivalent to:
x = lq.layers.QuantDense(64)(x)
x = tf.keras.layers.Activation("ste_sign")(x)
x = lq.layers.QuantDense(64)(x)
as well as:
x = lq.layers.QuantDense(64, activation="ste_sign")(x)
x = lq.layers.QuantDense(64)(x)
We highly recommend using the first of these formulations: for the other two formulations, intermediate layers - like batch normalization or average pooling - and shortcut connections may result in non-binary input to the convolutions.
Quantizers can either be referenced by string or called directly. The following usages are equivalent:
lq.layers.QuantDense(64, kernel_quantizer="ste_sign")
lq.layers.QuantDense(64, kernel_quantizer=lq.quantizers.SteSign(clip_value=1.0))
larq.quantizers.NoOpQuantizer(precision, **kwargs)
Instantiates a serializable no-op quantizer.
\[ q(x) = x \]
This quantizer will not change the input variable. It is only intended to mark variables with a desired precision that will be recognized by optimizers like Bop
and add training metrics to track variable changes.
layer = lq.layers.QuantDense(
16, kernel_quantizer=lq.quantizers.NoOpQuantizer(precision=1),
assert layer.kernel.precision == 1
- precision: Set the desired precision of the variable. This can be used to tag
- metrics: An array of metrics to add to the layer. If
the metrics set inlarq.context.metrics_scope
are used. Currently only theflip_ratio
metric is available.
larq.quantizers.SteSign(clip_value=1.0, **kwargs)
Instantiates a serializable binary quantizer.
\[ q(x) = \begin{cases} -1 & x < 0 \\ 1 & x \geq 0 \end{cases} \]
The gradient is estimated using the Straight-Through Estimator (essentially the binarization is replaced by a clipped identity on the backward pass). \[\frac{\partial q(x)}{\partial x} = \begin{cases} 1 & \left|x\right| \leq \texttt{clip_value} \\ 0 & \left|x\right| > \texttt{clip_value} \end{cases}\]
- clip_value: Threshold for clipping gradients. If
gradients are not clipped. - metrics: An array of metrics to add to the layer. If
the metrics set inlarq.context.metrics_scope
are used. Currently only theflip_ratio
metric is available.
larq.quantizers.ApproxSign(*args, metrics=None, **kwargs)
Instantiates a serializable binary quantizer. \[ q(x) = \begin{cases} -1 & x < 0 \\ 1 & x \geq 0 \end{cases} \]
The gradient is estimated using the ApproxSign method. \[\frac{\partial q(x)}{\partial x} = \begin{cases} (2 - 2 \left|x\right|) & \left|x\right| \leq 1 \\ 0 & \left|x\right| > 1 \end{cases} \]
- metrics: An array of metrics to add to the layer. If
the metrics set inlarq.context.metrics_scope
are used. Currently only theflip_ratio
metric is available.
larq.quantizers.SteHeaviside(clip_value=1.0, **kwargs)
Instantiates a binarization quantizer with output values 0 and 1. \[ q(x) = \begin{cases} +1 & x > 0 \\ 0 & x \leq 0 \end{cases} \]
The gradient is estimated using the Straight-Through Estimator (essentially the binarization is replaced by a clipped identity on the backward pass).
\[\frac{\partial q(x)}{\partial x} = \begin{cases} 1 & \left|x\right| \leq 1 \\ 0 & \left|x\right| > 1 \end{cases}\]
- clip_value: Threshold for clipping gradients. If
gradients are not clipped. - metrics: An array of metrics to add to the layer. If
the metrics set inlarq.context.metrics_scope
are used. Currently only theflip_ratio
metric is available.
AND Binarization function
larq.quantizers.SwishSign(beta=5.0, **kwargs)
Sign binarization function.
\[ q(x) = \begin{cases} -1 & x < 0 \\ 1 & x \geq 0 \end{cases} \]
The gradient is estimated using the SignSwish method.
\[ \frac{\partial q_{\beta}(x)}{\partial x} = \frac{\beta\left\{2-\beta x \tanh \left(\frac{\beta x}{2}\right)\right\}}{1+\cosh (\beta x)} \]
- beta: Larger values result in a closer approximation to the derivative of the sign.
- metrics: An array of metrics to add to the layer. If
the metrics set inlarq.context.metrics_scope
are used. Currently only theflip_ratio
metric is available.
SwishSign quantization function
larq.quantizers.MagnitudeAwareSign(clip_value=1.0, **kwargs)
Instantiates a serializable magnitude-aware sign quantizer for Bi-Real Net.
A scaled sign function computed according to Section 3.3 in Zechun Liu et al.
- clip_value: Threshold for clipping gradients. If
gradients are not clipped. - metrics: An array of metrics to add to the layer. If
the metrics set inlarq.context.metrics_scope
are used. Currently only theflip_ratio
metric is available.
threshold_value=0.05, ternary_weight_networks=False, clip_value=1.0, **kwargs
Instantiates a serializable ternarization quantizer.
\[ q(x) = \begin{cases} +1 & x > \Delta \\ 0 & |x| < \Delta \\ -1 & x < - \Delta \end{cases} \]
where \(\Delta\) is defined as the threshold and can be passed as an argument, or can be calculated as per the Ternary Weight Networks original paper, such that
\[ \Delta = \frac{0.7}{n} \sum_{i=1}^{n} |W_i| \] where we assume that \(W_i\) is generated from a normal distribution.
The gradient is estimated using the Straight-Through Estimator (essentially the Ternarization is replaced by a clipped identity on the backward pass). \[\frac{\partial q(x)}{\partial x} = \begin{cases} 1 & \left|x\right| \leq \texttt{clip_value} \\ 0 & \left|x\right| > \texttt{clip_value} \end{cases}\]
- threshold_value: The value for the threshold, \(\Delta\).
- ternary_weight_networks: Boolean of whether to use the Ternary Weight Networks threshold calculation.
- clip_value: Threshold for clipping gradients. If
gradients are not clipped. - metrics: An array of metrics to add to the layer. If
the metrics set inlarq.context.metrics_scope
are used. Currently only theflip_ratio
metric is available.
larq.quantizers.DoReFaQuantizer(k_bit=2, **kwargs)
Instantiates a serializable k_bit quantizer as in the DoReFa paper.
\[ q(x) = \begin{cases} 0 & x < \frac{1}{2n} \\ \frac{i}{n} & \frac{2i-1}{2n} < x < \frac{2i+1}{2n} \text{ for } i \in \{1,n-1\}\\ 1 & \frac{2n-1}{2n} < x \end{cases} \]
where \(n = 2^{\text{k_bit}} - 1\). The number of bits, k_bit, needs to be passed as an argument. The gradient is estimated using the Straight-Through Estimator (essentially the binarization is replaced by a clipped identity on the backward pass). \[\frac{\partial q(x)}{\partial x} = \begin{cases} 1 & 0 \leq x \leq 1 \\ 0 & \text{else} \end{cases}\]
While the DoReFa paper describes how to do quantization for both weights and activations, this implementation is only valid for activations, and this quantizer should therefore not be used as a kernel quantizer.
- k_bit: number of bits for the quantization.
- metrics: An array of metrics to add to the layer. If
the metrics set inlarq.context.metrics_scope
are used. Currently only theflip_ratio
metric is available.
Quantization function