Binary Neural Network (BNN)

基本原理

BNN与CNN最大的区别在于矩阵乘法的处理，也就是卷积层全连接层，都采用量化的方式，如下用+1和-1两个值来表示。

$\mathbf{x}$ $\mathbf{y}$ $\mathbf{x}\cdot\mathbf{y}$
-1 -1 +1
-1 +1 -1
+1 -1 -1
+1 +1 +1

$\hat{\mathbf{x}}$ $\hat{\mathbf{y}}$ $\hat{\mathbf{x}}\odot\hat{\mathbf{y}}$
0 0 1
0 1 0
1 0 0
1 1 1

\begin{aligned} \mathbf{x}\cdot\mathbf{y} &=\sum_{i=0}^L x_i\cdot y_i\\ &=\sum_{i=0}^L (2(\hat{x}_i\odot \hat{y}_i)-1)\\ &=2\sum_{i=0}^L (\hat{x}_i\odot\hat{y}_i) – L \end{aligned}

Conv

$a\leftarrow a+(b\times c)$

batch, in_channel, in_height, in_width = Input.shape
num_filter, filter_channel, kernel_h, kernel_w = Filter.shape
rc = hcl.reduce_axis(0, in_channel, name=name+'_rc')
ry = hcl.reduce_axis(0, kernel_h, name=name+'_ry')
rx = hcl.reduce_axis(0, kernel_w, name=name+'_rx')
rb = hcl.reduce_axis(0, bitwidth, name=name+'_rb')
kernel_size = kernel_h * kernel_w
out = hcl.compute((batch, out_channel, out_height, out_width),
lambda nn, ff, yy, xx: kernel_size * bitwidth * in_channel -
(hcl.sum((Padded_Input[nn, rc_, yy * stride_h + ry, xx * stride_w + rx] ^ Filter[ff, rc_, ry, rx])[rb],
axis=[rc, ry, rx, rb], dtype=out_dtype, name=name) << 1),
name=name, dtype=out_dtype)


Batch norm

$\mathbf{y}=\frac{\mathbf{x}-\mathbf{\mu}}{\sqrt{\sigma^2+\varepsilon}}\gamma+\beta$

Popcount

BNN的conv2d和dense layer中最为核心的即为popcount的实现，这里会有很多magic methods，gcc (CPU)是提供了内置的intrinsic，但FPGA则需自己实现。可参考：