NICS-fix Documentation

Memorizin. it

Posted by tianchen on April 21, 2020

Nics fix PyTorch

import nics_fix_pt as nfp
import nics_fix_pt.nn_fix as nnf
  • const.py 中包含了多种不同方法所对应的数字
class RangeMethod:
    RANGE_MAX: 0
    RANGE_3SIGMA: 1
    RANGE_MAX_TENPERCENT: 2
    RANGE_SWEEP: 3

Model

Fix Cfg

  • 分为三个部分分别给cfg,形式分别是一个dict
    • Activation
    • Param
    • Grad
  • 部分
    • name (weight/bias/running_mean/running_var)
    • method (Nearest/Stochastic Rounding)
    • scale (2^X)
    • bitwidth
    • range_method (Range_MAX/Range_SWEEPs)
def _generate_default_fix_cfg(names, scale=0, bitwidth=8, method=0):
    return {
        n: {
            "method": torch.autograd.Variable(
                torch.IntTensor(np.array([method])), requires_grad=False
            ),
            "scale": torch.autograd.Variable(
                torch.IntTensor(np.array([scale])), requires_grad=False
            ),
            "bitwidth": torch.autograd.Variable(
                torch.IntTensor(np.array([bitwidth])), requires_grad=False
            ),
            # "range_method": nfp.RangeMethod.RANGE_MAX_TENPERCENT
            # "range_method": nfp.RangeMethod.RANGE_SWEEP
            "range_method": nfp.RangeMethod.RANGE_MAX
        }
        for n in names
    }

Fix module

  • 每个nn.module相比于原来的module,加另外两个参数(都是dict) ````nf_fix_params,nf_fix_params_grad```
    layer = nnf.Conv2d_fix(cin, cout,
                                         nf_fix_params=self.fix_param_cfgs[name],
                                         nf_fix_params_grad=self.fix_grad_cfgs[name] if fix_grad else None,
                                         kernel_size=kernel_size,
                                         padding=(kernel_size-1)//2 if name != "conv3_1" else 0)
    
  • 对于fix的用法
for i in range(20):
    setattr(self, "fix" + str(i), nnf.Activation_fix(
        nf_fix_params=self.fix_act_cfgs[i],
        nf_fix_params_grad=self.fix_act_grad_cfgs[i] if fix_grad else None))

def forward(self, x):
    x = self.fix0(x)
    x = self.fix2(F.relu(self.bn1_1(self.fix1(self.conv1_1(x)))))
    x = self.fix4(F.relu(self.bn1_2(self.fix3(self.conv1_2(x)))))
    x = self.pool1(self.fix6(F.relu(self.bn1_3(self.fix5(self.conv1_3(x))))))
    x = self.fix8(F.relu(self.bn2_1(self.fix7(self.conv2_1(x)))))
    x = self.fix10(F.relu(self.bn2_2(self.fix9(self.conv2_2(x)))))
    x = self.pool2(self.fix12(F.relu(self.bn2_3(self.fix11(self.conv2_3(x))))))
    x = self.fix14(F.relu(self.bn3_1(self.fix13(self.conv3_1(x)))))
    x = self.fix16(F.relu(self.bn3_2(self.fix15(self.nin3_2(x)))))
    x = self.fix18(F.relu(self.bn3_3(self.fix17(self.nin3_3(x)))))

Fix Net

  • Class是nnf.FixTopModule,继承自nn.module
class FixNet(nnf.FixTopModule):
    def __init__()
  • 类方法
    • fix_state_dict
    • load_fix_configs
    • get_fix_configs
    • print_fix_configs
    • set_fix_method
  • fix_state_dict 是一个OrderedDict,包含了定点模型相关的参数(weight/activation)以及其对应的fp_scale(对应着需要乘2^Scale)

  • get_fix_configs(grad=False, data_only=False)
    • (grad=True)表示对应组件的梯度的定点,而()默认就是一般组件的定点模式
    • Dataonly参数表示内部的int32 Tensor变成一般的int
    • 最后形式
      conv1_1:
          weight:
              method: tensor([1])
              scale: tensor([2])
              bitwidth: tensor([8], type=int32)
              range_method: 0
          bias:
              method: tensor([1])
              scale: tensor([2])
              bitwidth: tensor([8], type=int32)
              range_method: 0
      fix10:
          activation:
              method: tensor([1])
              scale: tensor([2])
              bitwidth: tensor([8], type=int32)
              range_method: 0
    
    

Quantization Process

(in fix_modules called when forward)

Quantize(fix_cfg={}, fix_grad_cfg={}):
    Load The cfgs
    QuantizeForward/Gradients:
        quantize_cfgs(**cfgs):
            switch(quantize_method): 
                switch(range_method):
                    do_quantize()
  • RangeMethods
  • _do_quantize(data, scale, bitwidth)
    • scale - given by range_method
    • Flow
      • (when symmetric) range=2*scale
      • step = range/2^bit_width
      • clamp(int(data/step)*step, min, max)
      • 返回值是Tensor和Step
  • 主Quantize为quantize_cfg该函数
    • 包含了各种quantize,其中的核心为_do_quantize函数

Others

Register Self Written nn.module

  • register之后用nnf.{$NAME}_fix() 作为类来定义
from nics_fix_pt import register_fix_module
register_fix_module(MyBN)
self.bn1 = nnf.MyBN_fix()

TensorFlow Lite

参考了

Main Scheme

  • Real_value = Scale(FP)(int8_value - int8_zero_point)
  • 2 Grouping (1 scale and 1 zero_point for each group)
    • Per-Tensor(Per-Layer)
    • Per-Dim(Per-Channel)
  • Symmetric(Unsigned INT8)/ASymmetric(Signed INT)
    • Normally: Activation ASymmetric / Weight Symmetric
    • Specification in Web Page
      • Also sometimes the bias may be int32

Update & Test

  • 注意原本的scale是2^N的N,为了适配float-scale,我们改用了S=2^N,需要修改对应的test中
  • 还需要把_generate_default_cfg中的一个floattensor改成int