TensorFlow Lite学习笔记

标签：input -- graph image 笔记 output Lite TensorFlow model

Tensorflow LIte Demo

模型固化freeze_graph和模型优化optimize_for_inference

将模型转化为tflite:toco

TensorFlow Lite Converter

模型量化工具：quantize_graph

TensorFlow Lite学习资料集合

▷ 在 TensorFlow Lite 中支持 Core ML

▷ 使用 TensorFlow Lite 进行基于移动设备的对话建模

▷ Google 第一个 TF 中文教学视频发布 | TensorFlow Lite 深度解析

▷ 发布新的中文系列视频 | TensorFlow Lite 概述和模型转化简介

▷ 有道云笔记是如何使用 TensorFlow Lite 的？

▷ 中文教学视频 | 在 Android 中使用 TensorFlow Lite

▷ 中文视频教学 | 在 iOS 中使用 TensorFlow Lite

▷ TensorFlow Lite 在 Kika Keyboard 中的应用案例分享

▷ 出门问问：使用 TensorFlow Lite 在嵌入式端部署热词检测模型

Tensorflow LIte Demo

https://github.com/Robinatp/Tensorflow_Lite_Demo

模型固化freeze_graph和模型优化optimize_for_inference

移动设备有很大的局限性，因此可以进行任何可以减少应用程序占用空间的预处理值得考虑。TensorFlow库的一种方式是保持较小的移动性，只支持在推理期间常用的操作子集。这是一个合理的方法，因为在移动平台上很少进行培训。同样，它也排除了对大型外部依赖关系的操作的支持。您可以在tensorflow / contrib / makefile / tf_op_files.txt文件中看到支持的操作列表。

默认情况下，大多数图表包含TensorFlow的移动版本不支持的培训操作。TensorFlow不会加载包含不受支持操作的图（即使不支持的操作与推断无关）。

模型固化，可以使用《tensorflow实现将ckpt转pb文件》的convert_variables_to_constants方法，也可以直接采用脚本freeze_graph的方法。
模型优化可以使用脚本：tensorflow.python.tools.optimize_for_inference。

为了避免由不受支持的培训操作引起的问题，TensorFlow安装包括一个工具optimize_for_inference，可删除给定的一组输入和输出不需要的所有节点。

该脚本还进行了一些其他优化，可以帮助加快模型，例如将显式批量归一化操作合并到卷积权重中以减少计算次数。这可以根据输入型号提供30％的速度。运行脚本的方法如下：

python -m tensorflow.python.tools.optimize_for_inference \
  --input = tf_files / retrained_graph.pb \
  --output = tf_files / optimized_graph.pb \
  --input_names =“input”\
  --output_names = “final_result”

运行此脚本将在此创建一个新文件tf_files/optimized_graph.pb。

使用方法如下：

#!/usr/bin/env bash
# 模型路径
model_dir=/home/ubuntu/project/ImageEnhance/triple_path_networks/models/TMFCN_l2_sigmoid_best_sky
# ckpt文件
ckpt=tpn-52000
# 输入输出tensor
input_tensor=orig_images
output_tensor=output/Sigmoid
# 输出固话模型
output_pb=frozen_graph2.pb
# 输出优化后的模型
optimize_pb=optimize_graph2.pb

# 激活tensorflow
source activate tensorflow-cpu-py35

# 固话模型
echo 'freeze_graph'
freeze_graph \
    --input_graph=$model_dir/graph.pbtxt \
    --input_checkpoint=$model_dir/$ckpt \
    --input_binary=false \
    --output_graph=$model_dir/$output_pb \
    --output_node_names=$output_tensor

echo 'freeze graph done...'

# 模型优化
echo 'optimize_for_inference'
python -m tensorflow.python.tools.optimize_for_inference \
    --input=$model_dir/$output_pb \
    --output=$model_dir/$optimize_pb \
    --frozen_graph=True \
    --input_names=$input_tensor \
    --output_names=$output_tensor

echo 'optimized done...'

将模型转化为tflite:toco

TensorFlow Lite 所用的模型是使用 TOCO 工具从 TensorFlow 模型转化而来的，来源就是经过冷冻生成的 Frozen Graph。假如你已经得到了一个“够用”的模型了，而且你也没有源代码或者数据来重新进行训练，那么就使用当前的模型吧，没有任何问题。但如果你有源代码和数据，直接使用 TOCO 工具进行模型转化将会是最好的选择。示例代码如下：

#!/usr/bin/env bash
# 模型路径
model_dir=/home/ubuntu/project/ImageEnhance/triple_path_networks/models/YNet_sigmoid_best_sky

# 输入输出tensor
input_tensor=orig_images
output_tensor=output/concat
# 输入优化后的模型
optimize_pb=optimize_graph2.pb

# 激活tensorflow
source activate tensorflow-cpu-py35

#  float数据格式转换
echo 'TF Lite:float'
toco \
    --graph_def_file=$model_dir/$optimize_pb \
    --output_file=$model_dir/optimize_graph_float_128.tflite \
    --output_format=TFLITE \
    --inference_type=FLOAT \
    --input_type=FLOAT \
    --input_arrays=$input_tensor \
    --output_arrays=$output_tensor \
    --input_shapes=1,128,128,3

# QUANTIZED_UINT8格式
echo 'TF Lite:QUANTIZED_UINT8'
toco \
    --graph_def_file=$model_dir/$optimize_pb \
    --output_file=$model_dir/optimize_graph_uint8_128.tflite \
    --output_format=TFLITE \
    --input_arrays=$input_tensor \
    --output_arrays=$output_tensor \
    --input_shapes=1,128,128,3 \
    --inference_type=QUANTIZED_UINT8 \
    --inference_input_type=QUANTIZED_UINT8 \
    --mean_value=128 \
    --std_dev_values=128 \
    --default_ranges_min=0 \
    --default_ranges_max=255

TensorFlow Lite Converter

当然，也可以直接使用Python的TFLiteConvert工具，如：

PS：TensorFlow版本需要1.12.0
官网子类：https://tensorflow.google.cn/lite/convert/python_api?hl=zh-cn

def convert_tflite():
    graph_def_file = "../models/YNet_sigmoid_best_sky/optimize_graph.pb"
    # input_arrays = ['image', 'sp', 'Hsp_boxes', 'O_boxes']
    # output_arrays = ["classification/op_store"]
    input_arrays = ['orig_images']
    output_arrays = ['output/concat']
    out_tflite=os.path.dirname(graph_def_file)
    out_tflite=os.path.join(out_tflite,'converted_model_64.tflite')

    input_shapes={"orig_images":[1,64,64,3]}
    # Converting a GraphDef from session.
    # converter = lite.TFLiteConverter.from_session(sess, in_tensors, out_tensors)
    # tflite_model = converter.convert()
    # open("converted_model.tflite", "wb").write(tflite_model)

    # Converting a GraphDef from file.
    converter = lite.TFLiteConverter.from_frozen_graph(
        graph_def_file, input_arrays, output_arrays,input_shapes)
    tflite_model = converter.convert()
    open(out_tflite, "wb").write(tflite_model)

    # Converting a SavedModel.
    # converter = lite.TFLiteConverter.from_saved_model(saved_model_dir)
    # tflite_model = converter.convert()

    # Converting a tf.keras model.
    # converter = lite.TFLiteConverter.from_keras_model_file(keras_model)
    # tflite_model = converter.convert()

tflite模型用于移植到移动端，也可以调用tflite的Python接口lite.Interpreter进行推理：

def tflite_test(filename,orig_dir,out_dir,tflite_path,resize_width=0, resize_height=0):
    images_list =load_data.read_data(filename)
    images_list=[os.path.join(orig_dir,name) for name in images_list]

    if not os.path.exists(out_dir):
        os.makedirs(out_dir)

    # Get input and output tensors.
    interpreter = lite.Interpreter(model_path=tflite_path)
    interpreter.allocate_tensors()
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()
    input_shape = input_details[0]['shape']
    
    print("input_shape:{}".format(input_shape))
    print(" input_details.index".format(input_details[0]['index']))
    print("output_details.index".format(output_details[0]['index']))

    for image_path in images_list:
        if not os.path.exists(image_path):
            print("no image:{}".format(image_path))
            continue

        orig_image = image_processing.read_image(image_path, 0, 0, normalization=True)
        orig_shape = orig_image.shape
        input_image = orig_image

        if resize_height > 0 and resize_width > 0:
            input_image = cv2.resize(input_image, (resize_width, resize_height))
            
        # 输入数据的类型必须与tflite模型一致，一般是float32或uint8
        T0 = datetime.datetime.now()
        input_data = np.array(input_image[np.newaxis, :],dtype=np.float32)
        # input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
        interpreter.set_tensor(input_details[0]['index'], input_data)
        interpreter.invoke()
        output_data = interpreter.get_tensor(output_details[0]['index'])
        T1 = datetime.datetime.now()

        A_net, B_net = np.array_split(output_data, indices_or_sections=2, axis=3)
        pre_net1 = A_net[0, :, :, :]  # tf.squeeze
        pre_net2 = B_net[0, :, :, :]
        if resize_height > 0 and resize_width > 0:
            pre_net1 = cv2.resize(pre_net1, (orig_shape[1], orig_shape[0]), interpolation=cv2.INTER_LINEAR)
            pre_net2 = cv2.resize(pre_net2, (orig_shape[1], orig_shape[0]), interpolation=cv2.INTER_LINEAR)

        pre_images = np.multiply(pre_net1, orig_image) + pre_net2

        # 图像数据溢出保护
        # pre_images = tf.cast(255.0 * tf.clip_by_value(pre_images, 0, 1), tf.uint8)
        pre_images = np.clip(pre_images, 0, 1)
        T2 = datetime.datetime.now()

        # load_data.show_image("image", pre_images)
        name = os.path.splitext(os.path.basename(image_path))[0]
        image_processing.combime_save_image(orig_image, pre_images, out_dir, name,
                                            prefix="YNet_pb_resize" + str(resize_height))
        T3 = datetime.datetime.now()
        print("processing image:{},shape:{},rum time:tpn:{}ms,mul:{}ms,all:{}ms"
              .format(image_path,
                      pre_images.shape,
                      (T1 - T0).seconds * 1000 + (T1 - T0).microseconds / 1000.0,
                      (T2 - T1).seconds * 1000 + (T2 - T1).microseconds / 1000.0,
                      (T3 - T0).seconds * 1000 + (T3 - T0).microseconds / 1000.0))

模型量化工具：quantize_graph

量化简单来说就是将32浮点数近似地用8位整数存储和计算，量化后，模型占用存储空间减小75%,起到了压缩模型的效果。

8bit量化简单的例子：模型属于同一层的参数值会分布在一个较小的区间内，比如在[-1,1]之间，可以把同一层的所有参数都线性映射区间[0, 255]，如：

float | Quantized
-------+----------
-1.0 | 0
1.0 | 255
0.0 | 125

执行命令：

bazel-bin/tensorflow/tools/quantization/quantize_graph \
--input=./tmp/classify_image_graph_def.pb \
--output_node_names="softmax" --output=./tmp/quantized_graph.pb \
--mode=eightbit

标签：input,--,graph,image,笔记,output,Lite,TensorFlow,model
From： https://blog.51cto.com/u_15764210/5900522

TensorFlow Lite学习笔记