-
使用NPU进行AI计算,具体对应模型推理
-
对应代码位置
hardware/rockchip/rknpu2
-
examples: yolo5 编译
- 下载NDK工具
https://developer.android.google.cn/ndk/downloads?hl=zh-cn
- 源码中使用16b,测试可以使用16/17/18/19/20,不可以使用21,后续未测试
- 修改:ANDROID_NDK_PATH=~/opt/android-ndk-r20b
- 命令:bash build-android_RK3588.sh
- 生成物:build目录下rknn_yolov5_demo,model/,lib/
- 下载NDK工具
-
examples: yolo5 运行
post process config: box_conf_threshold = 0.25, nms_threshold = 0.45 Read meetting03.jpg ... img width = 3840, img height = 2160 Loading mode... sdk version: 1.5.2 (c6b7b351a@2023-08-23T15:27:35) driver version: 0.9.2 model input num: 1, output num: 3 index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, w_stride = 640, size_with_stride=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=0, name=output, n_dims=4, dims=[1, 255, 80, 80], n_elems=1632000, size=1632000, w_stride = 0, size_with_stride=1638400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003860 index=1, name=283, n_dims=4, dims=[1, 255, 40, 40], n_elems=408000, size=408000, w_stride = 0, size_with_stride=491520, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=2, name=285, n_dims=4, dims=[1, 255, 20, 20], n_elems=102000, size=102000, w_stride = 0, size_with_stride=163840, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003915 model is NHWC input fmt model input height=640, width=640, channel=3 resize with RGA! scal once run use 22.702000 ms once run use 20.701000 ms loadLabelName ./model/coco_80_labels_list.txt person @ (1482 475 1944 938) 0.704657 person @ (792 604 1632 2116) 0.614134 person @ (126 661 882 1815) 0.602546 person @ (84 540 432 1096) 0.582360 person @ (1290 428 1644 860) 0.570635 person @ (1272 759 2784 2160) 0.494397 person @ (2952 475 3834 1383) 0.484189 person @ (1800 506 2340 968) 0.442070 loop count = 1000 , average run 21.540229 ms emo model/RK3588/yolov5s-640-640.rknn meetting03.jpg < post process config: box_conf_threshold = 0.25, nms_threshold = 0.45 Read meetting03.jpg ... img width = 1920, img height = 1080 Loading mode... sdk version: 1.5.2 (c6b7b351a@2023-08-23T15:27:35) driver version: 0.9.2 model input num: 1, output num: 3 index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, w_stride = 640, size_with_stride=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=0, name=output, n_dims=4, dims=[1, 255, 80, 80], n_elems=1632000, size=1632000, w_stride = 0, size_with_stride=1638400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003860 index=1, name=283, n_dims=4, dims=[1, 255, 40, 40], n_elems=408000, size=408000, w_stride = 0, size_with_stride=491520, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=2, name=285, n_dims=4, dims=[1, 255, 20, 20], n_elems=102000, size=102000, w_stride = 0, size_with_stride=163840, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003915 model is NHWC input fmt model input height=640, width=640, channel=3 resize with RGA! scal once run use 13.794000 ms once run use 21.535000 ms loadLabelName ./model/coco_80_labels_list.txt person @ (738 237 975 469) 0.704657 person @ (396 303 816 1051) 0.614134 person @ (57 327 444 897) 0.604937 person @ (645 214 822 426) 0.601903 person @ (42 270 216 551) 0.566727 person @ (645 386 1389 1080) 0.486672 person @ (1473 237 1911 691) 0.472939 person @ (903 253 1173 482) 0.426558 loop count = 1000 , average run 21.586659 ms
-
examples: yolo5 分析
图像数据流程
-
model的输入格式是RGB888,大小是640x640,输入需要被识别文件为jpg,用opencv进行转换缩放
- cv::Mat orig_img = cv::imread(image_name, 1);
- cv::cvtColor(orig_img, img, cv::COLOR_BGR2RGB);
- src = wrapbuffer_virtualaddr((void*)img.data, img_width, img_height, RK_FORMAT_RGB_888);
- dst = wrapbuffer_virtualaddr((void*)resize_buf, width, height, RK_FORMAT_RGB_888);
- IM_STATUS STATUS = imresize(src, dst);
- inputs[0].buf = resize_buf;
-
图像缩放使用的rga,4k rgb需要12ms,4k yuv需要9ms
model运行流程
-
加载模型、rknn初始化、获取rknn信息、执行rknn,获取结果、后期处理、结果释放、后期处理资源释放、rknn资源释放
- unsigned char* model_data = load_model(model_name, &model_data_size);
- ret = rknn_init(&ctx, model_data, model_data_size, 0, NULL);
- ret = rknn_query(ctx, RKNN_QUERY_SDK_VERSION, &version, sizeof(rknn_sdk_version));
- ret = rknn_query(ctx, RKNN_QUERY_IN_OUT_NUM, &io_num, sizeof(io_num));
- ret = rknn_query(ctx, RKNN_QUERY_INPUT_ATTR, &(input_attrs[i]), sizeof(rknn_tensor_attr));
- ret = rknn_query(ctx, RKNN_QUERY_OUTPUT_ATTR, &(output_attrs[i]), sizeof(rknn_tensor_attr));
- ret = rknn_run(ctx, NULL);
- ret = rknn_outputs_get(ctx, io_num.n_output, outputs, NULL);
- post_process((int8_t*)outputs[0].buf, (int8_t*)outputs[1].buf, (int8_t*)outputs[2].buf,…);
- ret = rknn_outputs_release(ctx, io_num.n_output, outputs);
- deinitPostProcess();
- ret = rknn_destroy(ctx);
-