基本思想:最近想尝试一下nano 上部署nanodet,于是记录一下训练过程,手中有一份labelme标注的数据集,于是开始了一波操作~
首先进行划分数据集分为训练集和验证集 31、TensorFlow训练模型转成tfilte,进行Android端进行车辆检测、跟踪、部署_sxj731533730
import os
import random
import time
import shutil
totalfilepath = r'/home/ubuntu/ty/hand_open'
saveBasePath = r"/home/ubuntu/ty/A"
trainval_percent = 0.8
train_percent = 0.8
total_xml = os.listdir(totalfilepath)
num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)
print("train and val size", tv)
print("train size", tr)
start = time.time()
test_num = 0
val_num = 0
train_num = 0
for i in list:
if total_xml[i].endswith('xml'):
xmlname = total_xml[i]
jpgname, _ = os.path.splitext(xmlname)
jpgname = jpgname + ".jpg"
if i in trainval: # train and val set
if i in train:
directoryjpg = "trainjpg"
directoryxml = "trainxml"
train_num += 1
jpg_path = os.path.join(os.getcwd(), os.path.join(saveBasePath,directoryjpg))
if (not os.path.exists(jpg_path)):
os.mkdir(jpg_path)
xml_path = os.path.join(os.getcwd(), os.path.join(saveBasePath, directoryxml))
if (not os.path.exists(xml_path)):
os.mkdir(xml_path)
fulljpgname = os.path.join(totalfilepath, jpgname)
newfile = os.path.join(saveBasePath, os.path.join(directoryjpg, jpgname))
shutil.copyfile(fulljpgname, newfile)
fullxmlname = os.path.join(totalfilepath, xmlname)
newfile = os.path.join(saveBasePath, os.path.join(directoryxml, xmlname))
shutil.copyfile(fullxmlname, newfile)
else:
directoryjpg = "validationjpg"
directoryxml = "validationxml"
train_num += 1
jpg_path = os.path.join(os.getcwd(), os.path.join(saveBasePath, directoryjpg))
if (not os.path.exists(jpg_path)):
os.mkdir(jpg_path)
xml_path = os.path.join(os.getcwd(), os.path.join(saveBasePath, directoryxml))
if (not os.path.exists(xml_path)):
os.mkdir(xml_path)
fulljpgname = os.path.join(totalfilepath, jpgname)
newfile = os.path.join(saveBasePath, os.path.join(directoryjpg, jpgname))
shutil.copyfile(fulljpgname, newfile)
fullxmlname = os.path.join(totalfilepath, xmlname)
newfile = os.path.join(saveBasePath, os.path.join(directoryxml, xmlname))
shutil.copyfile(fullxmlname, newfile)
end = time.time()
seconds = end - start
print("train total : " + str(train_num))
print("test total : " + str(test_num))
total_num = train_num + val_num + test_num
print("total number : " + str(total_num))
print("Time taken : {0} seconds".format(seconds))
然后下载nanodet源码
ubuntu@ubuntu~$: git clone https://github.com/RangiLyu/nanodet
ubuntu@ubuntu~$: cd nanodet/config/
ubuntu@ubuntu~/nanodet/config $: cp nanodet_custom_xml_dataset.yml nanodet_custom_xml_datasetsxj.yml
修改的对应配置文件为:
ubuntu@ubuntu:~/nanodet/config$ cat nanodet_custom_xml_datasetsxj.yml
#Config File example
save_dir: workspace/nanodet_m
model:
arch:
name: GFL
backbone:
name: ShuffleNetV2
model_size: 1.0x
out_stages: [2,3,4]
activation: LeakyReLU
fpn:
name: PAN
in_channels: [116, 232, 464]
out_channels: 96
start_level: 0
num_outs: 3
head:
name: NanoDetHead
num_classes: 8 #Please fill in the number of categories (not include background category)
input_channel: 96
feat_channels: 96
stacked_convs: 2
share_cls_reg: True
octave_base_scale: 5
scales_per_octave: 1
strides: [8, 16, 32]
reg_max: 7
norm_cfg:
type: BN
loss:
loss_qfl:
name: QualityFocalLoss
use_sigmoid: True
beta: 2.0
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.25
loss_bbox:
name: GIoULoss
loss_weight: 2.0
class_names: &class_names ['保密','保密','保密','保密','保密','保密','保密','保密'] #Please fill in the category names (not include background category)
data:
train:
name: xml_dataset
class_names: *class_names
img_path: /home/ubuntu/dataset/trainjpg #Please fill in train image path
ann_path: /home/ubuntu/dataset/trainxml #Please fill in train xml path
input_size: [320,320] #[w,h]
keep_ratio: True
pipeline:
perspective: 0.0
scale: [0.6, 1.4]
stretch: [[1, 1], [1, 1]]
rotation: 0
shear: 0
translate: 0.2
flip: 0.5
brightness: 0.2
contrast: [0.8, 1.2]
saturation: [0.8, 1.2]
normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
val:
name: xml_dataset
class_names: *class_names
img_path: /home/ubuntu/dataset/validationjpg #Please fill in val image path
ann_path: /home/ubuntu/dataset/validationxml #Please fill in val xml path
input_size: [320,320] #[w,h]
keep_ratio: True
pipeline:
normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
device:
gpu_ids: [0]
workers_per_gpu: 12
batchsize_per_gpu: 160
schedule:
# resume:
# load_model: YOUR_MODEL_PATH
optimizer:
name: SGD
lr: 0.14
momentum: 0.9
weight_decay: 0.0001
warmup:
name: linear
steps: 300
ratio: 0.1
total_epochs: 190
lr_schedule:
name: MultiStepLR
milestones: [130,160,175,185]
gamma: 0.1
val_intervals: 10
eval(232, 232, 232); background: rgb(249, 249, 249);">ubuntu@ubuntu:~/nanodet$ python3 tools/train.py config/nanodet_custom_xml_datasetsxj.yml
然后测试一下
注意opencv-python 版本【遇到问题:qt.qpa.plugin:Could not load the Qt platform plugin “xcb“】
ubuntu@ubuntu:~/nanodet$ pip3 install opencv-python==4.1.0.25
ubuntu@ubuntu:~/nanodet$ python3 demo/demo.py image --config config/nanodet_custom_xml_datasetsxj.yml --model workspace/nanodet_m/model_best/model_best.pth --path /home/ps/TESTINT8YOL5/dataset/1.jpg
图片我就不附录了,客户要求保密~
然后转化模型pth--onnx,转化过程中要参考nanodet_custom_xml_dataset.yml的文件input_shape [320,320] 的大小进行处理
模型转化很顺利,但是后来发现 自己的转的模型和nihui大佬的不太一样,但是可以使用,没在深究
ubuntu@ubuntu:~/nanodet$: python3 tools/export.py --cfg_path=config/nanodet_custom_xml_datasetsxj.yml --model_path=workspace/nanodet_m/model_best/model_best.pth --out_path=/home/ubuntu/nanodet/result.onnx --input_shape=320,320
新版本
ubuntu@ubuntu:~/nanodet$ python3 tools/export_onnx.py --cfg_path=/home/ubuntu/nanodet/config/nanodet_custom_xml_dataset.yml --model_path=/home/ubuntu/nanodet/tools/workspace/nanodet_m/model_best/model_best.ckpt --out_path=/home/ubuntu/nanodet/result.onnx --input_shape=320,320
然后就转化模型成功了
%838 : Float(1:3200, 32:100, 100:1, requires_grad=1, device=cpu) = onnx::Reshape(%832, %837) # /home/ps/TESTINT8YOL5/nanodet/nanodet/model/head/nanodet_head.py:128:0
%839 : Float(1:3200, 100:1, 32:100, requires_grad=1, device=cpu) = onnx::Transpose[perm=[0, 2, 1]](%838) # /home/ps/TESTINT8YOL5/nanodet/nanodet/model/head/nanodet_head.py:128:0
return (%792, %814, %836, %795, %817, %839)
finished exporting onnx
Model saved to: /home/ubuntu/nanodet/result.onnx
然后简化一下模型
ubuntu@ubuntu:~/nanodet$ pip install onnx-simplifier
ubuntu@ubuntu:~/nanodet$ python -m onnxsim result.onnx result-smi.onnx
Simplifying...
Checking 0/3...
Checking 1/3...
Checking 2/3...
Ok!
使用了大老师的网页版转化一下 一键转换 Caffe, ONNX, TensorFlow 到 NCNN, MNN, Tengine
然后部署ncnn,使用nanodet中里面的ncnn_demo 的源码主函数修改一下和 同时根据*.param 和 *.bin 修改一下 nanodet.h
该代码主要实现功能为:主体结构是nanodet 进行目标检测,然后根据目标位置,进行区域搜索和移动,其中搜索过程和移动过程都是通过NVIDIA nano串口进行数据转发
核心代码(yolo5example+串口转发程序)
#include "ncnn/benchmark.h"
#include "ncnn/cpu.h"
#include "ncnn/datareader.h"
#include "ncnn/net.h"
#include "ncnn/gpu.h"
#include "ncnn/layer.h"
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <iostream>
#include <vector>
#include <stdio.h>
#include <iostream>
#include <string>
#include <opencv4/opencv2/opencv.hpp>
#include <opencv4/opencv2/core.hpp>
#include <opencv4/opencv2/highgui.hpp>
#include <opencv4/opencv2/imgproc.hpp>
#include <opencv4/opencv2/objdetect.hpp>
#include <opencv4/opencv2/imgproc/types_c.h>
#include <opencv4/opencv2/videoio.hpp>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <termios.h>
#include <unistd.h>
#include<unistd.h>
#include<stdio.h>
#include<sys/types.h>
#include<sys/stat.h>
using namespace cv;
using namespace std;
using namespace ncnn;
static ncnn::Net yolov5;
class YoloV5Focus : public ncnn::Layer
{
public:
YoloV5Focus()
{
one_blob_only = true;
}
virtual int forward(const ncnn::Mat& bottom_blob, ncnn::Mat& top_blob, const ncnn::Option& opt) const
{
int w = bottom_blob.w;
int h = bottom_blob.h;
int channels = bottom_blob.c;
int outw = w / 2;
int outh = h / 2;
int outc = channels * 4;
top_blob.create(outw, outh, outc, 4u, 1, opt.blob_allocator);
if (top_blob.empty())
return -100;
#pragma omp parallel for num_threads(opt.num_threads)
for (int p = 0; p < outc; p++)
{
const float* ptr = bottom_blob.channel(p % channels).row((p / channels) % 2) + ((p / channels) / 2);
float* outptr = top_blob.channel(p);
for (int i = 0; i < outh; i++)
{
for (int j = 0; j < outw; j++)
{
*outptr = *ptr;
outptr += 1;
ptr += 2;
}
ptr += w;
}
}
return 0;
}
};
DEFINE_LAYER_CREATOR(YoloV5Focus)
struct Object
{
float x;
float y;
float w;
float h;
int label;
float prob;
};
static inline float sigmoid(float x)
{
return static_cast<float>(1.f / (1.f + exp(-x)));
}
static void generate_proposals(const ncnn::Mat& anchors, int stride, const ncnn::Mat& in_pad, const ncnn::Mat& feat_blob, float prob_threshold, std::vector<Object>& objects)
{
const int num_grid = feat_blob.h;
int num_grid_x;
int num_grid_y;
if (in_pad.w > in_pad.h)
{
num_grid_x = in_pad.w / stride;
num_grid_y = num_grid / num_grid_x;
}
else
{
num_grid_y = in_pad.h / stride;
num_grid_x = num_grid / num_grid_y;
}
const int num_class = feat_blob.w - 5;
const int num_anchors = anchors.w / 2;
for (int q = 0; q < num_anchors; q++)
{
const float anchor_w = anchors[q * 2];
const float anchor_h = anchors[q * 2 + 1];
const ncnn::Mat feat = feat_blob.channel(q);
for (int i = 0; i < num_grid_y; i++)
{
for (int j = 0; j < num_grid_x; j++)
{
const float* featptr = feat.row(i * num_grid_x + j);
// find class index with max class score
int class_index = 0;
float class_score = -FLT_MAX;
for (int k = 0; k < num_class; k++)
{
float score = featptr[5 + k];
if (score > class_score)
{
class_index = k;
class_score = score;
}
}
float box_score = featptr[4];
float confidence = sigmoid(box_score) * sigmoid(class_score);
if (confidence >= prob_threshold)
{
// yolov5/models/yolo.py Detect forward
// y = x[i].sigmoid()
// y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy
// y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
float dx = sigmoid(featptr[0]);
float dy = sigmoid(featptr[1]);
float dw = sigmoid(featptr[2]);
float dh = sigmoid(featptr[3]);
float pb_cx = (dx * 2.f - 0.5f + j) * stride;
float pb_cy = (dy * 2.f - 0.5f + i) * stride;
float pb_w = pow(dw * 2.f, 2) * anchor_w;
float pb_h = pow(dh * 2.f, 2) * anchor_h;
float x0 = pb_cx - pb_w * 0.5f;
float y0 = pb_cy - pb_h * 0.5f;
float x1 = pb_cx + pb_w * 0.5f;
float y1 = pb_cy + pb_h * 0.5f;
Object obj;
obj.x = x0;
obj.y = y0;
obj.w = x1 - x0;
obj.h = y1 - y0;
obj.label = class_index;
obj.prob = confidence;
objects.push_back(obj);
}
}
}
}
}
static inline float intersection_area(const Object& a, const Object& b)
{
if (a.x > b.x + b.w || a.x + a.w < b.x || a.y > b.y + b.h || a.y + a.h < b.y)
{
// no intersection
return 0.f;
}
float inter_width = std::min(a.x + a.w, b.x + b.w) - std::max(a.x, b.x);
float inter_height = std::min(a.y + a.h, b.y + b.h) - std::max(a.y, b.y);
return inter_width * inter_height;
}
static void qsort_descent_inplace(std::vector<Object>& faceobjects, int left, int right)
{
int i = left;
int j = right;
float p = faceobjects[(left + right) / 2].prob;
while (i <= j)
{
while (faceobjects[i].prob > p)
i++;
while (faceobjects[j].prob < p)
j--;
if (i <= j)
{
// swap
std::swap(faceobjects[i], faceobjects[j]);
i++;
j--;
}
}
#pragma omp parallel sections
{
#pragma omp section
{
if (left < j) qsort_descent_inplace(faceobjects, left, j);
}
#pragma omp section
{
if (i < right) qsort_descent_inplace(faceobjects, i, right);
}
}
}
static void qsort_descent_inplace(std::vector<Object>& faceobjects)
{
if (faceobjects.empty())
return;
qsort_descent_inplace(faceobjects, 0, faceobjects.size() - 1);
}
static void nms_sorted_bboxes(const std::vector<Object>& faceobjects, std::vector<int>& picked, float nms_threshold)
{
picked.clear();
const int n = faceobjects.size();
std::vector<float> areas(n);
for (int i = 0; i < n; i++)
{
areas[i] = faceobjects[i].w * faceobjects[i].h;
}
for (int i = 0; i < n; i++)
{
const Object& a = faceobjects[i];
int keep = 1;
for (int j = 0; j < (int)picked.size(); j++)
{
const Object& b = faceobjects[picked[j]];
// intersection over union
float inter_area = intersection_area(a, b);
float union_area = areas[i] + areas[picked[j]] - inter_area;
// float IoU = inter_area / union_area
if (inter_area / union_area > nms_threshold)
keep = 0;
}
if (keep)
picked.push_back(i);
}
}
int sendSerialPort(const char *W_BUF)
{
/*
int fd=-1;
fd=open("/dev/ttyUSB0",O_RDWR);
if(fd<0)
{ printf("open fail \r\n");
return -1;
}
write(fd,W_BUF,7);
close(fd);
*/
int tty_fd = -1 ;
int rv = -1 ;
struct termios options;
tty_fd = open("/dev/ttyTHS1",O_RDWR|O_NOCTTY|O_NDELAY) ; //打开串口设备
if(tty_fd < 0)
{
printf("open tty failed:%s\n", strerror(errno)) ;
// goto cleanup ;
return -1;
}
printf("open devices sucessful!\n") ;
memset(&options, 0, sizeof(options)) ;
rv = tcgetattr(tty_fd, &options); //获取原有的串口属性的配置
if(rv != 0)
{
printf("tcgetattr() failed:%s\n",strerror(errno)) ;
// goto cleanup ;
return -1;
}
options.c_cflag|=(CLOCAL|CREAD ); // CREAD 开启串行数据接收,CLOCAL并打开本地连接模式
options.c_cflag &=~CSIZE;// 先使用CSIZE做位屏蔽
options.c_cflag |= CS8; //设置8位数据位
options.c_cflag &= ~PARENB; //无校验位
cfsetispeed(&options, B500000);
cfsetospeed(&options, B500000);
options.c_cflag &= ~CSTOPB;
options.c_cc[VTIME] = 0;
options.c_cc[VMIN] = 0;
tcflush(tty_fd ,TCIFLUSH);
if((tcsetattr(tty_fd, TCSANOW,&options))!=0)
{
printf("tcsetattr failed:%s\n", strerror(errno));
//goto cleanup ;
return -1;
}
//while(1)
//{
std::cout<<W_BUF<<std::endl;
rv = write(tty_fd, W_BUF,strlen(W_BUF)) ;
if(rv < 0)
{
printf("Write() error:%s\n",strerror(errno)) ;
//goto cleanup ;
return -1;
}
//sleep(3) ;
//}
cleanup:
close(tty_fd) ;
return 0 ;
}
string direction(string str,int h,int w,int x0,int y0,int x1,int y1)
{
int A = w / 2;
int B = h / 2;
if (A >= x0 && A <= x1 && B >= y0 && B <= y1) {
return "000"+str;
} else if (A >= x1) {
return "001"+str;
} else if(A<=x0)
{
return "010"+str;
} else if (B >= y1) {
return "011"+str;
} else if(B <=y0)
{
return "100"+str;
}
}
int demo(cv::Mat& image, ncnn::Net &detector, int detector_size_width, int detector_size_height)
{
// static const char* direction[]={"000000","010000","100000","110000"};
static const char* class_names[] = { "start","A","H","W","Z","car","person","house" };
string class_code[] = { "00001","0010","0011","0100","0101","0110","0111","1000" }; // the height level 1-2 axis, the low level aims
const int target_size = 640;
int width=image.cols;
int height=image.rows;
// letterbox pad to multiple of 32
int w = width;
int h = height;
float scale = 1.f;
if (w > h)
{
scale = (float)target_size / w;
w = target_size;
h = h * scale;
}
else
{
scale = (float)target_size / h;
h = target_size;
w = w * scale;
}
ncnn::Mat in = ncnn::Mat::from_pixels_resize(image.data, ncnn::Mat::PIXEL_BGR2RGB,\
image.cols, image.rows, w, h);
// pad to target_size rectangle
// yolov5/utils/datasets.py letterbox
int wpad = (w + 31) / 32 * 32 - w;
int hpad = (h + 31) / 32 * 32 - h;
ncnn::Mat in_pad;
ncnn::copy_make_border(in, in_pad, hpad / 2, hpad - hpad / 2, wpad / 2, wpad - wpad / 2, ncnn::BORDER_CONSTANT, 114.f);
// yolov5
std::vector<Object> objects;
{
const float prob_threshold = 0.25f;
const float nms_threshold = 0.45f;
const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f};
in_pad.substract_mean_normalize(0, norm_vals);
ncnn::Extractor ex = yolov5.create_extractor();
ex.set_num_threads(4);
ex.input("images", in_pad);
std::vector<Object> proposals;
// anchor setting from yolov5/models/yolov5s.yaml
// stride 8
{
ncnn::Mat out;
ex.extract("output", out);
ncnn::Mat anchors(6);
anchors[0] = 10.f;
anchors[1] = 13.f;
anchors[2] = 16.f;
anchors[3] = 30.f;
anchors[4] = 33.f;
anchors[5] = 23.f;
std::vector<Object> objects8;
generate_proposals(anchors, 8, in_pad, out, prob_threshold, objects8);
proposals.insert(proposals.end(), objects8.begin(), objects8.end());
}
// stride 16
{
ncnn::Mat out;
ex.extract("417", out);
ncnn::Mat anchors(6);
anchors[0] = 30.f;
anchors[1] = 61.f;
anchors[2] = 62.f;
anchors[3] = 45.f;
anchors[4] = 59.f;
anchors[5] = 119.f;
std::vector<Object> objects16;
generate_proposals(anchors, 16, in_pad, out, prob_threshold, objects16);
proposals.insert(proposals.end(), objects16.begin(), objects16.end());
}
// stride 32
{
ncnn::Mat out;
ex.extract("437", out);
ncnn::Mat anchors(6);
anchors[0] = 116.f;
anchors[1] = 90.f;
anchors[2] = 156.f;
anchors[3] = 198.f;
anchors[4] = 373.f;
anchors[5] = 326.f;
std::vector<Object> objects32;
generate_proposals(anchors, 32, in_pad, out, prob_threshold, objects32);
proposals.insert(proposals.end(), objects32.begin(), objects32.end());
}
// sort all proposals by score from highest to lowest
qsort_descent_inplace(proposals);
// apply nms with nms_threshold
std::vector<int> picked;
nms_sorted_bboxes(proposals, picked, nms_threshold);
int count = picked.size();
objects.resize(count);
if(count){
for (int i = 0; i < count; i++)
{
objects[i] = proposals[picked[i]];
// adjust offset to original unpadded
float x0 = (objects[i].x - (wpad / 2)) / scale;
float y0 = (objects[i].y - (hpad / 2)) / scale;
float x1 = (objects[i].x + objects[i].w - (wpad / 2)) / scale;
float y1 = (objects[i].y + objects[i].h - (hpad / 2)) / scale;
// clip
x0 = std::max(std::min(x0, (float)(width - 1)), 0.f);
y0 = std::max(std::min(y0, (float)(height - 1)), 0.f);
x1 = std::max(std::min(x1, (float)(width - 1)), 0.f);
y1 = std::max(std::min(y1, (float)(height - 1)), 0.f);
objects[i].x = x0;
objects[i].y = y0;
objects[i].w = x1 - x0;
objects[i].h = y1 - y0;
float centerX=x0+(x1-x0)/2;
float centerY=y0+(y1-y0)/2;
const char *data=direction(class_code[objects[i].label], image.cols,image.rows, x0, y0,x1, y1).c_str();
//cv::rectangle (image, cv::Point(x0, y0), cv::Point(x1, y1), cv::Scalar(255, 255, 0), 1, 1, 0);
// char text[256];
// sprintf(text, "%s %.1f%%", class_names[objects[i].label], objects[i].prob);
// int baseLine = 0;
// cv::Size label_size = cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine);
// cv::putText(image, text, cv::Point(x1, y1 + label_size.height),cv::FONT_HERSHEY_SIMPLEX, 0.5, cv::Scalar(0, 0, 0));
//std::cout<<class_code[objects[i].label]<<endl;
//const char *data=class_code[objects[i].label].c_str();
sendSerialPort(data);
}
}else{
string temp="000000";
std::cout<<temp<<endl;
const char *data=temp.c_str();
sendSerialPort(data);
}
}
return 0;
}
string gstreamer_pipeline (int capture_width, int capture_height, int display_width, int display_height, int framerate, int flip_method)
{
return "nvarguscamerasrc ! video/x-raw(memory:NVMM), width=(int)" + to_string(capture_width) + ", height=(int)" +
to_string(capture_height) + ", format=(string)NV12, framerate=(fraction)" + to_string(framerate) +
"/1 ! nvvidconv flip-method=" + to_string(flip_method) + " ! video/x-raw, width=(int)" + to_string(display_width) + ", height=(int)" +
to_string(display_height) + ", format=(string)BGRx ! videoconvert ! video/x-raw, format=(string)BGR ! appsink";
}
//摄像头测试
int test_cam()
{
int capture_width = 1280 ;
int capture_height = 720 ;
int display_width = 1280 ;
int display_height = 720 ;
int framerate = 60 ;
int flip_method = 0 ;
//´´½¨¹ÜµÀ
string pipeline = gstreamer_pipeline(capture_width,
capture_height,
display_width,
display_height,
framerate,
flip_method);
std::cout << "ʹÓÃgstreamer¹ÜµÀ: \n\t" << pipeline << "\n";
//¹ÜµÀÓëÊÓƵÁ÷°ó¶¨
VideoCapture cap(pipeline, CAP_GSTREAMER);
//定义yolo-fastest VOC检测器 x-special/nautilus-clipboard
ncnn::Net detector;
yolov5.register_custom_layer("YoloV5Focus", YoloV5Focus_layer_creator);
yolov5.load_param("/home/nano/DetectDemo/model/yolov5s-smi.param");
yolov5.load_model("/home/nano/DetectDemo/model/yolov5s-smi.bin");
int detector_size_width = 320;
int detector_size_height = 320;
//cv::VideoCapture cap(0);
cv::Mat src;
//加载图片
if (!cap.isOpened())
{
cout << "Error opening video stream or file" << endl;
return -1;
}
cv::Mat frame;
long int i=0;
while (1) {
if(i==8*500) i=0;
// Capture frame-by-frame
cap >> frame;
// If the frame is empty, break immediately
if(frame.empty()) break;
//******************
if(i%8==0){
double start = ncnn::get_current_time();
demo(frame, detector, detector_size_width, detector_size_height);
double end = ncnn::get_current_time();
double time = end - start;
printf("Time:%7.2f \n", time);
}
i++;
cv::imshow("Display", frame);
//imwrite("F:\\ll.jpg", src);
//暂停,等待按键结束
cv::waitKey(1);
}
cap.release();
///home/ubuntu/yolov5/yolov5s-smi.param
return 0;
}
int main()
{
test_cam();
return 0;
}
无人机原型图,后期集成到ros中,代码整理中
补充另一个例子,nanodet新版本为后缀名cpkt模型:
nanodet的源代码更新之后,转换模型步骤参考官网
ubuntu@ubuntu:~/nanodet$ python tools/export_onnx.py --cfg_path config/nanodet_custom_xml_datasetsxj.yml --model_path workspace/nanodet_m/model_best/model_best.ckpt --input_shape=320,320
ubuntu@ubuntu:~/nanodet$ python3 -m onnxsim nanodet.onnx nanodet_sim.onnx
Simplifying...
Checking 0/3...
Checking 1/3...
Checking 2/3...
Ok!
其它转化就一样了
模型后处理部分
根据这个输出参数修改demo就可以了 (ncnn代码片段)
// stride 8
{
ncnn::Mat cls_pred;
ncnn::Mat dis_pred;
ex.extract("cls_pred_stride_8", cls_pred);
ex.extract("dis_pred_stride_8", dis_pred);
std::vector<Object> objects8;
generate_proposals(cls_pred, dis_pred, 8, in_pad, prob_threshold, objects8);
proposals.insert(proposals.end(), objects8.begin(), objects8.end());
}
// stride 16
{
ncnn::Mat cls_pred;
ncnn::Mat dis_pred;
ex.extract("cls_pred_stride_16", cls_pred);
ex.extract("dis_pred_stride_16", dis_pred);
std::vector<Object> objects16;
generate_proposals(cls_pred, dis_pred, 16, in_pad, prob_threshold, objects16);
proposals.insert(proposals.end(), objects16.begin(), objects16.end());
}
// stride 32
{
ncnn::Mat cls_pred;
ncnn::Mat dis_pred;
ex.extract("cls_pred_stride_32", cls_pred);
ex.extract("dis_pred_stride_32", dis_pred);
std::vector<Object> objects32;
generate_proposals(cls_pred, dis_pred, 32, in_pad, prob_threshold, objects32);
proposals.insert(proposals.end(), objects32.begin(), objects32.end());
}
nanodet训练voc数据集的方式,jpg/xml数据集转voc数据集
转化voc数据集代码
import xml.etree.ElementTree as ET
import os
import json
coco = dict()
coco['images'] = []
coco['type'] = 'instances'
coco['annotations'] = []
coco['categories'] = []
category_set = dict()
image_set = set()
category_item_id = 0
image_id = 0
annotation_id = 0
def addCatItem(name):
global category_item_id
category_item = dict()
category_item['supercategory'] = 'none'
category_item_id += 1
category_item['id'] = category_item_id
category_item['name'] = name
coco['categories'].append(category_item)
category_set[name] = category_item_id
return category_item_id
def addImgItem(file_name, size):
global image_id
if file_name is None:
raise Exception('Could not find filename tag in xml file.')
if size['width'] is None:
raise Exception('Could not find width tag in xml file.')
if size['height'] is None:
raise Exception('Could not find height tag in xml file.')
image_id += 1
image_item = dict()
image_item['id'] = image_id
image_item['file_name'] = file_name
image_item['width'] = size['width']
image_item['height'] = size['height']
coco['images'].append(image_item)
image_set.add(file_name)
return image_id
def addAnnoItem(object_name, image_id, category_id, bbox):
global annotation_id
annotation_item = dict()
annotation_item['segmentation'] = []
seg = []
# bbox[] is x,y,w,h
# left_top
seg.append(bbox[0])
seg.append(bbox[1])
# left_bottom
seg.append(bbox[0])
seg.append(bbox[1] + bbox[3])
# right_bottom
seg.append(bbox[0] + bbox[2])
seg.append(bbox[1] + bbox[3])
# right_top
seg.append(bbox[0] + bbox[2])
seg.append(bbox[1])
annotation_item['segmentation'].append(seg)
annotation_item['area'] = bbox[2] * bbox[3]
annotation_item['iscrowd'] = 0
annotation_item['ignore'] = 0
annotation_item['image_id'] = image_id
annotation_item['bbox'] = bbox
annotation_item['category_id'] = category_id
annotation_id += 1
annotation_item['id'] = annotation_id
coco['annotations'].append(annotation_item)
def parseXmlFiles(xml_path):
for f in os.listdir(xml_path):
if not f.endswith('.xml'):
continue
bndbox = dict()
size = dict()
current_image_id = None
current_category_id = None
file_name = None
size['width'] = None
size['height'] = None
size['depth'] = None
xml_file = os.path.join(xml_path, f)
print(xml_file)
tree = ET.parse(xml_file)
root = tree.getroot()
if root.tag != 'annotation':
raise Exception('pascal voc xml root element should be annotation, rather than {}'.format(root.tag))
# elem is <folder>, <filename>, <size>, <object>
for elem in root:
current_parent = elem.tag
current_sub = None
object_name = None
if elem.tag == 'folder':
continue
if elem.tag == 'filename':
file_name = elem.text
if file_name in category_set:
raise Exception('file_name duplicated')
# add img item only after parse <size> tag
elif current_image_id is None and file_name is not None and size['width'] is not None:
if file_name not in image_set:
current_image_id = addImgItem(file_name, size)
print('add image with {} and {}'.format(file_name, size))
else:
raise Exception('duplicated image: {}'.format(file_name))
# subelem is <width>, <height>, <depth>, <name>, <bndbox>
for subelem in elem:
bndbox['xmin'] = None
bndbox['xmax'] = None
bndbox['ymin'] = None
bndbox['ymax'] = None
current_sub = subelem.tag
if current_parent == 'object' and subelem.tag == 'name':
object_name = subelem.text
if object_name not in category_set:
current_category_id = addCatItem(object_name)
else:
current_category_id = category_set[object_name]
elif current_parent == 'size':
if size[subelem.tag] is not None:
raise Exception('xml structure broken at size tag.')
size[subelem.tag] = int(subelem.text)
# option is <xmin>, <ymin>, <xmax>, <ymax>, when subelem is <bndbox>
for option in subelem:
if current_sub == 'bndbox':
if bndbox[option.tag] is not None:
raise Exception('xml structure corrupted at bndbox tag.')
bndbox[option.tag] = int(option.text)
# only after parse the <object> tag
if bndbox['xmin'] is not None:
if object_name is None:
raise Exception('xml structure broken at bndbox tag')
if current_image_id is None:
raise Exception('xml structure broken at bndbox tag')
if current_category_id is None:
raise Exception('xml structure broken at bndbox tag')
bbox = []
# x
bbox.append(bndbox['xmin'])
# y
bbox.append(bndbox['ymin'])
# w
bbox.append(bndbox['xmax'] - bndbox['xmin'])
# h
bbox.append(bndbox['ymax'] - bndbox['ymin'])
print('add annotation with {},{},{},{}'.format(object_name, current_image_id, current_category_id,
bbox))
addAnnoItem(object_name, current_image_id, current_category_id, bbox)
if __name__ == '__main__':
xml_path = r"C:\Users\sxj\Desktop\dataset\valxml"
json_file = r"C:\Users\sxj\Desktop\dataset\val.json"
# xml_path = 'Annotations'
# json_file = 'instances.json'
parseXmlFiles(xml_path)
json.dump(coco, open(json_file, 'w'))
生成数据
修改配置文件 nanodet-plus-m_416.yml
# nanodet-plus-m_416
# COCO mAP(0.5:0.95) = 0.304
# AP_50 = 0.459
# AP_75 = 0.317
# AP_small = 0.106
# AP_m = 0.322
# AP_l = 0.477
save_dir: workspace/nanodet-plus-m_416
model:
weight_averager:
name: ExpMovingAverager
decay: 0.9998
arch:
name: NanoDetPlus
detach_epoch: 10
backbone:
name: ShuffleNetV2
model_size: 1.0x
out_stages: [2,3,4]
activation: LeakyReLU
fpn:
name: GhostPAN
in_channels: [116, 232, 464]
out_channels: 96
kernel_size: 5
num_extra_level: 1
use_depthwise: True
activation: LeakyReLU
head:
name: NanoDetPlusHead
num_classes: 14
input_channel: 96
feat_channels: 96
stacked_convs: 2
kernel_size: 5
strides: [8, 16, 32, 64]
activation: LeakyReLU
reg_max: 7
norm_cfg:
type: BN
loss:
loss_qfl:
name: QualityFocalLoss
use_sigmoid: True
beta: 2.0
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.25
loss_bbox:
name: GIoULoss
loss_weight: 2.0
# Auxiliary head, only use in training time.
aux_head:
name: SimpleConvHead
num_classes: 14
input_channel: 192
feat_channels: 192
stacked_convs: 4
strides: [8, 16, 32, 64]
activation: LeakyReLU
reg_max: 7
data:
train:
name: CocoDataset
img_path: C:\Users\sxj\Desktop\dataset\trainjpg
ann_path: C:\Users\sxj\Desktop\dataset\train.json
input_size: [416,416] #[w,h]
keep_ratio: False
pipeline:
perspective: 0.0
scale: [0.6, 1.4]
stretch: [[0.8, 1.2], [0.8, 1.2]]
rotation: 0
shear: 0
translate: 0.2
flip: 0.5
brightness: 0.2
contrast: [0.6, 1.4]
saturation: [0.5, 1.2]
normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
val:
name: CocoDataset
img_path: C:\Users\sxj\Desktop\dataset\valjpg
ann_path: C:\Users\sxj\Desktop\dataset\val.json
input_size: [416,416] #[w,h]
keep_ratio: False
pipeline:
normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
device:
gpu_ids: [0]
workers_per_gpu: 1 #这个地方的设置小点 否则卡在训练处不动了
batchsize_per_gpu: 8 #这个地方的设置小点 否则卡在训练处不动了
schedule:
# resume:
# load_model:
optimizer:
name: AdamW
lr: 0.001
weight_decay: 0.05
warmup:
name: linear
steps: 500
ratio: 0.0001
total_epochs: 300
lr_schedule:
name: CosineAnnealingLR
T_max: 300
eta_min: 0.00005
val_intervals: 10
grad_clip: 35
eval(232, 232, 232); background: rgb(249, 249, 249);">D:\Python39\python.exe F:/sxj731533730demo/nanodet/tools/train.py F:\sxj731533730demo\nanodet\config\nanodet-plus-m_416.yml
loading annotations into memory...
[NanoDet][01-10 14:37:09]INFO:Setting up data...
Done (t=0.01s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
model size is 1.0x
[NanoDet][01-10 14:37:09]INFO:Creating model...
init weights...
=> loading pretrained model https://download.pytorch.org/models/shufflenetv2_x1-5666bf0f80.pth
Finish initialize NanoDet-Plus Head.
D:\Python39\lib\site-packages\pytorch_lightning\callbacks\progress\progress.py:21: LightningDeprecationWarning: `ProgressBar` has been deprecated in v1.5 and will be removed in v1.7. It has been renamed to `TQDMProgressBar` instead.
rank_zero_deprecation(
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
D:\Python39\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:143: LightningDeprecationWarning: The `LightningModule.get_progress_bar_dict` method was deprecated in v1.5 and will be removed in v1.7. Please use the `ProgressBarBase.get_metrics` instead.
rank_zero_deprecation(
D:\Python39\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:284: LightningDeprecationWarning: Base `LightningModule.on_train_batch_end` hook signature has changed in v1.5. The `dataloader_idx` argument will be removed in v1.7.
rank_zero_deprecation(
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
------------------------------------------
0 | model | NanoDetPlus | 4.2 M
1 | avg_model | NanoDetPlus | 4.2 M
------------------------------------------
8.4 M Trainable params
0 Non-trainable params
8.4 M Total params
33.529 Total estimated model params size (MB)
[NanoDet][01-10 14:37:12]INFO:Weight Averaging is enabled
D:\Python39\lib\site-packages\pytorch_lightning\trainer\data_loading.py:132: UserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 12 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
D:\Python39\lib\site-packages\pytorch_lightning\trainer\data_loading.py:132: UserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 12 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
D:\Python39\lib\site-packages\torch\nn\functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at ..\c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
D:\Python39\lib\site-packages\torch\nn\functional.py:3609: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
warnings.warn(
[NanoDet][01-10 14:37:20]INFO:Train|Epoch1/300|Iter0(0)| lr:1.00e-07| loss_qfl:0.5994| loss_bbox:1.1017| loss_dfl:0.4796| aux_loss_qfl:0.5966| aux_loss_bbox:1.1563| aux_loss_dfl:0.5350|
D:\Python39\lib\site-packages\torch\nn\functional.py:3609: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
warnings.warn(
[NanoDet][01-10 14:37:22]INFO:Train|Epoch1/300|Iter1(1)| lr:2.10e-06| loss_qfl:0.8244| loss_bbox:0.9529| loss_dfl:0.5201| aux_loss_qfl:0.8318| aux_loss_bbox:0.9386| aux_loss_dfl:0.5379|
[NanoDet][01-10 14:37:22]INFO:Train|Epoch1/300|Iter2(2)| lr:4.10e-06| loss_qfl:0.7075| loss_bbox:1.0347| loss_dfl:0.5200| aux_loss_qfl:0.7199| aux_loss_bbox:0.9950| aux_loss_dfl:0.5338|
[NanoDet][01-10 14:37:23]INFO:Train|Epoch1/300|Iter3(3)| lr:6.10e-06| loss_qfl:0.6569| loss_bbox:1.1143| loss_dfl:0.5187| aux_loss_qfl:0.6602| aux_loss_bbox:1.0917| aux_loss_dfl:0.5386|
[NanoDet][01-10 14:37:23]INFO:Train|Epoch1/300|Iter4(4)| lr:8.10e-06| loss_qfl:0.6826| loss_bbox:1.0714| loss_dfl:0.5200| aux_loss_qfl:0.6791| aux_loss_bbox:1.0339| aux_loss_dfl:0.5320|
[NanoDet][01-10 14:37:24]INFO:Train|Epoch1/300|Iter5(5)| lr:1.01e-05| loss_qfl:0.7539| loss_bbox:1.0295| loss_dfl:0.5191| aux_loss_qfl:0.7168| aux_loss_bbox:0.9948| aux_loss_dfl:0.5305|
[NanoDet][01-10 14:37:24]INFO:Train|Epoch1/300|Iter6(6)| lr:1.21e-05| loss_qfl:0.7362| loss_bbox:1.0625| loss_dfl:0.5196| aux_loss_qfl:0.7167| aux_loss_bbox:1.0083| aux_loss_dfl:0.5265|
[NanoDet][01-10 14:37:25]INFO:Train|Epoch1/300|Iter7(7)| lr:1.41e-05| loss_qfl:0.6765| loss_bbox:1.1261| loss_dfl:0.5204| aux_loss_qfl:0.6551| aux_loss_bbox:1.0559| aux_loss_dfl:0.5232|
[NanoDet][01-10 14:37:26]INFO:Train|Epoch1/300|Iter8(8)| lr:1.61e-05| loss_qfl:0.7322| loss_bbox:1.0200| loss_dfl:0.5194| aux_loss_qfl:0.6927| aux_loss_bbox:1.0168| aux_loss_dfl:0.5345|
[NanoDet][01-10 14:37:26]INFO:Train|Epoch1/300|Iter9(9)| lr:1.81e-05| loss_qfl:0.6557| loss_bbox:1.2017| loss_dfl:0.5200| aux_loss_qfl:0.5928| aux_loss_bbox:1.0757| aux_loss_dfl:0.5204|
[NanoDet][01-10 14:37:27]INFO:Train|Epoch1/300|Iter10(10)| lr:2.01e-05| loss_qfl:0.6138| loss_bbox:1.1489| loss_dfl:0.5216| aux_loss_qfl:0.5608| aux_loss_bbox:1.0903| aux_loss_dfl:0.5191|
[NanoDet][01-10 14:37:27]INFO:Train|Epoch1/300|Iter11(11)| lr:2.21e-05| loss_qfl:0.6394| loss_bbox:1.1630| loss_dfl:0.5210| aux_loss_qfl:0.6133| aux_loss_bbox:1.0757| aux_loss_dfl:0.5074|
[NanoDet][01-10 14:37:28]INFO:Train|Epoch1/300|Iter12(12)| lr:2.41e-05| loss_qfl:0.6971| loss_bbox:1.1372| loss_dfl:0.5184| aux_loss_qfl:0.6033| aux_loss_bbox:1.0216| aux_loss_dfl:0.5039|
[NanoDet][01-10 14:37:28]INFO:Train|Epoch1/300|Iter13(13)| lr:2.61e-05| loss_qfl:0.8788| loss_bbox:0.9752| loss_dfl:0.5172| aux_loss_qfl:0.6990| aux_loss_bbox:0.9284| aux_loss_dfl:0.5035|
标签:loss,20,name,Nano,int,bbox,串口,path,nanodet From: https://blog.51cto.com/u_12504263/5719087