淘寶網(wǎng)作圖做網(wǎng)站瀏覽器地址欄怎么打開
(未完成,待補充)
獲取Faster RCNN源碼
(開源的很多,論文里也有,在這里不多贅述)
替換自己的數(shù)據(jù)集(圖片+標簽文件)
(需要使用labeling生成標簽文件)
打開終端,進入gpupytorch環(huán)境
運行voc_annotation.py文件生成與訓練文件
E:\DeepLearningModel\Model01>activate gpupytorch(gpupytorch) E:\DeepLearningModel\Model01>python voc_annotation.py
D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\_distributor_init.py:30: UserWarning: loaded more than 1 DLL from .libs:
D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll
D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dllwarnings.warn("loaded more than 1 DLL from .libs:\n%s" %
Generate txt in ImageSets.
train and val size 777
train size 699
Generate txt in ImageSets done.
Generate 2007_train.txt and 2007_val.txt for train.
?結果所示:
(gpupytorch) E:\DeepLearningModel\Model01>python voc_annotation.py
D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\_distributor_init.py:30: UserWarning: loaded more than 1 DLL from .libs:
D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll
D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dllwarnings.warn("loaded more than 1 DLL from .libs:\n%s" %
Generate txt in ImageSets.
train and val size 777
train size 699
Generate txt in ImageSets done.
Generate 2007_train.txt and 2007_val.txt for train.
Generate 2007_train.txt and 2007_val.txt for train done.
| leopard | 174 |
| boar | 491 |
| roe_deer | 352 |(gpupytorch) E:\DeepLearningModel\Model01>
運行:train.py文件
import colorsys
import os
import timeimport numpy as np
import torch
import torch.nn as nn
from PIL import Image, ImageDraw, ImageFontfrom nets.frcnn import FasterRCNN
from utils.utils import (cvtColor, get_classes, get_new_img_size, resize_image,preprocess_input, show_config)
from utils.utils_bbox import DecodeBoxclass FRCNN(object):_defaults = {"model_path" : 'logs/loss_2024_03_05_22_26_24.pth',"classes_path" : 'model_data/voc_classes.txt',"backbone" : "resnet50","confidence" : 0.5,"nms_iou" : 0.3,'anchors_size' : [8, 16, 32],"cuda" : True,}@classmethoddef get_defaults(cls, n):if n in cls._defaults:return cls._defaults[n]else:return "Unrecognized attribute name '" + n + "'"def __init__(self, **kwargs):self.__dict__.update(self._defaults)for name, value in kwargs.items():setattr(self, name, value)self._defaults[name] = value self.class_names, self.num_classes = get_classes(self.classes_path)self.std = torch.Tensor([0.1, 0.1, 0.2, 0.2]).repeat(self.num_classes + 1)[None]if self.cuda:self.std = self.std.cuda()self.bbox_util = DecodeBox(self.std, self.num_classes)#---------------------------------------------------#hsv_tuples = [(x / self.num_classes, 1., 1.) for x in range(self.num_classes)]self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))self.colors = list(map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)), self.colors))self.generate()show_config(**self._defaults)#---------------------------------------------------## 載入模型#---------------------------------------------------#def generate(self):self.net = FasterRCNN(self.num_classes, "predict", anchor_scales = self.anchors_size, backbone = self.backbone)device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')self.net.load_state_dict(torch.load(self.model_path, map_location=device))self.net = self.net.eval()print('{} model, anchors, and classes loaded.'.format(self.model_path))if self.cuda:self.net = nn.DataParallel(self.net)self.net = self.net.cuda()#---------------------------------------------------## 檢測圖片#---------------------------------------------------#def detect_image(self, image, crop = False, count = False):#---------------------------------------------------## 計算輸入圖片的高和寬#---------------------------------------------------#image_shape = np.array(np.shape(image)[0:2])#---------------------------------------------------## 計算resize后的圖片的大小,resize后的圖片短邊為600#---------------------------------------------------#input_shape = get_new_img_size(image_shape[0], image_shape[1])#---------------------------------------------------------## 在這里將圖像轉換成RGB圖像,防止灰度圖在預測時報錯。# 代碼僅僅支持RGB圖像的預測,所有其它類型的圖像都會轉化成RGB#---------------------------------------------------------#image = cvtColor(image)#---------------------------------------------------------## 給原圖像進行resize,resize到短邊為600的大小上#---------------------------------------------------------#image_data = resize_image(image, [input_shape[1], input_shape[0]])#---------------------------------------------------------## 添加上batch_size維度#---------------------------------------------------------#image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0)with torch.no_grad():images = torch.from_numpy(image_data)if self.cuda:images = images.cuda()#-------------------------------------------------------------## roi_cls_locs 建議框的調(diào)整參數(shù)# roi_scores 建議框的種類得分# rois 建議框的坐標#-------------------------------------------------------------#roi_cls_locs, roi_scores, rois, _ = self.net(images)#-------------------------------------------------------------## 利用classifier的預測結果對建議框進行解碼,獲得預測框#-------------------------------------------------------------#results = self.bbox_util.forward(roi_cls_locs, roi_scores, rois, image_shape, input_shape, nms_iou = self.nms_iou, confidence = self.confidence)#---------------------------------------------------------## 如果沒有檢測出物體,返回原圖#---------------------------------------------------------# if len(results[0]) <= 0:return imagetop_label = np.array(results[0][:, 5], dtype = 'int32')top_conf = results[0][:, 4]top_boxes = results[0][:, :4]#---------------------------------------------------------## 設置字體與邊框厚度#---------------------------------------------------------#font = ImageFont.truetype(font='model_data/simhei.ttf', size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32'))thickness = int(max((image.size[0] + image.size[1]) // np.mean(input_shape), 1))#---------------------------------------------------------## 計數(shù)#---------------------------------------------------------#if count:print("top_label:", top_label)classes_nums = np.zeros([self.num_classes])for i in range(self.num_classes):num = np.sum(top_label == i)if num > 0:print(self.class_names[i], " : ", num)classes_nums[i] = numprint("classes_nums:", classes_nums)#---------------------------------------------------------## 是否進行目標的裁剪#---------------------------------------------------------#if crop:for i, c in list(enumerate(top_label)):top, left, bottom, right = top_boxes[i]top = max(0, np.floor(top).astype('int32'))left = max(0, np.floor(left).astype('int32'))bottom = min(image.size[1], np.floor(bottom).astype('int32'))right = min(image.size[0], np.floor(right).astype('int32'))dir_save_path = "img_crop"if not os.path.exists(dir_save_path):os.makedirs(dir_save_path)crop_image = image.crop([left, top, right, bottom])crop_image.save(os.path.join(dir_save_path, "crop_" + str(i) + ".png"), quality=95, subsampling=0)print("save crop_" + str(i) + ".png to " + dir_save_path)#---------------------------------------------------------## 圖像繪制#---------------------------------------------------------#for i, c in list(enumerate(top_label)):predicted_class = self.class_names[int(c)]box = top_boxes[i]score = top_conf[i]top, left, bottom, right = boxtop = max(0, np.floor(top).astype('int32'))left = max(0, np.floor(left).astype('int32'))bottom = min(image.size[1], np.floor(bottom).astype('int32'))right = min(image.size[0], np.floor(right).astype('int32'))label = '{} {:.2f}'.format(predicted_class, score)draw = ImageDraw.Draw(image)label_size = draw.textsize(label, font)label = label.encode('utf-8')# print(label, top, left, bottom, right)if top - label_size[1] >= 0:text_origin = np.array([left, top - label_size[1]])else:text_origin = np.array([left, top + 1])for i in range(thickness):draw.rectangle([left + i, top + i, right - i, bottom - i], outline=self.colors[c])draw.rectangle([tuple(text_origin), tuple(text_origin + label_size)], fill=self.colors[c])draw.text(text_origin, str(label,'UTF-8'), fill=(0, 0, 0), font=font)del drawreturn imagedef get_FPS(self, image, test_interval):#---------------------------------------------------## 計算輸入圖片的高和寬#---------------------------------------------------#image_shape = np.array(np.shape(image)[0:2])input_shape = get_new_img_size(image_shape[0], image_shape[1])#---------------------------------------------------------## 在這里將圖像轉換成RGB圖像,防止灰度圖在預測時報錯。# 代碼僅僅支持RGB圖像的預測,所有其它類型的圖像都會轉化成RGB#---------------------------------------------------------#image = cvtColor(image)#---------------------------------------------------------## 給原圖像進行resize,resize到短邊為600的大小上#---------------------------------------------------------#image_data = resize_image(image, [input_shape[1], input_shape[0]])#---------------------------------------------------------## 添加上batch_size維度#---------------------------------------------------------#image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0)with torch.no_grad():images = torch.from_numpy(image_data)if self.cuda:images = images.cuda()roi_cls_locs, roi_scores, rois, _ = self.net(images)#-------------------------------------------------------------## 利用classifier的預測結果對建議框進行解碼,獲得預測框#-------------------------------------------------------------#results = self.bbox_util.forward(roi_cls_locs, roi_scores, rois, image_shape, input_shape, nms_iou = self.nms_iou, confidence = self.confidence)t1 = time.time()for _ in range(test_interval):with torch.no_grad():roi_cls_locs, roi_scores, rois, _ = self.net(images)#-------------------------------------------------------------## 利用classifier的預測結果對建議框進行解碼,獲得預測框#-------------------------------------------------------------#results = self.bbox_util.forward(roi_cls_locs, roi_scores, rois, image_shape, input_shape, nms_iou = self.nms_iou, confidence = self.confidence)t2 = time.time()tact_time = (t2 - t1) / test_intervalreturn tact_time#---------------------------------------------------## 檢測圖片#---------------------------------------------------#def get_map_txt(self, image_id, image, class_names, map_out_path):f = open(os.path.join(map_out_path, "detection-results/"+image_id+".txt"),"w")#---------------------------------------------------## 計算輸入圖片的高和寬#---------------------------------------------------#image_shape = np.array(np.shape(image)[0:2])input_shape = get_new_img_size(image_shape[0], image_shape[1])#---------------------------------------------------------## 在這里將圖像轉換成RGB圖像,防止灰度圖在預測時報錯。# 代碼僅僅支持RGB圖像的預測,所有其它類型的圖像都會轉化成RGB#---------------------------------------------------------#image = cvtColor(image)#---------------------------------------------------------## 給原圖像進行resize,resize到短邊為600的大小上#---------------------------------------------------------#image_data = resize_image(image, [input_shape[1], input_shape[0]])#---------------------------------------------------------## 添加上batch_size維度#---------------------------------------------------------#image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0)with torch.no_grad():images = torch.from_numpy(image_data)if self.cuda:images = images.cuda()roi_cls_locs, roi_scores, rois, _ = self.net(images)#-------------------------------------------------------------## 利用classifier的預測結果對建議框進行解碼,獲得預測框#-------------------------------------------------------------#results = self.bbox_util.forward(roi_cls_locs, roi_scores, rois, image_shape, input_shape, nms_iou = self.nms_iou, confidence = self.confidence)#--------------------------------------## 如果沒有檢測到物體,則返回原圖#--------------------------------------#if len(results[0]) <= 0:return top_label = np.array(results[0][:, 5], dtype = 'int32')top_conf = results[0][:, 4]top_boxes = results[0][:, :4]for i, c in list(enumerate(top_label)):predicted_class = self.class_names[int(c)]box = top_boxes[i]score = str(top_conf[i])top, left, bottom, right = boxif predicted_class not in class_names:continuef.write("%s %s %s %s %s %s\n" % (predicted_class, score[:6], str(int(left)), str(int(top)), str(int(right)),str(int(bottom))))f.close()return
?終端/編碼器運行:
E:\DeepLearningModel\Model01>activate gpupytorch(gpupytorch) E:\DeepLearningModel\Model01>python train.py
D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\_distributor_init.py:30: UserWarning: loaded more than 1 DLL from .libs:
D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll
D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dllwarnings.warn("loaded more than 1 DLL from .libs:\n%s" %
Number of devices: 1
initialize network with normal type
Load weights model_data/voc_weights_resnet.pth.Successful Load Key: ['extractor.0.weight', 'extractor.1.weight', 'extractor.1.bias', 'extractor.1.running_mean', 'extractor.1.running_var', 'extractor.1.num_batches_tracked', 'extractor.4.0.conv1.weight', 'extractor.4.0.bn1.weight', 'extractor.4.0.bn1.bias', 'extractor.4.0.bn1.running_mean', 'extractor.4.0.bn1.running_var', 'extractor.4.0.bn1.num_batches_tracked', 'extractor.4.0.conv2.weight', 'extractor.4.0.bn2.weight', 'extractor.4.0.bn2.bias', 'extractor.4.0.bn2.running_mean', 'extractor.4.0.bn2.running_var', 'e ……
Successful Load Key Num: 324Fail To Load Key: ['head.cls_loc.weight', 'head.cls_loc.bias', 'head.score.weight', 'head.score.bias'] ……
Fail To Load Key num: 4溫馨提示,head部分沒有載入是正?,F(xiàn)象,Backbone部分沒有載入是錯誤的。
Configurations:
----------------------------------------------------------------------
| keys | values|
----------------------------------------------------------------------
| classes_path | model_data/voc_classes.txt|
| model_path | model_data/voc_weights_resnet.pth|
| input_shape | [600, 600]|
| Init_Epoch | 0|
| Freeze_Epoch | 50|
| UnFreeze_Epoch | 100|
| Freeze_batch_size | 4|
| Unfreeze_batch_size | 2|
| Freeze_Train | True|
| Init_lr | 0.0001|
| Min_lr | 1.0000000000000002e-06|
| optimizer_type | adam|
| momentum | 0.9|
| lr_decay_type | cos|
| save_period | 5|
| save_dir | logs|
| num_workers | 4|
| num_train | 699|
| num_val | 78|
----------------------------------------------------------------------
Start Train
Epoch 1/100: 0%| | 0/174 [00:00<?, ?it/s<class 'dict'>]D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\_distributor_init.py:30: UserWarning: loaded more than 1 DLL from .libs:
D:\Anaconda\envs\gpupytorch\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll
查看結果:
Calculate Map.
96.35% = boar AP || score_threhold=0.5 : F1=0.81 ; Recall=97.92% ; Precision=69.12%
94.74% = leopard AP || score_threhold=0.5 : F1=0.90 ; Recall=94.74% ; Precision=85.71%
94.97% = roe_deer AP || score_threhold=0.5 : F1=0.86 ; Recall=96.88% ; Precision=77.50%
mAP = 95.35%
Get map done.
Epoch:100/100
Total Loss: 0.505 || Val Loss: 0.621
Save best model to best_epoch_weights.pth