當(dāng)前位置：首頁 > news >正文

凡科網(wǎng)登錄官網(wǎng)seo關(guān)鍵詞排名優(yōu)化哪家好

news 2025/7/4 3:51:04

凡科網(wǎng)登錄官網(wǎng),seo關(guān)鍵詞排名優(yōu)化哪家好,大學(xué)web前端心得體會(huì)500字,貴陽網(wǎng)站建設(shè)多錢錢1.研究背景與意義隨著科技的不斷發(fā)展，紅外遙感技術(shù)在軍事、安防、環(huán)境監(jiān)測等領(lǐng)域中得到了廣泛應(yīng)用。紅外遙感圖像具有獨(dú)特的優(yōu)勢，可以在夜間或惡劣天氣條件下獲取目標(biāo)信息，因此在小目標(biāo)檢測方面具有重要的應(yīng)用價(jià)值。然而，由于紅…

1.研究背景與意義

隨著科技的不斷發(fā)展，紅外遙感技術(shù)在軍事、安防、環(huán)境監(jiān)測等領(lǐng)域中得到了廣泛應(yīng)用。紅外遙感圖像具有獨(dú)特的優(yōu)勢，可以在夜間或惡劣天氣條件下獲取目標(biāo)信息，因此在小目標(biāo)檢測方面具有重要的應(yīng)用價(jià)值。然而，由于紅外圖像的低對比度、噪聲干擾等問題，小目標(biāo)檢測仍然是一個(gè)具有挑戰(zhàn)性的問題。

目前，深度學(xué)習(xí)已經(jīng)在計(jì)算機(jī)視覺領(lǐng)域取得了顯著的成果，特別是目標(biāo)檢測領(lǐng)域。YOLO（You Only Look Once）是一種基于深度學(xué)習(xí)的實(shí)時(shí)目標(biāo)檢測算法，其通過將目標(biāo)檢測問題轉(zhuǎn)化為回歸問題，將目標(biāo)的位置和類別同時(shí)預(yù)測出來。YOLO算法具有快速、準(zhǔn)確的特點(diǎn)，因此在目標(biāo)檢測領(lǐng)域受到了廣泛關(guān)注。

然而，傳統(tǒng)的YOLO算法在紅外遙感圖像中小目標(biāo)檢測方面存在一些問題。首先，紅外圖像中的小目標(biāo)往往具有低對比度，導(dǎo)致目標(biāo)的邊緣信息不明顯，難以準(zhǔn)確檢測。其次，紅外圖像中存在大量的噪聲干擾，這些噪聲會(huì)干擾目標(biāo)的檢測和識別。此外，紅外圖像中的小目標(biāo)往往具有多尺度和多方向的特點(diǎn)，傳統(tǒng)的YOLO算法難以處理這種復(fù)雜情況。

因此，基于小目標(biāo)檢測頭的改進(jìn)YOLOv5紅外遙感圖像小目標(biāo)檢測系統(tǒng)具有重要的研究意義。通過引入小目標(biāo)檢測頭，可以專門針對紅外圖像中的小目標(biāo)進(jìn)行優(yōu)化，提高檢測的準(zhǔn)確性和魯棒性。此外，改進(jìn)的YOLOv5算法可以充分利用紅外圖像中的特征信息，提高小目標(biāo)的檢測效果。這對于提高紅外遙感圖像的目標(biāo)檢測能力，進(jìn)一步推動(dòng)紅外遙感技術(shù)的發(fā)展具有重要的實(shí)際應(yīng)用價(jià)值。

在實(shí)際應(yīng)用中，基于小目標(biāo)檢測頭的改進(jìn)YOLOv5紅外遙感圖像小目標(biāo)檢測系統(tǒng)可以應(yīng)用于軍事偵察、邊防巡邏、環(huán)境監(jiān)測等領(lǐng)域。例如，在軍事偵察中，可以利用該系統(tǒng)對紅外圖像中的敵方小目標(biāo)進(jìn)行實(shí)時(shí)監(jiān)測和識別，提高作戰(zhàn)的情報(bào)獲取能力。在邊防巡邏中，該系統(tǒng)可以幫助邊防人員及時(shí)發(fā)現(xiàn)潛在的安全威脅。在環(huán)境監(jiān)測中，該系統(tǒng)可以用于檢測和監(jiān)測自然災(zāi)害、森林火災(zāi)等情況，提供及時(shí)的預(yù)警和救援。

綜上所述，基于小目標(biāo)檢測頭的改進(jìn)YOLOv5紅外遙感圖像小目標(biāo)檢測系統(tǒng)具有重要的研究意義和實(shí)際應(yīng)用價(jià)值。通過優(yōu)化目標(biāo)檢測算法，提高紅外圖像中小目標(biāo)的檢測準(zhǔn)確性和魯棒性，可以進(jìn)一步推動(dòng)紅外遙感技術(shù)的發(fā)展，為軍事、安防、環(huán)境監(jiān)測等領(lǐng)域提供更加可靠和高效的解決方案。

2.圖片演示

在這里插入圖片描述

3.視頻演示

基于小目標(biāo)檢測頭的改進(jìn)YOLOv5紅外遙感圖像小目標(biāo)檢測系統(tǒng)_嗶哩嗶哩_bilibili

4.數(shù)據(jù)集的采集＆標(biāo)注和整理

圖片的收集

首先，我們需要收集所需的圖片。這可以通過不同的方式來實(shí)現(xiàn)，例如使用現(xiàn)有的紅外遙感圖像小目標(biāo)數(shù)據(jù)集infrared_COCO_format。
在這里插入圖片描述

使用labelImg進(jìn)行標(biāo)注

labelImg是一個(gè)圖形化的圖像注釋工具，支持VOC和YOLO格式。以下是使用labelImg將圖片標(biāo)注為VOC格式的步驟：

（1）下載并安裝labelImg。
（2）打開labelImg并選擇“Open Dir”來選擇你的圖片目錄。
（3）為你的目標(biāo)對象設(shè)置標(biāo)簽名稱。
（4）在圖片上繪制矩形框，選擇對應(yīng)的標(biāo)簽。
（5）保存標(biāo)注信息，這將在圖片目錄下生成一個(gè)與圖片同名的XML文件。
（6）重復(fù)此過程，直到所有的圖片都標(biāo)注完畢。
在這里插入圖片描述

轉(zhuǎn)換為YOLO格式

由于YOLO使用的是txt格式的標(biāo)注，我們需要將VOC格式轉(zhuǎn)換為YOLO格式。可以使用各種轉(zhuǎn)換工具或腳本來實(shí)現(xiàn)。

下面是一個(gè)簡單的方法是使用Python腳本，該腳本讀取XML文件，然后將其轉(zhuǎn)換為YOLO所需的txt格式。

#!/usr/bin/env python3
# -*- coding: utf-8 -*-import xml.etree.ElementTree as ET
import osclasses = []  # 初始化為空列表CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))def convert(size, box):dw = 1. / size[0]dh = 1. / size[1]x = (box[0] + box[1]) / 2.0y = (box[2] + box[3]) / 2.0w = box[1] - box[0]h = box[3] - box[2]x = x * dww = w * dwy = y * dhh = h * dhreturn (x, y, w, h)def convert_annotation(image_id):in_file = open('./label_xml\%s.xml' % (image_id), encoding='UTF-8')out_file = open('./label_txt\%s.txt' % (image_id), 'w')  # 生成txt格式文件tree = ET.parse(in_file)root = tree.getroot()size = root.find('size')w = int(size.find('width').text)h = int(size.find('height').text)for obj in root.iter('object'):cls = obj.find('name').textif cls not in classes:classes.append(cls)  # 如果類別不存在，添加到classes列表中cls_id = classes.index(cls)xmlbox = obj.find('bndbox')b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),float(xmlbox.find('ymax').text))bb = convert((w, h), b)out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')xml_path = os.path.join(CURRENT_DIR, './label_xml/')# xml list
img_xmls = os.listdir(xml_path)
for img_xml in img_xmls:label_name = img_xml.split('.')[0]print(label_name)convert_annotation(label_name)print("Classes:")  # 打印最終的classes列表
print(classes)  # 打印最終的classes列表

整理數(shù)據(jù)文件夾結(jié)構(gòu)

我們需要將數(shù)據(jù)集整理為以下結(jié)構(gòu)：

-----data|-----train|   |-----images|   |-----labels||-----valid|   |-----images|   |-----labels||-----test|-----images|-----labels

確保以下幾點(diǎn)：

所有的訓(xùn)練圖片都位于data/train/images目錄下，相應(yīng)的標(biāo)注文件位于data/train/labels目錄下。
所有的驗(yàn)證圖片都位于data/valid/images目錄下，相應(yīng)的標(biāo)注文件位于data/valid/labels目錄下。
所有的測試圖片都位于data/test/images目錄下，相應(yīng)的標(biāo)注文件位于data/test/labels目錄下。
這樣的結(jié)構(gòu)使得數(shù)據(jù)的管理和模型的訓(xùn)練、驗(yàn)證和測試變得非常方便。

5.核心代碼講解

5.1 detect.py


class YOLOv5Detector:def __init__(self, weights='yolov5s.pt', source='data/images', imgsz=640, conf_thres=0.25, iou_thres=0.45,max_det=1000, device='', view_img=False, save_txt=False, save_conf=False, save_crop=False,nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False,project='runs/detect', name='exp', exist_ok=False, line_thickness=3, hide_labels=False,hide_conf=False, half=False):self.weights = weightsself.source = sourceself.imgsz = imgszself.conf_thres = conf_thresself.iou_thres = iou_thresself.max_det = max_detself.device = deviceself.view_img = view_imgself.save_txt = save_txtself.save_conf = save_confself.save_crop = save_cropself.nosave = nosaveself.classes = classesself.agnostic_nms = agnostic_nmsself.augment = augmentself.visualize = visualizeself.update = updateself.project = projectself.name = nameself.exist_ok = exist_okself.line_thickness = line_thicknessself.hide_labels = hide_labelsself.hide_conf = hide_confself.half = half@torch.no_grad()def run(self):save_img = not self.nosave and not self.source.endswith('.txt')  # save inference imageswebcam = self.source.isnumeric() or self.source.endswith('.txt') or self.source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://'))# Directoriessave_dir = increment_path(Path(self.project) / self.name, exist_ok=self.exist_ok)  # increment run(save_dir / 'labels' if self.save_txt else save_dir).mkdir(parents=True, exist_ok=True)  # make dir# Initializeset_logging()device = select_device(self.device)half = self.half & device.type != 'cpu'  # half precision only supported on CUDA# Load modelmodel = attempt_load(self.weights, map_location=device)  # load FP32 modelstride = int(model.stride.max())  # model strideimgsz = check_img_size(self.imgsz, s=stride)  # check image sizenames = model.module.names if hasattr(model, 'module') else model.names  # get class namesif half:model.half()  # to FP16# Second-stage classifierclassify = Falseif classify:modelc = load_classifier(name='resnet50', n=2)  # initializemodelc.load_state_dict(torch.load('resnet50.pt', map_location=device)['model']).to(device).eval()# Dataloaderif webcam:

這個(gè)程序文件是一個(gè)用于在圖像、視頻、目錄和流上運(yùn)行YOLOv5模型進(jìn)行推理的腳本。它接受命令行參數(shù)來指定模型、輸入源、推理尺寸等。

主要的函數(shù)是run()，它接受一系列參數(shù)來配置推理過程。在函數(shù)內(nèi)部，它首先加載模型并設(shè)置一些參數(shù)，然后根據(jù)輸入源的類型創(chuàng)建一個(gè)數(shù)據(jù)加載器。接下來，它循環(huán)遍歷數(shù)據(jù)加載器中的每個(gè)圖像或視頻幀，并對其進(jìn)行推理。推理結(jié)果包括檢測到的物體的邊界框和類別。最后，它可以選擇將結(jié)果保存到文件或顯示在屏幕上。

parse_opt()函數(shù)用于解析命令行參數(shù)，并返回一個(gè)包含這些參數(shù)的命名空間對象。

main()函數(shù)是腳本的入口點(diǎn)，它調(diào)用parse_opt()函數(shù)解析命令行參數(shù)，并調(diào)用run()函數(shù)進(jìn)行推理。

整個(gè)程序文件的目的是提供一個(gè)方便的方式來運(yùn)行YOLOv5模型進(jìn)行目標(biāo)檢測。用戶可以通過命令行參數(shù)來自定義推理過程的各個(gè)方面，例如模型、輸入源、推理尺寸等。

5.2 export.py

class ModelExporter:def __init__(self, weights='./yolov5s.pt', img_size=(640, 640), batch_size=1, device='cpu',include=('torchscript', 'onnx', 'coreml'), half=False, inplace=False, train=False,optimize=False, dynamic=False, simplify=False, opset_version=12):self.weights = weightsself.img_size = img_sizeself.batch_size = batch_sizeself.device = deviceself.include = includeself.half = halfself.inplace = inplaceself.train = trainself.optimize = optimizeself.dynamic = dynamicself.simplify = simplifyself.opset_version = opset_versiondef export_torchscript(self, model, img, file, optimize):# TorchScript model exportprefix = colorstr('TorchScript:')try:print(f'\n{prefix} starting export with torch {torch.__version__}...')f = file.with_suffix('.torchscript.pt')ts = torch.jit.trace(model, img, strict=False)(optimize_for_mobile(ts) if optimize else ts).save(f)print(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')return tsexcept Exception as e:print(f'{prefix} export failure: {e}')def export_onnx(self, model, img, file, opset_version, train, dynamic, simplify):# ONNX model exportprefix = colorstr('ONNX:')try:check_requirements(('onnx', 'onnx-simplifier'))import onnxprint(f'\n{prefix} starting export with onnx {onnx.__version__}...')f = file.with_suffix('.onnx')torch.onnx.export(model, img, f, verbose=False, opset_version=opset_version,training=torch.onnx.TrainingMode.TRAINING if train else torch.onnx.TrainingMode.EVAL,do_constant_folding=not train,input_names=['images'],output_names=['output'],dynamic_axes={'images': {0: 'batch', 2: 'height', 3: 'width'},  # shape(1,3,640,640)'output': {0: 'batch', 1: 'anchors'}  # shape(1,25200,85)} if dynamic else None)# Checksmodel_onnx = onnx.load(f)  # load onnx modelonnx.checker.check_model(model_onnx)  # check onnx model# print(onnx.helper.printable_graph(model_onnx.graph))  # print# Simplifyif simplify:try:import onnxsimprint(f'{prefix} simplifying with onnx-simplifier {onnxsim.__version__}...')model_onnx, check = onnxsim.simplify(model_onnx,dynamic_input_shape=dynamic,input_shapes={'images': list(img.shape)} if dynamic else None)assert check, 'assert check failed'onnx.save(model_onnx, f)except Exception as e:print(f'{prefix} simplifier failure: {e}')print(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')except Exception as e:print(f'{prefix} export failure: {e}')def export_coreml(self, model, img, file):# CoreML model exportprefix = colorstr('CoreML:')try:import coremltools as ctprint(f'\n{prefix} starting export with coremltools {ct.__version__}...')f = file.with_suffix('.mlmodel')model.train()  # CoreML exports should be placed in model.train() modets = torch.jit.trace(model, img, strict=False)  # TorchScript modelmodel = ct.convert(ts, inputs=[ct.ImageType('image', shape=img.shape, scale=1 / 255.0, bias=[0, 0, 0])])model.save(f)print(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')except Exception as e:print(f'{prefix} export failure: {e}')def run(self):t = time.time()include = [x.lower() for x in self.include]img_size = self.img_size * 2 if len(self.img_size) == 1 else 1  # expandfile = Path(self.weights)# Load PyTorch modeldevice = select_device(self.device)assert not (device.type == 'cpu' and self.half), '--half only compatible with GPU export, i.e. use --device 0'model = attempt_load(self.weights, map_location=device)  # load FP32 modelnames = model.names# Inputgs = int(max(model.stride))  # grid size (max stride)img_size = [check_img_size(x, gs) for x in img_size]  # verify img_size are gs-multiplesimg = torch.zeros(self.batch_size, 3, *img_size).to(device)  # image size(1,3,320,192) iDetection# Update modelif self.half:img, model = img.half(), model.half()  # to FP16model.train() if self.train else model.eval()  # training mode = no Detect() layer grid constructionfor k, m in model.named_modules():if isinstance(m, Conv):  # assign export-friendly activationsif isinstance(m.act, nn.Hardswish):m.act = Hardswish()elif isinstance(m.act, nn.SiLU):m.act = SiLU()elif isinstance(m, Detect):m.inplace = self.inplacem.onnx_dynamic = self.dynamic# m.forward = m.forward_export  # assign forward (optional)for _ in range(2):y = model(img)  # dry runsprint(f"\n{colorstr('PyTorch:')} starting from {self.weights} ({file_size(self.weights):.1f} MB)")# Exportsif 'torchscript' in include:self.export_torchscript(model, img, file, self.optimize)if 'onnx' in include:self.export_onnx(model, img, file, self.opset_version, self.train, self.dynamic, self.simplify)if 'coreml' in include:self.export_coreml(model, img, file)# Finishprint(f'\nExport complete ({time.time() - t:.2f}s). Visualize with https://github.com/lutzroeder/netron.')

這個(gè)程序文件是用來將YOLOv5模型導(dǎo)出為TorchScript、ONNX和CoreML格式的。程序提供了命令行參數(shù)來指定模型權(quán)重路徑、圖像尺寸、批處理大小、設(shè)備類型等。程序的主要功能是加載PyTorch模型，然后根據(jù)用戶指定的格式進(jìn)行導(dǎo)出。導(dǎo)出的格式包括TorchScript、ONNX和CoreML。導(dǎo)出過程中還可以選擇優(yōu)化模型、使用半精度浮點(diǎn)數(shù)、設(shè)置YOLOv5的Detect()層為inplace模式、設(shè)置模型為訓(xùn)練模式等。導(dǎo)出完成后，程序會(huì)打印出導(dǎo)出的文件路徑和大小，并提示可以使用Netron工具進(jìn)行可視化。

5.3 hubconf.py


class YOLOv5:def __init__(self, name='yolov5s', pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):self.name = nameself.pretrained = pretrainedself.channels = channelsself.classes = classesself.autoshape = autoshapeself.verbose = verboseself.device = deviceself.model = self._create()def _create(self):from pathlib import Pathfrom models.yolo import Model, attempt_loadfrom utils.general import check_requirements, set_loggingfrom utils.google_utils import attempt_downloadfrom utils.torch_utils import select_devicefile = Path(__file__).absolute()check_requirements(requirements=file.parent / 'requirements.txt', exclude=('tensorboard', 'thop', 'opencv-python'))set_logging(verbose=self.verbose)save_dir = Path('') if str(self.name).endswith('.pt') else file.parentpath = (save_dir / self.name).with_suffix('.pt')  # checkpoint pathtry:device = select_device(('0' if torch.cuda.is_available() else 'cpu') if self.device is None else self.device)if self.pretrained and self.channels == 3 and self.classes == 80:model = attempt_load(path, map_location=device)  # download/load FP32 modelelse:cfg = list((Path(__file__).parent / 'models').rglob(f'{self.name}.yaml'))[0]  # model.yaml pathmodel = Model(cfg, self.channels, self.classes)  # create modelif self.pretrained:ckpt = torch.load(attempt_download(path), map_location=device)  # loadmsd = model.state_dict()  # model state_dictcsd = ckpt['model'].float().state_dict()  # checkpoint state_dict as FP32csd = {k: v for k, v in csd.items() if msd[k].shape == v.shape}  # filtermodel.load_state_dict(csd, strict=False)  # loadif len(ckpt['model'].names) == self.classes:model.names = ckpt['model'].names  # set class names attributeif self.autoshape:model = model.autoshape()  # for file/URI/PIL/cv2/np inputs and NMSreturn model.to(device)except Exception as e:help_url = 'https://github.com/ultralytics/yolov5/issues/36's = 'Cache may be out of date, try `force_reload=True`. See %s for help.' % help_urlraise Exception(s) from edef inference(self, imgs):return self.model(imgs)if __name__ == '__main__':model = YOLOv5(name='yolov5s', pretrained=True, channels=3, classes=80, autoshape=True, verbose=True)imgs = ['data/images/zidane.jpg',  # filename'https://github.com/ultralytics/yolov5/releases/download/v1.0/zidane.jpg',  # URIcv2.imread('data/images/bus.jpg')[:, :, ::-1],  # OpenCVImage.open('data/images/bus.jpg'),  # PILnp.zeros((320, 640, 3))]  # numpyresults = model.inference(imgs)results.print()results.save()

這個(gè)程序文件是一個(gè)用于加載和使用YOLOv5模型的Python腳本。它定義了一個(gè)名為hubconf.py的模塊，可以通過PyTorch Hub加載和使用YOLOv5模型。

該模塊提供了以下功能：

_create函數(shù)：根據(jù)指定的模型名稱創(chuàng)建一個(gè)YOLOv5模型。
custom函數(shù)：加載自定義或本地模型。
yolov5s、yolov5m、yolov5l、yolov5x、yolov5s6、yolov5m6、yolov5l6、yolov5x6函數(shù)：加載預(yù)訓(xùn)練的YOLOv5模型，根據(jù)模型的大小和復(fù)雜度選擇不同的函數(shù)。
__main__部分：在這里可以驗(yàn)證模型的推理功能，加載圖像并進(jìn)行批量推理。

總體而言，這個(gè)程序文件提供了一個(gè)方便的接口，可以輕松地加載和使用YOLOv5模型進(jìn)行目標(biāo)檢測任務(wù)。

5.4 img2videos.py

class ImageToVideoConverter:def __init__(self, input_folder='./image', output_file='./output.mp4', frame_size=(960, 540), fps=30):self.input_folder = input_folderself.output_file = output_fileself.frame_size = frame_sizeself.fps = fpsdef convert(self):image_extensions = ["*.png", "*.PNG", "*.JPG", "*.JPEG", "*.jpg", "*.jpeg", "*.bmp"]image_files = []for ext in image_extensions:image_files.extend(glob.glob(os.path.join(self.input_folder, ext)))image_files.sort()fourcc = cv2.VideoWriter_fourcc(*'mp4v')out = cv2.VideoWriter(self.output_file, fourcc, self.fps, self.frame_size)for image_file in image_files:img = cv2.imread(image_file)img_resized = cv2.resize(img, self.frame_size)out.write(img_resized)out.release()cv2.destroyAllWindows()converter = ImageToVideoConverter(input_folder='./images', output_file='./output.mp4')
converter.convert()

這個(gè)程序文件名為img2videos.py，它的功能是將一個(gè)文件夾中的圖片轉(zhuǎn)換成視頻。具體來說，它會(huì)讀取指定文件夾中的所有圖片文件，并按照文件名的順序?qū)⑦@些圖片合成為一個(gè)視頻文件。用戶可以通過指定輸入文件夾路徑、輸出文件路徑、幀大小和幀率等參數(shù)來自定義轉(zhuǎn)換過程。程序使用了OpenCV庫來處理圖片和視頻。

5.5 subdivision.py

封裝為類后的代碼如下：

class ImageProcessor:def __init__(self, model, img, augment, save_dir, path, visualize):self.model = modelself.img = imgself.augment = augmentself.save_dir = save_dirself.path = pathself.visualize = visualizedef process_image(self):mulpicplus = "3"  # 1 for normal,2 for 4pic plus,3 for 9pic plus and so onassert (int(mulpicplus) >= 1)if mulpicplus == "1":pred = self.model(self.img,augment=self.augment,visualize=increment_path(self.save_dir / Path(self.path).stem, mkdir=True) if self.visualize else False)[0]else:xsz = self.img.shape[2]ysz = self.img.shape[3]mulpicplus = int(mulpicplus)x_smalloccur = int(xsz / mulpicplus * 1.2)y_smalloccur = int(ysz / mulpicplus * 1.2)for i in range(mulpicplus):x_startpoint = int(i * (xsz / mulpicplus))for j in range(mulpicplus):y_startpoint = int(j * (ysz / mulpicplus))x_real = min(x_startpoint + x_smalloccur, xsz)y_real = min(y_startpoint + y_smalloccur, ysz)if (x_real - x_startpoint) % 64 != 0:x_real = x_real - (x_real - x_startpoint) % 64if (y_real - y_startpoint) % 64 != 0:y_real = y_real - (y_real - y_startpoint) % 64dicsrc = self.img[:, :, x_startpoint:x_real,y_startpoint:y_real]pred_temp = self.model(dicsrc,augment=self.augment,visualize=increment_path(self.save_dir / Path(self.path).stem, mkdir=True) if self.visualize else False)[0]pred_temp[..., 0] = pred_temp[..., 0] + y_startpointpred_temp[..., 1] = pred_temp[..., 1] + x_startpointif i == 0 and j == 0:pred = pred_tempelse:pred = torch.cat([pred, pred_temp], dim=1)# Apply NMSpred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)return pred

使用時(shí)，可以實(shí)例化該類，并調(diào)用process_image方法來處理圖像。

這個(gè)程序文件名為subdivision.py，主要功能是對圖像進(jìn)行細(xì)分處理。程序中定義了一個(gè)變量mulpicplus，用于確定細(xì)分的方式。如果mulpicplus為1，則進(jìn)行普通的細(xì)分處理；如果mulpicplus大于1，則進(jìn)行更多的細(xì)分處理。

在普通細(xì)分處理中，程序調(diào)用了一個(gè)名為model的函數(shù)，對圖像進(jìn)行處理，并返回一個(gè)結(jié)果pred。

在更多細(xì)分處理中，程序根據(jù)mulpicplus的值，將圖像分成多個(gè)小塊，并分別進(jìn)行處理。程序通過計(jì)算圖像的尺寸和細(xì)分的數(shù)量，確定每個(gè)小塊的起始點(diǎn)和結(jié)束點(diǎn)。然后，程序?qū)⒚總€(gè)小塊傳入model函數(shù)進(jìn)行處理，并將處理結(jié)果拼接在一起。

最后，程序?qū)μ幚斫Y(jié)果進(jìn)行非極大值抑制（NMS），通過設(shè)置一些閾值和參數(shù)，篩選出最有可能的目標(biāo)物體，并返回最終的結(jié)果pred。

5.6 train.py

class YOLOv5Trainer:def __init__(self, hyp, opt, device):self.hyp = hypself.opt = optself.device = devicedef train(self):save_dir, epochs, batch_size, weights, single_cls, evolve, data, cfg, resume, noval, nosave, workers, = \self.opt.save_dir, self.opt.epochs, self.opt.batch_size, self.opt.weights, self.opt.single_cls, \self.opt.evolve, self.opt.data, self.opt.cfg, self.opt.resume, self.opt.noval, self.opt.nosave, \self.opt.workers# Directoriessave_dir = Path(save_dir)wdir = save_dir / 'weights'wdir.mkdir(parents=True, exist_ok=True)  # make dirlast = wdir / 'last.pt'best = wdir / 'best.pt'results_file = save_dir / 'results.txt'# Hyperparametersif isinstance(self.hyp, str):with open(self.hyp) as f:self.hyp = yaml.safe_load(f)  # load hyps dictLOGGER.info(colorstr('hyperparameters: ') + ', '.join(f'{k}={v}' for k, v in self.hyp.items()))# Save run settingswith open(save_dir / 'hyp.yaml', 'w') as f:yaml.safe_dump(self.hyp, f, sort_keys=False)with open(save_dir / 'opt.yaml', 'w') as f:yaml.safe_dump(vars(self.opt), f, sort_keys=False)# Configureplots = not evolve  # create plotscuda = self.device.type != 'cpu'init_seeds(1 + RANK)with open(data) as f:data_dict = yaml.safe_load(f)  # data dict# Loggersloggers = {'wandb': None, 'tb': None}  # loggers dictif RANK in [-1, 0]:# TensorBoardif not evolve:prefix = colorstr('tensorboard: ')LOGGER.info(f"{prefix}Start with 'tensorboard --logdir {self.opt.project}', view at http://localhost:6006/")loggers['tb'] = SummaryWriter(str(save_dir))# W&Bself.opt.hyp = self.hyp  # add hyperparametersrun_id = torch.load(weights).get('wandb_id') if weights.endswith('.pt') and os.path.isfile(weights) else Nonerun_id = run_id if self.opt.resume else None  # start fresh run if transfer learningwandb_logger = WandbLogger(self.opt, save_dir.stem, run_id, data_dict)loggers['wandb'] = wandb_logger.wandbif loggers['wandb']:data_dict = wandb_logger.data_dictweights, epochs, self.hyp = self.opt.weights, self.opt.epochs, self.opt.hyp  # may update weights, epochs if resumingnc = 1 if single_cls else int(data_dict['nc'])  # number of classesnames = ['item'] if single_cls and len(data_dict['names']) != 1 else data_dict['names']  # class namesassert len(names) == nc, '%g names found for nc=%g dataset in %s' % (len(names), nc, data)  # checkis_coco = data.endswith('coco.yaml') and nc == 80  # COCO dataset# Modelpretrained = weights.endswith('.pt')if pretrained:with torch_distributed_zero_first(RANK):weights = attempt_download(weights)  # download if not found locallyckpt = torch.load(weights, map_location=self.device)  # load checkpointmodel = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=self.hyp.get('anchors')).to(self.device)  # createexclude = ['anchor'] if (cfg or self.hyp.get('anchors')) and not resume else []  # exclude keysstate_dict = ckpt['model'].float().state_dict()  # to FP32state_dict = intersect_dicts(state_dict, model.state_dict(), exclude=exclude)  # intersectmodel.load_state_dict(state_dict, strict=False)  # loadLOGGER.info('Transferred %g/%g items from %s' % (len(state_dict), len(model.state_dict()), weights))  # reportelse:model = Model(cfg, ch=3, nc=nc, anchors=self.hyp.get('anchors')).to(self.device)  # createwith torch_distributed_zero_first(RANK):check_dataset(data_dict)  # checktrain_path = data_dict['train']val_path = data_dict['val']# Freezefreeze = []  # parameter names to freeze (full or partial)for k, v in model.named_parameters():v.requires_grad = True  # train all layersif any(x in k for x in freeze):print('freezing %s' % k)v.requires_grad = False# Optimizernbs = 64  # nominal batch sizeaccumulate = max(round(nbs / batch_size), 1)  # accumulate loss before optimizingself.hyp['weight_decay'] *= batch_size * accumulate / nbs  # scale weight_decayLOGGER.info(f"Scaled weight_decay = {self.hyp['weight_decay']}")pg0, pg1, pg2 = [], [], []  # optimizer parameter groupsfor k, v in model.named_modules():if hasattr(v, 'bias') and isinstance(v.bias, nn.Parameter):pg2.append(v.bias)  # biasesif isinstance(v, nn.BatchNorm2d):pg0.append(v.weight)  # no decayelif hasattr(v, 'weight') and isinstance(v.weight, nn.Parameter):pg1.append(v.weight)  # apply decayif self.opt.adam:optimizer = optim.Adam(pg0, lr=self.hyp['lr0'], betas=(self.hyp['momentum'], 0.999))  # adjust beta1 to momentumelse:optimizer = optim.SGD(pg0, lr=self.hyp['lr0'], momentum=self.hyp['momentum'], nesterov=True)optimizer.add_param_group({'params': pg1,

這個(gè)程序文件是用來訓(xùn)練一個(gè)YOLOv5模型的，使用的是自定義的數(shù)據(jù)集。程序文件的使用方法是通過命令行參數(shù)來指定數(shù)據(jù)集配置文件、模型權(quán)重文件和圖像尺寸等參數(shù)。程序會(huì)根據(jù)指定的參數(shù)加載模型和數(shù)據(jù)集，并進(jìn)行訓(xùn)練。訓(xùn)練過程中會(huì)保存模型權(quán)重和訓(xùn)練結(jié)果，并可以通過TensorBoard和W&B進(jìn)行可視化和記錄。訓(xùn)練過程中還會(huì)使用一些優(yōu)化技巧，如學(xué)習(xí)率調(diào)整、權(quán)重衰減和EMA等。訓(xùn)練完成后，可以使用訓(xùn)練結(jié)果進(jìn)行目標(biāo)檢測等任務(wù)。

6.系統(tǒng)整體結(jié)構(gòu)

整體功能和構(gòu)架概述：
該項(xiàng)目是一個(gè)基于YOLOv5的紅外遙感圖像小目標(biāo)檢測系統(tǒng)。它包含了多個(gè)程序文件，用于實(shí)現(xiàn)不同的功能，包括模型訓(xùn)練、推理、導(dǎo)出、數(shù)據(jù)處理等。主要的程序文件包括detect.py、export.py、hubconf.py、img2videos.py、subdivision.py、train.py等。

下表整理了每個(gè)文件的功能：

文件路徑	功能概述
detect.py	運(yùn)行YOLOv5模型進(jìn)行推理的腳本
export.py	將YOLOv5模型導(dǎo)出為TorchScript、ONNX和CoreML格式的腳本
hubconf.py	加載和使用YOLOv5模型的Python模塊
img2videos.py	將圖片文件夾轉(zhuǎn)換為視頻的腳本
subdivision.py	對圖像進(jìn)行細(xì)分處理的腳本
train.py	訓(xùn)練YOLOv5模型的腳本
ui.py	用戶界面腳本
val.py	驗(yàn)證YOLOv5模型的腳本
yaml.py	解析和處理YAML文件的腳本
models\common.py	YOLOv5模型的通用函數(shù)和類
models\experimental.py	YOLOv5模型的實(shí)驗(yàn)性函數(shù)和類
models\tf.py	YOLOv5模型的TensorFlow相關(guān)函數(shù)和類
models\yolo.py	YOLOv5模型的主要實(shí)現(xiàn)
models_init_.py	模型初始化腳本
utils\activations.py	激活函數(shù)的實(shí)現(xiàn)
utils\augmentations.py	數(shù)據(jù)增強(qiáng)的實(shí)現(xiàn)
utils\autoanchor.py	自動(dòng)錨框生成的實(shí)現(xiàn)
utils\autobatch.py	自動(dòng)批處理的實(shí)現(xiàn)
utils\callbacks.py	回調(diào)函數(shù)的實(shí)現(xiàn)
utils\datasets.py	數(shù)據(jù)集加載和處理的實(shí)現(xiàn)
utils\downloads.py	下載文件的實(shí)現(xiàn)
utils\general.py	通用函數(shù)和類的實(shí)現(xiàn)
utils\google_utils.py	Google云存儲(chǔ)相關(guān)函數(shù)的實(shí)現(xiàn)
utils\loss.py	損失函數(shù)的實(shí)現(xiàn)
utils\metrics.py	評估指標(biāo)的實(shí)現(xiàn)
utils\plots.py	繪圖函數(shù)的實(shí)現(xiàn)
utils\torch_utils.py	PyTorch相關(guān)函數(shù)和類的實(shí)現(xiàn)
utils_init_.py	工具函數(shù)初始化腳本
utils\aws\resume.py	AWS相關(guān)函數(shù)的實(shí)現(xiàn)
utils\aws_init_.py	AWS初始化腳本
utils\flask_rest_api\example_request.py	Flask REST API示例請求的實(shí)現(xiàn)
utils\flask_rest_api\restapi.py	Flask REST API的實(shí)現(xiàn)
utils\loggers_init_.py	日志記錄器初始化腳本
utils\loggers\wandb\log_dataset.py	使用WandB記錄數(shù)據(jù)集的實(shí)現(xiàn)
utils\loggers\wandb\sweep.py	使用WandB進(jìn)行超參數(shù)搜索的實(shí)現(xiàn)
utils\loggers\wandb\wandb_utils.py	使用WandB進(jìn)行日志記錄的實(shí)現(xiàn)
utils\loggers\wandb_init_.py	WandB日志記錄器初始化腳本
utils\wandb_logging\log_dataset.py	使用WandB記錄數(shù)據(jù)集的實(shí)現(xiàn)
utils\wandb_logging\sweep.py	使用WandB進(jìn)行超參數(shù)搜索的實(shí)現(xiàn)
utils\wandb_logging\wandb_utils.py	使用WandB進(jìn)行日志記錄的實(shí)現(xiàn)
utils\wandb_logging_init_.py	WandB日志記錄器初始化腳本

以上是對每個(gè)文件的功能進(jìn)行了簡要概述。每個(gè)文件都有不同的功能，用于實(shí)現(xiàn)整個(gè)基于小目標(biāo)檢測頭的改進(jìn)YOLOv5紅外遙感圖像小目標(biāo)檢測系統(tǒng)的各個(gè)方面。

7.數(shù)據(jù)集的標(biāo)注和增強(qiáng)

數(shù)據(jù)集的標(biāo)注

YOLOv5網(wǎng)絡(luò)雖然使用pytorch作為網(wǎng)絡(luò)的基礎(chǔ)框架，但YOLO系列算法在建立之初并沒有使用任何現(xiàn)有框架，而是獨(dú)創(chuàng)了以純C語言為基礎(chǔ)的輕量級darknet框架，這也使YOLO系列算法的標(biāo)簽具有獨(dú)立性，并沒有使用被大家熟知的VOC或COCo格式標(biāo)簽。因此新建立的數(shù)據(jù)集，在標(biāo)注目標(biāo)框的時(shí)候，通常只能使用最常見的標(biāo)注工具Labelimgl50來標(biāo)注，得到VOC或者COCO格式的標(biāo)簽然后在通過python或其他語言編寫的格式轉(zhuǎn)換文件來轉(zhuǎn)換成YOLO獨(dú)特的標(biāo)簽，這也使眾多YOLO愛好者感受到了建立標(biāo)注框的困難。直到來自俄羅斯的AlexeyAB制作了一款基于c語言的YOLO專屬標(biāo)注軟件YOLO_markl51l,才將獲得YOLO格式標(biāo)簽的難度降低了一部分。然而深度學(xué)習(xí)算法的性能很大一部分取決于數(shù)據(jù)集的數(shù)量，所以即使YOLO_mark可以直接制作YOLO格式標(biāo)簽，但龐大的工作量依然讓眾多深度學(xué)習(xí)研究者苦不堪言。鑒于工作量的龐大，本次研究專門建立的紅外弱小目標(biāo)數(shù)據(jù)集并沒有特別多的圖像數(shù)據(jù)量，且目標(biāo)類別也只留下了無人機(jī)一類。紅外弱小目標(biāo)數(shù)據(jù)集共有1530張圖片，排除掉其中零零散散的無目標(biāo)負(fù)樣本圖像，剩余有價(jià)值目標(biāo)接近1500張，同時(shí)目標(biāo)的尺寸非常弱小，因此此環(huán)節(jié)為整個(gè)實(shí)驗(yàn)過程中最為困難的一環(huán)。
YOLO_mark 的標(biāo)注界面如圖4.2所示,標(biāo)注過程中只需用矩形框?qū)⒛繕?biāo)物體框在框內(nèi)即可，盡可能準(zhǔn)確不留背景的鎖住目標(biāo)物體就可以得到一個(gè)有效的標(biāo)簽。至此，標(biāo)簽格式如圖所示
在這里插入圖片描述

數(shù)據(jù)增強(qiáng)

作為深度學(xué)習(xí)網(wǎng)絡(luò)，大規(guī)模數(shù)據(jù)集的應(yīng)用是網(wǎng)絡(luò)成功實(shí)現(xiàn)的前提，而上文創(chuàng)立的數(shù)據(jù)集單從圖像數(shù)量的角度來說，并不利于網(wǎng)絡(luò)模型的訓(xùn)練。數(shù)據(jù)增廣技術(shù)就是針對這一問題，通過對數(shù)據(jù)圖像進(jìn)行一系列隨機(jī)的變化，來產(chǎn)出一些與原始數(shù)據(jù)集目標(biāo)特征相似，背景信息相似，色域信息相似，可以被網(wǎng)絡(luò)認(rèn)定為同類型目標(biāo)樣本，但又不完全相同的訓(xùn)練樣本來達(dá)到擴(kuò)充數(shù)據(jù)集樣本數(shù)量的目的。同時(shí)，經(jīng)過數(shù)據(jù)增廣填充后的數(shù)據(jù)集樣本，還能起到減少網(wǎng)絡(luò)模型對于目標(biāo)樣本其中某些特征或?qū)傩赃^度依賴的作用。
最早期的數(shù)據(jù)增廣方式為簡單的對原始訓(xùn)練樣本進(jìn)行翻轉(zhuǎn)或裁剪[52]，來得到一些新的訓(xùn)練樣本。翻轉(zhuǎn)的具體操作只需要將原始訓(xùn)練樣本進(jìn)行水平或者上下的翻轉(zhuǎn)，雖然操作簡單，訓(xùn)練樣本的肉眼可識別特征也沒有變化，但對于檢測網(wǎng)絡(luò)來說，目標(biāo)所在的位置和背景信息的位置都發(fā)生了改變，所以可以得到新的訓(xùn)練樣本數(shù)據(jù)，是一種簡單有效的數(shù)據(jù)增廣方式。而對于原始訓(xùn)練數(shù)據(jù)的裁剪，操作更復(fù)雜一些，需要從原始圖像中隨機(jī)裁取出一塊面積為10%至100%的區(qū)域，然后再將其放大到原始圖像的大小，得到新的數(shù)據(jù)圖像。這種方式得到的數(shù)據(jù)增廣樣本，極可能將原始樣本中的目標(biāo)物體所在位置分離掉，從而使網(wǎng)絡(luò)能夠?qū)W習(xí)到目標(biāo)物體的局部特征，降低對某些屬性的過度依賴。雖然比起翻轉(zhuǎn)方式復(fù)雜一些，但也屬于一種簡捷有效的數(shù)據(jù)增廣方法。由于圖像擁有對比度，飽和度，亮度和色調(diào)等參數(shù)，又產(chǎn)生了一個(gè)基于顏色的數(shù)據(jù)增廣方式[3]，只需要將上述參數(shù)隨機(jī)調(diào)整為原參數(shù)的50%至150%，就可以又得到一組新的樣本數(shù)據(jù)，這樣產(chǎn)生的樣本數(shù)據(jù)并沒有改變目標(biāo)物體的位置及類別信息，也是一種行之有效的方式。
上述的三種方法都是最為基礎(chǔ)的數(shù)據(jù)增廣方式，而當(dāng)樣本中含有多類別時(shí)，簡單的處理方式，可能會(huì)在訓(xùn)練樣本中加入過多的無效樣本，反而使網(wǎng)絡(luò)魯棒性下降.由此又提出了cutmixl[S4]的數(shù)據(jù)增廣方法。顧名思義，cutmix就是將圖片先裁剪再融合，具體操作就是在某張?zhí)卣鲌D上隨機(jī)生成一個(gè)裁剪區(qū)域，然后將另一張圖片同一位置的裁剪部分補(bǔ)充到特征圖的空白區(qū)域，最終得到的分類結(jié)果，按一定比例分配。cutmix 數(shù)據(jù)增廣方式有如下幾個(gè)優(yōu)點(diǎn):(1) cutmix數(shù)據(jù)增廣方式?jīng)]有補(bǔ)О操作，因而并不會(huì)存在無效的非像素信息出現(xiàn)在訓(xùn)練過程中，能夠有效提高訓(xùn)練效率; (2) cutmix要求網(wǎng)絡(luò)學(xué)習(xí)的重點(diǎn)集中到目標(biāo)的局部信息，進(jìn)一步增強(qiáng)網(wǎng)絡(luò)的強(qiáng)定位能力。(3）合成的訓(xùn)練樣本圖像中的目標(biāo)物體并不會(huì)出現(xiàn)不自然的情況，能夠有效提高網(wǎng)絡(luò)模型的分類能力。
但cutmix數(shù)據(jù)增廣方式對于單類別的數(shù)據(jù)集并沒有有效提升,因此本文使用的數(shù)據(jù)增廣方式為cutmix 的增強(qiáng)版本，Mosaic數(shù)據(jù)增廣。Mosaic數(shù)據(jù)增廣[55]利用了來自原始訓(xùn)練樣本中的四張圖片，將四張圖片進(jìn)行拼接后，就會(huì)得到一張新的訓(xùn)練樣本數(shù)據(jù)，同時(shí)還能獲得這四張圖片中的目標(biāo)框位置信息。Mosaic數(shù)據(jù)增廣后的圖像如圖所示:
在這里插入圖片描述
由上圖可知，Mosaic數(shù)據(jù)增廣的具體操作是，首先從原始訓(xùn)練樣本中隨機(jī)抽取四張圖片，然后對四張圖片進(jìn)行如旋轉(zhuǎn)，翻轉(zhuǎn)，裁剪，色域變換等隨機(jī)基礎(chǔ)方法，得到的四張圖片按照左上，左下，右下，右上的順序進(jìn)行排放，排放后的四張圖片按照矩陣的方式截取固定區(qū)域大小進(jìn)行拼接，拼接后的圖像有極為明顯的邊緣線，這些邊緣線的取值可以隨機(jī)選取或者人為設(shè)定，超出邊緣的部分會(huì)被直接舍去掉，同時(shí)保留原始圖像中的目標(biāo)框信息，沒有信息的部分會(huì)進(jìn)行補(bǔ)О操作，將圖片尺寸與原始樣本尺寸對齊。
Mosaic數(shù)據(jù)增廣方法在將數(shù)據(jù)集樣本數(shù)量豐富的同時(shí)，也極大地提高了樣本的多樣性，降低了算法網(wǎng)絡(luò)對于學(xué)習(xí)待測圖像多樣性的難度，是新穎又有效的數(shù)據(jù)增廣方式。

模型訓(xùn)練

 Epoch   gpu_mem       box       obj       cls    labels  img_size1/200     20.8G   0.01576   0.01955  0.007536        22      1280: 100%|██████████| 849/849 [14:42<00:00,  1.04s/it]Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 213/213 [01:14<00:00,  2.87it/s]all       3395      17314      0.994      0.957      0.0957      0.0843Epoch   gpu_mem       box       obj       cls    labels  img_size2/200     20.8G   0.01578   0.01923  0.007006        22      1280: 100%|██████████| 849/849 [14:44<00:00,  1.04s/it]Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 213/213 [01:12<00:00,  2.95it/s]all       3395      17314      0.996      0.956      0.0957      0.0845Epoch   gpu_mem       box       obj       cls    labels  img_size3/200     20.8G   0.01561    0.0191  0.006895        27      1280: 100%|██████████| 849/849 [10:56<00:00,  1.29it/s]Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 213/213 [00:52<00:00,  4.04it/s]all       3395      17314      0.996      0.957      0.0957      0.0845

8.針對小目標(biāo)檢測的YOLOv5網(wǎng)絡(luò)改進(jìn)

增加小尺度檢測頭

增加了一個(gè)檢測框，這個(gè)感覺意義不大，YOLO本身也有自適應(yīng)框。
主要是參考上面網(wǎng)頁中的方式，為小目標(biāo)檢測專門的增加了幾個(gè)特征提取層:
在第17層后，繼續(xù)對特征圖進(jìn)行上采樣等處理，使得特征圖繼續(xù)擴(kuò)大，同時(shí)在第20層時(shí)，將獲取到的大小為160X160的特征圖與骨干網(wǎng)絡(luò)中第2層特征圖進(jìn)行concat融合，以此獲取更大的特征圖進(jìn)行小目標(biāo)檢測。

# parameters
nc: 1  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple# anchors
anchors:- [5,6, 8,14, 15,11]  #4- [10,13, 16,30, 33,23]  # P3/8- [30,61, 62,45, 59,119]  # P4/16- [116,90, 156,198, 373,326]  # P5/32# YOLOv5 backbone
backbone:# [from, number, module, args][[-1, 1, Focus, [64, 3]],  # 0-P1/2[-1, 1, Conv, [128, 3, 2]],  # 1-P2/4[-1, 3, BottleneckCSP, [128]],   #160*160[-1, 1, Conv, [256, 3, 2]],  # 3-P3/8[-1, 9, BottleneckCSP, [256]],  #80*80[-1, 1, Conv, [512, 3, 2]],  # 5-P4/16[-1, 9, BottleneckCSP, [512]], #40*40[-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32[-1, 1, SPP, [1024, [5, 9, 13]]],[-1, 3, BottleneckCSP, [1024, False]],  # 9   20*20]# YOLOv5 head
head:[[-1, 1, Conv, [512, 1, 1]],  #20*20[-1, 1, nn.Upsample, [None, 2, 'nearest']], #40*40[[-1, 6], 1, Concat, [1]],  # cat backbone P4  40*40[-1, 3, BottleneckCSP, [512, False]],  # 13     40*40[-1, 1, Conv, [512, 1, 1]], #40*40[-1, 1, nn.Upsample, [None, 2, 'nearest']],[[-1, 4], 1, Concat, [1]],  # cat backbone P3   80*80[-1, 3, BottleneckCSP, [512, False]],  # 17 (P3/8-small)  80*80[-1, 1, Conv, [256, 1, 1]], #18  80*80[-1, 1, nn.Upsample, [None, 2, 'nearest']], #19  160*160[[-1, 2], 1, Concat, [1]], #20 cat backbone p2  160*160[-1, 3, BottleneckCSP, [256, False]], #21 160*160[-1, 1, Conv, [256, 3, 2]],  #22   80*80[[-1, 18], 1, Concat, [1]], #23 80*80[-1, 3, BottleneckCSP, [256, False]], #24 80*80[-1, 1, Conv, [256, 3, 2]], #25  40*40[[-1, 14], 1, Concat, [1]],  # 26  cat head P4  40*40[-1, 3, BottleneckCSP, [512, False]],  # 27 (P4/16-medium) 40*40[-1, 1, Conv, [512, 3, 2]],  #28  20*20[[-1, 10], 1, Concat, [1]],  #29 cat head P5  #20*20[-1, 3, BottleneckCSP, [1024, False]],  # 30 (P5/32-large)  20*20[[21, 24, 27, 30], 1, Detect, [nc, anchors]],  # Detect(p2, P3, P4, P5)]

深度學(xué)習(xí)算法中并不需要人工選擇目標(biāo)所在區(qū)域，因此為了解決如何自主尋找候選區(qū)域的問題，選擇預(yù)設(shè)一組或多組不同尺度不同位置的錨點(diǎn)框，這樣只需要留下與錨點(diǎn)框交并比大于預(yù)設(shè)閾值的預(yù)測框就可以得到目標(biāo)信息，YOLOv5s的anchor 設(shè)置如圖所示:
在這里插入圖片描述

雖然YOLOv5s的準(zhǔn)確率和召回率并不低，但由于YOLOv5s網(wǎng)絡(luò) anchors的設(shè)置具有泛用性，適用于大部分?jǐn)?shù)據(jù)集的目標(biāo)尺度，并沒有完全將特征提取能力聚焦到小尺度上，所以接下來對網(wǎng)絡(luò)的改進(jìn)將針對anchors，使網(wǎng)絡(luò)盡可能的鎖定到實(shí)際目標(biāo)所在的高層語義信息。在上文中提到，YOLOv5網(wǎng)絡(luò)的檢測頭接在PANet后，因此需要在Head輸出層PAN結(jié)構(gòu)部分增加一層高分辨率網(wǎng)絡(luò)，鑒于PANet的雙塔結(jié)構(gòu)，還需要在FPN層補(bǔ)充一次下采樣過程,使網(wǎng)絡(luò)模型能更準(zhǔn)確細(xì)致的保留且學(xué)習(xí)到弱小目標(biāo)的語義特征信息。增加高分辨率小尺度檢測頭后，網(wǎng)絡(luò)的錨點(diǎn)信息如圖所示:
在這里插入圖片描述
針對紅外弱小目標(biāo)的小尺度檢測頭添加完成,訓(xùn)練參數(shù)與上一次實(shí)驗(yàn)的參數(shù)對齊，在Ubuntu18.04操作系統(tǒng)下進(jìn)行，軟件平臺為Pycharm，訓(xùn)練與驗(yàn)證框架為Pytorch1.6.0。使用的CUDA版本為10.1，Python 版本為3.6，CPU為i9 9900k，GPU為兩張NVIDIARTX 2080ti。Batchsize設(shè)為8，輸入圖像大小為1920*1080，初始學(xué)習(xí)率采用0.01訓(xùn)練100epoch，直至訓(xùn)練loss 如圖時(shí)訓(xùn)練結(jié)束。

在這里插入圖片描述

增加圖像切割層

AAAI提出的思路就是在目標(biāo)分辨率太大的情況下，將目標(biāo)圖像分解為數(shù)個(gè)圖像送入YOLOV5網(wǎng)絡(luò)中做檢測，再回收所有圖像，計(jì)算坐標(biāo)的相對值，集體來一次NMS。
小目標(biāo)檢測效果不好主要原因?yàn)樾∧繕?biāo)尺寸問題。
以網(wǎng)絡(luò)的輸入608608為例,yolov5中下采樣使用了5次，因此最后的特征圖大小是1919，3838,7676。三個(gè)特征圖中，最大的7676負(fù)責(zé)檢測小目標(biāo)，而對應(yīng)到608608上，每格特征圖的感受野是608/76=8*8大小。即如果原始圖像中目標(biāo)的寬或高小于8像素，網(wǎng)絡(luò)很難學(xué)習(xí)到目標(biāo)的特征信息。
另外很多圖像分辨率很大，如果簡單的進(jìn)行下采樣，下采樣的倍數(shù)太大，容易丟失數(shù)據(jù)信息。但是倍數(shù)太小，網(wǎng)絡(luò)前向傳播需要在內(nèi)存中保存大量的特征圖，極大耗盡GPU資源,很容易發(fā)生顯存爆炸，無法正常的訓(xùn)練及推理。
這種情況可以使用分割的方式，將大圖先分割成小圖，再對每個(gè)小圖檢測，可以看出中間區(qū)域很多的汽車都被檢測出來:不過這樣方式有優(yōu)點(diǎn)也有缺點(diǎn):優(yōu)點(diǎn):準(zhǔn)確性分割后的小圖，再輸入目標(biāo)檢測網(wǎng)絡(luò)中，對于最小目標(biāo)像素的下限會(huì)大大降低。比如分割成608608大小，送入輸入圖像大小608608的網(wǎng)絡(luò)中，按照上面的計(jì)算方式，原始圖片上，長寬大于8個(gè)像素的小目標(biāo)都可以學(xué)習(xí)到特征。
缺點(diǎn):增加計(jì)算量比如原本19201080的圖像，如果使用直接大圖檢測的方式，一次即可檢測完。但采用分割的方式，切分成4張912608大小的圖像，再進(jìn)行N次檢測，會(huì)大大增加檢測時(shí)間。
在這里插入圖片描述

9.系統(tǒng)整合

下圖完整源碼＆數(shù)據(jù)集＆環(huán)境部署視頻教程＆自定義UI界面

在這里插入圖片描述

參考博客《基于小目標(biāo)檢測頭的改進(jìn)YOLOv5紅外遙感圖像小目標(biāo)檢測系統(tǒng)》

查看全文

http://www.risenshineclean.com/news/1959.html

中文亚洲精品无码_熟女乱子伦免费_人人超碰人人爱国产_亚洲熟妇女综合网

凡科網(wǎng)登錄官網(wǎng)seo關(guān)鍵詞排名優(yōu)化哪家好

1.研究背景與意義

2.圖片演示

3.視頻演示

4.數(shù)據(jù)集的采集＆標(biāo)注和整理

圖片的收集

使用labelImg進(jìn)行標(biāo)注

轉(zhuǎn)換為YOLO格式

整理數(shù)據(jù)文件夾結(jié)構(gòu)

5.核心代碼講解

5.1 detect.py

5.2 export.py

5.3 hubconf.py

5.4 img2videos.py

5.5 subdivision.py

5.6 train.py

6.系統(tǒng)整體結(jié)構(gòu)

7.數(shù)據(jù)集的標(biāo)注和增強(qiáng)

數(shù)據(jù)集的標(biāo)注

數(shù)據(jù)增強(qiáng)

模型訓(xùn)練

8.針對小目標(biāo)檢測的YOLOv5網(wǎng)絡(luò)改進(jìn)

增加小尺度檢測頭

增加圖像切割層

9.系統(tǒng)整合

相關(guān)文章：