做網(wǎng)站和seo流程接推廣app任務(wù)的平臺
1、前言:
在閱讀之前,需要配置好對應(yīng)pytorch版本。
對于一般學習,使用cpu版本的即可。參考教程點我
導(dǎo)入pytorch包,使用如下命令即可。
import torch # 注意雖然叫pytorch,但是在引用時是引用torch
2、神經(jīng)網(wǎng)絡(luò)獲取數(shù)據(jù)
神經(jīng)網(wǎng)絡(luò)獲取數(shù)據(jù)主要用到Dataset和Dataloader兩個方法
Dataset主要用于獲取數(shù)據(jù)以及對應(yīng)的真實label
Dataloader主要為后面的網(wǎng)絡(luò)提供不同的數(shù)據(jù)形式
在torch.utils.data包內(nèi)提供了DataSet類,可在Pytorch官網(wǎng)看到對應(yīng)的描述
class Dataset(Generic[T_co]):r"""An abstract class representing a :class:`Dataset`.All datasets that represent a map from keys to data samples should subclassit. All subclasses should overwrite :meth:`__getitem__`, supporting fetching adata sample for a given key. Subclasses could also optionally overwrite:meth:`__len__`, which is expected to return the size of the dataset by many:class:`~torch.utils.data.Sampler` implementations and the default optionsof :class:`~torch.utils.data.DataLoader`. Subclasses could alsooptionally implement :meth:`__getitems__`, for speedup batched samplesloading. This method accepts list of indices of samples of batch and returnslist of samples... note:::class:`~torch.utils.data.DataLoader` by default constructs an indexsampler that yields integral indices. To make it work with a map-styledataset with non-integral indices/keys, a custom sampler must be provided."""def __getitem__(self, index) -> T_co:raise NotImplementedError("Subclasses of Dataset should implement __getitem__.")# def __getitems__(self, indices: List) -> List[T_co]:# Not implemented to prevent false-positives in fetcher check in# torch.utils.data._utils.fetch._MapDatasetFetcherdef __add__(self, other: "Dataset[T_co]") -> "ConcatDataset[T_co]":return ConcatDataset([self, other])# No `def __len__(self)` default?# See NOTE [ Lack of Default `__len__` in Python Abstract Base Classes ]# in pytorch/torch/utils/data/sampler.py
根據(jù)上述描述可知,Dataset是一個抽象類,用于表示數(shù)據(jù)集。你可以通過繼承這個類并實現(xiàn)以下方法來自定義數(shù)據(jù)集:
__len__(self): 返回數(shù)據(jù)集的大小,即數(shù)據(jù)集中有多少個樣本。
__getitem__(self, idx): 根據(jù)索引 idx 返回數(shù)據(jù)集中的一個樣本和對應(yīng)的標簽。
3、案例
使用Dataset讀取文件夾E:\Python_learning\Deep_learning\dataset\hymenoptera_data\train\ants下所有圖片。并獲取對應(yīng)的label,該數(shù)據(jù)集的文件夾的名字為對應(yīng)的標簽,而文件夾內(nèi)為對應(yīng)的訓(xùn)練集的圖片。
import os
from torch.utils.data import Dataset
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transformsclass MyDataset(Dataset):def __init__(self, root_path, label):self.root_path = root_pathself.label = labelself.img_path = os.path.join(root_path, label) # 拼接路徑print(f"圖片路徑: {self.img_path}") # 打印路徑以進行調(diào)試try:self.img_path_list = os.listdir(self.img_path) # 列出文件夾中的文件print(f"圖片列表: {self.img_path_list}") # 打印圖片列表以進行調(diào)試except PermissionError as e:print(f"權(quán)限錯誤: {e}")except FileNotFoundError as e:print(f"文件未找到錯誤: {e}")def __getitem__(self, index):img_index = self.img_path_list[index]img_path = os.path.join(self.img_path, img_index)try:img = Image.open(img_path)except Exception as e:print(f"讀取圖片時出錯: {e}, 圖片路徑: {img_path}")raise elabel = self.labelreturn img, labeldef __len__(self):return len(self.img_path_list)# 實例化這個類
my_data = MyDataset(root_path=r'E:\Python_learning\Deep_learning\dataset\hymenoptera_data\train', label='ants')
writer = SummaryWriter('logs')
for i in range(my_data.__len__()):img, label = my_data[i] # 依次獲取對應(yīng)的圖片# 此處img為PIL Image, 使用transforms中的ToTensor方法轉(zhuǎn)化為tensor格式writer.add_image(tag=label, img_tensor=transforms.ToTensor()(img), global_step=i)
writer.close()
print(f"當前文件夾下{i + 1}張圖片已讀取完畢,請在Tensorboard中查看")
在控制臺輸入tensorboard --logdir='E:\Python_learning\Deep_learning\note\logs'
打開tensorboard查看