網(wǎng)站優(yōu)化設(shè)計(jì)方案鄭州seo線上推廣技術(shù)
經(jīng)典神經(jīng)網(wǎng)絡(luò)-LeNets5
1998年Yann LeCun等提出的第一個(gè)用于手寫(xiě)數(shù)字識(shí)別問(wèn)題并產(chǎn)生實(shí)際商業(yè)(郵政行業(yè))價(jià)值的卷積神經(jīng)網(wǎng)絡(luò)
參考:論文筆記:Gradient-Based Learning Applied to Document Recognition-CSDN博客
1 網(wǎng)絡(luò)模型結(jié)構(gòu)
整體結(jié)構(gòu)解讀:
輸入圖像:32×32×1
三個(gè)卷積層:
C1:輸入圖片32×32,6個(gè)5×5卷積核 ,輸出特征圖大小28×28(32-5+1)=28,一個(gè)bias參數(shù);
可訓(xùn)練參數(shù)一共有:(5×5+1)×6=156
C3 :輸入圖片14×14,16個(gè)5×5卷積核,有6×3+6×4+3×4+1×6=60個(gè)通道,輸出特征圖大小10×10((14-5)/1+1),一個(gè)bias參數(shù);
可訓(xùn)練參數(shù)一共有:6(3×5×5+1)+6×(4×5×5+1)+3×(4×5×5+1)+1×(6×5×5+1)=1516
C3的非密集的特征圖連接:
C3的前6個(gè)特征圖與S2層相連的3個(gè)特征圖相連接,后面6個(gè)特征圖與S2層相連的4個(gè)特征圖相連 接,后面3個(gè)特征圖與S2層部分不相連的4個(gè)特征圖相連接,最后一個(gè)與S2層的所有特征圖相連。 采用非密集連接的方式,打破對(duì)稱(chēng)性,同時(shí)減少計(jì)算量,共60組卷積核。主要是為了節(jié)省算力。
C5:輸入圖片5×5,16個(gè)5×5卷積核,包括120×16個(gè)5×5卷積核 ,輸出特征圖大小1×1(5-5+1),一個(gè)bias參數(shù);
可訓(xùn)練參數(shù)一共有:120×(16×5×5+1)=48120
兩個(gè)池化層S2和S4:
都是2×2的平均池化,并添加了非線性映射
S2(下采樣層):輸入28×28,采樣區(qū)域2×2,輸入相加,乘以一個(gè)可訓(xùn)練參數(shù), 再加上一個(gè)可訓(xùn)練偏置,使用sigmoid激活,輸出特征圖大小:14×14(28/2)
S4(下采樣層):輸入10×10,采樣區(qū)域2×2,輸入相加,乘以一個(gè)可訓(xùn)練參數(shù), 再加上一個(gè)可訓(xùn)練偏置,使用sigmoid激活,輸出特征圖大小:5×5(10/2)
兩個(gè)全連接層:
第一個(gè)全連接層:輸入120維向量,輸出84個(gè)神經(jīng)元,計(jì)算輸入向量和權(quán)重向量之間的點(diǎn)積,再加上一個(gè)偏置,結(jié)果通過(guò)sigmoid函數(shù)輸出。84的原因是:字符編碼是ASCII編碼,用7×12大小的位圖表示,-1白色1黑色,84可以用于對(duì)每一個(gè)像素點(diǎn)的值進(jìn)行估計(jì)。
第二個(gè)全連接層(Output層-輸出層):輸出 10個(gè)神經(jīng)元 ,共有10個(gè)節(jié)點(diǎn),代表數(shù)字0-9。
所有激活函數(shù)采用Sigmoid
2 網(wǎng)絡(luò)模型實(shí)現(xiàn)
2.1模型定義
import torch import torch.nn as nn ? ? class LeNet5s(nn.Module):def __init__(self):super(LeNet5s, self).__init__() ?# 繼承父類(lèi)# 第一個(gè)卷積層self.C1 = nn.Sequential(nn.Conv2d(in_channels=1, ?# 輸入通道out_channels=6, ?# 輸出通道kernel_size=5, ?# 卷積核大小),nn.ReLU(),)# 池化:平均池化self.S2 = nn.AvgPool2d(kernel_size=2) ?# C3:3通道特征融合單元self.C3_unit_6x3 = nn.Conv2d(in_channels=3,out_channels=1,kernel_size=5,)# C3:4通道特征融合單元self.C3_unit_6x4 = nn.Conv2d(in_channels=4,out_channels=1,kernel_size=5,) ?# C3:4通道特征融合單元,剔除中間的1通道self.C3_unit_3x4_pop1 = nn.Conv2d(in_channels=4,out_channels=1,kernel_size=5,) ?# C3:6通道特征融合單元self.C3_unit_1x6 = nn.Conv2d(in_channels=6,out_channels=1,kernel_size=5,) ?# S4:池化self.S4 = nn.AvgPool2d(kernel_size=2)# 全連接層self.fc1 = nn.Sequential(nn.Linear(in_features=16 * 5 * 5, out_features=120), nn.ReLU())self.fc2 = nn.Sequential(nn.Linear(in_features=120, out_features=84), nn.ReLU())self.fc3 = nn.Linear(in_features=84, out_features=10) ?def forward(self, x):# 訓(xùn)練數(shù)據(jù)批次大小batch_sizenum = x.shape[0] ?x = self.C1(x)x = self.S2(x)# 生成一個(gè)empty張量outchannel = torch.empty((num, 0, 10, 10))# 6個(gè)3通道的單元for i in range(6):# 定義一個(gè)元組:存儲(chǔ)要提取的通道特征的下標(biāo)channel_idx = tuple([j % 6 for j in range(i, i + 3)])x1 = self.C3_unit_6x3(x[:, channel_idx, :, :])outchannel = torch.cat([outchannel, x1], dim=1) ?# 6個(gè)4通道的單元for i in range(6):# 定義一個(gè)元組:存儲(chǔ)要提取的通道特征的下標(biāo)channel_idx = tuple([j % 6 for j in range(i, i + 4)])x1 = self.C3_unit_6x4(x[:, channel_idx, :, :])outchannel = torch.cat([outchannel, x1], dim=1) ?# 3個(gè)4通道的單元,先拿五個(gè),干掉中那一個(gè)for i in range(3):# 定義一個(gè)元組:存儲(chǔ)要提取的通道特征的下標(biāo)channel_idx = tuple([j % 6 for j in range(i, i + 5)])# 刪除第三個(gè)元素channel_idx = channel_idx[:2] + channel_idx[3:]print(channel_idx)x1 = self.C3_unit_3x4_pop1(x[:, channel_idx, :, :])outchannel = torch.cat([outchannel, x1], dim=1) ?x1 = self.C3_unit_1x6(x)# 平均池化outchannel = torch.cat([outchannel, x1], dim=1)outchannel = nn.ReLU()(outchannel) ?x = self.S4(outchannel)# 對(duì)數(shù)據(jù)進(jìn)行變形x = x.view(x.size(0), -1)# 全連接層x = self.fc1(x)x = self.fc2(x)# TODO:SOFTMAXoutput = self.fc3(x) ?return output ? ? def test001():net = LeNet5s()# 隨機(jī)一個(gè)測(cè)試數(shù)據(jù)input = torch.randn(128, 1, 32, 32)output = net(input)print(output.shape)pass ? ? if __name__ == "__main__":test001()
2.2全局變量
import torch import torch.nn as nn import torch.optim as optim import torchvision import torchvision.transforms as transforms import os ? dir = os.path.dirname(__file__) modelpath = os.path.join(dir, "weight/model.pth") datapath = os.path.join(dir, "data") ? # 數(shù)據(jù)預(yù)處理和加載 transform = transforms.Compose([transforms.Resize((32, 32)), ?# 調(diào)整輸入圖像大小為32x32transforms.ToTensor(),transforms.Normalize((0.1307,), (0.3081,)),] ) ?
2.3模型訓(xùn)練
def train(): ?trainset = torchvision.datasets.MNIST(root=datapath, train=True, download=True, transform=transform)trainloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True) ?# 實(shí)例化模型net = LeNet5() ?# 使用MSELoss作為損失函數(shù)criterion = nn.MSELoss() ?# 使用SGD優(yōu)化器optimizer = optim.SGD(net.parameters(), lr=0.01, momentum=0.9) ?# 訓(xùn)練模型num_epochs = 10for epoch in range(num_epochs):running_loss = 0.0for i, data in enumerate(trainloader, 0):inputs, labels = data ?# 將labels轉(zhuǎn)換為one-hot編碼labels_one_hot = torch.zeros(labels.size(0), 10).scatter_(1, labels.view(-1, 1), 1.0)labels_one_hot = labels_one_hot.to(torch.float32)optimizer.zero_grad()outputs = net(inputs)loss = criterion(outputs, labels_one_hot)loss.backward()optimizer.step() ?running_loss += loss.item()if i % 100 == 99:print(f"[{epoch + 1}, {i + 1}] loss: {running_loss / 100:.3f}")running_loss = 0.0# 保存模型參數(shù)torch.save(net.state_dict(), modelpath)print("Finished Training")
2.4驗(yàn)證
def vaild(): ?testset = torchvision.datasets.MNIST(root=datapath, train=False, download=True, transform=transform)testloader = torch.utils.data.DataLoader(testset, batch_size=32, shuffle=False)# 實(shí)例化模型net = LeNet5()net.load_state_dict(torch.load(modelpath))# 在測(cè)試集上測(cè)試模型correct = 0total = 0with torch.no_grad():for data in testloader:images, labels = dataoutputs = net(images)_, predicted = torch.max(outputs.data, 1)total += labels.size(0)correct += (predicted == labels).sum().item() ?print(f"驗(yàn)證集: {100 * correct / total:.2f}%")