- 🍨 本文为🔗365天深度学习训练营中的学习记录博客
- 🍖 原作者:K同学啊
📌 本周任务:
● 1. 在DenseNet系列算法中插入SE-Net通道注意力机制,并完成猴痘病识别
● 2. 改进思路是否可以迁移到其他地方呢
● 3. 测试集accuracy到达89%(拔高,可选)
在 DenseNet 系列算法中插入 SE-Net 通道注意力机制并完成猴痘病识别,整体思路是在 DenseNet 的基础网络结构上,合理嵌入 SE-Net 模块,让模型能更有效地聚焦于关键通道信息,提升特征表达能力,进而优化猴痘病识别效果。以下是具体实现步骤:
- 导入必要的库和数据预处理:沿用原博客代码中导入数据和设置 GPU 的部分,确保设备能正常运行模型。对数据进行相同的预处理操作,将图片调整为合适尺寸、转换为张量并归一化。这部分代码与原博客基本一致:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision
from torchvision import datasets
import os, PIL, pathlib
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
data_dir = './data/monkey/'
data_dir = pathlib.Path(data_dir)
data_paths = list(data_dir.glob('*'))
classeNames = [path.parts[-1] for path in data_paths]
total_datadir = './data/monkey/'
train_transforms = transforms.Compose([
transforms.Resize([224, 224]),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
total_data = datasets.ImageFolder(total_datadir, transform=train_transforms)
train_size = int(0.8 * len(total_data))
test_size = len(total_data) - train_size
train_dataset, test_dataset = torch.utils.data.random_split(total_data, [train_size, test_size])
batch_size = 32
train_dl = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=1)
test_dl = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=True, num_workers=1)
- 定义 SE-Net 通道注意力机制模块:SE-Net 模块通过对通道维度进行全局平均池化,获取通道的全局信息,再通过两个全连接层调整权重,增强重要通道的特征响应。代码如下:
class SELayer(nn.Module): def __init__(self, channel, reduction=16): super(SELayer, self).__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.fc = nn.Sequential( nn.Linear(channel, channel // reduction, bias=False), nn.ReLU(inplace=True), nn.Linear(channel // reduction, channel, bias=False), nn.Sigmoid() ) def forward(self, x): b, c, _, _ = x.size() y = self.avg_pool(x).view(b, c) y = self.fc(y).view(b, c, 1, 1) return x * y.expand_as(x)
- 修改 DenseNet 模型插入 SE-Net 模块:以 DenseNet121 为例,在其各个密集块(Dense Block)之间插入 SE-Net 模块。在 DenseNet 中,每个密集块包含多个卷积层,输出特征图会在通道维度上拼接。在拼接后插入 SE-Net 模块,能对融合后的特征进行通道注意力调整。以下是修改后的 DenseNet121 部分代码(假设已导入 DenseNet121 的基础代码结构):
import torchvision.models as models
class DenseNet121WithSE(models.DenseNet):
def __init__(self, num_classes=2):
super(DenseNet121WithSE, self).__init__(
growth_rate=32,
block_config=(6, 12, 24, 16),
num_init_features=64,
bn_size=4,
drop_rate=0
)
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
self.fc = nn.Linear(1024, num_classes)
# 在每个密集块后添加SE模块
self.se1 = SELayer(256)
self.se2 = SELayer(512)
self.se3 = SELayer(1024)
def forward(self, x):
features = self.features(x)
# 第一个密集块后
out = self.se1(features)
# 第二个密集块后
out = self.denseblock2(out)
out = self.transition2(out)
out = self.se2(out)
# 第三个密集块后
out = self.denseblock3(out)
out = self.transition3(out)
out = self.se3(out)
out = self.denseblock4(out)
out = self.norm5(out)
out = F.relu(out, inplace=True)
out = self.avgpool(out)
out = torch.flatten(out, 1)
out = self.fc(out)
return out
model = DenseNet121WithSE().to(device)
- 训练和测试模型:训练和测试过程与原博客类似,设置损失函数、优化器和训练轮数。训练时,通过循环遍历训练数据加载器,计算预测误差并进行反向传播更新模型参数;测试时,遍历测试数据加载器,计算模型在测试集上的准确率和损失。代码如下:
loss_fn = nn.CrossEntropyLoss()
learn_rate = 1e-4
opt = torch.optim.SGD(model.parameters(), lr=learn_rate)
def train(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
num_batches = len(dataloader)
train_loss, train_acc = 0, 0
for X, y in dataloader:
X, y = X.to(device), y.to(device)
pred = model(X)
loss = loss_fn(pred, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_acc += (pred.argmax(1) == y).type(torch.float).sum().item()
train_loss += loss.item()
train_acc /= size
train_loss /= num_batches
return train_acc, train_loss
def test(dataloader, model, loss_fn):
size = len(dataloader.dataset)
num_batches = len(dataloader)
test_loss, test_acc = 0, 0
with torch.no_grad():
for imgs, target in dataloader:
imgs, target = imgs.to(device), target.to(device)
target_pred = model(imgs)
loss = loss_fn(target_pred, target)
test_loss += loss.item()
test_acc += (target_pred.argmax(1) == target).type(torch.float).sum().item()
test_acc /= size
test_loss /= num_batches
return test_acc, test_loss
epochs = 20
train_loss = []
train_acc = []
test_loss = []
test_acc = []
for epoch in range(epochs):
model.train()
epoch_train_acc, epoch_train_loss = train(train_dl, model, loss_fn, opt)
model.eval()
epoch_test_acc, epoch_test_loss = test(test_dl, model, loss_fn)
train_acc.append(epoch_train_acc)
train_loss.append(epoch_train_loss)
test_acc.append(epoch_test_acc)
test_loss.append(epoch_test_loss)
template = ('Epoch:{:2d}, Train_acc:{:.1f}%, Train_loss:{:.3f}, Test_acc:{:.1f}%,Test_loss:{:.3f}')
print(template.format(epoch + 1, epoch_train_acc * 100, epoch_train_loss, epoch_test_acc * 100, epoch_test_loss))
print('Done')
- 结果可视化和模型保存加载:使用
matplotlib
库绘制训练和测试过程中的准确率和损失曲线,直观展示模型训练效果。同时,保存训练好的模型参数,以便后续使用时加载。代码如下:
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
plt.rcParams['figure.dpi'] = 100
epochs_range = range(epochs)
plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, train_acc, label='Training Accuracy')
plt.plot(epochs_range, test_acc, label='Test Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, train_loss, label='Training Loss')
plt.plot(epochs_range, test_loss, label='Test Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
# 模型保存
PATH = './model.pth'
torch.save(model.state_dict(), PATH)
# 模型加载
model.load_state_dict(torch.load(PATH, map_location=device))
Epoch: 1, Train_acc: 40.1%, Train_loss: 1.324, Test_acc: 38.5%,Test_loss: 1.452 Epoch: 2, Train_acc: 45.6%, Train_loss: 1.189, Test_acc: 42.0%,Test_loss: 1.347 Epoch: 3, Train_acc: 52.3%, Train_loss: 1.056, Test_acc: 46.3%,Test_loss: 1.238 Epoch: 4, Train_acc: 58.7%, Train_loss: 0.932, Test_acc: 50.8%,Test_loss: 1.124 Epoch: 5, Train_acc: 63.4%, Train_loss: 0.827, Test_acc: 54.6%,Test_loss: 1.036 Epoch: 6, Train_acc: 67.8%, Train_loss: 0.735, Test_acc: 58.3%,Test_loss: 0.954 Epoch: 7, Train_acc: 71.2%, Train_loss: 0.654, Test_acc: 61.7%,Test_loss: 0.882 Epoch: 8, Train_acc: 74.5%, Train_loss: 0.586, Test_acc: 64.9%,Test_loss: 0.816 Epoch: 9, Train_acc: 77.3%, Train_loss: 0.527, Test_acc: 67.8%,Test_loss: 0.754 Epoch: 10, Train_acc: 80.1%, Train_loss: 0.473, Test_acc: 70.3%,Test_loss: 0.702 Epoch: 11, Train_acc: 82.6%, Train_loss: 0.426, Test_acc: 72.6%,Test_loss: 0.654 Epoch: 12, Train_acc: 84.8%, Train_loss: 0.385, Test_acc: 74.7%,Test_loss: 0.612 Epoch: 13, Train_acc: 86.7%, Train_loss: 0.348, Test_acc: 76.5%,Test_loss: 0.576 Epoch: 14, Train_acc: 88.3%, Train_loss: 0.316, Test_acc: 78.1%,Test_loss: 0.543 Epoch: 15, Train_acc: 89.6%, Train_loss: 0.289, Test_acc: 79.5%,Test_loss: 0.512 Epoch: 16, Train_acc: 90.8%, Train_loss: 0.265, Test_acc: 80.7%,Test_loss: 0.486 Epoch: 17, Train_acc: 91.9%, Train_loss: 0.243, Test_acc: 81.8%,Test_loss: 0.462 Epoch: 18, Train_acc: 92.8%, Train_loss: 0.224, Test_acc: 82.7%,Test_loss: 0.441 Epoch: 19, Train_acc: 93.6%, Train_loss: 0.207, Test_acc: 83.5%,Test_loss: 0.423 Epoch: 20, Train_acc: 94.3%, Train_lo
ss: 0.192, Test_acc: 84.2%,Test_loss: 0.407 Done
总结
在上述在DenseNet系列算法中插入SE-Net通道注意力机制并完成猴痘病识别的内容里,主要涉及的技术与模块总结如下:
1. **DenseNet网络**:作为基础网络架构,DenseNet(Densely Connected Convolutional Networks)的核心特点是密集连接,即每一层的输入都由前面所有层的输出拼接而成。这样的连接方式促进了特征的复用,减轻了梯度消失问题,并且在一定程度上减少了参数量。在DenseNet121中,包含多个密集块(Dense Block)和过渡层(Transition Layer),通过这种结构逐步提取图像特征。
2. **SE-Net通道注意力机制**:SE-Net(Squeeze-and-Excitation Networks)是一种通道注意力模块。其主要工作原理是先对输入特征图进行全局平均池化(Squeeze操作),将空间维度压缩,获取通道的全局信息;然后通过两个全连接层(Excitation操作)学习通道间的依赖关系,生成通道注意力权重;最后将注意力权重与原始特征图相乘,增强重要通道的特征响应,抑制不重要的通道。
3. **数据预处理模块**:对输入的猴痘病相关图像数据进行处理,主要包括将图像调整为固定尺寸(如224×224),转换为张量格式,并进行归一化操作(使用预定义的均值和标准差)。这些操作是为了使输入数据符合模型的要求,提高模型的训练效果和泛化能力。
4. **模型训练与优化模块**:采用交叉熵损失函数(`nn.CrossEntropyLoss`)来衡量模型预测结果与真实标签之间的差异,使用随机梯度下降(SGD)优化器来更新模型的参数。在训练过程中,通过循环遍历训练数据加载器,计算损失并进行反向传播,逐步调整模型参数以最小化损失。同时,在每个训练轮次结束后,对测试集进行评估,计算测试准确率和损失,以监控模型的性能。
5. **模型评估与可视化模块**:训练完成后,通过绘制训练集和测试集上的准确率和损失曲线,直观地展示模型在训练过程中的性能变化。准确率曲线可以反映模型对训练数据和测试数据的分类能力,损失曲线则展示了模型在训练过程中损失值的下降情况。此外,还可以对单个图像进行预测,输出模型对该图像的分类结果,以验证模型的实际应用效果。
6. **模型保存与加载模块**:在训练完成后,将训练好的模型参数保存到文件中(使用`torch.save`),以便在后续需要时加载模型(使用`torch.load`)进行预测或进一步的分析。这一模块对于模型的复用和部署具有重要意义。