Python 训练营打卡 Day 54-Inception网络及其思考

发布于:2025-07-07 ⋅ 阅读:(23) ⋅ 点赞:(0)

一.inception网络介绍

Inception 网络,也被称为 GoogLeNet,是 Google 团队在 2014 年提出的经典卷积神经网络架构。它的核心设计理念是 “并行的多尺度融合”,通过在同一层网络中使用多个不同大小的卷积核(如 1x1、3x3、5x5)以及池化操作,从不同尺度提取图像特征,然后将这些特征进行融合,从而在不增加过多计算量的情况下,获得更丰富的特征表达
Inception 模块是 Inception 网络的基本组成单元,在同样的步长下,卷积核越小,下采样率越低,保留的图片像素越多;卷积核越大,越能捕捉像素周围的信息
一个典型的 Inception 模块包含以下几个并行的分支:

  •  1x1 卷积分支:用于降维,减少后续卷积操作的计算量,同时提取局部特征。​(像素下采样率低,但是可以修改通道数)
  • 3x3 卷积分支:捕捉中等尺度的特征
  • 5x5 卷积分支:捕捉较大尺度的特征
  • 池化分支:通常使用最大池化或平均池化,用于保留图像的全局信息

二.inception网络架构

2.1 定义inception模块

import torch
import torch.nn as nn

class Inception(nn.Module):
    def __init__(self, in_channels):
        """
        Inception模块初始化,实现多尺度特征并行提取与融合
        
        参数:
            in_channels: 输入特征图的通道数
        """
        super(Inception, self).__init__()
        
        # 1x1卷积分支:降维并提取通道间特征关系
        # 减少后续卷积的计算量,同时保留局部特征信息
        self.branch1x1 = nn.Sequential(
            nn.Conv2d(in_channels, 64, kernel_size=1),  # 降维至64通道
            nn.ReLU()  # 引入非线性激活
        )
        
        # 3x3卷积分支:通过1x1卷积降维后使用3x3卷积捕捉中等尺度特征
        # 先降维减少计算量,再进行空间特征提取
        self.branch3x3 = nn.Sequential(
            nn.Conv2d(in_channels, 96, kernel_size=1),  # 降维至96通道
            nn.ReLU(),
            nn.Conv2d(96, 128, kernel_size=3, padding=1),  # 3x3卷积,保持空间尺寸不变
            nn.ReLU()
        )
        
        # 5x5卷积分支:通过1x1卷积降维后使用5x5卷积捕捉大尺度特征
        # 较大的感受野用于提取更全局的结构信息
        self.branch5x5 = nn.Sequential(
            nn.Conv2d(in_channels, 16, kernel_size=1),  # 大幅降维至16通道
            nn.ReLU(),
            nn.Conv2d(16, 32, kernel_size=5, padding=2),  # 5x5卷积,保持空间尺寸不变
            nn.ReLU()
        )
        
        # 池化分支:通过池化操作保留全局信息并降维
        # 增强特征的平移不变性
        self.branch_pool = nn.Sequential(
            nn.MaxPool2d(kernel_size=3, stride=1, padding=1),  # 3x3最大池化,保持尺寸
            nn.Conv2d(in_channels, 32, kernel_size=1),  # 降维至32通道
            nn.ReLU()
        )

    def forward(self, x):
        """
        前向传播函数,并行计算四个分支并在通道维度拼接
        
        参数:
            x: 输入特征图,形状为[batch_size, in_channels, height, width]
        
        返回:
            拼接后的特征图,形状为[batch_size, 256, height, width]
        """
        # 注意,这里是并行计算四个分支
        branch1x1 = self.branch1x1(x)  # 输出形状: [batch_size, 64, height, width]
        branch3x3 = self.branch3x3(x)  # 输出形状: [batch_size, 128, height, width]
        branch5x5 = self.branch5x5(x)  # 输出形状: [batch_size, 32, height, width]
        branch_pool = self.branch_pool(x)  # 输出形状: [batch_size, 32, height, width]
        
        # 在通道维度(dim=1)拼接四个分支的输出
        # 总通道数: 64 + 128 + 32 + 32 = 256
        outputs = [branch1x1, branch3x3, branch5x5, branch_pool]
        return torch.cat(outputs, dim=1)

上述模块变化为[B, C, H, W]-->[B, 256, H, W]

model = Inception(in_channels=64)
input = torch.randn(32, 64, 28, 28)
output = model(input)
print(f"输入形状: {input.shape}")
print(f"输出形状: {output.shape}")  

inception模块中不同的卷积核和步长最后输出同样尺寸的特征图,这是经过精心设计的,才能在空间上对齐,才能在维度上正确拼接

2.2 特征融合方法

concat这种增加通道数的方法是一种经典的特征融合方法。通道数增加,空间尺寸(H, W)保持不变,每个通道的数值保持独立,没有加法运算。相当于把不同特征图 “并排” 放在一起,形成更 “厚” 的特征矩阵
在深度学习中,特征融合的尺度有以下方式:

1. 逐元素相加:将相同形状的特征图对应位置的元素直接相加,比如残差连接:

output = x + self.residual_block(x)

不改变特征图尺寸和通道数,计算高效,但需保证输入形状一致
2. 逐元素相乘:通过乘法对特征进行权重分配,抑制无关特征,增强关键特征。比如注意力机制、门控机制(如 LSTM 中的遗忘门、输入门)

attention = self.ChannelAttention(features)  # 生成通道权重
    weighted_features = features * attention  # 逐元素相乘

2.3 inceptionNet网络定义

class InceptionNet(nn.Module):
    def __init__(self, num_classes=10):
        super(InceptionNet, self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        )

        self.inception1 = Inception(64)
        self.inception2 = Inception(256)

        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(256, num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = self.inception1(x)
        x = self.inception2(x)

        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        return x

# 创建网络实例
model = InceptionNet()
# 创建一个随机输入张量,模拟图像数据,这里假设输入图像是3通道,尺寸为224x224
input_tensor = torch.randn(1, 3, 224, 224)
# 前向传播
output = model(input_tensor)
print(output.shape)

torch.Size([1, 10])

三.卷积核的变体

3.1 感受野

感受野是指在卷积神经网络(CNN)中,神经元在原始输入图像上所对应的区域大小。通俗来说,卷积层中的每个输出特征图上的一个像素点,其信息来源于输入图像中的某个特定区域,这个区域的大小就是该像素点的感受野
假设我们有一个 3×3 的卷积核,对一张 5×5 的图像进行步长为 1 的卷积操作:输出特征图的每个像素点,都由输入图像中 3×3 的区域计算得到,因此该层的感受野为 3×3,如果再叠加一层 3×3 卷积(步长 1),第二层的每个像素点会融合第一层 3×3 区域的信息,而第一层的每个区域又对应原始图像的 3×3 区域,因此第二层的感受野扩展为 5×5(即 3+3-1=5)

所以,在对应同等感受野的情况下,卷积核尺寸小有2个显著的优势:

  • 能让参数变少,简化计算
  • 能够引入更多的非线性(多经过几次激活函数),让拟合效果更好

这也是为什么像 VGG 网络就用多层 3×3 卷积核替代大卷积核,平衡模型性能与复杂度

3.2 卷积的变体

以空洞卷积为例

空洞卷积(也叫扩张卷积、膨胀卷积 ),是对标准卷积的 “升级”—— 在卷积核元素间插入空洞(间隔),用 空洞率(dilation rate,记为d) 控制间隔大小
标准卷积(d=1):卷积核元素紧密排列,直接覆盖输入特征图相邻区域
空洞卷积(d>1):卷积核元素间插入 d-1 个空洞,等效扩大卷积核的 “感受野范围”,但不增加参数数量(仅改变计算时的采样间隔)。也就是无需增大卷积核尺寸或叠加多层卷积,仅通过调整 d,就能指数级提升感受野
对比池化(Pooling)或下采样,空洞卷积不丢失空间信息,能在扩大感受野的同时,维持特征图尺寸,特别适合语义分割、目标检测等需要精准像素 / 目标定位的任务
所以不同的设计,其实是为了不同的任务,比如你虽然可以捕捉不同尺度的信息,但是对于图像分类这个任务来说没用,我的核心是整个图像的类别,如果你是目标检测,对于小目标的检测中小尺度的设计就很有用

3.3 空洞卷积示例

其实就是多了一个参数,代码上仅仅也是多了一个参数
self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=2, dilation=2)  

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

# 数据预处理
transform = transforms.Compose([
    transforms.ToTensor(),  # 转为张量
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # 归一化
])

# 加载CIFAR-10数据集
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = DataLoader(trainset, batch_size=128, shuffle=True)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = DataLoader(testset, batch_size=128, shuffle=False)

# 定义含空洞卷积的CNN模型
class SimpleCNNWithDilation(nn.Module):
    def __init__(self):
        super(SimpleCNNWithDilation, self).__init__()
        # 第一层:普通3×3卷积,捕捉基础特征
        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)  
        # 第二层:空洞卷积,dilation=2,感受野扩大(等效5×5普通卷积感受野)
        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=2, dilation=2)  
        # 第三层:普通3×3卷积,恢复特征对齐
        self.conv3 = nn.Conv2d(32, 64, kernel_size=3, padding=1)  
        
        self.pool = nn.MaxPool2d(2, 2)  # 池化层
        self.relu = nn.ReLU()
        
        # 全连接层,根据CIFAR-10尺寸计算:32×32→池化后16×16→...→最终特征维度需匹配
        self.fc1 = nn.Linear(64 * 8 * 8, 256)  
        self.fc2 = nn.Linear(256, 10)  

    def forward(self, x):
        # 输入: [batch, 3, 32, 32]
        x = self.conv1(x)  # [batch, 16, 32, 32]
        x = self.relu(x)
        x = self.pool(x)   # [batch, 16, 16, 16]
        
        x = self.conv2(x)  # [batch, 32, 16, 16](dilation=2 + padding=2 保持尺寸)
        x = self.relu(x)
        x = self.pool(x)   # [batch, 32, 8, 8]
        
        x = self.conv3(x)  # [batch, 64, 8, 8]
        x = self.relu(x)
        
        x = x.view(-1, 64 * 8 * 8)  # 展平
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# 初始化模型、损失函数、优化器
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SimpleCNNWithDilation().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# 训练函数
def train(epoch):
    model.train()
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)
        
        optimizer.zero_grad()
        
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        if i % 100 == 99:  # 每100个batch打印一次
            print(f'Epoch: {epoch + 1}, Batch: {i + 1}, Loss: {running_loss / 100:.3f}')
            running_loss = 0.0

# 测试函数
def test():
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for data in testloader:
            images, labels = data[0].to(device), data[1].to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    print(f'Accuracy on test set: {100 * correct / total:.2f}%')

# 训练&测试流程
for epoch in range(5):  # 简单跑5个epoch示例
    train(epoch)
    test()

作业:一次稍微有点学术感觉的作业:

  1. 对inception网络在cifar10上观察精度
  2. 消融实验:引入残差机制和cbam模块分别进行消融
精度测试
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import time

# 设置中文字体
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

# 数据预处理(与原代码一致)
transform = transforms.Compose([
    transforms.Resize(32),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# 加载CIFAR-10数据集(与原代码一致)
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=False, num_workers=2)

class InceptionModule(nn.Module):
    """基础Inception模块(与用户代码一致,明确注释各分支作用)"""
    def __init__(self, in_channels):
        super(InceptionModule, self).__init__()
        # 1x1卷积分支:减少计算量,保持空间信息
        self.branch1x1 = nn.Sequential(
            nn.Conv2d(in_channels, 32, kernel_size=1),
            nn.BatchNorm2d(32),
            nn.ReLU(inplace=True)
        )
        # 3x3卷积分支:捕捉局部特征
        self.branch3x3 = nn.Sequential(
            nn.Conv2d(in_channels, 32, kernel_size=1),  # 降维
            nn.BatchNorm2d(32),
            nn.ReLU(inplace=True),
            nn.Conv2d(32, 48, kernel_size=3, padding=1),  # 3x3卷积
            nn.BatchNorm2d(48),
            nn.ReLU(inplace=True)
        )
        # 5x5卷积分支:捕捉更大范围特征
        self.branch5x5 = nn.Sequential(
            nn.Conv2d(in_channels, 16, kernel_size=1),  # 降维
            nn.BatchNorm2d(16),
            nn.ReLU(inplace=True),
            nn.Conv2d(16, 24, kernel_size=5, padding=2),  # 5x5卷积
            nn.BatchNorm2d(24),
            nn.ReLU(inplace=True)
        )
        # 池化分支:保留平移不变性
        self.branch_pool = nn.Sequential(
            nn.MaxPool2d(kernel_size=3, stride=1, padding=1),
            nn.Conv2d(in_channels, 32, kernel_size=1),  # 调整通道数
            nn.BatchNorm2d(32),
            nn.ReLU(inplace=True)
        )

    def forward(self, x):
        # 各分支输出拼接(通道维度)
        branch1x1 = self.branch1x1(x)
        branch3x3 = self.branch3x3(x)
        branch5x5 = self.branch5x5(x)
        branch_pool = self.branch_pool(x)
        return torch.cat([branch1x1, branch3x3, branch5x5, branch_pool], dim=1)

class InceptionNet(nn.Module):
    """完整Inception网络(优化结构注释)"""
    def __init__(self, num_classes=10):
        super(InceptionNet, self).__init__()
        # 初始卷积层:提取基础特征
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2)  # 下采样
        )
        # Inception模块堆叠
        self.inception2 = InceptionModule(64)  # 输入通道64
        self.inception3 = InceptionModule(136)  # 前一模块输出通道32+48+24+32=136
        # 过渡卷积层:调整通道数
        self.conv4 = nn.Sequential(
            nn.Conv2d(136, 256, kernel_size=1),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True)
        )
        # 全局平均池化+全连接层
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(256, num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = self.inception2(x)
        x = self.inception3(x)
        x = self.conv4(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        return x

def train_model(model, criterion, optimizer, scheduler, num_epochs, device):
    """通用训练函数(保留用户代码优点,增加注释)"""
    train_losses = []
    test_accuracies = []
    start_time = time.time()

    for epoch in range(num_epochs):
        # 训练阶段
        model.train()
        running_loss = 0.0
        for inputs, labels in trainloader:
            inputs, labels = inputs.to(device), labels.to(device)
            optimizer.zero_grad()  # 梯度清零
            outputs = model(inputs)  # 前向传播
            loss = criterion(outputs, labels)  # 计算损失
            loss.backward()  # 反向传播
            optimizer.step()  # 参数更新
            running_loss += loss.item()
        epoch_loss = running_loss / len(trainloader)
        train_losses.append(epoch_loss)

        # 测试阶段
        model.eval()
        correct = 0
        total = 0
        with torch.no_grad():
            for inputs, labels in testloader:
                inputs, labels = inputs.to(device), labels.to(device)
                outputs = model(inputs)
                _, predicted = torch.max(outputs, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()
        accuracy = 100 * correct / total
        test_accuracies.append(accuracy)

        # 学习率调度
        if scheduler:
            scheduler.step()
        print(f'Epoch {epoch+1}/{num_epochs}, Loss: {epoch_loss:.4f}, Acc: {accuracy:.2f}%')

    print(f"训练完成,总耗时: {time.time() - start_time:.2f}秒")
    return train_losses, test_accuracies

def main():
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(f"使用设备: {device}")

    # 初始化模型、损失函数、优化器
    model = InceptionNet().to(device)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)
    scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=20)

    # 训练并获取结果
    print("=== Inception网络在CIFAR-10上训练 ===")
    train_losses, test_accuracies = train_model(
        model, criterion, optimizer, scheduler, num_epochs=30, device=device
    )

    # 保存结果(方便后续分析)
    np.savez("inception_cifar10_results.npz",
             train_loss=train_losses,
             test_acc=test_accuracies)

    # 绘制训练曲线
    plt.figure(figsize=(12, 5))
    plt.subplot(1, 2, 1)
    plt.plot(train_losses)
    plt.title('训练损失曲线')
    plt.xlabel('轮次')
    plt.ylabel('损失值')

    plt.subplot(1, 2, 2)
    plt.plot(test_accuracies)
    plt.title('测试精度曲线')
    plt.xlabel('轮次')
    plt.ylabel('精度(%)')
    plt.tight_layout()
    plt.savefig('inception_cifar10_performance.png')
    plt.show()

if __name__ == "__main__":
    main()

使用设备: cuda
=== Inception网络在CIFAR-10上训练 ===
Epoch 1/30, Loss: 1.3307, Acc: 60.50%
Epoch 2/30, Loss: 0.9765, Acc: 66.57%
Epoch 3/30, Loss: 0.8367, Acc: 69.83%
Epoch 4/30, Loss: 0.7323, Acc: 72.99%
Epoch 5/30, Loss: 0.6591, Acc: 74.77%
Epoch 6/30, Loss: 0.5996, Acc: 71.90%
Epoch 7/30, Loss: 0.5522, Acc: 74.97%
Epoch 8/30, Loss: 0.5092, Acc: 78.50%
Epoch 9/30, Loss: 0.4714, Acc: 78.51%
Epoch 10/30, Loss: 0.4357, Acc: 78.03%
Epoch 11/30, Loss: 0.4050, Acc: 79.16%
Epoch 12/30, Loss: 0.3756, Acc: 81.18%
Epoch 13/30, Loss: 0.3460, Acc: 80.92%
Epoch 14/30, Loss: 0.3219, Acc: 81.87%
Epoch 15/30, Loss: 0.3029, Acc: 82.27%
Epoch 16/30, Loss: 0.2856, Acc: 82.63%
Epoch 17/30, Loss: 0.2689, Acc: 82.85%
Epoch 18/30, Loss: 0.2584, Acc: 83.06%
Epoch 19/30, Loss: 0.2503, Acc: 83.01%
Epoch 20/30, Loss: 0.2464, Acc: 83.26%
Epoch 21/30, Loss: 0.2449, Acc: 83.15%
Epoch 22/30, Loss: 0.2446, Acc: 83.14%
Epoch 23/30, Loss: 0.2484, Acc: 83.12%
...
Epoch 28/30, Loss: 0.2702, Acc: 80.49%
Epoch 29/30, Loss: 0.2728, Acc: 80.38%
Epoch 30/30, Loss: 0.2759, Acc: 81.40%
训练完成,总耗时: 899.64秒

残差消融实验
# 残差块(解决深层网络梯度消失)
class ResidualBlock(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(ResidualBlock, self).__init__()
        # 主路径:Inception模块+1x1卷积调整通道
        self.main_path = nn.Sequential(
            InceptionModule(in_channels),  # 复用基础Inception模块
            nn.Conv2d(out_channels, out_channels, kernel_size=1),
            nn.BatchNorm2d(out_channels)
        )
        # 快捷路径:若通道数不匹配,用1x1卷积调整
        self.shortcut = nn.Sequential()
        if in_channels != out_channels:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, kernel_size=1),
                nn.BatchNorm2d(out_channels)
            )

    def forward(self, x):
        out = self.main_path(x)
        shortcut = self.shortcut(x)
        return nn.ReLU(inplace=True)(out + shortcut)  # 残差连接+激活

# CBAM注意力模块(参考day49代码,增强特征筛选)
class CBAM(nn.Module):
    def __init__(self, in_channels, ratio=16):
        super(CBAM, self).__init__()
        # 通道注意力(全局平均+最大池化)
        self.channel_att = nn.Sequential(
            nn.AdaptiveAvgPool2d(1),
            nn.Conv2d(in_channels, in_channels//ratio, kernel_size=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(in_channels//ratio, in_channels, kernel_size=1),
            nn.Sigmoid()
        )
        # 空间注意力(拼接平均+最大池化特征)
        self.spatial_att = nn.Sequential(
            nn.Conv2d(2, 1, kernel_size=7, padding=3),
            nn.Sigmoid()
        )

    def forward(self, x):
        # 通道注意力加权
        avg_out = self.channel_att(x)
        x = x * avg_out
        # 空间注意力加权
        avg = torch.mean(x, dim=1, keepdim=True)
        max_val = torch.max(x, dim=1, keepdim=True)[0]
        concat = torch.cat([avg, max_val], dim=1)
        x = x * self.spatial_att(concat)
        return x

class InceptionAblation(nn.Module):
    def __init__(self, variant='base', num_classes=10):
        super(InceptionAblation, self).__init__()
        self.variant = variant  # 模型变体:'base'/'residual'/'cbam'

        # 初始卷积层(与基础Inception一致)
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )

        # 根据变体选择模块
        if variant == 'residual':
            # 残差变体:替换Inception模块为残差块
            self.features = nn.Sequential(
                ResidualBlock(64, 136),  # 输入64通道,输出136通道(与基础Inception一致)
                ResidualBlock(136, 136)   # 堆叠残差块增强特征提取
            )
        elif variant == 'cbam':
            # CBAM变体:Inception模块后接CBAM注意力
            self.features = nn.Sequential(
                InceptionModule(64),
                CBAM(136),  # 基础Inception输出136通道,作为CBAM输入
                InceptionModule(136),
                CBAM(136)
            )
        else:
            # 基础变体:纯Inception模块
            self.features = nn.Sequential(
                InceptionModule(64),
                InceptionModule(136)
            )

        # 输出层(与基础Inception一致)
        self.conv4 = nn.Sequential(
            nn.Conv2d(136, 256, kernel_size=1),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True)
        )
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(256, num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = self.features(x)
        x = self.conv4(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        return x

def main():
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(f"使用设备: {device}")
    criterion = nn.CrossEntropyLoss()
    results = {}

    # 实验1:基础Inception(无残差/CBAM)
    print("=== 基础Inception训练 ===")
    model_base = InceptionAblation(variant='base').to(device)
    optimizer_base = optim.Adam(model_base.parameters(), lr=0.001)
    scheduler_base = optim.lr_scheduler.CosineAnnealingLR(optimizer_base, T_max=20)
    train_loss_base, test_acc_base = train_model(
        model_base, criterion, optimizer_base, scheduler_base, num_epochs=30, device=device
    )
    results['base'] = (train_loss_base, test_acc_base)

    # 实验2:Inception+残差机制
    print("=== Inception+残差训练 ===")
    model_res = InceptionAblation(variant='residual').to(device)
    optimizer_res = optim.Adam(model_res.parameters(), lr=0.001)
    scheduler_res = optim.lr_scheduler.CosineAnnealingLR(optimizer_res, T_max=20)
    train_loss_res, test_acc_res = train_model(
        model_res, criterion, optimizer_res, scheduler_res, num_epochs=30, device=device
    )
    results['residual'] = (train_loss_res, test_acc_res)

    # 实验3:Inception+CBAM模块
    print("=== Inception+CBAM训练 ===")
    model_cbam = InceptionAblation(variant='cbam').to(device)
    optimizer_cbam = optim.Adam(model_cbam.parameters(), lr=0.001)
    scheduler_cbam = optim.lr_scheduler.CosineAnnealingLR(optimizer_cbam, T_max=20)
    train_loss_cbam, test_acc_cbam = train_model(
        model_cbam, criterion, optimizer_cbam, scheduler_cbam, num_epochs=30, device=device
    )
    results['cbam'] = (train_loss_cbam, test_acc_cbam)

    # 保存多模型结果
    np.savez("inception_ablation_results.npz", **results)

    # 绘制对比图
    plt.figure(figsize=(15, 6))
    # 训练损失对比
    plt.subplot(1, 2, 1)
    plt.plot(results['base'][0], label='基础Inception')
    plt.plot(results['residual'][0], label='+残差')
    plt.plot(results['cbam'][0], label='+CBAM')
    plt.title('训练损失对比')
    plt.xlabel('轮次')
    plt.ylabel('损失值')
    plt.legend()
    # 测试精度对比
    plt.subplot(1, 2, 2)
    plt.plot(results['base'][1], label='基础Inception')
    plt.plot(results['residual'][1], label='+残差')
    plt.plot(results['cbam'][1], label='+CBAM')
    plt.title('测试精度对比')
    plt.xlabel('轮次')
    plt.ylabel('精度(%)')
    plt.legend()
    plt.tight_layout()
    plt.savefig('inception_ablation_comparison.png')
    plt.show()

if __name__ == "__main__":
    main()

使用设备: cuda
=== 基础Inception训练 ===
Epoch 1/30, Loss: 1.3137, Acc: 59.68%
Epoch 2/30, Loss: 0.9556, Acc: 65.51%
Epoch 3/30, Loss: 0.8212, Acc: 68.34%
Epoch 4/30, Loss: 0.7188, Acc: 68.77%
Epoch 5/30, Loss: 0.6463, Acc: 74.05%
Epoch 6/30, Loss: 0.5887, Acc: 72.49%
Epoch 7/30, Loss: 0.5346, Acc: 73.83%
Epoch 8/30, Loss: 0.4931, Acc: 77.93%
Epoch 9/30, Loss: 0.4528, Acc: 77.19%
Epoch 10/30, Loss: 0.4188, Acc: 79.68%
Epoch 11/30, Loss: 0.3886, Acc: 80.64%
Epoch 12/30, Loss: 0.3583, Acc: 81.04%
Epoch 13/30, Loss: 0.3326, Acc: 81.55%
Epoch 14/30, Loss: 0.3069, Acc: 81.78%
Epoch 15/30, Loss: 0.2837, Acc: 82.18%
Epoch 16/30, Loss: 0.2683, Acc: 83.10%
Epoch 17/30, Loss: 0.2531, Acc: 83.09%
Epoch 18/30, Loss: 0.2410, Acc: 82.98%
Epoch 19/30, Loss: 0.2333, Acc: 83.21%
Epoch 20/30, Loss: 0.2290, Acc: 83.11%
Epoch 21/30, Loss: 0.2264, Acc: 83.24%
Epoch 22/30, Loss: 0.2278, Acc: 83.23%
Epoch 23/30, Loss: 0.2305, Acc: 83.19%
...
Epoch 28/30, Loss: 0.2324, Acc: 81.72%
Epoch 29/30, Loss: 0.2393, Acc: 81.16%
Epoch 30/30, Loss: 0.2437, Acc: 78.45%
训练完成,总耗时: 1004.23秒

@浙大疏锦行