Python 训练营打卡 Day 54-Inception网络及其思考-EW帮帮网

一.inception网络介绍

Inception 网络，也被称为 GoogLeNet，是 Google 团队在 2014 年提出的经典卷积神经网络架构。它的核心设计理念是 “并行的多尺度融合”，通过在同一层网络中使用多个不同大小的卷积核（如 1x1、3x3、5x5）以及池化操作，从不同尺度提取图像特征，然后将这些特征进行融合，从而在不增加过多计算量的情况下，获得更丰富的特征表达
Inception 模块是 Inception 网络的基本组成单元，在同样的步长下，卷积核越小，下采样率越低，保留的图片像素越多；卷积核越大，越能捕捉像素周围的信息
一个典型的 Inception 模块包含以下几个并行的分支：

1x1 卷积分支：用于降维，减少后续卷积操作的计算量，同时提取局部特征。（像素下采样率低，但是可以修改通道数）
3x3 卷积分支：捕捉中等尺度的特征
5x5 卷积分支：捕捉较大尺度的特征
池化分支：通常使用最大池化或平均池化，用于保留图像的全局信息

二.inception网络架构

2.1 定义inception模块

import torch
import torch.nn as nn

class Inception(nn.Module):
    def __init__(self, in_channels):
        """
        Inception模块初始化，实现多尺度特征并行提取与融合
        
        参数:
            in_channels: 输入特征图的通道数
        """
        super(Inception, self).__init__()
        
        # 1x1卷积分支：降维并提取通道间特征关系
        # 减少后续卷积的计算量，同时保留局部特征信息
        self.branch1x1 = nn.Sequential(
            nn.Conv2d(in_channels, 64, kernel_size=1),  # 降维至64通道
            nn.ReLU()  # 引入非线性激活
        )
        
        # 3x3卷积分支：通过1x1卷积降维后使用3x3卷积捕捉中等尺度特征
        # 先降维减少计算量，再进行空间特征提取
        self.branch3x3 = nn.Sequential(
            nn.Conv2d(in_channels, 96, kernel_size=1),  # 降维至96通道
            nn.ReLU(),
            nn.Conv2d(96, 128, kernel_size=3, padding=1),  # 3x3卷积，保持空间尺寸不变
            nn.ReLU()
        )
        
        # 5x5卷积分支：通过1x1卷积降维后使用5x5卷积捕捉大尺度特征
        # 较大的感受野用于提取更全局的结构信息
        self.branch5x5 = nn.Sequential(
            nn.Conv2d(in_channels, 16, kernel_size=1),  # 大幅降维至16通道
            nn.ReLU(),
            nn.Conv2d(16, 32, kernel_size=5, padding=2),  # 5x5卷积，保持空间尺寸不变
            nn.ReLU()
        )
        
        # 池化分支：通过池化操作保留全局信息并降维
        # 增强特征的平移不变性
        self.branch_pool = nn.Sequential(
            nn.MaxPool2d(kernel_size=3, stride=1, padding=1),  # 3x3最大池化，保持尺寸
            nn.Conv2d(in_channels, 32, kernel_size=1),  # 降维至32通道
            nn.ReLU()
        )

    def forward(self, x):
        """
        前向传播函数，并行计算四个分支并在通道维度拼接
        
        参数:
            x: 输入特征图，形状为[batch_size, in_channels, height, width]
        
        返回:
            拼接后的特征图，形状为[batch_size, 256, height, width]
        """
        # 注意，这里是并行计算四个分支
        branch1x1 = self.branch1x1(x)  # 输出形状: [batch_size, 64, height, width]
        branch3x3 = self.branch3x3(x)  # 输出形状: [batch_size, 128, height, width]
        branch5x5 = self.branch5x5(x)  # 输出形状: [batch_size, 32, height, width]
        branch_pool = self.branch_pool(x)  # 输出形状: [batch_size, 32, height, width]
        
        # 在通道维度(dim=1)拼接四个分支的输出
        # 总通道数: 64 + 128 + 32 + 32 = 256
        outputs = [branch1x1, branch3x3, branch5x5, branch_pool]
        return torch.cat(outputs, dim=1)

上述模块变化为[B, C, H, W]-->[B, 256, H, W]

model = Inception(in_channels=64)
input = torch.randn(32, 64, 28, 28)
output = model(input)
print(f"输入形状: {input.shape}")
print(f"输出形状: {output.shape}")

inception模块中不同的卷积核和步长最后输出同样尺寸的特征图，这是经过精心设计的，才能在空间上对齐，才能在维度上正确拼接

2.2 特征融合方法

concat这种增加通道数的方法是一种经典的特征融合方法。通道数增加，空间尺寸（H, W）保持不变，每个通道的数值保持独立，没有加法运算。相当于把不同特征图 “并排” 放在一起，形成更 “厚” 的特征矩阵
在深度学习中，特征融合的尺度有以下方式：

1. 逐元素相加：将相同形状的特征图对应位置的元素直接相加，比如残差连接：

output = x + self.residual_block(x)

不改变特征图尺寸和通道数，计算高效，但需保证输入形状一致
2. 逐元素相乘：通过乘法对特征进行权重分配，抑制无关特征，增强关键特征。比如注意力机制、门控机制（如 LSTM 中的遗忘门、输入门）

attention = self.ChannelAttention(features)  # 生成通道权重
    weighted_features = features * attention  # 逐元素相乘

2.3 inceptionNet网络定义

class InceptionNet(nn.Module):
    def __init__(self, num_classes=10):
        super(InceptionNet, self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        )

        self.inception1 = Inception(64)
        self.inception2 = Inception(256)

        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(256, num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = self.inception1(x)
        x = self.inception2(x)

        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        return x

# 创建网络实例
model = InceptionNet()
# 创建一个随机输入张量，模拟图像数据，这里假设输入图像是3通道，尺寸为224x224
input_tensor = torch.randn(1, 3, 224, 224)
# 前向传播
output = model(input_tensor)
print(output.shape)

torch.Size([1, 10])

三.卷积核的变体

3.1 感受野

感受野是指在卷积神经网络（CNN）中，神经元在原始输入图像上所对应的区域大小。通俗来说，卷积层中的每个输出特征图上的一个像素点，其信息来源于输入图像中的某个特定区域，这个区域的大小就是该像素点的感受野
假设我们有一个 3×3 的卷积核，对一张 5×5 的图像进行步长为 1 的卷积操作：输出特征图的每个像素点，都由输入图像中 3×3 的区域计算得到，因此该层的感受野为 3×3，如果再叠加一层 3×3 卷积（步长 1），第二层的每个像素点会融合第一层 3×3 区域的信息，而第一层的每个区域又对应原始图像的 3×3 区域，因此第二层的感受野扩展为 5×5（即 3+3-1=5）

所以，在对应同等感受野的情况下，卷积核尺寸小有2个显著的优势：

能让参数变少，简化计算
能够引入更多的非线性（多经过几次激活函数），让拟合效果更好

这也是为什么像 VGG 网络就用多层 3×3 卷积核替代大卷积核，平衡模型性能与复杂度

3.2 卷积的变体

以空洞卷积为例

空洞卷积（也叫扩张卷积、膨胀卷积），是对标准卷积的 “升级”—— 在卷积核元素间插入空洞（间隔），用空洞率（dilation rate，记为d）控制间隔大小
标准卷积（d=1）：卷积核元素紧密排列，直接覆盖输入特征图相邻区域
空洞卷积（d>1）：卷积核元素间插入 d-1 个空洞，等效扩大卷积核的 “感受野范围”，但不增加参数数量（仅改变计算时的采样间隔）。也就是无需增大卷积核尺寸或叠加多层卷积，仅通过调整 d，就能指数级提升感受野
对比池化（Pooling）或下采样，空洞卷积不丢失空间信息，能在扩大感受野的同时，维持特征图尺寸，特别适合语义分割、目标检测等需要精准像素 / 目标定位的任务
所以不同的设计，其实是为了不同的任务，比如你虽然可以捕捉不同尺度的信息，但是对于图像分类这个任务来说没用，我的核心是整个图像的类别，如果你是目标检测，对于小目标的检测中小尺度的设计就很有用

3.3 空洞卷积示例

其实就是多了一个参数，代码上仅仅也是多了一个参数
self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=2, dilation=2)

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

# 数据预处理
transform = transforms.Compose([
    transforms.ToTensor(),  # 转为张量
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # 归一化
])

# 加载CIFAR-10数据集
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = DataLoader(trainset, batch_size=128, shuffle=True)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = DataLoader(testset, batch_size=128, shuffle=False)

# 定义含空洞卷积的CNN模型
class SimpleCNNWithDilation(nn.Module):
    def __init__(self):
        super(SimpleCNNWithDilation, self).__init__()
        # 第一层：普通3×3卷积，捕捉基础特征
        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)  
        # 第二层：空洞卷积，dilation=2，感受野扩大（等效5×5普通卷积感受野）
        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=2, dilation=2)  
        # 第三层：普通3×3卷积，恢复特征对齐
        self.conv3 = nn.Conv2d(32, 64, kernel_size=3, padding=1)  
        
        self.pool = nn.MaxPool2d(2, 2)  # 池化层
        self.relu = nn.ReLU()
        
        # 全连接层，根据CIFAR-10尺寸计算：32×32→池化后16×16→...→最终特征维度需匹配
        self.fc1 = nn.Linear(64 * 8 * 8, 256)  
        self.fc2 = nn.Linear(256, 10)  

    def forward(self, x):
        # 输入: [batch, 3, 32, 32]
        x = self.conv1(x)  # [batch, 16, 32, 32]
        x = self.relu(x)
        x = self.pool(x)   # [batch, 16, 16, 16]
        
        x = self.conv2(x)  # [batch, 32, 16, 16]（dilation=2 + padding=2 保持尺寸）
        x = self.relu(x)
        x = self.pool(x)   # [batch, 32, 8, 8]
        
        x = self.conv3(x)  # [batch, 64, 8, 8]
        x = self.relu(x)
        
        x = x.view(-1, 64 * 8 * 8)  # 展平
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# 初始化模型、损失函数、优化器
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SimpleCNNWithDilation().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# 训练函数
def train(epoch):
    model.train()
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)
        
        optimizer.zero_grad()
        
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        if i % 100 == 99:  # 每100个batch打印一次
            print(f'Epoch: {epoch + 1}, Batch: {i + 1}, Loss: {running_loss / 100:.3f}')
            running_loss = 0.0

# 测试函数
def test():
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for data in testloader:
            images, labels = data[0].to(device), data[1].to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    print(f'Accuracy on test set: {100 * correct / total:.2f}%')

# 训练&测试流程
for epoch in range(5):  # 简单跑5个epoch示例
    train(epoch)
    test()

作业：一次稍微有点学术感觉的作业：

对inception网络在cifar10上观察精度
消融实验：引入残差机制和cbam模块分别进行消融

精度测试

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import time

# 设置中文字体
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

# 数据预处理（与原代码一致）
transform = transforms.Compose([
    transforms.Resize(32),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# 加载CIFAR-10数据集（与原代码一致）
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=False, num_workers=2)

class InceptionModule(nn.Module):
    """基础Inception模块（与用户代码一致，明确注释各分支作用）"""
    def __init__(self, in_channels):
        super(InceptionModule, self).__init__()
        # 1x1卷积分支：减少计算量，保持空间信息
        self.branch1x1 = nn.Sequential(
            nn.Conv2d(in_channels, 32, kernel_size=1),
            nn.BatchNorm2d(32),
            nn.ReLU(inplace=True)
        )
        # 3x3卷积分支：捕捉局部特征
        self.branch3x3 = nn.Sequential(
            nn.Conv2d(in_channels, 32, kernel_size=1),  # 降维
            nn.BatchNorm2d(32),
            nn.ReLU(inplace=True),
            nn.Conv2d(32, 48, kernel_size=3, padding=1),  # 3x3卷积
            nn.BatchNorm2d(48),
            nn.ReLU(inplace=True)
        )
        # 5x5卷积分支：捕捉更大范围特征
        self.branch5x5 = nn.Sequential(
            nn.Conv2d(in_channels, 16, kernel_size=1),  # 降维
            nn.BatchNorm2d(16),
            nn.ReLU(inplace=True),
            nn.Conv2d(16, 24, kernel_size=5, padding=2),  # 5x5卷积
            nn.BatchNorm2d(24),
            nn.ReLU(inplace=True)
        )
        # 池化分支：保留平移不变性
        self.branch_pool = nn.Sequential(
            nn.MaxPool2d(kernel_size=3, stride=1, padding=1),
            nn.Conv2d(in_channels, 32, kernel_size=1),  # 调整通道数
            nn.BatchNorm2d(32),
            nn.ReLU(inplace=True)
        )

    def forward(self, x):
        # 各分支输出拼接（通道维度）
        branch1x1 = self.branch1x1(x)
        branch3x3 = self.branch3x3(x)
        branch5x5 = self.branch5x5(x)
        branch_pool = self.branch_pool(x)
        return torch.cat([branch1x1, branch3x3, branch5x5, branch_pool], dim=1)

class InceptionNet(nn.Module):
    """完整Inception网络（优化结构注释）"""
    def __init__(self, num_classes=10):
        super(InceptionNet, self).__init__()
        # 初始卷积层：提取基础特征
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2)  # 下采样
        )
        # Inception模块堆叠
        self.inception2 = InceptionModule(64)  # 输入通道64
        self.inception3 = InceptionModule(136)  # 前一模块输出通道32+48+24+32=136
        # 过渡卷积层：调整通道数
        self.conv4 = nn.Sequential(
            nn.Conv2d(136, 256, kernel_size=1),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True)
        )
        # 全局平均池化+全连接层
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(256, num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = self.inception2(x)
        x = self.inception3(x)
        x = self.conv4(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        return x

def train_model(model, criterion, optimizer, scheduler, num_epochs, device):
    """通用训练函数（保留用户代码优点，增加注释）"""
    train_losses = []
    test_accuracies = []
    start_time = time.time()

    for epoch in range(num_epochs):
        # 训练阶段
        model.train()
        running_loss = 0.0
        for inputs, labels in trainloader:
            inputs, labels = inputs.to(device), labels.to(device)
            optimizer.zero_grad()  # 梯度清零
            outputs = model(inputs)  # 前向传播
            loss = criterion(outputs, labels)  # 计算损失
            loss.backward()  # 反向传播
            optimizer.step()  # 参数更新
            running_loss += loss.item()
        epoch_loss = running_loss / len(trainloader)
        train_losses.append(epoch_loss)

        # 测试阶段
        model.eval()
        correct = 0
        total = 0
        with torch.no_grad():
            for inputs, labels in testloader:
                inputs, labels = inputs.to(device), labels.to(device)
                outputs = model(inputs)
                _, predicted = torch.max(outputs, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()
        accuracy = 100 * correct / total
        test_accuracies.append(accuracy)

        # 学习率调度
        if scheduler:
            scheduler.step()
        print(f'Epoch {epoch+1}/{num_epochs}, Loss: {epoch_loss:.4f}, Acc: {accuracy:.2f}%')

    print(f"训练完成，总耗时: {time.time() - start_time:.2f}秒")
    return train_losses, test_accuracies

def main():
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(f"使用设备: {device}")

    # 初始化模型、损失函数、优化器
    model = InceptionNet().to(device)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)
    scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=20)

    # 训练并获取结果
    print("=== Inception网络在CIFAR-10上训练 ===")
    train_losses, test_accuracies = train_model(
        model, criterion, optimizer, scheduler, num_epochs=30, device=device
    )

    # 保存结果（方便后续分析）
    np.savez("inception_cifar10_results.npz",
             train_loss=train_losses,
             test_acc=test_accuracies)

    # 绘制训练曲线
    plt.figure(figsize=(12, 5))
    plt.subplot(1, 2, 1)
    plt.plot(train_losses)
    plt.title('训练损失曲线')
    plt.xlabel('轮次')
    plt.ylabel('损失值')

    plt.subplot(1, 2, 2)
    plt.plot(test_accuracies)
    plt.title('测试精度曲线')
    plt.xlabel('轮次')
    plt.ylabel('精度（%）')
    plt.tight_layout()
    plt.savefig('inception_cifar10_performance.png')
    plt.show()

if __name__ == "__main__":
    main()

使用设备: cuda
=== Inception网络在CIFAR-10上训练 ===
Epoch 1/30, Loss: 1.3307, Acc: 60.50%
Epoch 2/30, Loss: 0.9765, Acc: 66.57%
Epoch 3/30, Loss: 0.8367, Acc: 69.83%
Epoch 4/30, Loss: 0.7323, Acc: 72.99%
Epoch 5/30, Loss: 0.6591, Acc: 74.77%
Epoch 6/30, Loss: 0.5996, Acc: 71.90%
Epoch 7/30, Loss: 0.5522, Acc: 74.97%
Epoch 8/30, Loss: 0.5092, Acc: 78.50%
Epoch 9/30, Loss: 0.4714, Acc: 78.51%
Epoch 10/30, Loss: 0.4357, Acc: 78.03%
Epoch 11/30, Loss: 0.4050, Acc: 79.16%
Epoch 12/30, Loss: 0.3756, Acc: 81.18%
Epoch 13/30, Loss: 0.3460, Acc: 80.92%
Epoch 14/30, Loss: 0.3219, Acc: 81.87%
Epoch 15/30, Loss: 0.3029, Acc: 82.27%
Epoch 16/30, Loss: 0.2856, Acc: 82.63%
Epoch 17/30, Loss: 0.2689, Acc: 82.85%
Epoch 18/30, Loss: 0.2584, Acc: 83.06%
Epoch 19/30, Loss: 0.2503, Acc: 83.01%
Epoch 20/30, Loss: 0.2464, Acc: 83.26%
Epoch 21/30, Loss: 0.2449, Acc: 83.15%
Epoch 22/30, Loss: 0.2446, Acc: 83.14%
Epoch 23/30, Loss: 0.2484, Acc: 83.12%
...
Epoch 28/30, Loss: 0.2702, Acc: 80.49%
Epoch 29/30, Loss: 0.2728, Acc: 80.38%
Epoch 30/30, Loss: 0.2759, Acc: 81.40%
训练完成，总耗时: 899.64秒

残差消融实验

# 残差块（解决深层网络梯度消失）
class ResidualBlock(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(ResidualBlock, self).__init__()
        # 主路径：Inception模块+1x1卷积调整通道
        self.main_path = nn.Sequential(
            InceptionModule(in_channels),  # 复用基础Inception模块
            nn.Conv2d(out_channels, out_channels, kernel_size=1),
            nn.BatchNorm2d(out_channels)
        )
        # 快捷路径：若通道数不匹配，用1x1卷积调整
        self.shortcut = nn.Sequential()
        if in_channels != out_channels:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, kernel_size=1),
                nn.BatchNorm2d(out_channels)
            )

    def forward(self, x):
        out = self.main_path(x)
        shortcut = self.shortcut(x)
        return nn.ReLU(inplace=True)(out + shortcut)  # 残差连接+激活

# CBAM注意力模块（参考day49代码，增强特征筛选）
class CBAM(nn.Module):
    def __init__(self, in_channels, ratio=16):
        super(CBAM, self).__init__()
        # 通道注意力（全局平均+最大池化）
        self.channel_att = nn.Sequential(
            nn.AdaptiveAvgPool2d(1),
            nn.Conv2d(in_channels, in_channels//ratio, kernel_size=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(in_channels//ratio, in_channels, kernel_size=1),
            nn.Sigmoid()
        )
        # 空间注意力（拼接平均+最大池化特征）
        self.spatial_att = nn.Sequential(
            nn.Conv2d(2, 1, kernel_size=7, padding=3),
            nn.Sigmoid()
        )

    def forward(self, x):
        # 通道注意力加权
        avg_out = self.channel_att(x)
        x = x * avg_out
        # 空间注意力加权
        avg = torch.mean(x, dim=1, keepdim=True)
        max_val = torch.max(x, dim=1, keepdim=True)[0]
        concat = torch.cat([avg, max_val], dim=1)
        x = x * self.spatial_att(concat)
        return x

class InceptionAblation(nn.Module):
    def __init__(self, variant='base', num_classes=10):
        super(InceptionAblation, self).__init__()
        self.variant = variant  # 模型变体：'base'/'residual'/'cbam'

        # 初始卷积层（与基础Inception一致）
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )

        # 根据变体选择模块
        if variant == 'residual':
            # 残差变体：替换Inception模块为残差块
            self.features = nn.Sequential(
                ResidualBlock(64, 136),  # 输入64通道，输出136通道（与基础Inception一致）
                ResidualBlock(136, 136)   # 堆叠残差块增强特征提取
            )
        elif variant == 'cbam':
            # CBAM变体：Inception模块后接CBAM注意力
            self.features = nn.Sequential(
                InceptionModule(64),
                CBAM(136),  # 基础Inception输出136通道，作为CBAM输入
                InceptionModule(136),
                CBAM(136)
            )
        else:
            # 基础变体：纯Inception模块
            self.features = nn.Sequential(
                InceptionModule(64),
                InceptionModule(136)
            )

        # 输出层（与基础Inception一致）
        self.conv4 = nn.Sequential(
            nn.Conv2d(136, 256, kernel_size=1),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True)
        )
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(256, num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = self.features(x)
        x = self.conv4(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        return x

def main():
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(f"使用设备: {device}")
    criterion = nn.CrossEntropyLoss()
    results = {}

    # 实验1：基础Inception（无残差/CBAM）
    print("=== 基础Inception训练 ===")
    model_base = InceptionAblation(variant='base').to(device)
    optimizer_base = optim.Adam(model_base.parameters(), lr=0.001)
    scheduler_base = optim.lr_scheduler.CosineAnnealingLR(optimizer_base, T_max=20)
    train_loss_base, test_acc_base = train_model(
        model_base, criterion, optimizer_base, scheduler_base, num_epochs=30, device=device
    )
    results['base'] = (train_loss_base, test_acc_base)

    # 实验2：Inception+残差机制
    print("=== Inception+残差训练 ===")
    model_res = InceptionAblation(variant='residual').to(device)
    optimizer_res = optim.Adam(model_res.parameters(), lr=0.001)
    scheduler_res = optim.lr_scheduler.CosineAnnealingLR(optimizer_res, T_max=20)
    train_loss_res, test_acc_res = train_model(
        model_res, criterion, optimizer_res, scheduler_res, num_epochs=30, device=device
    )
    results['residual'] = (train_loss_res, test_acc_res)

    # 实验3：Inception+CBAM模块
    print("=== Inception+CBAM训练 ===")
    model_cbam = InceptionAblation(variant='cbam').to(device)
    optimizer_cbam = optim.Adam(model_cbam.parameters(), lr=0.001)
    scheduler_cbam = optim.lr_scheduler.CosineAnnealingLR(optimizer_cbam, T_max=20)
    train_loss_cbam, test_acc_cbam = train_model(
        model_cbam, criterion, optimizer_cbam, scheduler_cbam, num_epochs=30, device=device
    )
    results['cbam'] = (train_loss_cbam, test_acc_cbam)

    # 保存多模型结果
    np.savez("inception_ablation_results.npz", **results)

    # 绘制对比图
    plt.figure(figsize=(15, 6))
    # 训练损失对比
    plt.subplot(1, 2, 1)
    plt.plot(results['base'][0], label='基础Inception')
    plt.plot(results['residual'][0], label='+残差')
    plt.plot(results['cbam'][0], label='+CBAM')
    plt.title('训练损失对比')
    plt.xlabel('轮次')
    plt.ylabel('损失值')
    plt.legend()
    # 测试精度对比
    plt.subplot(1, 2, 2)
    plt.plot(results['base'][1], label='基础Inception')
    plt.plot(results['residual'][1], label='+残差')
    plt.plot(results['cbam'][1], label='+CBAM')
    plt.title('测试精度对比')
    plt.xlabel('轮次')
    plt.ylabel('精度（%）')
    plt.legend()
    plt.tight_layout()
    plt.savefig('inception_ablation_comparison.png')
    plt.show()

if __name__ == "__main__":
    main()

使用设备: cuda
=== 基础Inception训练 ===
Epoch 1/30, Loss: 1.3137, Acc: 59.68%
Epoch 2/30, Loss: 0.9556, Acc: 65.51%
Epoch 3/30, Loss: 0.8212, Acc: 68.34%
Epoch 4/30, Loss: 0.7188, Acc: 68.77%
Epoch 5/30, Loss: 0.6463, Acc: 74.05%
Epoch 6/30, Loss: 0.5887, Acc: 72.49%
Epoch 7/30, Loss: 0.5346, Acc: 73.83%
Epoch 8/30, Loss: 0.4931, Acc: 77.93%
Epoch 9/30, Loss: 0.4528, Acc: 77.19%
Epoch 10/30, Loss: 0.4188, Acc: 79.68%
Epoch 11/30, Loss: 0.3886, Acc: 80.64%
Epoch 12/30, Loss: 0.3583, Acc: 81.04%
Epoch 13/30, Loss: 0.3326, Acc: 81.55%
Epoch 14/30, Loss: 0.3069, Acc: 81.78%
Epoch 15/30, Loss: 0.2837, Acc: 82.18%
Epoch 16/30, Loss: 0.2683, Acc: 83.10%
Epoch 17/30, Loss: 0.2531, Acc: 83.09%
Epoch 18/30, Loss: 0.2410, Acc: 82.98%
Epoch 19/30, Loss: 0.2333, Acc: 83.21%
Epoch 20/30, Loss: 0.2290, Acc: 83.11%
Epoch 21/30, Loss: 0.2264, Acc: 83.24%
Epoch 22/30, Loss: 0.2278, Acc: 83.23%
Epoch 23/30, Loss: 0.2305, Acc: 83.19%
...
Epoch 28/30, Loss: 0.2324, Acc: 81.72%
Epoch 29/30, Loss: 0.2393, Acc: 81.16%
Epoch 30/30, Loss: 0.2437, Acc: 78.45%
训练完成，总耗时: 1004.23秒

@浙大疏锦行

Python 训练营打卡 Day 54-Inception网络及其思考