Pytorch深度学习框架60天进阶学习计划 - 第35天:模型解释性
今天我们将深入探讨机器学习和深度学习领域中一个至关重要的主题:模型解释性。
随着深度学习模型变得越来越复杂,它们往往被视为"黑盒"—我们知道输入和输出,但对中间发生的事情却知之甚少。在许多领域,特别是医疗、金融和法律等高风险领域,仅仅知道模型的预测是不够的,我们还需要理解"为什么"模型会做出这样的预测。这就是模型解释性的重要性所在。
我们将重点解析LIME和SHAP这两种局部解释方法的原理,并比较梯度类方法与扰动法的可视化效果差异。这些方法将帮助我们揭开深度神经网络的神秘面纱,理解模型决策背后的逻辑。
学习目标
- 理解模型解释性的重要性和基本概念
- 深入解析LIME和SHAP的局部近似原理
- 掌握Pytorch中实现LIME和SHAP的方法
- 了解各种梯度类解释方法的原理和实现
- 比较梯度类方法与扰动法的可视化效果差异
- 实践案例:针对图像分类和文本分类的模型解释
1. 模型解释性概述
1.1 什么是模型解释性?
模型解释性(Model Interpretability)或可解释人工智能(XAI, Explainable AI),指的是使人类能够理解、信任和有效管理AI系统的能力。它回答了"模型为什么会做出这样的预测"这一核心问题。
1.2 模型解释性的重要性
模型解释性在以下几个方面非常重要:
- 合规性和监管:在许多行业,监管要求系统决策过程必须是透明和可解释的
- 信任建立:用户更容易信任他们能够理解的系统
- 调试和改进:了解模型行为有助于发现和修复问题
- 科学发现:解释模型可能揭示数据中隐藏的模式和关系
- 道德考量:确保决策是基于合理的标准,而不是偏见或其他不相关因素
1.3 模型解释性的类型
模型解释性方法可以按照不同维度分类:
分类维度 | 类型 | 说明 |
---|---|---|
解释范围 | 全局解释 | 解释整个模型的行为 |
局部解释 | 解释特定样本的预测 | |
解释方法 | 内在解释 | 模型本身就是可解释的(如线性模型、决策树) |
事后解释 | 对已训练好的复杂模型进行解释 | |
模型依赖性 | 模型特定 | 针对特定类型的模型设计(如梯度类方法) |
模型无关 | 适用于任何类型的模型(如LIME) | |
解释表示 | 特征重要性 | 显示每个特征对预测的贡献 |
规则提取 | 从模型中提取决策规则 | |
示例说明 | 通过类似样本来解释预测 | |
可视化 | 通过图形方式解释模型决策 |
今天我们将主要关注局部解释方法中的LIME和SHAP,以及梯度类方法和扰动法的比较。
2. 局部近似方法原理
2.1 为什么需要局部近似?
虽然我们想要理解复杂的深度学习模型,但直接解释整个模型通常很困难。局部近似方法采用了一种聪明的策略:我们不需要解释整个复杂模型,只需要解释它在特定输入周围的行为。这是基于一个关键洞察:即使复杂模型在整个特征空间的行为难以描述,但在局部区域内,它的行为可以通过简单模型(如线性模型)近似。
2.2 LIME(Local Interpretable Model-agnostic Explanations)原理
LIME的基本思想是:对于单个预测,我们可以学习一个局部的、可解释的模型来近似原始模型在该点周围的行为。
LIME的工作流程:
- 选择一个要解释的实例
- 通过扰动(perturbation)创建该实例周围的样本
- 使用原始复杂模型对这些扰动样本进行预测
- 根据与原实例的距离对这些样本进行加权
- 训练一个简单的可解释模型(如线性模型)来拟合这些加权样本
- 解释这个简单模型的参数(如线性模型的系数)
数学上,LIME的目标函数如下:
ξ ( x ) = arg min g ∈ G L ( f , g , π x ) + Ω ( g ) \xi(x) = \underset{g \in G}{\arg\min} \, \mathcal{L}(f, g, \pi_x) + \Omega(g) ξ(x)=g∈GargminL(f,g,πx)+Ω(g)
其中:
- ξ ( x ) \xi(x) ξ(x) 是对实例 x x x 的解释
- f f f 是原始复杂模型
- g g g 是可解释模型,来自可解释模型类 G G G
- π x \pi_x πx 是表示实例 x x x 附近区域的加权函数
- L \mathcal{L} L 是衡量 g g g 在局部区域内近似 f f f 程度的损失函数
- Ω \Omega Ω 是衡量 g g g 复杂度的函数
2.3 SHAP(SHapley Additive exPlanations)原理
SHAP基于博弈论中的Shapley值概念,它将预测看作是一个博弈,其中每个特征是一个参与者,预测是这些参与者的收益。Shapley值告诉我们每个特征对最终预测的贡献。
SHAP的工作原理:
- 考虑所有可能的特征组合(存在/不存在)
- 对于每个特征,计算它在所有可能的特征组合中的边际贡献
- 这些边际贡献的加权平均就是该特征的Shapley值
数学上,一个特征的Shapley值计算如下:
ϕ i ( f , x ) = ∑ S ⊆ N ∖ { i } ∣ S ∣ ! ( ∣ N ∣ − ∣ S ∣ − 1 ) ! ∣ N ∣ ! [ f x ( S ∪ { i } ) − f x ( S ) ] \phi_i(f, x) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|! (|N| - |S| - 1)!}{|N|!} [f_x(S \cup \{i\}) - f_x(S)] ϕi(f,x)=S⊆N∖{i}∑∣N∣!∣S∣!(∣N∣−∣S∣−1)![fx(S∪{i})−fx(S)]
其中:
- ϕ i \phi_i ϕi 是特征 i i i 的Shapley值
- N N N 是所有特征的集合
- S S S 是不包含特征 i i i 的特征子集
- f x ( S ) f_x(S) fx(S) 是仅使用特征子集 S S S 的模型预测
SHAP方法有几个变种,包括KernelSHAP、TreeSHAP和DeepSHAP等,它们针对不同类型的模型提供了更高效的计算方法。
3. 梯度类方法与扰动法
3.1 梯度类方法原理
梯度类方法利用模型梯度来解释预测。它基于这样一个直觉:如果输入的微小变化导致输出的显著变化,那么这个输入特征对预测很重要。
常见的梯度类方法包括:
- Vanilla Gradients:使用输入特征对输出的梯度大小作为特征重要性
- Grad × Input:梯度乘以输入值,更好地反映特征的实际贡献
- Integrated Gradients:沿从基线到输入的路径积分梯度,避免梯度饱和问题
- GradCAM:使用最后一个卷积层的特征图和梯度来生成类激活热图
- SmoothGrad:通过在输入附近添加噪声并平均梯度来减少噪声影响
3.2 扰动法原理
扰动法通过改变输入并观察输出变化来解释模型。如果改变某个特征导致预测发生显著变化,那么这个特征对预测很重要。
常见的扰动法包括:
- 特征消除:移除或掩盖特定特征,观察预测变化
- 特征置换:用随机值或基线值替换特征
- LIME:如前所述,通过扰动创建局部样本,然后训练可解释模型
- SHAP:通过考虑特征的所有可能组合来计算贡献
3.3 梯度类方法与扰动法对比
方法 | 优点 | 缺点 | 适用场景 |
---|---|---|---|
梯度类方法 | 计算高效,利用反向传播 | 可能受梯度消失/爆炸影响,难以理解 | 需要快速解释,模型梯度容易获取 |
提供像素级/特征级精细解释 | 对梯度饱和问题敏感 | 图像分类等高维输入问题 | |
不需要多次前向传播 | 可能产生嘈杂的解释 | 计算资源有限的环境 | |
扰动法 | 概念简单,直观易懂 | 计算成本高,需要多次模型推理 | 需要稳健解释,可接受较长计算时间 |
模型无关,适用于任何黑盒模型 | 扰动策略的选择影响结果 | 需要模型无关的解释方法 | |
更稳定的解释,关注因果关系 | 高维特征空间下计算困难 | 表格数据、文本等低/中维输入 |
接下来,我们将通过代码示例来实际探索这些方法的实现和效果比较。
4. 实践:LIME与SHAP实现
首先,让我们通过代码实现LIME和SHAP方法,并比较它们的解释效果。
import torch
import torch.nn as nn
import torchvision.models as models
import torchvision.transforms as transforms
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
from lime import lime_image
from skimage.segmentation import mark_boundaries
import requests
from io import BytesIO
# 设置设备
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
# 加载预训练的ResNet18模型
model = models.resnet18(pretrained=True)
model = model.to(device)
model.eval()
# ImageNet类别标签
LABELS_URL = 'https://raw.githubusercontent.com/anishathalye/imagenet-simple-labels/master/imagenet-simple-labels.json'
response = requests.get(LABELS_URL)
labels = response.json()
# 图像预处理
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# 下载示例图像
def download_image(url):
response = requests.get(url)
img = Image.open(BytesIO(response.content))
return img
# 选择一张狗的图像进行解释
img_url = "https://farm1.staticflickr.com/7/11792994_7fc91b7d60.jpg" # 一张狗的图片
img = download_image(img_url)
# 图像预处理和预测
input_tensor = preprocess(img)
input_batch = input_tensor.unsqueeze(0).to(device)
# 模型预测
with torch.no_grad():
output = model(input_batch)
# 获取预测结果
_, predicted_idx = torch.max(output, 1)
predicted_label = labels[predicted_idx.item()]
confidence = torch.nn.functional.softmax(output, dim=1)[0][predicted_idx].item()
print(f"Predicted: {predicted_label} with confidence {confidence:.2f}")
# 定义模型预测函数(LIME需要)
def predict_fn(images):
# 将PIL图像转换为模型输入
batch = torch.stack([preprocess(Image.fromarray(img.astype('uint8'))) for img in images])
batch = batch.to(device)
# 进行预测
with torch.no_grad():
output = model(batch)
# 返回softmax概率
probs = torch.nn.functional.softmax(output, dim=1).cpu().numpy()
return probs
# 使用LIME解释预测
explainer = lime_image.LimeImageExplainer()
explanation = explainer.explain_instance(
np.array(img.convert('RGB')),
predict_fn,
top_labels=5,
hide_color=0,
num_samples=1000
)
# 获取预测类别的解释
temp, mask = explanation.get_image_and_mask(
predicted_idx.item(),
positive_only=True,
num_features=5,
hide_rest=True
)
# 可视化解释
plt.figure(figsize=(12, 6))
# 原始图像
plt.subplot(1, 2, 1)
plt.imshow(img)
plt.title(f"Original - Predicted: {predicted_label}\nConfidence: {confidence:.2f}")
plt.axis('off')
# LIME解释
plt.subplot(1, 2, 2)
plt.imshow(mark_boundaries(temp / 255.0, mask))
plt.title("LIME Explanation\n(Positive regions contributing to prediction)")
plt.axis('off')
plt.tight_layout()
plt.savefig('lime_explanation.png')
plt.show()
# 多类别的解释比较
plt.figure(figsize=(15, 10))
# 获取前3个预测类别
top_probs, top_indices = torch.topk(torch.nn.functional.softmax(output, dim=1)[0], 3)
top_labels = [labels[idx.item()] for idx in top_indices]
for i, (idx, label, prob) in enumerate(zip(top_indices, top_labels, top_probs)):
temp, mask = explanation.get_image_and_mask(
idx.item(),
positive_only=True,
num_features=5,
hide_rest=True
)
plt.subplot(1, 3, i+1)
plt.imshow(mark_boundaries(temp / 255.0, mask))
plt.title(f"{label}\nProbability: {prob:.2f}")
plt.axis('off')
plt.tight_layout()
plt.savefig('lime_multiclass_explanation.png')
plt.show()
# 正负贡献区域对比
temp, mask = explanation.get_image_and_mask(
predicted_idx.item(),
positive_only=False,
num_features=10,
hide_rest=False,
min_weight=0.01 # 只显示权重大于0.01的区域
)
plt.figure(figsize=(10, 5))
plt.imshow(mark_boundaries(temp / 255.0, mask))
plt.title(f"Positive (green) and Negative (red) regions for {predicted_label}")
plt.axis('off')
plt.savefig('lime_positive_negative.png')
plt.show()
import torch
import torch.nn as nn
import torchvision.models as models
import torchvision.transforms as transforms
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import shap
import requests
from io import BytesIO
# 设置设备
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
# 加载预训练的ResNet18模型
model = models.resnet18(pretrained=True)
model = model.to(device)
model.eval()
# ImageNet类别标签
LABELS_URL = 'https://raw.githubusercontent.com/anishathalye/imagenet-simple-labels/master/imagenet-simple-labels.json'
response = requests.get(LABELS_URL)
labels = response.json()
# 图像预处理
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# 下载示例图像
def download_image(url):
response = requests.get(url)
img = Image.open(BytesIO(response.content))
return img
# 选择一张狗的图像进行解释
img_url = "https://farm1.staticflickr.com/7/11792994_7fc91b7d60.jpg" # 一张狗的图片
img = download_image(img_url)
# 图像预处理和预测
input_tensor = preprocess(img)
input_batch = input_tensor.unsqueeze(0).to(device)
# 模型预测
with torch.no_grad():
output = model(input_batch)
# 获取预测结果
_, predicted_idx = torch.max(output, 1)
predicted_label = labels[predicted_idx.item()]
confidence = torch.nn.functional.softmax(output, dim=1)[0][predicted_idx].item()
print(f"Predicted: {predicted_label} with confidence {confidence:.2f}")
# 定义模型预测包装函数(SHAP需要)
def predict(imgs):
# 在batched模式下,imgs是一个4D数组 [batch_size, height, width, channels]
# 需要转换为PyTorch格式 [batch_size, channels, height, width]
preprocessed_imgs = []
for img_array in imgs:
# 转换为PIL图像,然后使用预处理
img_pil = Image.fromarray((img_array * 255).astype(np.uint8))
preprocessed_imgs.append(preprocess(img_pil))
batch = torch.stack(preprocessed_imgs).to(device)
with torch.no_grad():
output = model(batch)
# 返回模型输出(没有应用softmax)
return output.cpu().numpy()
# 创建背景数据集(理想情况下应该使用更多的图像)
# 这里为了简化,我们使用单张图像
background = np.array([np.array(img.resize((224, 224)).convert('RGB')) / 255.0])
# 创建SHAP解释器
explainer = shap.GradientExplainer(
model,
torch.stack([preprocess(Image.fromarray((img_arr * 255).astype(np.uint8))) for img_arr in background]).to(device)
)
# 准备要解释的图像
test_img = np.array(img.resize((224, 224)).convert('RGB')) / 255.0
test_tensor = preprocess(img).unsqueeze(0).to(device)
# 计算SHAP值
shap_values = explainer.shap_values(test_tensor)
# 可视化SHAP值
# shap_values是一个列表,每个元素是一个类别的SHAP值
# 我们选择预测类别的SHAP值
pred_class_idx = predicted_idx.item()
# 转置SHAP值以适应可视化函数的格式要求
# [channel, height, width] -> [height, width, channel]
shap_values_for_pred = [sv[0].transpose(1, 2, 0) for sv in shap_values]
plt.figure(figsize=(12, 6))
# 原始图像
plt.subplot(1, 2, 1)
plt.imshow(img)
plt.title(f"Original - Predicted: {predicted_label}\nConfidence: {confidence:.2f}")
plt.axis('off')
# SHAP解释
plt.subplot(1, 2, 2)
# 使用matplotlib自己的功能来显示SHAP值
abs_shap = np.abs(shap_values_for_pred[pred_class_idx]).sum(axis=2)
abs_shap = abs_shap / np.max(abs_shap) # 归一化
plt.imshow(test_img) # 显示原图
plt.imshow(abs_shap, cmap='jet', alpha=0.6) # 叠加SHAP热图
plt.title(f"SHAP Explanation for {predicted_label}")
plt.axis('off')
plt.savefig('shap_explanation.png')
plt.tight_layout()
plt.show()
# 多类别比较
plt.figure(figsize=(15, 10))
# 获取前3个预测类别
top_probs, top_indices = torch.topk(torch.nn.functional.softmax(output, dim=1)[0], 3)
top_labels = [labels[idx.item()] for idx in top_indices]
for i, (idx, label, prob) in enumerate(zip(top_indices, top_labels, top_probs)):
plt.subplot(1, 3, i+1)
abs_shap = np.abs(shap_values_for_pred[idx.item()]).sum(axis=2)
abs_shap = abs_shap / np.max(abs_shap) # 归一化
plt.imshow(test_img) # 显示原图
plt.imshow(abs_shap, cmap='jet', alpha=0.6) # 叠加SHAP热图
plt.title(f"{label}\nProbability: {prob:.2f}")
plt.axis('off')
plt.savefig('shap_multiclass_explanation.png')
plt.tight_layout()
plt.show()
# SHAP值的通道分解(看每个颜色通道的贡献)
plt.figure(figsize=(15, 5))
channels = ['Red', 'Green', 'Blue']
for i, channel in enumerate(channels):
plt.subplot(1, 3, i+1)
channel_shap = shap_values_for_pred[pred_class_idx][:, :, i]
# 使用不同的归一化方法使正负值都可见
vmax = np.max(np.abs(channel_shap))
plt.imshow(channel_shap, cmap='coolwarm', vmin=-vmax, vmax=vmax)
plt.title(f"{channel} Channel SHAP Values\nfor {predicted_label}")
plt.colorbar()
plt.axis('off')
plt.savefig('shap_channels.png')
plt.tight_layout()
plt.show()
import torch
import torch.nn as nn
import torchvision.models as models
import torchvision.transforms as transforms
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import requests
from io import BytesIO
import cv2
# 设置设备
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
# 加载预训练的ResNet18模型
model = models.resnet18(pretrained=True)
model = model.to(device)
model.eval()
# ImageNet类别标签
LABELS_URL = 'https://raw.githubusercontent.com/anishathalye/imagenet-simple-labels/master/imagenet-simple-labels.json'
response = requests.get(LABELS_URL)
labels = response.json()
# 图像预处理
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# 下载示例图像
def download_image(url):
response = requests.get(url)
img = Image.open(BytesIO(response.content))
return img
# 选择一张狗的图像进行解释
img_url = "https://farm1.staticflickr.com/7/11792994_7fc91b7d60.jpg" # 一张狗的图片
img = download_image(img_url)
# 图像预处理和预测
input_tensor = preprocess(img)
input_batch = input_tensor.unsqueeze(0).to(device)
input_batch.requires_grad_() # 启用梯度计算
# 模型预测
output = model(input_batch)
# 获取预测结果
_, predicted_idx = torch.max(output, 1)
predicted_label = labels[predicted_idx.item()]
confidence = torch.nn.functional.softmax(output, dim=1)[0][predicted_idx].item()
print(f"Predicted: {predicted_label} with confidence {confidence:.2f}")
# 1. Vanilla Gradients
# 清除之前的梯度
if input_batch.grad is not None:
input_batch.grad.zero_()
# 反向传播
output[0, predicted_idx].backward()
# 获取梯度
vanilla_grads = input_batch.grad.data.cpu().numpy()[0] # [3, 224, 224]
# 2. Grad-CAM实现
# 获取最后一个卷积层的输出特征图
last_conv_layer = model.layer4[-1].conv2 # ResNet18的最后一个卷积层
last_conv_output = None
last_grads = None
# 定义钩子函数
def save_output_hook(module, input, output):
global last_conv_output
last_conv_output = output
def save_grad_hook(module, grad_input, grad_output):
global last_grads
last_grads = grad_output[0]
# 注册钩子
handle_output = last_conv_layer.register_forward_hook(save_output_hook)
handle_grad = last_conv_layer.register_backward_hook(save_grad_hook)
# 重新前向传播和反向传播
model.zero_grad()
output = model(input_batch)
output[0, predicted_idx].backward()
# 移除钩子
handle_output.remove()
handle_grad.remove()
# 计算Grad-CAM
gradcam = torch.mean(last_grads, dim=[2, 3], keepdim=True)
gradcam = torch.mul(last_conv_output, gradcam)
gradcam = torch.sum(gradcam, dim=1, keepdim=True)
gradcam = torch.nn.functional.relu(gradcam) # ReLU操作,保留正值
gradcam = torch.nn.functional.interpolate(
gradcam,
size=(224, 224),
mode='bilinear',
align_corners=False
)
# 转换为numpy并归一化
gradcam_np = gradcam.detach().cpu().numpy()[0, 0]
gradcam_np = (gradcam_np - gradcam_np.min()) / (gradcam_np.max() - gradcam_np.min())
# 3. SmoothGrad实现
def smooth_grad(image, model, target_class_idx, param_n=50, param_sigma=0.15):
"""
SmoothGrad实现:通过添加噪声并平均梯度来减少噪声影响
"""
image.requires_grad_()
mean_grads = torch.zeros_like(image.data)
sigma = param_sigma * (torch.max(image.data) - torch.min(image.data)).item()
for i in range(param_n):
# 添加噪声
noise = torch.normal(0, sigma, size=image.size()).to(device)
noisy_image = image.data + noise
noisy_image.requires_grad_()
# 预测
output = model(noisy_image)
# 梯度计算
model.zero_grad()
output[0, target_class_idx].backward()
# 累加梯度
mean_grads += noisy_image.grad.data
# 平均梯度
mean_grads /= param_n
return mean_grads.cpu().numpy()[0] # [3, 224, 224]
# 计算SmoothGrad
smoothgrad = smooth_grad(input_batch, model, predicted_idx.item(), param_n=25)
# 4. Integrated Gradients实现
def integrated_gradients(image, model, target_class_idx, steps=50):
"""
Integrated Gradients实现:沿从baseline到input的路径积分梯度
"""
# 创建baseline (黑色图像)
baseline = torch.zeros_like(image).to(device)
# 积分步骤
scaled_inputs = [baseline + (float(i) / steps) * (image - baseline) for i in range(steps + 1)]
# 计算每一步的梯度
grads = []
for scaled_input in scaled_inputs:
scaled_input.requires_grad_()
output = model(scaled_input)
model.zero_grad()
output[0, target_class_idx].backward()
grad = scaled_input.grad.data
grads.append(grad)
# 平均梯度
avg_grads = torch.stack(grads).mean(dim=0)
# 计算Integrated Gradients
integrated_grad = (image - baseline) * avg_grads
return integrated_grad.cpu().numpy()[0] # [3, 224, 224]
# 计算Integrated Gradients
ig = integrated_gradients(input_batch, model, predicted_idx.item(), steps=25)
# 5. Grad × Input方法
grad_times_input = vanilla_grads * input_tensor.cpu().numpy()
# 可视化所有梯度方法
plt.figure(figsize=(20, 16))
# 原始图像
plt.subplot(3, 2, 1)
plt.imshow(img)
plt.title(f"Original - Predicted: {predicted_label}\nConfidence: {confidence:.2f}")
plt.axis('off')
# Vanilla Gradients可视化
plt.subplot(3, 2, 2)
# 取绝对值并按通道求和
abs_grads = np.abs(vanilla_grads).sum(axis=0)
# 归一化
abs_grads = (abs_grads - abs_grads.min()) / (abs_grads.max() - abs_grads.min())
plt.imshow(abs_grads, cmap='jet')
plt.title("Vanilla Gradients")
plt.axis('off')
# Grad-CAM可视化
plt.subplot(3, 2, 3)
# 将原图转换为RGB数组
img_array = np.array(img.resize((224, 224)))
# 创建热图
heatmap = cv2.applyColorMap(np.uint8(255 * gradcam_np), cv2.COLORMAP_JET)
heatmap = cv2.cvtColor(heatmap, cv2.COLOR_BGR2RGB)
# 叠加热图
superimposed_img = heatmap * 0.4 + img_array * 0.6
superimposed_img = superimposed_img / np.max(superimposed_img)
plt.imshow(superimposed_img)
plt.title("Grad-CAM")
plt.axis('off')
# SmoothGrad可视化
plt.subplot(3, 2, 4)
abs_smoothgrad = np.abs(smoothgrad).sum(axis=0)
abs_smoothgrad = (abs_smoothgrad - abs_smoothgrad.min()) / (abs_smoothgrad.max() - abs_smoothgrad.min())
plt.imshow(abs_smoothgrad, cmap='jet')
plt.title("SmoothGrad")
plt.axis('off')
# Integrated Gradients可视化
plt.subplot(3, 2, 5)
abs_ig = np.abs(ig).sum(axis=0)
abs_ig = (abs_ig - abs_ig.min()) / (abs_ig.max() - abs_ig.min())
plt.imshow(abs_ig, cmap='jet')
plt.title("Integrated Gradients")
plt.axis('off')
# Grad × Input可视化
plt.subplot(3, 2, 6)
abs_grad_input = np.abs(grad_times_input).sum(axis=0)
abs_grad_input = (abs_grad_input - abs_grad_input.min()) / (abs_grad_input.max() - abs_grad_input.min())
plt.imshow(abs_grad_input, cmap='jet')
plt.title("Grad × Input")
plt.axis('off')
plt.tight_layout()
plt.savefig('gradient_methods_comparison.png')
plt.show()
# 比较图:原始图像与各种可视化方法叠加
plt.figure(figsize=(20, 10))
# 准备叠加图像
methods = {
"Vanilla Gradients": abs_grads,
"Grad-CAM": gradcam_np,
"SmoothGrad": abs_smoothgrad,
"Integrated Gradients": abs_ig,
"Grad × Input": abs_grad_input
}
# 显示原图
plt.subplot(2, 3, 1)
plt.imshow(img_array)
plt.title("Original Image")
plt.axis('off')
# 显示每种方法的叠加图
for i, (method_name, heatmap_data) in enumerate(methods.items(), start=2):
plt.subplot(2, 3, i)
# 创建热图
if method_name == "Grad-CAM":
# Grad-CAM已经处理过了
superimposed = superimposed_img
else:
# 其他方法需要创建热图
heatmap = cv2.applyColorMap(np.uint8(255 * heatmap_data), cv2.COLORMAP_JET)
heatmap = cv2.cvtColor(heatmap, cv2.COLOR_BGR2RGB)
superimposed = heatmap * 0.4 + img_array * 0.6
superimposed = superimposed / np.max(superimposed)
plt.imshow(superimposed)
plt.title(method_name)
plt.axis('off')
plt.tight_layout()
plt.savefig('gradient_methods_overlay.png')
plt.show()
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
from matplotlib.patches import FancyArrowPatch
import pandas as pd
# 创建比较表格
comparison_data = {
'Method': ['LIME', 'SHAP', 'Vanilla Gradients', 'GradCAM', 'SmoothGrad', 'Integrated Gradients'],
'Type': ['扰动法', '扰动法', '梯度类', '梯度类', '梯度类', '梯度类'],
'Model Agnostic': ['✓', '✓', '✗', '✗', '✗', '✗'],
'Local/Global': ['局部', '局部', '局部', '局部', '局部', '局部'],
'Computational Cost': ['高', '高', '低', '低', '中', '中'],
'Image': ['✓', '✓', '✓', '✓', '✓', '✓'],
'Text': ['✓', '✓', '✓', '✓', '✓', '✓'],
'Tabular': ['✓', '✓', '✓', '✗', '✓', '✓'],
'Interpretability': ['高', '高', '中', '高', '中', '中'],
'Stability': ['中', '高', '低', '中', '中', '高']
}
df = pd.DataFrame(comparison_data)
print("解释方法比较表:")
print(df.to_string(index=False))
# 保存为CSV
df.to_csv('explanation_methods_comparison.csv', index=False)
# 创建LIME和SHAP流程图
plt.figure(figsize=(14, 10))
# 创建有向图
G = nx.DiGraph()
# LIME流程节点
G.add_node("Instance", pos=(0, 5))
G.add_node("Perturbed\nSamples", pos=(2, 5))
G.add_node("Model\nPredictions", pos=(4, 5))
G.add_node("Weighted\nSamples", pos=(6, 5))
G.add_node("Train Local\nModel", pos=(8, 5))
G.add_node("LIME\nExplanation", pos=(10, 5))
# LIME流程边
G.add_edge("Instance", "Perturbed\nSamples")
G.add_edge("Perturbed\nSamples", "Model\nPredictions")
G.add_edge("Model\nPredictions", "Weighted\nSamples")
G.add_edge("Weighted\nSamples", "Train Local\nModel")
G.add_edge("Train Local\nModel", "LIME\nExplanation")
# SHAP流程节点
G.add_node("Instance_S", pos=(0, 2))
G.add_node("Feature\nCoalitions", pos=(2, 2))
G.add_node("Model\nPredictions_S", pos=(4, 2))
G.add_node("Calculate\nMarginal\nContributions", pos=(6, 2))
G.add_node("Average\nContributions", pos=(8, 2))
G.add_node("SHAP\nValues", pos=(10, 2))
# SHAP流程边
G.add_edge("Instance_S", "Feature\nCoalitions")
G.add_edge("Feature\nCoalitions", "Model\nPredictions_S")
G.add_edge("Model\nPredictions_S", "Calculate\nMarginal\nContributions")
G.add_edge("Calculate\nMarginal\nContributions", "Average\nContributions")
G.add_edge("Average\nContributions", "SHAP\nValues")
# 创建梯度方法流程节点
G.add_node("Instance_G", pos=(0, -1))
G.add_node("Forward\nPass", pos=(2, -1))
G.add_node("Model\nOutput", pos=(4, -1))
G.add_node("Backward\nPass", pos=(6, -1))
G.add_node("Process\nGradients", pos=(8, -1))
G.add_node("Gradient\nVisualization", pos=(10, -1))
# 梯度方法流程边
G.add_edge("Instance_G", "Forward\nPass")
G.add_edge("Forward\nPass", "Model\nOutput")
G.add_edge("Model\nOutput", "Backward\nPass")
G.add_edge("Backward\nPass", "Process\nGradients")
G.add_edge("Process\nGradients", "Gradient\nVisualization")
# 添加扰动法与梯度法的分组标签
plt.text(-1, 6, "扰动法 - LIME流程", fontsize=14, fontweight='bold')
plt.text(-1, 3, "扰动法 - SHAP流程", fontsize=14, fontweight='bold')
plt.text(-1, 0, "梯度类方法流程", fontsize=14, fontweight='bold')
# 获取节点位置
pos = nx.get_node_attributes(G, 'pos')
# 绘制节点和标签
nx.draw_networkx_nodes(G, pos, node_size=3000, node_color="lightblue", alpha=0.8, node_shape="s")
nx.draw_networkx_labels(G, pos, font_size=10, font_weight="bold")
# 自定义边的样式
edge_pos = {(u, v): (pos[u], pos[v]) for u, v in G.edges()}
for (u, v, _) in G.edges(data=True):
# 创建弧形的FancyArrowPatch
arrow = FancyArrowPatch(pos[u], pos[v],
connectionstyle="arc3,rad=0.1",
arrowstyle="-|>",
color="gray",
lw=1.5)
plt.gca().add_patch(arrow)
# 设置图表
plt.axis('off')
plt.title("模型解释方法流程图比较", fontsize=16)
plt.tight_layout()
plt.savefig('explanation_methods_flowchart.png', dpi=300, bbox_inches='tight')
plt.show()
# 创建模型解释方法的概念图
plt.figure(figsize=(10, 8))
# 创建数据
methods = ['LIME', 'SHAP', 'Vanilla\nGradients', 'GradCAM', 'SmoothGrad', 'Integrated\nGradients']
# 创建两个维度的评分
interpretability = [4.5, 4.7, 2.5, 4.0, 3.0, 3.5] # 可解释性评分(1-5)
computational_efficiency = [2.0, 1.5, 5.0, 4.5, 3.5, 3.0] # 计算效率评分(1-5)
stability = [3.0, 4.5, 2.0, 3.5, 3.8, 4.2] # 稳定性评分(1-5)
# 创建气泡大小(基于稳定性)
sizes = [s * 100 for s in stability]
# 创建颜色(基于方法类型)
colors = ['#FF9999', '#FF9999', '#66B2FF', '#66B2FF', '#66B2FF', '#66B2FF'] # 红色为扰动法,蓝色为梯度类
# 散点图
plt.scatter(computational_efficiency, interpretability, s=sizes, c=colors, alpha=0.6)
# 添加方法标签
for i, method in enumerate(methods):
plt.annotate(method, (computational_efficiency[i], interpretability[i]),
xytext=(5, 5), textcoords='offset points')
# 添加图例
plt.scatter([], [], s=300, c='#FF9999', alpha=0.6, label='扰动法')
plt.scatter([], [], s=300, c='#66B2FF', alpha=0.6, label='梯度类方法')
sizes_legend = [200, 300, 400]
for s in sizes_legend:
plt.scatter([], [], s=s, c='gray', alpha=0.4, label=f'稳定性: {s/100:.1f}')
plt.legend(scatterpoints=1, frameon=False, labelspacing=1)
# 设置坐标轴
plt.xlim(0, 6)
plt.ylim(0, 6)
plt.xlabel('计算效率', fontsize=12)
plt.ylabel('可解释性', fontsize=12)
plt.title('模型解释方法对比图', fontsize=14)
# 添加网格
plt.grid(True, linestyle='--', alpha=0.7)
plt.tight_layout()
plt.savefig('explanation_methods_comparison_chart.png', dpi=300)
plt.show()
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics import accuracy_score
import lime.lime_text
import shap
import re
import string
import seaborn as sns
# 设置随机种子
torch.manual_seed(42)
np.random.seed(42)
# 模拟数据: 电影评论情感分析(积极/消极)
movie_reviews = [
# 积极评论
"This movie was fantastic and I thoroughly enjoyed every moment of it.",
"The acting was superb and the plot kept me engaged throughout.",
"I loved this film, it was a masterpiece with incredible visuals.",
"Great direction and outstanding performances by the entire cast.",
"One of the best movies I've seen this year, highly recommended!",
"The screenplay was brilliant and the dialogue was witty and intelligent.",
"Amazing character development, I connected with all the main roles.",
"The cinematography was breathtaking, every frame was like a painting.",
"This film exceeded all my expectations, a true cinematic gem.",
"The musical score perfectly complemented the emotional journey of the story.",
# 消极评论
"This movie was terrible, I wasted two hours of my life.",
"The acting was wooden and the plot had too many holes.",
"I hated this film, it was boring and predictable.",
"Poor direction and disappointing performances across the board.",
"One of the worst movies I've seen, don't waste your money.",
"The screenplay was awful and the dialogue felt unnatural.",
"Shallow characters, I couldn't relate to any of them.",
"The visual effects were cheap looking and distracting.",
"This film fell far short of expectations, a complete letdown.",
"The soundtrack was annoying and didn't fit the mood of the scenes."
]
# 标签: 1为积极, 0为消极
labels = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
# 分割训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(
movie_reviews, labels, test_size=0.3, random_state=42
)
# 文本预处理
def preprocess_text(text):
# 转为小写
text = text.lower()
# 移除标点符号
text = re.sub(f'[{string.punctuation}]', ' ', text)
# 移除多余空格
text = re.sub(r'\s+', ' ', text).strip()
return text
# 应用预处理
X_train_processed = [preprocess_text(text) for text in X_train]
X_test_processed = [preprocess_text(text) for text in X_test]
# 使用CountVectorizer将文本转换为特征向量
vectorizer = CountVectorizer(max_features=1000)
X_train_vectorized = vectorizer.fit_transform(X_train_processed)
X_test_vectorized = vectorizer.transform(X_test_processed)
# 获取特征名(词汇表)
feature_names = vectorizer.get_feature_names_out()
# 将数据转换为PyTorch tensor
X_train_tensor = torch.FloatTensor(X_train_vectorized.toarray())
X_test_tensor = torch.FloatTensor(X_test_vectorized.toarray())
y_train_tensor = torch.LongTensor(y_train)
y_test_tensor = torch.LongTensor(y_test)
# 定义神经网络模型
class SentimentClassifier(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(SentimentClassifier, self).__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.fc2 = nn.Linear(hidden_size, output_size)
self.dropout = nn.Dropout(0.3)
def forward(self, x):
x = F.relu(self.fc1(x))
x = self.dropout(x)
x = self.fc2(x)
return x
# 模型参数
input_size = X_train_tensor.shape[1] # 特征数量
hidden_size = 64
output_size = 2 # 二分类:积极/消极
# 初始化模型
model = SentimentClassifier(input_size, hidden_size, output_size)
# 损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# 训练模型
batch_size = 4
epochs = 100
# 手动创建批次
def get_batches(X, y, batch_size):
for i in range(0, len(X), batch_size):
yield X[i:i+batch_size], y[i:i+batch_size]
# 训练循环
model.train()
for epoch in range(epochs):
epoch_loss = 0
for X_batch, y_batch in get_batches(X_train_tensor, y_train_tensor, batch_size):
optimizer.zero_grad()
outputs = model(X_batch)
loss = criterion(outputs, y_batch)
loss.backward()
optimizer.step()
epoch_loss += loss.item()
if (epoch+1) % 10 == 0:
print(f'Epoch {epoch+1}/{epochs}, Loss: {epoch_loss/len(X_train_tensor):.4f}')
# 评估模型
model.eval()
with torch.no_grad():
outputs = model(X_test_tensor)
_, predicted = torch.max(outputs, 1)
accuracy = accuracy_score(y_test, predicted.numpy())
print(f'Test Accuracy: {accuracy:.4f}')
# 用于LIME和SHAP的预测函数
def predict_proba(texts):
# 预处理
processed_texts = [preprocess_text(text) for text in texts]
# 向量化
vectorized = vectorizer.transform(processed_texts).toarray()
# 转换为tensor
inputs = torch.FloatTensor(vectorized)
# 预测
model.eval()
with torch.no_grad():
outputs = model(inputs)
probs = F.softmax(outputs, dim=1).numpy()
return probs
# 1. LIME解释
# 创建LIME解释器
explainer = lime.lime_text.LimeTextExplainer(class_names=["Negative", "Positive"])
# 选择一个测试样本进行解释
idx = 0 # 测试集中的第一个样本
text_instance = X_test[idx]
true_label = y_test[idx]
print(f"\nExplaining: '{text_instance}'")
print(f"True label: {'Positive' if true_label == 1 else 'Negative'}")
# 生成LIME解释
num_features = 6 # 显示的特征数量
lime_exp = explainer.explain_instance(
text_instance,
predict_proba,
num_features=num_features,
num_samples=1000
)
# 可视化LIME解释
plt.figure(figsize=(10, 4))
lime_exp.as_pyplot_figure()
plt.title("LIME Explanation")
plt.tight_layout()
plt.savefig('lime_text_explanation.png')
plt.show()
# 提取LIME权重
lime_weights = dict(lime_exp.as_list())
lime_words = list(lime_weights.keys())
lime_values = list(lime_weights.values())
# 2. SHAP解释
# 创建背景数据集 (使用部分训练数据)
background_samples = X_train_processed[:10] # 使用前10个样本作为背景
background_vectorized = vectorizer.transform(background_samples).toarray()
background = torch.FloatTensor(background_vectorized)
# 定义包装器模型函数
def f(x):
model.eval()
with torch.no_grad():
outputs = model(torch.FloatTensor(x))
return outputs.numpy()
# 创建SHAP解释器
explainer = shap.DeepExplainer(model, background)
# 生成SHAP值
shap_values = explainer.shap_values(X_test_tensor[idx:idx+1])
# 提取特征重要性
feature_importance = np.abs(shap_values[1][0]) # 对积极类的SHAP值
top_indices = np.argsort(feature_importance)[-num_features:] # 最重要的特征
top_features = [feature_names[i] for i in top_indices]
top_values = [feature_importance[i] for i in top_indices]
# 可视化SHAP值
plt.figure(figsize=(10, 4))
plt.barh(top_features, top_values)
plt.title("SHAP Feature Importance (for Positive class)")
plt.xlabel("SHAP Value (Absolute)")
plt.tight_layout()
plt.savefig('shap_text_explanation.png')
plt.show()
# 比较LIME和SHAP的结果
plt.figure(figsize=(12, 10))
# LIME可视化
plt.subplot(2, 1, 1)
colors = ['green' if x > 0 else 'red' for x in lime_values]
plt.barh(lime_words, lime_values, color=colors)
plt.title("LIME Explanation")
plt.xlabel("Weight (Positive = Supporting Prediction)")
# SHAP可视化
plt.subplot(2, 1, 2)
plt.barh(top_features, top_values, color='blue')
plt.title("SHAP Feature Importance")
plt.xlabel("SHAP Value (Absolute)")
plt.tight_layout()
plt.savefig('lime_vs_shap_text.png')
plt.show()
# 提取所有测试样本的特征重要性
shap_values_all = explainer.shap_values(X_test_tensor)
# 创建特征重要性热图
feature_imp_matrix = np.abs(shap_values_all[1]) # 使用积极类的SHAP值
# 选择前10个最重要的特征
mean_imp = np.mean(feature_imp_matrix, axis=0)
top_10_features_idx = np.argsort(mean_imp)[-10:]
top_10_features = [feature_names[i] for i in top_10_features_idx]
# 创建可视化矩阵
vis_matrix = feature_imp_matrix[:, top_10_features_idx]
plt.figure(figsize=(12, 8))
sns.heatmap(vis_matrix, cmap='viridis', xticklabels=top_10_features)
plt.title("SHAP Feature Importance Across Test Samples")
plt.xlabel("Top Features")
plt.ylabel("Test Samples")
plt.tight_layout()
plt.savefig('shap_feature_importance_heatmap.png')
plt.show()
# 汇总统计
print("\nTop Features by Mean SHAP Importance:")
for i, feature in enumerate(top_10_features[::-1]):
print(f"{i+1}. {feature}: {mean_imp[top_10_features_idx[::-1][i]]:.4f}")
清华大学全五版的《DeepSeek教程》完整的文档需要的朋友,关注我私信:deepseek 即可获得。
怎么样今天的内容还满意吗?再次感谢朋友们的观看,关注GZH:凡人的AI工具箱,回复666,送您价值199的AI大礼包。最后,祝您早日实现财务自由,还请给个赞,谢谢!