高通手机跑AI系列之——人脸变化算法-EW帮帮网

环境准备

手机

测试手机型号：Redmi K60 Pro

处理器：第二代骁龙8移动--8gen2

运行内存：8.0GB ，LPDDR5X-8400，67.0 GB/s

摄像头：前置16MP+后置50MP+8MP+2MP

AI算力：NPU 48Tops INT8 && GPU 1536ALU x 2 x 680MHz = 2.089 TFLOPS

提示：任意手机均可以，性能越好的手机速度越快

软件

APP：AidLux 2.0

系统环境：Ubuntu 20.04.3 LTS

提示：AidLux登录后代码运行更流畅，在代码运行时保持AidLux APP在前台运行，避免代码运行过程中被系统回收进程，另外屏幕保持常亮，一般息屏后一段时间，手机系统会进入休眠状态，如需长驻后台需要给APP权限。

算法Demo

代码功能详解

这段代码通过AidLlite推理引擎实现了一个基于计算机视觉的实时人脸美化应用，主要结合了人脸检测、关键点定位、图像变换和融合等技术。下面从整体架构和核心功能两方面进行解析：

整体架构

代码主要由以下几个部分组成：

人脸检测模块：使用 BlazeFace 模型识别视频中的人脸
关键点检测模块：定位人脸的 468 个关键点
人脸变换模块：通过仿射变换和三角剖分实现人脸对齐
图像融合模块：将源人脸与目标人脸无缝融合
用户交互模块：提供 UI 界面选择不同的目标人脸图像

核心功能解析

人脸检测与预处理

# 使用BlazeFace模型进行人脸检测
def blazeface(raw_output_a, raw_output_b, anchors):
    # 解码边界框和分数
    detections = net.tensors_to_detections(raw_box_tensor, raw_score_tensor, anchors)
    # 非极大值抑制过滤重叠检测
    filtered_detections = net.weighted_non_max_suppression(detections[i])

通过 TFLite 模型face_detection_front.tflite检测人脸，返回边界框和关键点坐标，再通过非极大值抑制优化检测结果。

人脸关键点定位

# 检测人脸的468个关键点
model_path1 = "models/face_landmark.tflite"
mesh = fast_interpreter1.get_output_tensor(0)
mesh = mesh.reshape(468, 3) / 192

使用face_landmark.tflite模型定位眼睛、嘴巴、鼻子等关键部位的坐标，为后续人脸变换提供基础。

人脸变换与融合

# 基于Delaunay三角剖分的人脸变换
def warpTriangle(img1, img2, t1, t2):
    # 计算仿射变换矩阵
    warpMat = cv2.getAffineTransform(np.float32(srcTri), np.float32(dstTri))
    # 应用变换并融合
    dst = cv2.warpAffine(src, warpMat, (size[0], size[1]))

将人脸区域划分为三角形网格，对每个三角形应用仿射变换，再通过cv2.seamlessClone实现无缝融合。

用户交互界面

# 创建UI界面选择目标人脸
class MyApp(App):
    def main(self):
        # 创建摄像头组件和图像选择按钮
        self.img1 = Image('/res:' + back_img_path[0], height=80, margin='10px')
        self.img1.onclick.do(self.on_img1_clicked)

提供图形界面让用户选择不同的目标人脸图像，点击图片即可切换。

模型作用分析

代码中使用了两个关键的 TFLite 模型：

face_detection_front.tflite
- 类型：人脸检测模型
- 作用：在输入图像中定位人脸位置，输出边界框和 6 个关键点坐标 (眼睛、鼻子、嘴角等)
- 技术特点：
  - 轻量级设计，适合实时应用
  - 使用锚点机制检测不同尺度的人脸
  - 输出包括边界框坐标和关键点位置
face_landmark.tflite
- 类型：人脸关键点检测模型
- 作用：检测人脸的 468 个精确关键点，覆盖眉毛、眼睛、鼻子、嘴巴和脸部轮廓
- 技术特点：
  - 输出 468 个 3D 坐标点，提供精细的人脸形状描述
  - 用于人脸对齐、表情分析等高级应用
  - 模型输入为 192x192 的图像，输出为 468 个 3D 坐标

应用场景

该人脸变换和美化应用适用于以下场景：

娱乐与社交媒体
- 短视频特效制作
- 社交平台实时滤镜
- 趣味照片编辑工具
影视制作与广告
- 电影特效中的人脸替换
- 广告中实现明星脸替换效果
- 虚拟主播的面部表情迁移
教育与演示
- 计算机视觉原理教学演示
- 人脸图像处理技术展示
- 机器学习模型应用案例
特殊行业应用
- 安防领域的人脸模拟
- 虚拟现实中的面部表情同步
- 医学领域的面部畸形模拟与修复预览

技术特点与优势

实时性：通过轻量级 TFLite 模型和优化的计算流程，实现实时人脸变换
鲁棒性：使用 Delaunay 三角剖分和无缝克隆技术，确保不同表情和角度下的效果
易用性：提供图形界面，用户可轻松选择不同的目标人脸
可扩展性：模型与业务逻辑分离，便于替换更高精度的模型或添加新功能

该应用结合了计算机视觉和机器学习技术，展示了现代人脸处理的核心流程，具有较强的实用性和拓展空间。

示例代码

import cv2
import math
import sys
import numpy as np
import os
import subprocess
import time
from cvs import *
import aidlite

# 背景图像路径列表
back_img_path = ('models/rs.jpeg', 'models/wy.jpeg', 'models/zyx.jpeg', 'models/monkey.jpg', 'models/star2.jpg', 'models/star1.jpg', 'models/star3.jpg', 'models/star4.jpg')

# 读取第一张背景图像
faceimg = cv2.imread(back_img_path[0])
mod = -1
bfirstframe = True

# 从文件中读取关键点
def readPoints(path):
    # 创建一个关键点数组
    points = []
    # 打开文件读取关键点
    with open(path) as file:
        for line in file:
            x, y = line.split()
            points.append((int(x), int(y)))
    return points

# 应用仿射变换
def applyAffineTransform(src, srcTri, dstTri, size):
    # 计算仿射变换矩阵
    warpMat = cv2.getAffineTransform(np.float32(srcTri), np.float32(dstTri))
    # 应用仿射变换到源图像
    dst = cv2.warpAffine(src, warpMat, (size[0], size[1]), None, flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REFLECT_101)
    return dst

# 检查点是否在矩形内
def rectContains(rect, point):
    if point[0] < rect[0]:
        return False
    elif point[1] < rect[1]:
        return False
    elif point[0] > rect[0] + rect[2]:
        return False
    elif point[1] > rect[1] + rect[3]:
        return False
    return True

# 计算Delaunay三角形
def calculateDelaunayTriangles(rect, points):
    # 创建Subdiv2D对象
    subdiv = cv2.Subdiv2D(rect)
    ttp = None
    # 将关键点插入到Subdiv2D对象中
    for p in points:
        try:
            subdiv.insert(p)
            ttp = p
        except:
            subdiv.insert(ttp)
            continue
    # 获取三角形列表
    triangleList = subdiv.getTriangleList()
    delaunayTri = []
    pt = []
    for t in triangleList:
        pt.append((t[0], t[1]))
        pt.append((t[2], t[3]))
        pt.append((t[4], t[5]))
        pt1 = (t[0], t[1])
        pt2 = (t[2], t[3])
        pt3 = (t[4], t[5])
        # 检查三角形的三个顶点是否都在矩形内
        if rectContains(rect, pt1) and rectContains(rect, pt2) and rectContains(rect, pt3):
            ind = []
            # 获取关键点的索引
            for j in range(0, 3):
                for k in range(0, len(points)):
                    if (abs(pt[j][0] - points[k][0]) < 1.0 and abs(pt[j][1] - points[k][1]) < 1.0):
                        ind.append(k)
            # 如果索引列表长度为3，则将其添加到Delaunay三角形列表中
            if len(ind) == 3:
                delaunayTri.append((ind[0], ind[1], ind[2]))
        pt = []
    return delaunayTri

# 对三角形区域进行变形和融合
def warpTriangle(img1, img2, t1, t2):
    # 找到每个三角形的边界矩形
    r1 = cv2.boundingRect(np.float32([t1]))
    r2 = cv2.boundingRect(np.float32([t2]))
    # 偏移关键点
    t1Rect = []
    t2Rect = []
    t2RectInt = []
    for i in range(0, 3):
        t1Rect.append(((t1[i][0] - r1[0]), (t1[i][1] - r1[1])))
        t2Rect.append(((t2[i][0] - r2[0]), (t2[i][1] - r2[1])))
        t2RectInt.append(((t2[i][0] - r2[0]), (t2[i][1] - r2[1])))
    # 创建掩码
    mask = np.zeros((r2[3], r2[2], 3), dtype=np.float32)
    cv2.fillConvexPoly(mask, np.int32(t2RectInt), (1.0, 1.0, 1.0), 16, 0)
    # 对小矩形区域应用仿射变换
    img1Rect = img1[r1[1]:r1[1] + r1[3], r1[0]:r1[0] + r1[2]]
    size = (r2[2], r2[3])
    img2Rect = applyAffineTransform(img1Rect, t1Rect, t2Rect, size)
    img2Rect = img2Rect * mask
    # 将变形后的三角形区域复制到输出图像中
    img2[r2[1]:r2[1] + r2[3], r2[0]:r2[0] + r2[2]] = img2[r2[1]:r2[1] + r2[3], r2[0]:r2[0] + r2[2]] * ((1.0, 1.0, 1.0) - mask)
    img2[r2[1]:r2[1] + r2[3], r2[0]:r2[0] + r2[2]] = img2[r2[1]:r2[1] + r2[3], r2[0]:r2[0] + r2[2]] + img2Rect

# 人脸变换函数
def faceswap(points1, points2, img1, img2):
    img1Warped = np.copy(img2)
    # 找到凸包
    hull1 = []
    hull2 = []
    hullIndex = cv2.convexHull(np.array(points2), returnPoints=False)
    for i in range(0, len(hullIndex)):
        hull1.append(points1[int(hullIndex[i])])
        hull2.append(points2[int(hullIndex[i])])
    # 计算凸包关键点的Delaunay三角形
    sizeImg2 = img2.shape
    rect = (0, 0, sizeImg2[1], sizeImg2[0])
    dt = calculateDelaunayTriangles(rect, hull2)
    if len(dt) == 0:
        quit()
    # 对Delaunay三角形应用仿射变换
    for i in range(0, len(dt)):
        t1 = []
        t2 = []
        for j in range(0, 3):
            t1.append(hull1[dt[i][j]])
            t2.append(hull2[dt[i][j]])
        warpTriangle(img1, img1Warped, t1, t2)
    # 计算掩码
    hull8U = []
    for i in range(0, len(hull2)):
        hull8U.append((hull2[i][0], hull2[i][1]))
    mask = np.zeros(img2.shape, dtype=img2.dtype)
    cv2.fillConvexPoly(mask, np.int32(hull8U), (255, 255, 255))
    r = cv2.boundingRect(np.float32([hull2]))
    center = ((r[0] + int(r[2] / 2), r[1] + int(r[3] / 2)))
    # 无缝克隆
    try:
        output = cv2.seamlessClone(np.uint8(img1Warped), img2, mask, center, cv2.NORMAL_CLONE)
    except:
        return None
    return output

# 对图像进行预处理，用于TFLite模型
def preprocess_image_for_tflite32(image, model_image_size=192):
    # 将图像从BGR颜色空间转换为RGB颜色空间
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    # 调整图像大小
    image = cv2.resize(image, (model_image_size, model_image_size))
    # 添加一个维度
    image = np.expand_dims(image, axis=0)
    # 归一化处理
    image = (2.0 / 255.0) * image - 1.0
    # 将图像数据类型转换为float32
    image = image.astype('float32')
    return image

# 对图像进行填充和预处理
def preprocess_img_pad(img, image_size=128):
    # 获取图像的形状
    shape = np.r_[img.shape]
    # 计算需要填充的像素数
    pad_all = (shape.max() - shape[:2]).astype('uint32')
    pad = pad_all // 2
    # 对原始图像进行填充
    img_pad_ori = np.pad(
        img,
        ((pad[0], pad_all[0] - pad[0]), (pad[1], pad_all[1] - pad[1]), (0, 0)),
        mode='constant')
    # 将图像从BGR颜色空间转换为RGB颜色空间
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    # 对RGB图像进行填充
    img_pad = np.pad(
        img,
        ((pad[0], pad_all[0] - pad[0]), (pad[1], pad_all[1] - pad[1]), (0, 0)),
        mode='constant')
    # 调整图像大小
    img_small = cv2.resize(img_pad, (image_size, image_size))
    # 添加一个维度
    img_small = np.expand_dims(img_small, axis=0)
    # 归一化处理
    img_small = (2.0 / 255.0) * img_small - 1.0
    # 将图像数据类型转换为float32
    img_small = img_small.astype('float32')
    return img_pad_ori, img_small, pad

# 绘制检测到的人脸框
def plot_detections(img, detections, with_keypoints=True):
    output_img = img
    print(img.shape)
    x_min = 0
    x_max = 0
    y_min = 0
    y_max = 0
    print("找到 %d 个人脸" % len(detections))
    for i in range(len(detections)):
        # 计算人脸框的坐标
        ymin = detections[i][0] * img.shape[0]
        xmin = detections[i][1] * img.shape[1]
        ymax = detections[i][2] * img.shape[0]
        xmax = detections[i][3] * img.shape[1]
        w = int(xmax - xmin)
        h = int(ymax - ymin)
        h = max(w, h)
        h = h * 1.5
        x = (xmin + xmax) / 2.
        y = (ymin + ymax) / 2.
        xmin = x - h / 2.
        xmax = x + h / 2.
        ymin = y - h / 2. - 0.08 * h
        ymax = y + h / 2. - 0.08 * h
        x_min = int(xmin)
        y_min = int(ymin)
        x_max = int(xmax)
        y_max = int(ymax)
        p1 = (int(xmin), int(ymin))
        p2 = (int(xmax), int(ymax))
        # 绘制人脸框
        cv2.rectangle(output_img, p1, p2, (0, 255, 255), 2, 1)
    return x_min, y_min, x_max, y_max

# 绘制人脸网格
def draw_mesh(image, mesh, mark_size=2, line_width=1):
    # 获取图像的大小
    image_size = image.shape[0]
    # 将归一化的网格坐标转换为图像坐标
    mesh = mesh * image_size
    # 绘制关键点
    for point in mesh:
        cv2.circle(image, (point[0], point[1]),
                   mark_size, (0, 255, 128), -1)
    # 绘制眼睛轮廓
    left_eye_contour = np.array([mesh[33][0:2],
                                 mesh[7][0:2],
                                 mesh[163][0:2],
                                 mesh[144][0:2],
                                 mesh[145][0:2],
                                 mesh[153][0:2],
                                 mesh[154][0:2],
                                 mesh[155][0:2],
                                 mesh[133][0:2],
                                 mesh[173][0:2],
                                 mesh[157][0:2],
                                 mesh[158][0:2],
                                 mesh[159][0:2],
                                 mesh[160][0:2],
                                 mesh[161][0:2],
                                 mesh[246][0:2]]).astype(np.int32)
    right_eye_contour = np.array([mesh[263][0:2],
                                  mesh[249][0:2],
                                  mesh[390][0:2],
                                  mesh[373][0:2],
                                  mesh[374][0:2],
                                  mesh[380][0:2],
                                  mesh[381][0:2],
                                  mesh[382][0:2],
                                  mesh[362][0:2],
                                  mesh[398][0:2],
                                  mesh[384][0:2],
                                  mesh[385][0:2],
                                  mesh[386][0:2],
                                  mesh[387][0:2],
                                  mesh[388][0:2],
                                  mesh[466][0:2]]).astype(np.int32)
    # 绘制眼睛轮廓线
    cv2.polylines(image, [left_eye_contour, right_eye_contour], False,
                  (255, 255, 255), line_width, cv2.LINE_AA)

# 获取关键点
def getkeypoint(image, mesh, landmark_point):
    # 获取图像的大小
    image_size = image.shape[0]
    # 将归一化的网格坐标转换为图像坐标
    mesh = mesh * image_size
    # 将关键点添加到列表中
    for point in mesh:
        landmark_point.append((point[0], point[1]))
    return image

# 绘制关键点和面部特征线
def draw_landmarks(image, mesh, landmark_point):
    # 获取图像的大小
    image_size = image.shape[0]
    # 将归一化的网格坐标转换为图像坐标
    mesh = mesh * image_size
    # 绘制关键点
    for point in mesh:
        landmark_point.append((point[0], point[1]))
        cv2.circle(image, (point[0], point[1]), 2, (255, 255, 0), -1)
    if len(landmark_point) > 0:
        # 绘制左眉毛
        cv2.line(image, landmark_point[55], landmark_point[65], (0, 0, 255), 2, -3)
        cv2.line(image, landmark_point[65], landmark_point[52], (0, 0, 255), 2, -3)
        cv2.line(image, landmark_point[52], landmark_point[53], (0, 0, 255), 2, -3)
        cv2.line(image, landmark_point[53], landmark_point[46], (0, 0, 255), 2, -3)
        # 绘制右眉毛
        cv2.line(image, landmark_point[285], landmark_point[295], (0, 0, 255), 2)
        cv2.line(image, landmark_point[295], landmark_point[282], (0, 0, 255), 2)
        cv2.line(image, landmark_point[282], landmark_point[283], (0, 0, 255), 2)
        cv2.line(image, landmark_point[283], landmark_point[276], (0, 0, 255), 2)
        # 绘制左眼睛
        cv2.line(image, landmark_point[133], landmark_point[173], (0, 0, 255), 2)
        cv2.line(image, landmark_point[173], landmark_point[157], (0, 0, 255), 2)
        cv2.line(image, landmark_point[157], landmark_point[158], (0, 0, 255), 2)
        cv2.line(image, landmark_point[158], landmark_point[159], (0, 0, 255), 2)
        cv2.line(image, landmark_point[159], landmark_point[160], (0, 0, 255), 2)
        cv2.line(image, landmark_point[160], landmark_point[161], (0, 0, 255), 2)
        cv2.line(image, landmark_point[161], landmark_point[246], (0, 0, 255), 2)
        cv2.line(image, landmark_point[246], landmark_point[163], (0, 0, 255), 2)
        cv2.line(image, landmark_point[163], landmark_point[144], (0, 0, 255), 2)
        cv2.line(image, landmark_point[144], landmark_point[145], (0, 0, 255), 2)
        cv2.line(image, landmark_point[145], landmark_point[153], (0, 0, 255), 2)
        cv2.line(image, landmark_point[153], landmark_point[154], (0, 0, 255), 2)
        cv2.line(image, landmark_point[154], landmark_point[155], (0, 0, 255), 2)
        cv2.line(image, landmark_point[155], landmark_point[133], (0, 0, 255), 2)
        # 绘制右眼睛
        cv2.line(image, landmark_point[362], landmark_point[398], (0, 0, 255), 2)
        cv2.line(image, landmark_point[398], landmark_point[384], (0, 0, 255), 2)
        cv2.line(image, landmark_point[384], landmark_point[385], (0, 0, 255), 2)
        cv2.line(image, landmark_point[385], landmark_point[386], (0, 0, 255), 2)
        cv2.line(image, landmark_point[386], landmark_point[387], (0, 0, 255), 2)
        cv2.line(image, landmark_point[387], landmark_point[388], (0, 0, 255), 2)
        cv2.line(image, landmark_point[388], landmark_point[466], (0, 0, 255), 2)
        cv2.line(image, landmark_point[466], landmark_point[390], (0, 0, 255), 2)
        cv2.line(image, landmark_point[390], landmark_point[373], (0, 0, 255), 2)
        cv2.line(image, landmark_point[373], landmark_point[374], (0, 0, 255), 2)
        cv2.line(image, landmark_point[374], landmark_point[380], (0, 0, 255), 2)
        cv2.line(image, landmark_point[380], landmark_point[381], (0, 0, 255), 2)
        cv2.line(image, landmark_point[381], landmark_point[382], (0, 0, 255), 2)
        cv2.line(image, landmark_point[382], landmark_point[362], (0, 0, 255), 2)
        # 绘制嘴巴
        cv2.line(image, landmark_point[308], landmark_point[415], (0, 0, 255), 2)
        cv2.line(image, landmark_point[415], landmark_point[310], (0, 0, 255), 2)
        cv2.line(image, landmark_point[310], landmark_point[311], (0, 0, 255), 2)
        cv2.line(image, landmark_point[311], landmark_point[312], (0, 0, 255), 2)
        cv2.line(image, landmark_point[312], landmark_point[13], (0, 0, 255), 2)
        cv2.line(image, landmark_point[13], landmark_point[82], (0, 0, 255), 2)
        cv2.line(image, landmark_point[82], landmark_point[81], (0, 0, 255), 2)
        cv2.line(image, landmark_point[81], landmark_point[80], (0, 0, 255), 2)
        cv2.line(image, landmark_point[80], landmark_point[191], (0, 0, 255), 2)
        cv2.line(image, landmark_point[191], landmark_point[78], (0, 0, 255), 2)
        cv2.line(image, landmark_point[78], landmark_point[95], (0, 0, 255), 2)
        cv2.line(image, landmark_point[95], landmark_point[88], (0, 0, 255), 2)
        cv2.line(image, landmark_point[88], landmark_point[178], (0, 0, 255), 2)
        cv2.line(image, landmark_point[178], landmark_point[87], (0, 0, 255), 2)
        cv2.line(image, landmark_point[87], landmark_point[14], (0, 0, 255), 2)
        cv2.line(image, landmark_point[14], landmark_point[317], (0, 0, 255), 2)
        cv2.line(image, landmark_point[317], landmark_point[402], (0, 0, 255), 2)
        cv2.line(image, landmark_point[402], landmark_point[318], (0, 0, 255), 2)
        cv2.line(image, landmark_point[318], landmark_point[324], (0, 0, 255), 2)
        cv2.line(image, landmark_point[324], landmark_point[308], (0, 0, 255), 2)
    return image

# BlazeFace人脸检测模型类
class BlazeFace():
    def __init__(self):
        # 类别数量
        self.num_classes = 1
        # 锚点数量
        self.num_anchors = 896
        # 坐标数量
        self.num_coords = 16
        # 分数裁剪阈值
        self.score_clipping_thresh = 100.0
        # x坐标缩放因子
        self.x_scale = 128.0
        # y坐标缩放因子
        self.y_scale = 128.0
        # 高度缩放因子
        self.h_scale = 128.0
        # 宽度缩放因子
        self.w_scale = 128.0
        # 最小分数阈值
        self.min_score_thresh = 0.75
        # 最小抑制阈值
        self.min_suppression_threshold = 0.3

    # Sigmoid函数
    def sigmoid(self, inX):
        if inX >= 0:
            return 1.0 / (1 + np.exp(-inX))
        else:
            return np.exp(inX) / (1 + np.exp(inX))

    # 将原始输出张量转换为检测结果
    def tensors_to_detections(self, raw_box_tensor, raw_score_tensor, anchors):
        assert len(raw_box_tensor.shape) == 3
        assert raw_box_tensor.shape[1] == self.num_anchors
        assert raw_box_tensor.shape[2] == self.num_coords
        assert len(raw_score_tensor.shape) == 3
        assert raw_score_tensor.shape[1] == self.num_anchors
        assert raw_score_tensor.shape[2] == self.num_classes
        assert raw_box_tensor.shape[0] == raw_score_tensor.shape[0]
        # 解码边界框
        detection_boxes = self._decode_boxes(raw_box_tensor, anchors)
        # 裁剪分数
        thresh = self.score_clipping_thresh
        raw_score_tensor = raw_score_tensor.clip(-thresh, thresh)
        # 计算检测分数
        detection_scores = 1 / (1 + np.exp(-raw_score_tensor)).squeeze(axis=-1)
        # 过滤掉分数低于阈值的检测结果
        mask = detection_scores >= self.min_score_thresh
        output_detections = []
        for i in range(raw_box_tensor.shape[0]):
            boxes = detection_boxes[i, mask[i]]
            scores = np.expand_dims(detection_scores[i, mask[i]], axis=-1)
            output_detections.append(np.concatenate((boxes, scores), axis=-1))
        return output_detections

    # 解码边界框
    def _decode_boxes(self, raw_boxes, anchors):
        boxes = np.zeros(raw_boxes.shape)
        # 计算边界框的中心点坐标
        x_center = raw_boxes[..., 0] / self.x_scale * anchors[:, 2] + anchors[:, 0]
        y_center = raw_boxes[..., 1] / self.y_scale * anchors[:, 3] + anchors[:, 1]
        # 计算边界框的宽度和高度
        w = raw_boxes[..., 2] / self.w_scale * anchors[:, 2]
        h = raw_boxes[..., 3] / self.h_scale * anchors[:, 3]
        # 计算边界框的左上角和右下角坐标
        boxes[..., 0] = y_center - h / 2.  # ymin
        boxes[..., 1] = x_center - w / 2.  # xmin
        boxes[..., 2] = y_center + h / 2.  # ymax
        boxes[..., 3] = x_center + w / 2.  # xmax
        # 计算关键点坐标
        for k in range(6):
            offset = 4 + k * 2
            keypoint_x = raw_boxes[..., offset] / self.x_scale * anchors[:, 2] + anchors[:, 0]
            keypoint_y = raw_boxes[..., offset + 1] / self.y_scale * anchors[:, 3] + anchors[:, 1]
            boxes[..., offset] = keypoint_x
            boxes[..., offset + 1] = keypoint_y
        return boxes

    # 加权非极大值抑制
    def weighted_non_max_suppression(self, detections):
        if len(detections) == 0: return []
        output_detections = []
        # 按分数从高到低排序
        remaining = np.argsort(-detections[:, 16])
        while len(remaining) > 0:
            detection = detections[remaining[0]]
            # 计算第一个框与其他框的重叠度
            first_box = detection[:4]
            other_boxes = detections[remaining, :4]
            ious = overlap_similarity(first_box, other_boxes)
            # 过滤掉重叠度低于阈值的框
            mask = ious > self.min_suppression_threshold
            overlapping = remaining[mask]
            remaining = remaining[~mask]
            # 计算加权检测结果
            weighted_detection = detection.copy()
            if len(overlapping) > 1:
                coordinates = detections[overlapping, :16]
                scores = detections[overlapping, 16:17]
                total_score = scores.sum()
                weighted = (coordinates * scores).sum(axis=0) / total_score
                weighted_detection[:16] = weighted
                weighted_detection[16] = total_score / len(overlapping)
            output_detections.append(weighted_detection)
        return output_detections

# BlazeFace人脸检测函数
def blazeface(raw_output_a, raw_output_b, anchors):
    if raw_output_a.size == 896:
        raw_score_tensor = raw_output_a
        raw_box_tensor = raw_output_b
    else:
        raw_score_tensor = raw_output_b
        raw_box_tensor = raw_output_a
    assert (raw_score_tensor.size == 896)
    assert (raw_box_tensor.size == 896 * 16)
    # 调整输出张量的形状
    raw_score_tensor = raw_score_tensor.reshape(1, 896, 1)
    raw_box_tensor = raw_box_tensor.reshape(1, 896, 16)
    net = BlazeFace()
    # 后处理原始预测结果
    detections = net.tensors_to_detections(raw_box_tensor, raw_score_tensor, anchors)
    # 非极大值抑制
    filtered_detections = []
    for i in range(len(detections)):
        faces = net.weighted_non_max_suppression(detections[i])
        if len(faces) > 0:
            faces = np.stack(faces)
        filtered_detections.append(faces)
    return filtered_detections

# 将检测结果从填充图像坐标转换为原始图像坐标
def convert_to_orig_points(results, orig_dim, letter_dim):
    # 计算缩放比例
    inter_scale = min(letter_dim / orig_dim[0], letter_dim / orig_dim[1])
    inter_h, inter_w = int(inter_scale * orig_dim[0]), int(inter_scale * orig_dim[1])
    # 计算偏移量
    offset_x, offset_y = (letter_dim - inter_w) / 2.0 / letter_dim, (letter_dim - inter_h) / 2.0 / letter_dim
    scale_x, scale_y = letter_dim / inter_w, letter_dim / inter_h
    # 调整检测结果的坐标
    results[:, 0:2] = (results[:, 0:2] - [offset_x, offset_y]) * [scale_x, scale_y]
    results[:, 2:4] = results[:, 2:4] * [scale_x, scale_y]
    results[:, 4:16:2] = (results[:, 4:16:2] - offset_x) * scale_x
    results[:, 5:17:2] = (results[:, 5:17:2] - offset_y) * scale_y
    # 将坐标从0-1范围转换为原始图像范围
    results[:, 0:16:2] *= orig_dim[1]
    results[:, 1:17:2] *= orig_dim[0]
    return results.astype(np.int32)

# 计算两个边界框的交并比（IoU）
def overlap_similarity(box, other_boxes):
    def union(A, B):
        x1, y1, x2, y2 = A
        a = (x2 - x1) * (y2 - y1)
        x1, y1, x2, y2 = B
        b = (x2 - x1) * (y2 - y1)
        ret = a + b - intersect(A, B)
        return ret

    def intersect(A, B):
        x1 = max(A[0], B[0])
        y1 = max(A[1], B[1])
        x2 = min(A[2], B[2])
        y2 = min(A[3], B[3])
        return (x2 - x1) * (y2 - y1)

    ret = np.array([max(0, intersect(box, b) / union(box, b)) for b in other_boxes])
    return ret

# 自定义应用类
class MyApp(App):
    def __init__(self, *args):
        super(MyApp, self).__init__(*args)

    # 空闲时更新摄像头
    def idle(self):
        self.aidcam0.update()

    # 主函数，创建UI界面
    def main(self):
        # 创建垂直容器
        main_container = VBox(width=360, height=680, style={'margin': '0px auto'})
        # 创建摄像头组件
        self.aidcam0 = OpencvVideoWidget(self, width=340, height=400)
        self.aidcam0.style['margin'] = '10px'
        i = 0
        exec("self.aidcam%(i)s = OpencvVideoWidget(self)" % {'i': i})
        exec("self.aidcam%(i)s.identifier = 'aidcam%(i)s'" % {'i': i})
        eval("main_container.append(self.aidcam%(i)s)" % {'i': i})
        main_container.append(self.aidcam0)
        # 创建标签
        self.lbl = Label('点击图片选择你喜欢的明星脸：')
        main_container.append(self.lbl)
        # 创建底部容器
        bottom_container = HBox(width=360, height=130, style={'margin': '0px auto'})
        # 创建图像组件
        self.img1 = Image('/res:' + os.getcwd() + '/' + back_img_path[0], height=80, margin='10px')
        self.img1.onclick.do(self.on_img1_clicked)
        bottom_container.append(self.img1)
        self.img2 = Image('/res:' + os.getcwd() + '/' + back_img_path[1], height=80, margin='10px')
        self.img2.onclick.do(self.on_img2_clicked)
        bottom_container.append(self.img2)
        self.img3 = Image('/res:' + os.getcwd() + '/' + back_img_path[2], height=80, margin='10px')
        self.img3.onclick.do(self.on_img3_clicked)
        bottom_container.append(self.img3)
        self.img4 = Image('/res:' + os.getcwd() + '/' + back_img_path[3], height=80, margin='10px')
        self.img4.onclick.do(self.on_img4_clicked)
        bottom_container.append(self.img4)
        # 创建按钮容器
        bt_container = HBox(width=360, height=130, style={'margin': '0px auto'})
        self.img11 = Image('/res:' + os.getcwd() + '/' + back_img_path[4], height=80, margin='10px')
        self.img11.onclick.do(self.on_img11_clicked)
        bt_container.append(self.img11)
        self.img22 = Image('/res:' + os.getcwd() + '/' + back_img_path[5], height=80, margin='10px')
        self.img22.onclick.do(self.on_img22_clicked)
        bt_container.append(self.img22)
        self.img33 = Image('/res:' + os.getcwd() + '/' + back_img_path[6], height=80, margin='10px')
        self.img33.onclick.do(self.on_img33_clicked)
        bt_container.append(self.img33)
        self.img44 = Image('/res:' + os.getcwd() + '/' + back_img_path[7], height=80, margin='10px')
        self.img44.onclick.do(self.on_img44_clicked)
        bt_container.append(self.img44)
        main_container.append(bottom_container)
        main_container.append(bt_container)
        return main_container

    # 点击第一张图片的回调函数
    def on_img1_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[0])
        faceimg = bgnd
        global mod
        mod = 0

    # 点击第二张图片的回调函数
    def on_img2_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[1])
        faceimg = bgnd
        global mod
        mod = 1

    # 点击第三张图片的回调函数
    def on_img3_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[2])
        faceimg = bgnd
        global mod
        mod = 2

    # 点击第四张图片的回调函数
    def on_img4_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[3])
        faceimg = bgnd
        global mod
        mod = 3

    # 点击第五张图片的回调函数
    def on_img11_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[4])
        faceimg = bgnd
        global mod
        mod = 4

    # 点击第六张图片的回调函数
    def on_img22_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[5])
        faceimg = bgnd
        global mod
        mod = 5

    # 点击第七张图片的回调函数
    def on_img33_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[6])
        faceimg = bgnd
        global mod
        mod = 6

    # 点击第八张图片的回调函数
    def on_img44_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[7])
        faceimg = bgnd
        global mod
        mod = 7

    # 点击第一个按钮的回调函数
    def on_button_pressed1(self, widget):
        global mod
        mod = 0

    # 点击第二个按钮的回调函数
    def on_button_pressed2(self, widget):
        global mod
        mod = 1

    # 点击第三个按钮的回调函数
    def on_button_pressed3(self, widget):
        global mod
        mod = 2

# 获取摄像头ID
def get_cap_id():
    try:
        # 构造命令，使用awk处理输出
        cmd = "ls -l /sys/class/video4linux | awk -F ' -> ' '/usb/{sub(/.*video/, \"\", $2); print $2}'"
        result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
        output = result.stdout.strip().split()
        # 转换所有捕获的编号为整数，找出最小值
        video_numbers = list(map(int, output))
        if video_numbers:
            return min(video_numbers)
        else:
            return None
    except Exception as e:
        print(f"发生错误: {e}")
        return None

# 处理函数，实现人脸变换
def process():
    cvs.setCustomUI()
    # 初始化人脸检测模型
    inShape = [[1, 128, 128, 3]]
    outShape = [[1, 896, 16], [1, 896, 1]]
    model_path = "models/face_detection_front.tflite"
    model = aidlite.Model.create_instance(model_path)
    if model is None:
        print("创建face_detection_front模型失败！")
    model.set_model_properties(inShape, aidlite.DataType.TYPE_FLOAT32, outShape, aidlite.DataType.TYPE_FLOAT32)
    config = aidlite.Config.create_instance()
    config.implement_type = aidlite.ImplementType.TYPE_FAST
    config.framework_type = aidlite.FrameworkType.TYPE_TFLITE
    config.accelerate_type = aidlite.AccelerateType.TYPE_CPU
    config.number_of_threads = 4
    fast_interpreter = aidlite.InterpreterBuilder.build_interpretper_from_model_and_config(model, config)
    if fast_interpreter is None:
        print("face_detection_front模型build_interpretper_from_model_and_config失败！")
    result = fast_interpreter.init()
    if result != 0:
        print("face_detection_front模型解释器初始化失败！")
    result = fast_interpreter.load_model()
    if result != 0:
        print("face_detection_front模型解释器加载模型失败！")
    print("face_detection_front模型加载成功！")
    # 初始化人脸关键点检测模型
    model_path1 = "models/face_landmark.tflite"
    inShape1 = [[1 * 192 * 192 * 3]]
    outShape1 = [[1 * 1404 * 4], [1 * 4]]
    model1 = aidlite.Model.create_instance(model_path1)
    if model1 is None:
        print("创建face_landmark模型失败！")
    model1.set_model_properties(inShape1, aidlite.DataType.TYPE_FLOAT32, outShape1, aidlite.DataType.TYPE_FLOAT32)
    config1 = aidlite.Config.create_instance()
    config1.implement_type = aidlite.ImplementType.TYPE_FAST
    config1.framework_type = aidlite.FrameworkType.TYPE_TFLITE
    config1.accelerate_type = aidlite.AccelerateType.TYPE_GPU
    config1.number_of_threads = 4
    fast_interpreter1 = aidlite.InterpreterBuilder.build_interpretper_from_model_and_config(model1, config1)
    if fast_interpreter1 is None:
        print("face_landmark模型build_interpretper_from_model_and_config失败！")
    result = fast_interpreter1.init()
    if result != 0:
        print("face_landmark模型解释器初始化失败！")
    result = fast_interpreter1.load_model()
    if result != 0:
        print("face_landmark模型解释器加载模型失败！")
    print("face_landmark模型加载成功！")
    # 加载锚点
    anchors = np.load('models/anchors.npy').astype(np.float32)
    # 0-后置，1-前置
    camid = 1
    capId = get_cap_id()
    if capId is None:
        print("使用MIPI摄像头")
    else:
        print("使用USB摄像头")
        camid = -1
    cap = cvs.VideoCapture(camid)
    bFace = False
    x_min, y_min, x_max, y_max = (0, 0, 0, 0)
    fface = 0.0
    global bfirstframe
    bfirstframe = True
    facepath = "Biden.jpeg"
    global faceimg
    faceimg = cv2.resize(faceimg, (256, 256))
    roi_orifirst = faceimg
    padfaceimg = faceimg
    fpoints = []
    spoints = []
    global mod
    mod = -1
    while True:
        # 读取帧
        frame = cvs.read()
        if frame is None:
            continue
        if camid == 1:
            frame = cv2.flip(frame, 1)
        if mod > -1 or bfirstframe:
            x_min, y_min, x_max, y_max = (0, 0, 0, 0)
            faceimg = cv2.resize(faceimg, (256, 256))
            frame = faceimg
            bFace = False
            roi_orifirst = faceimg
            padfaceimg = faceimg
            bfirstframe = True
            fpoints = []
            spoints = []
        # 记录开始时间
        start_time = time.time()
        # 对图像进行填充和预处理
        img_pad, img, pad = preprocess_img_pad(frame, 128)
        if bFace == False:
            # 设置输入数据
            result = fast_interpreter.set_input_tensor(0, img.data)
            if result != 0:
                print("face_detection_front模型解释器set_input_tensor()失败")
            # 执行推理
            result = fast_interpreter.invoke()
            if result != 0:
                print("face_detection_front模型解释器invoke()失败")
            # 获取输出数据
            raw_boxes = fast_interpreter.get_output_tensor(0)
            if raw_boxes is None:
                print("示例: face_detection_front模型解释器->get_output_tensor(0)失败！")
            classificators = fast_interpreter.get_output_tensor(1)
            if classificators is None:
                print("示例: face_detection_front模型解释器->get_output_tensor(1)失败！")
            # 进行人脸检测
            detections = blazeface(raw_boxes, classificators, anchors)[0]
            if len(detections) > 0:
                bFace = True
        if bFace:
            for i in range(len(detections)):
                # 计算人脸框的坐标
                ymin = detections[i][0] * img_pad.shape[0]
                xmin = detections[i][1] * img_pad.shape[1]
                ymax = detections[i][2] * img_pad.shape[0]
                xmax = detections[i][3] * img_pad.shape[1]
                w = int(xmax - xmin)
                h = int(ymax - ymin)
                h = max(w, h)
                h = h * 1.5
                x = (xmin + xmax) / 2.
                y = (ymin + ymax) / 2.
                xmin = x - h / 2.
                xmax = x + h / 2.
                ymin = y - h / 2.
                ymax = y + h / 2.
                ymin = y - h / 2. - 0.08 * h
                ymax = y + h / 2. - 0.08 * h
                x_min = int(xmin)
                y_min = int(ymin)
                x_max = int(xmax)
                y_max = int(ymax)
                x_min = max(0, x_min)
                y_min = max(0, y_min)
                x_max = min(img_pad.shape[1], x_max)
                y_max = min(img_pad.shape[0], y_max)
                roi_ori = img_pad[y_min:y_max, x_min:x_max]
                roi = preprocess_image_for_tflite32(roi_ori, 192)
                # 设置输入数据
                result = fast_interpreter1.set_input_tensor(0, roi.data)
                if result != 0:
                    print("face_landmark模型解释器set_input_tensor()失败")
                # 执行推理
                result = fast_interpreter1.invoke()
                if result != 0:
                    print("face_landmark模型解释器invoke()失败")
                # 获取输出数据
                mesh = fast_interpreter1.get_output_tensor(0)
                if mesh is None:
                    print("示例: face_landmark模型解释器->get_output_tensor(0)失败！")
                stride8 = fast_interpreter1.get_output_tensor(1)
                if stride8 is None:
                    print("示例: face_landmark模型解释器->get_output_tensor(1)失败！")
                ffacetmp = stride8[0]
                print('fface:', abs(fface - ffacetmp))
                if abs(fface - ffacetmp) > 0.5:
                    bFace = False
                fface = ffacetmp
                spoints = []
                mesh = mesh.reshape(468, 3) / 192
                if bfirstframe:
                    # 获取关键点
                    getkeypoint(roi_ori, mesh, fpoints)
                    roi_orifirst = roi_ori.copy()
                    bfirstframe = False
                    mod = -1
                else:
                    # 获取关键点
                    getkeypoint(roi_ori, mesh, spoints)
                    # 进行人脸变换
                    roi_ori = faceswap(fpoints, spoints, roi_orifirst, roi_ori)
                    if roi_ori is None:
                        continue
                    img_pad[y_min:y_max, x_min:x_max] = roi_ori
                shape = frame.shape
                x, y = img_pad.shape[0] / 2, img_pad.shape[1] / 2
                frame = img_pad[int(y - shape[0] / 2):int(y + shape[0] / 2), int(x - shape[1] / 2):int(x + shape[1] / 2)]
        # 计算处理时间
        t = (time.time() - start_time)
        lbs = 'Fps: ' + str(int(100 / t) / 100.) + " ~~ Time:" + str(t * 1000) + "ms"
        cvs.setLbs(lbs)
        # 显示帧
        cvs.imshow(frame)
        # 休眠1毫秒
        time.sleep(0.001)

if __name__ == '__main__':
    initcv(startcv, MyApp)
    process()

高通手机跑AI系列之——人脸变化算法

环境准备

手机

软件

算法Demo

代码功能详解

整体架构

核心功能解析

模型作用分析

应用场景

技术特点与优势

示例代码

网站公告

今日签到

热门文章

最新发布