注意:本文样例图片为了避免侵权,均使用AIGC生成;
AI追色是像素蛋糕软件中比较受欢迎的一个功能点,本文将针对AI追色来解析一下大概的技术原理。
功能分析
AI追色实际上可以理解为颜色迁移的一种变体或者叫做升级版,主要用法将一张目标图像的颜色信息迁移到一张用户图像,得到类似颜色风格的效果图。在像素蛋糕中,界面如下所示:
像素蛋糕中的AI追色分为全图追色和局部追色两种模式。
全图追色如上面的界面图所示,包含三个参数:整体效果的程度参数、影调参数和色彩参数;
局部追色包含了主体和背景以及更为细分的面部皮肤/身体皮肤/头发/眉毛/眼白/眼瞳/嘴唇/牙齿/口腔/衣服/关联物品,每个追色功能都包含两个参数:影调和色彩控制参数;如下图所示:
像素蛋糕全图追色效果如下:
通过上面的界面以及功能测试验证,我们总结一下像素蛋糕AI追色的特点:
1.所谓追色主要包含两个功能:影调和色彩。影调可以理解为光影,跟亮度相关,色彩即颜色。
2.AI追色整体上划分为背景+主体两个部分,主体就是人像区域(人像分割或抠像区域)。
3.主体区域可进行细分调节,目前包含面部皮肤/身体皮肤/头发/眉毛/眼白/眼瞳/嘴唇/牙齿/口腔/衣服/关联物品 11个细分区域的追色调节。
算法方案
像素蛋糕官宣使用了AI追色技术,也就是影调和颜色迁移可能使用了AI模型,整体上基于人像分割+精细化的人体属性分割来构建了AI追色技术,总结一下就是颜色迁移+分割,即基于目标分割区域的颜色迁移。这一类技术很早之前就有相关论文和专利存在,并非新技术,只是像素蛋糕将其落地在了影楼修图的具体场景,结合对应的包装和推广,因而达到了比较好的市场反响。
这里,从个人角度,介绍一种比较简单的实现,来模拟像素蛋糕的功能和效果,注意,只是模拟,毕竟AI模型是黑盒,想要真正复刻基本上不太可能。
本人方案如下:
1.基于经典的颜色迁移算法论文《color transfer between images》进行修改,将算法修改为支持影调和颜色独立调节的追色算法;这里我们将影调简化为亮度调节,实际上大家可以自行探索对应的影调调节算法,对应于经典论文里的L和AB分别调节;
2.添加目标分割,将追色算法支持对应目标区域的颜色+影调迁移;这里测试,我们选择背景+皮肤区域+其他三个分割区域来进行验证;
方案验证:
注意:本文通常使用C进行算法验证;
1.实现经典的color transfer between images代码:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
// RGB到XYZ颜色空间转换的系数矩阵
#define RGB_TO_XYZ_MATRIX \
0.412453, 0.357580, 0.180423, \
0.212671, 0.715160, 0.072169, \
0.019334, 0.119193, 0.950227
// XYZ到RGB颜色空间转换的系数矩阵
#define XYZ_TO_RGB_MATRIX \
3.240479, -1.537150, -0.498535, \
-0.969256, 1.875992, 0.041556, \
0.055648, -0.204043, 1.057311
// 参考白点(D65)
#define REF_X 0.95047
#define REF_Y 1.00000
#define REF_Z 1.08883
// RGB到Lab颜色空间转换
void rgb_to_lab(double r, double g, double b, double *L, double *a, double *b_) {
// 转换sRGB到线性RGB
double r_lin = r > 0.04045 ? pow((r + 0.055) / 1.055, 2.4) : r / 12.92;
double g_lin = g > 0.04045 ? pow((g + 0.055) / 1.055, 2.4) : g / 12.92;
double b_lin = b > 0.04045 ? pow((b + 0.055) / 1.055, 2.4) : b / 12.92;
// RGB到XYZ转换
double X = r_lin * 0.412453 + g_lin * 0.357580 + b_lin * 0.180423;
double Y = r_lin * 0.212671 + g_lin * 0.715160 + b_lin * 0.072169;
double Z = r_lin * 0.019334 + g_lin * 0.119193 + b_lin * 0.950227;
// XYZ归一化
X /= REF_X;
Y /= REF_Y;
Z /= REF_Z;
// XYZ到Lab转换
double fX = X > 0.008856 ? pow(X, 1.0/3.0) : (7.787 * X) + (16.0/116.0);
double fY = Y > 0.008856 ? pow(Y, 1.0/3.0) : (7.787 * Y) + (16.0/116.0);
double fZ = Z > 0.008856 ? pow(Z, 1.0/3.0) : (7.787 * Z) + (16.0/116.0);
*L = (116.0 * fY) - 16.0;
*a = 500.0 * (fX - fY);
*b_ = 200.0 * (fY - fZ);
}
// Lab到RGB颜色空间转换
void lab_to_rgb(double L, double a, double b_, double *r, double *g, double *b) {
// Lab到XYZ转换
double fY = (L + 16.0) / 116.0;
double fX = a / 500.0 + fY;
double fZ = fY - b_ / 200.0;
double Y = pow(fY, 3.0) > 0.008856 ? pow(fY, 3.0) : (fY - 16.0/116.0) / 7.787;
double X = pow(fX, 3.0) > 0.008856 ? pow(fX, 3.0) : (fX - 16.0/116.0) / 7.787;
double Z = pow(fZ, 3.0) > 0.008856 ? pow(fZ, 3.0) : (fZ - 16.0/116.0) / 7.787;
// 归一化
X *= REF_X;
Y *= REF_Y;
Z *= REF_Z;
// XYZ到RGB转换
double r_lin = X * 3.240479 + Y * -1.537150 + Z * -0.498535;
double g_lin = X * -0.969256 + Y * 1.875992 + Z * 0.041556;
double b_lin = X * 0.055648 + Y * -0.204043 + Z * 1.057311;
// 线性RGB到sRGB转换
*r = r_lin > 0.0031308 ? 1.055 * pow(r_lin, 1.0/2.4) - 0.055 : 12.92 * r_lin;
*g = g_lin > 0.0031308 ? 1.055 * pow(g_lin, 1.0/2.4) - 0.055 : 12.92 * g_lin;
*b = b_lin > 0.0031308 ? 1.055 * pow(b_lin, 1.0/2.4) - 0.055 : 12.92 * b_lin;
// 限制在0-1范围内
*r = (*r < 0.0) ? 0.0 : (*r > 1.0) ? 1.0 : *r;
*g = (*g < 0.0) ? 0.0 : (*g > 1.0) ? 1.0 : *g;
*b = (*b < 0.0) ? 0.0 : (*b > 1.0) ? 1.0 : *b;
}
// 计算Lab颜色空间的均值和标准差
void compute_lab_stats(unsigned char* bgraData, int width, int height, int stride,
double *meanL, double *meana, double *meanb,
double *stdL, double *stda, double *stdb) {
double sumL = 0.0, suma = 0.0, sumb = 0.0;
double sumL2 = 0.0, suma2 = 0.0, sumb2 = 0.0;
int count = 0;
for (int y = 0; y < height; y++) {
unsigned char* row = bgraData + y * stride;
for (int x = 0; x < width; x++) {
unsigned char* pixel = row + x * 4;
double b = pixel[0] / 255.0;
double g = pixel[1] / 255.0;
double r = pixel[2] / 255.0;
double L, a, b_;
rgb_to_lab(r, g, b, &L, &a, &b_);
sumL += L;
suma += a;
sumb += b_;
sumL2 += L * L;
suma2 += a * a;
sumb2 += b_ * b_;
count++;
}
}
if (count == 0) {
*meanL = *meana = *meanb = 0.0;
*stdL = *stda = *stdb = 0.0;
return;
}
// 计算均值
*meanL = sumL / count;
*meana = suma / count;
*meanb = sumb / count;
// 计算标准差
*stdL = sqrt((sumL2 - sumL * sumL / count) / (count - 1));
*stda = sqrt((suma2 - suma * suma / count) / (count - 1));
*stdb = sqrt((sumb2 - sumb * sumb / count) / (count - 1));
}
// 颜色迁移主函数
void color_transfer(unsigned char* bgraData, int width, int height, int stride,
unsigned char* referenceData, int mWidth, int mHeight, int mStride, int ratio) {
if (!bgraData || !referenceData || width <= 0 || height <= 0 || mWidth <= 0 || mHeight <= 0 ||
ratio < 0 || ratio > 100) {
return;
}
// 目标图像的统计量
double tgt_meanL, tgt_meana, tgt_meanb;
double tgt_stdL, tgt_stda, tgt_stdb;
compute_lab_stats(bgraData, width, height, stride,
&tgt_meanL, &tgt_meana, &tgt_meanb,
&tgt_stdL, &tgt_stda, &tgt_stdb);
// 参考图像的统计量
double ref_meanL, ref_meana, ref_meanb;
double ref_stdL, ref_stda, ref_stdb;
compute_lab_stats(referenceData, mWidth, mHeight, mStride,
&ref_meanL, &ref_meana, &ref_meanb,
&ref_stdL, &ref_stda, &ref_stdb);
// 应用迁移强度比例
double scale = ratio / 100.0;
// 对每个像素进行颜色迁移
for (int y = 0; y < height; y++) {
unsigned char* row = bgraData + y * stride;
for (int x = 0; x < width; x++) {
unsigned char* pixel = row + x * 4;
double b = pixel[0] / 255.0;
double g = pixel[1] / 255.0;
double r = pixel[2] / 255.0;
// 转换到Lab空间
double L, a, b_;
rgb_to_lab(r, g, b, &L, &a, &b_);
// 应用颜色迁移
double newL = ref_meanL + (ref_stdL / (tgt_stdL > 0 ? tgt_stdL : 1.0)) * (L - tgt_meanL);
double newa = ref_meana + (ref_stda / (tgt_stda > 0 ? tgt_stda : 1.0)) * (a - tgt_meana);
double newb = ref_meanb + (ref_stdb / (tgt_stdb > 0 ? tgt_stdb : 1.0)) * (b_ - tgt_meanb);
// 转换回RGB空间
double new_r, new_g, new_b;
newL = newL * scale + (1.0 - scale) * L;
newa = newa * scale + (1.0 - scale) * a;
newb = newb * scale + (1.0 - scale) * b;
lab_to_rgb(newL, newa, newb, &new_r, &new_g, &new_b);
// 更新像素值
pixel[0] = (unsigned char)(new_b * 255.0);
pixel[1] = (unsigned char)(new_g * 255.0);
pixel[2] = (unsigned char)(new_r * 255.0);
}
}
}
效果测试:
可看到直接使用经典算法,效果不怎么友好,与像素蛋糕差距也是相当大的;
2.我们改进为支持目标分割Mask区域的颜色+亮度迁移,同时添加亮度(影调)luminance_ratio和颜色控制参数chrominance_ratio:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
// RGB到XYZ颜色空间转换的系数矩阵
#define RGB_TO_XYZ_MATRIX \
0.412453, 0.357580, 0.180423, \
0.212671, 0.715160, 0.072169, \
0.019334, 0.119193, 0.950227
// XYZ到RGB颜色空间转换的系数矩阵
#define XYZ_TO_RGB_MATRIX \
3.240479, -1.537150, -0.498535, \
-0.969256, 1.875992, 0.041556, \
0.055648, -0.204043, 1.057311
// 参考白点(D65)
#define REF_X 0.95047
#define REF_Y 1.00000
#define REF_Z 1.08883
// RGB到Lab颜色空间转换
void rgb_to_lab(double r, double g, double b, double *L, double *a, double *b_) {
// 转换sRGB到线性RGB
double r_lin = r > 0.04045 ? pow((r + 0.055) / 1.055, 2.4) : r / 12.92;
double g_lin = g > 0.04045 ? pow((g + 0.055) / 1.055, 2.4) : g / 12.92;
double b_lin = b > 0.04045 ? pow((b + 0.055) / 1.055, 2.4) : b / 12.92;
// RGB到XYZ转换
double X = r_lin * 0.412453 + g_lin * 0.357580 + b_lin * 0.180423;
double Y = r_lin * 0.212671 + g_lin * 0.715160 + b_lin * 0.072169;
double Z = r_lin * 0.019334 + g_lin * 0.119193 + b_lin * 0.950227;
// XYZ归一化
X /= REF_X;
Y /= REF_Y;
Z /= REF_Z;
// XYZ到Lab转换
double fX = X > 0.008856 ? pow(X, 1.0/3.0) : (7.787 * X) + (16.0/116.0);
double fY = Y > 0.008856 ? pow(Y, 1.0/3.0) : (7.787 * Y) + (16.0/116.0);
double fZ = Z > 0.008856 ? pow(Z, 1.0/3.0) : (7.787 * Z) + (16.0/116.0);
*L = (116.0 * fY) - 16.0;
*a = 500.0 * (fX - fY);
*b_ = 200.0 * (fY - fZ);
}
// Lab到RGB颜色空间转换
void lab_to_rgb(double L, double a, double b_, double *r, double *g, double *b) {
// Lab到XYZ转换
double fY = (L + 16.0) / 116.0;
double fX = a / 500.0 + fY;
double fZ = fY - b_ / 200.0;
double Y = pow(fY, 3.0) > 0.008856 ? pow(fY, 3.0) : (fY - 16.0/116.0) / 7.787;
double X = pow(fX, 3.0) > 0.008856 ? pow(fX, 3.0) : (fX - 16.0/116.0) / 7.787;
double Z = pow(fZ, 3.0) > 0.008856 ? pow(fZ, 3.0) : (fZ - 16.0/116.0) / 7.787;
// 归一化
X *= REF_X;
Y *= REF_Y;
Z *= REF_Z;
// XYZ到RGB转换
double r_lin = X * 3.240479 + Y * -1.537150 + Z * -0.498535;
double g_lin = X * -0.969256 + Y * 1.875992 + Z * 0.041556;
double b_lin = X * 0.055648 + Y * -0.204043 + Z * 1.057311;
// 线性RGB到sRGB转换
*r = r_lin > 0.0031308 ? 1.055 * pow(r_lin, 1.0/2.4) - 0.055 : 12.92 * r_lin;
*g = g_lin > 0.0031308 ? 1.055 * pow(g_lin, 1.0/2.4) - 0.055 : 12.92 * g_lin;
*b = b_lin > 0.0031308 ? 1.055 * pow(b_lin, 1.0/2.4) - 0.055 : 12.92 * b_lin;
// 限制在0-1范围内
*r = (*r < 0.0) ? 0.0 : (*r > 1.0) ? 1.0 : *r;
*g = (*g < 0.0) ? 0.0 : (*g > 1.0) ? 1.0 : *g;
*b = (*b < 0.0) ? 0.0 : (*b > 1.0) ? 1.0 : *b;
}
// 计算Lab颜色空间的均值和标准差
void compute_lab_stats_mask(unsigned char* bgraData, int width, int height, int stride, unsigned char* maskData,
double *meanL, double *meana, double *meanb,
double *stdL, double *stda, double *stdb) {
double sumL = 0.0, suma = 0.0, sumb = 0.0;
double sumL2 = 0.0, suma2 = 0.0, sumb2 = 0.0;
int count = 0;
for (int y = 0; y < height; y++) {
unsigned char* row = bgraData + y * stride;
unsigned char* maskRow = maskData + y * width;
for (int x = 0; x < width; x++) {
// 使用权重(mask值)来计算加权统计量
double weight = maskRow[x] / 255.0;
if (weight > 0) {
unsigned char* pixel = row + x * 4;
double b = pixel[0] / 255.0;
double g = pixel[1] / 255.0;
double r = pixel[2] / 255.0;
double L, a, b_;
rgb_to_lab(r, g, b, &L, &a, &b_);
sumL += L * weight;
suma += a * weight;
sumb += b_ * weight;
sumL2 += L * L * weight;
suma2 += a * a * weight;
sumb2 += b_ * b_ * weight;
count++;
}
}
}
if (count == 0) {
*meanL = *meana = *meanb = 0.0;
*stdL = *stda = *stdb = 0.0;
return;
}
// 计算加权均值
*meanL = sumL / count;
*meana = suma / count;
*meanb = sumb / count;
// 计算加权标准差
*stdL = sqrt((sumL2 - sumL * sumL / count) / (count - 1));
*stda = sqrt((suma2 - suma * suma / count) / (count - 1));
*stdb = sqrt((sumb2 - sumb * sumb / count) / (count - 1));
}
// 颜色迁移主函数
void color_transfer_mask(unsigned char* bgraData, int width, int height, int stride, unsigned char* maskData,
unsigned char* referenceData, int mWidth, int mHeight, int mStride, unsigned char* referenceMaskData,
int luminance_ratio, int chrominance_ratio) {
if (!bgraData || !referenceData || !maskData || !referenceMaskData ||
width <= 0 || height <= 0 || mWidth <= 0 || mHeight <= 0 ||
luminance_ratio < 0 || luminance_ratio > 100 ||
chrominance_ratio < 0 || chrominance_ratio > 100) {
return;
}
// 目标图像的统计量(使用权重计算)
double tgt_meanL, tgt_meana, tgt_meanb;
double tgt_stdL, tgt_stda, tgt_stdb;
compute_lab_stats_mask(bgraData, width, height, stride, maskData,
&tgt_meanL, &tgt_meana, &tgt_meanb,
&tgt_stdL, &tgt_stda, &tgt_stdb);
// 参考图像的统计量(使用权重计算)
double ref_meanL, ref_meana, ref_meanb;
double ref_stdL, ref_stda, ref_stdb;
compute_lab_stats_mask(referenceData, mWidth, mHeight, mStride, referenceMaskData,
&ref_meanL, &ref_meana, &ref_meanb,
&ref_stdL, &ref_stda, &ref_stdb);
// 应用迁移强度比例(分别控制亮度和色度)
double luminance_scale = luminance_ratio / 100.0;
double chrominance_scale = chrominance_ratio / 100.0;
// 对每个像素进行颜色迁移,根据mask值进行融合
for (int y = 0; y < height; y++) {
unsigned char* row = bgraData + y * stride;
unsigned char* maskRow = maskData + y * width;
for (int x = 0; x < width; x++) {
// 获取mask值并转换为0-1之间的权重
double mask_weight = maskRow[x] / 255.0;
// 原始像素值
unsigned char* pixel = row + x * 4;
double orig_b = pixel[0] / 255.0;
double orig_g = pixel[1] / 255.0;
double orig_r = pixel[2] / 255.0;
// 转换到Lab空间
double L, a, b_;
rgb_to_lab(orig_r, orig_g, orig_b, &L, &a, &b_);
// 应用亮度迁移
double newL = tgt_meanL + (ref_meanL - tgt_meanL) +
(ref_stdL / (tgt_stdL > 0 ? tgt_stdL : 1.0)) * (L - tgt_meanL);
// 应用色度迁移
double newa = tgt_meana + (ref_meana - tgt_meana) +
(ref_stda / (tgt_stda > 0 ? tgt_stda : 1.0)) * (a - tgt_meana);
double newb = tgt_meanb + (ref_meanb - tgt_meanb) +
(ref_stdb / (tgt_stdb > 0 ? tgt_stdb : 1.0)) * (b_ - tgt_meanb);
// 转换回RGB空间
double trans_r, trans_g, trans_b;
newL = newL * luminance_scale + (1.0 - luminance_scale) * L;
newa = newa * chrominance_scale + (1.0 - chrominance_scale) * a;
newb = newb * chrominance_scale + (1.0 - chrominance_scale) * b_;
lab_to_rgb(newL, newa, newb, &trans_r, &trans_g, &trans_b);
// 根据mask权重融合原始颜色和迁移后的颜色
double final_r = orig_r + mask_weight * (trans_r - orig_r);
double final_g = orig_g + mask_weight * (trans_g - orig_g);
double final_b = orig_b + mask_weight * (trans_b - orig_b);
// 限制在0-1范围内
final_r = (final_r < 0.0) ? 0.0 : (final_r > 1.0) ? 1.0 : final_r;
final_g = (final_g < 0.0) ? 0.0 : (final_g > 1.0) ? 1.0 : final_g;
final_b = (final_b < 0.0) ? 0.0 : (final_b > 1.0) ? 1.0 : final_b;
// 更新像素值
pixel[0] = (unsigned char)(final_b * 255.0);
pixel[1] = (unsigned char)(final_g * 255.0);
pixel[2] = (unsigned char)(final_r * 255.0);
}
}
}
3.对原图进行分割,分别获得背景mask+皮肤mask+其他区域mask,如下图所示:
注意,如果不想处理五官,也可以把五官排除;
4.按照”背景-其他-皮肤“的顺序,依次调用2中的颜色迁移算法,并把上一次结果图作为下一次输入图,这样做的目的是减少由于分割不准确导致的边界颜色不自然问题,如果你有精确的分割,那么这一步可以多个mask并行处理。这里测试效果如下:
5.对于4的结果图,我们添加一个全局效果控制参数k,将原图和效果图做alpha融合;
至此,上述步骤完成了追色的简单实现,对应三个参数:全图效果控制参数k,影调调节参数,颜色调节参数,与像素蛋糕对应。对于更多的精细化分割区域,大家可以使用sam等精细的分割来解决,这里没有过多叙述;通过更多的分割区域,我们就可以对每个区域mask设置对应的参数进行效果控制。
最后,放上对比效果图:
注意:在头发区域稍微有蓝色瑕疵,这个是由于分割精度不够导致的,像素蛋糕的分割技术做的还是非常不错的。