将JSON数据还原为PPT文件:Python自动化生成工具详解

发布于:2025-04-02 ⋅ 阅读:(19) ⋅ 点赞:(0)

引言

在上一篇博客中,我们实现了将PPT文件解析为JSON结构的功能。现在,我们将构建其逆向工具——通过JSON数据自动生成PPT文件。这一功能可应用于自动化报告生成、样式复用、数据驱动的PPT创建等场景。本文将详解代码实现与关键步骤。


核心代码解析

1. 颜色与对齐转换函数

将JSON中的十六进制颜色和对齐方式转换为PPT的API可识别的格式。

hex_to_rgb(rgb_tuple)

将十六进制颜色(如#FF0000)转换为RGB对象:

def hex_to_rgb(rgb_tuple):
    if not rgb_tuple:
        return None
    return RGBColor(*rgb_tuple)
get_alignment(alignment_str)

将对齐字符串(如"PP_ALIGN.CENTER")转换为枚举值:

def get_alignment(alignment_str):
    if alignment_str == "PP_ALIGN.LEFT":
        return PP_ALIGN.LEFT
    elif alignment_str == "PP_ALIGN.CENTER":
        return PP_ALIGN.CENTER
    elif alignment_str == "PP_ALIGN.RIGHT":
        return PP_ALIGN.RIGHT
    return PP_ALIGN.LEFT

2. 创建形状:create_shape

根据JSON数据中的形状类型(如文本框、表格、图片),动态创建对应的PPT形状。

关键逻辑:
def create_shape(slide, shape_data):
    shape_type = shape_data["type"]
    left = Emu(shape_data["left"])
    top = Emu(shape_data["top"])
    width = Emu(shape_data["width"])
    height = Emu(shape_data["height"])

    if shape_type == MSO_SHAPE_TYPE.TEXT_BOX:
        shape = slide.shapes.add_textbox(left, top, width, height)
    elif shape_type == MSO_SHAPE_TYPE.TABLE:
        rows = len(shape_data.get("table", []))
        cols = max(len(row) for row in shape_data["table"]) if shape_data.get("table") else 1
        shape = slide.shapes.add_table(rows, cols, left, top, width, height).table
    elif shape_type == MSO_SHAPE_TYPE.PICTURE:
        image_path = "path/to/your/image.png"  # 需替换为实际路径
        shape = slide.shapes.add_picture(image_path, left, top, width, height)
    else:
        shape = slide.shapes.add_shape(
            MSO_SHAPE.RECTANGLE,  # 默认形状
            left,
            top,
            width,
            height
        )
    shape.rotation = shape_data.get("rotation", 0)
    return shape

3. 应用样式:apply_style

将JSON中的填充、边框样式应用到形状上。

填充样式:
fill.type = MSO_FILL.SOLID  # 或BACKGROUND
fill.fore_color.rgb = hex_to_rgb(fill_data["color"])
边框样式:
line.color.rgb = hex_to_rgb(line_data["color"])
line.width = Emu(line_data["width"])
line.dash_style = getattr(MSO_LINE_DASH_STYLE, line_data["dash_style"])

4. 应用文本样式:apply_text_style

根据JSON中的字体、段落设置,构建文本框内容。

示例:
def apply_text_style(text_frame, text_style_data):
    for paragraph_data in text_style_data.get("paragraphs", []):
        paragraph = text_frame.add_paragraph()
        paragraph.text = paragraph_data["text"]
        paragraph.alignment = get_alignment(paragraph_data["alignment"])
        for run_data in paragraph_data["runs"]:
            run = paragraph.add_run()
            run.text = run_data["text"]
            font = run.font
            font.name = run_data["font"]["name"]
            font.size = Pt(run_data["font"]["size"])
            font.bold = run_data["font"].get("bold", False)
            font.color.rgb = hex_to_rgb(run_data["font"]["color"])

5. 主函数:json_to_pptx

读取JSON文件,遍历每页和每个形状,完成PPT重建。

关键步骤:
  1. 创建空白幻灯片slide_layout = prs.slide_layouts[6](索引6对应空白版式)。
  2. 遍历形状数据
    for shape_data in slide_data["shapes"]:
        shape = create_shape(slide, shape_data)
        apply_style(shape, style_data)
        if "text_style" in shape_data:
            apply_text_style(text_frame, shape_data["text_style"])
        if "table" in shape_data:
            # 填充表格内容
            for row_idx, row in enumerate(shape_data["table"]):
                for col_idx, cell_data in enumerate(row):
                    cell = table.cell(row_idx, col_idx)
                    cell.text = cell_data["text"]
    

使用示例

1. 输入JSON结构

假设我们有以下JSON片段(来自上篇博客的输出):

{
  "slides": [
    {
      "shapes": [
        {
          "type": 1,  // MSO_SHAPE_TYPE.TEXT_BOX
          "left": 1143000,
          "top": 1143000,
          "width": 6858000,
          "height": 1683600,
          "fill": {"type": "MSO_FILL.SOLID", "color": "#FF0000"},
          "text_style": {
            "paragraphs": [
              {
                "text": "Hello World",
                "alignment": "PP_ALIGN.CENTER",
                "runs": [
                  {
                    "font": {
                      "name": "Arial",
                      "size": 24,
                      "color": "#FFFFFF"
                    }
                  }
                ]
              }
            ]
          }
        }
      ]
    }
  ]
}

2. 生成PPT

运行代码后,将得到一个包含红色文本框的PPT文件:

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传


关键注意事项

1. 图片路径问题

代码中图片路径为硬编码:

image_path = "path/to/your/image.png"

需根据JSON中的图片信息动态指定路径,或添加图片路径映射逻辑。

2. 表格合并单元格

当前代码仅填充单元格文本,未处理跨行/列合并。需扩展逻辑:

row_span = cell_data.get("row_span", 1)
if row_span > 1:
    cell.merge(table.cell(row_idx + 1, col_idx))

3. 形状类型兼容性

  • 未支持的形状:如线条、箭头等需扩展create_shape
  • 默认形状:非文本框/表格/图片的形状默认为矩形。

应用场景

  1. 自动化报告生成
    结合数据库或API数据,动态生成标准化报告(如财务月报、项目进度)。

  2. 样式复用
    将PPT模板解析为JSON后,可快速生成符合规范的新PPT。

  3. 内容迁移
    将老旧PPT内容迁移到新模板,或跨平台导出(如从PPTX到Google Slides)。


完整工具链演示

通过上篇博客的解析功能和本篇的生成功能,可实现PPT ↔ JSON的双向转换

# 步骤1:解析现有PPT为JSON
python parse_pptx.py input.pptx > parsed.json

# 步骤2:修改JSON数据
# 例如:修改文本内容、调整样式

# 步骤3:生成新PPT
python generate_pptx.py parsed.json > output.pptx

总结

通过本文的代码,开发者可将结构化JSON数据还原为PPT文件,实现自动化内容生成与样式复用。结合上篇解析功能,这一工具链可应用于:

  • 数据驱动的PPT创建:根据实时数据生成动态报告。
  • 样式标准化:确保所有PPT符合企业模板规范。
  • 版本控制:将PPT内容纳入Git等版本控制系统。

未来可进一步扩展功能,例如:

  • 支持更多形状类型(如线条、SmartArt)。
  • 智能布局调整:根据内容自适应排版。
  • API集成:与AI模型结合,生成内容并直接渲染为PPT。

通过Python和python-pptx库,PPT的自动化处理从未如此灵活!

from pptx import Presentation
from pptx.util import Emu, Pt
from pptx.enum.shapes import MSO_SHAPE, MSO_SHAPE_TYPE
from pptx.enum.text import PP_ALIGN
from pptx.enum.dml import MSO_FILL, MSO_LINE_DASH_STYLE
from pptx.dml.color import RGBColor
import json


def hex_to_rgb(rgb_tuple):
    if not rgb_tuple:
        return None
    return RGBColor(*rgb_tuple)


def get_alignment(alignment_str):
    if alignment_str == "PP_ALIGN.LEFT":
        return PP_ALIGN.LEFT
    elif alignment_str == "PP_ALIGN.CENTER":
        return PP_ALIGN.CENTER
    elif alignment_str == "PP_ALIGN.RIGHT":
        return PP_ALIGN.RIGHT
    return PP_ALIGN.LEFT


def create_shape(slide, shape_data):
    shape_type = shape_data["type"]
    left = Emu(shape_data["left"])
    top = Emu(shape_data["top"])
    width = Emu(shape_data["width"])
    height = Emu(shape_data["height"])

    if shape_type == MSO_SHAPE_TYPE.TEXT_BOX:
        shape = slide.shapes.add_textbox(left, top, width, height)
    elif shape_type == MSO_SHAPE_TYPE.TABLE:
        rows = len(shape_data.get("table", []))
        cols = max(len(row) for row in shape_data["table"]) if shape_data.get("table") else 1
        shape = slide.shapes.add_table(rows, cols, left, top, width, height).table
    elif shape_type == MSO_SHAPE_TYPE.PICTURE:
        image_path = "path/to/your/image.png"  # 替换为实际图片路径
        shape = slide.shapes.add_picture(image_path, left, top, width, height)
    else:
        shape = slide.shapes.add_shape(
            MSO_SHAPE.RECTANGLE,  # 修正后的默认形状类型
            left,
            top,
            width,
            height
        )

    shape.rotation = shape_data.get("rotation", 0)
    return shape


def apply_style(shape, style_data):
    fill_data = style_data["fill"]
    line_data = style_data["line"]

    # 填充样式
    fill = shape.fill
    fill_type_str = fill_data["type"]
    if fill_type_str == "MSO_FILL.SOLID":
        fill.type = MSO_FILL.SOLID
        fill.fore_color.rgb = hex_to_rgb(fill_data["color"])
    elif fill_type_str == "MSO_FILL.BACKGROUND":
        fill.type = MSO_FILL.BACKGROUND

    # 边框样式
    line = shape.line
    color = hex_to_rgb(line_data["color"])
    if color:
        line.color.rgb = color
    line.width = Emu(line_data["width"])
    # 处理虚线样式(示例)
    if line_data.get("dash_style"):
        dash_style_str = line_data["dash_style"]
        line.dash_style = getattr(MSO_LINE_DASH_STYLE, dash_style_str.split(" ")[0])


def apply_text_style(text_frame, text_style_data):
    for paragraph_data in text_style_data.get("paragraphs", []):
        paragraph = text_frame.add_paragraph()
        paragraph.text = paragraph_data["text"]
        paragraph.level = paragraph_data.get("level", 0)
        paragraph.alignment = get_alignment(paragraph_data.get("alignment"))

        for run_data in paragraph_data.get("runs", []):
            run = paragraph.add_run()
            run.text = run_data["text"]
            font = run.font
            font.name = run_data["font"]["name"]
            font.size = Pt(run_data["font"]["size"]) if run_data["font"]["size"] else None
            font.bold = run_data["font"].get("bold", False)
            font.italic = run_data["font"].get("italic", False)
            font.color.rgb = hex_to_rgb(run_data["font"].get("color"))


def json_to_pptx(json_path, output_pptx):
    prs = Presentation()
    with open(json_path, "r", encoding="utf-8") as f:
        data = json.load(f)

    for slide_data in data["slides"]:
        slide_layout = prs.slide_layouts[6]  # 使用空白版式
        slide = prs.slides.add_slide(slide_layout)

        for shape_data in slide_data["shapes"]:
            shape = create_shape(slide, shape_data)
            apply_style(shape, {
                "fill": shape_data["fill"],
                "line": shape_data["line"]
            })

            if "text_style" in shape_data and hasattr(shape, "text_frame"):
                text_frame = shape.text_frame
                apply_text_style(text_frame, shape_data["text_style"])

            if "table" in shape_data and hasattr(shape, "table"):
                table = shape.table
                for row_idx, row in enumerate(shape_data["table"]):
                    for col_idx, cell_data in enumerate(row):
                        cell = table.cell(row_idx, col_idx)
                        cell.text = cell_data["text"]
                        row_span = cell_data.get("row_span", 1)
                        if row_span > 1:
                            cell.merge(table.cell(row_idx + 1, col_idx))

    prs.save(output_pptx)


if __name__ == "__main__":
    input_json = "presentation_info.json"
    output_pptx = "reconstructed.pptx"
    json_to_pptx(input_json, output_pptx)