Study：day13-数据可视化之Matplotlib模块-EW帮帮网

文章目录

Study：day13-数据可视化之Matplotlib模块

Study：day13-数据可视化之Matplotlib模块

昨日内容:

Scienceplot主题的使用
主题的使用方法:
- 全局使用: plt.style.use
- 局部使用: with plt.style.context
字体从处理:
- 方案1: 全局字体的设定: plt.rcParams[font.sans-serif] = ‘字体’
- 全局字体位置在什么地方找
- 方案2: 全局字体+局部字体设定(重要)
- 方案3: 调用LaTeX绘制图表(了解)
图表的存储:
- 存储的格式: png/pdf/svg
- 存储DPI设定(200-300)
- 图表裁剪白边
- 图表背景色设置透明
画布的创建: plt.figure
- 画布的大小
- 画布的分辨率
子图的创建:
- plt.subplots(重要)
- plt.subplot(重要)
- plt.axes(了解)
基本折线图的绘制:
- 折线图绘制方法:
- 折线图参数设定
- plt.fill_bewtten:填充
- plt.legend图例参数的设定

1.1 折线图进阶

1.1.1 进阶折线图-双Y轴折线图

import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import sys

print('Python version:', sys.version)
print('Pandas version:', pd.__version__)
print('Numpy version:', np.__version__)
print('Matplotlib version:', matplotlib.__version__)

Python version: 3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:20:11) [MSC v.1938 64 bit (AMD64)]
Pandas version: 2.2.2
Numpy version: 1.26.4
Matplotlib version: 3.9.2

data = pd.read_csv('./data/economics.csv')

data.head()

	date	pce	pop	psavert	uempmed	unemploy
0	1967-07-01	507.4	198712	12.5	4.5	2944
1	1967-08-01	510.5	198911	12.5	4.7	2945
2	1967-09-01	516.3	199113	11.7	4.6	2958
3	1967-10-01	512.9	199311	12.5	4.9	3143
4	1967-11-01	518.1	199498	12.5	4.7	3066

import matplotx

plt.rcParams['font.sans-serif'] = ['STsong']
plt.rcParams['font.size'] = 15
plt.style.use(matplotx.styles.pitaya_smoothie['light'])

fig = plt.figure(figsize=(16, 9), dpi=100)

plt.plot(data['date'], data['psavert'], label='个人存储率', color='#ff7f0e', linewidth=1.2, linestyle='--')
plt.ylim(0, 18)
plt.ylabel('个人存储率', color='#ff7f0e')
plt.tick_params(axis='y', colors='#ff7f0e')  # 设置y轴刻度标签的颜色

plt.title('个人存储率与失业人数对比折线图')
# plt.legend(loc='upper right',ncols=2,labelcolor='linecolor')
plt.twinx()  # 让两个图标共享x轴

plt.plot(data['date'], data['unemploy'], label='失业人数', color='red', linewidth=1.2)
plt.ylim(0, 18000)
plt.ylabel('失业人数', color='red')
plt.tick_params(axis='y', colors='red')  # 设置y轴刻度标签的颜色

plt.xticks(np.arange(0, len(data), 40), rotation=45)

# plt.legend(loc='upper left',ncols=2,labelcolor='linecolor')

fig.legend(loc=1, prop={'family': 'STsong'}, bbox_to_anchor=(0.26, 0.87), labelcolor='linecolor', shadow=True,
           fancybox=True, frameon=True)
plt.show()

在这里插入图片描述

共享x轴,plt.twinx 可以开启共享x轴,在这个前后分别是两个图表,图一用左边的Y轴,图二用右边的y轴
图例的设定,对于一个画布上两个单独的图像,图例设定可以使用画布中的图例对象进行添加,
坐标轴刻度标签的设定,plt.tick_params的使用
图表中折线颜色一致
x轴,y轴,标题,图例,这些都没有
字体没处理
网格线没有对齐
坐标轴很混乱,不知道那个坐标轴对应的是那个折线

1.1.2 进阶折线图-双子图

# 创建一个画布，设置尺寸 20x9 英寸，分辨率 100 DPI
plt.figure(figsize=(20, 9), dpi=100)

# 创建第一个子图 (2行1列中的第1个)
plt.subplot(211)
# 绘制个人存储率折线图
plt.plot(data['date'], data['psavert'], label='个人存储率',
         color='red', linewidth=1.2, linestyle=(0, (3, 1, 1, 1, 1, 1)))

# 设置 x 轴刻度标签方向和间距
plt.tick_params(axis='x', direction='out', pad=14)  # x 轴刻度标签向外，间距 14
plt.xticks(np.arange(0, data.shape[0], 50))  # x 轴刻度每隔 50 个显示一次

# 设置 y 轴范围为 0 到 18
plt.ylim(0, 18)
# 设置 y 轴标签和颜色
plt.ylabel('个人存储率', fontdict={'family': 'STsong'}, color='red')
# 设置 y 轴刻度标签颜色和方向
plt.tick_params(axis='y', labelcolor='red', direction='in')
# 显示网格线
plt.grid()

# 设置图表标题
plt.title('个人存储率与失业人数对比折线图', fontdict={'family': 'STsong', 'size': 20})

# 创建第二个子图 (2行1列中的第2个)
plt.subplot(212)
# 绘制失业人数折线图
plt.plot(data['date'], data['unemploy'], label='失业人数',
         color='blue', linewidth=1.2, linestyle=(0, (5, 1)))

# 设置 x 轴刻度标签方向和间距
plt.xticks(np.arange(0, data.shape[0], 50))  # x 轴刻度每隔 50 个显示一次
plt.ylim(0, 18000)  # 设置 y 轴范围为 0 到 18000
# 设置 y 轴标签和颜色
plt.ylabel('失业人数', fontdict={'family': 'STsong'}, color='blue')
# 设置 y 轴刻度标签颜色和方向
plt.tick_params(axis='y', labelcolor='blue', direction='in')
# 设置 x 轴刻度标签方向和位置
plt.tick_params(axis='x', bottom=False, labelbottom=False, direction='out', top=True)
# 显示网格线
plt.grid()

# 显示图表
plt.show()

在这里插入图片描述

1.1.3 进阶折线图-堆叠面积图

data = pd.read_csv('./data/store.csv')

data.head()

	Day	Email	Union Ads	Video Ads	Direct	Search Engine
0	周一	120	220	150	320	820
1	周二	132	182	232	332	932
2	周三	101	191	201	301	901
3	周四	134	234	154	334	934
4	周五	90	290	190	390	1290

第一条折线很轻松可以绘制出来
第二条折线: 是第二条折线本身+第一条折线的数据.
第三条折线图: 前三条折线图数据之后
…

第N条折线图,数据应该为前N条数据之和

data.iloc[:, 1:] = data.iloc[:, 1:].cumsum(axis=1)

lists = list(zip([0, 'Email', 'Union Ads', 'Video Ads', 'Direct'],
                 ['Email', 'Union Ads', 'Video Ads', 'Direct', 'Search Engine']
                 ))
colors = ['#91a1df', '#c6e8ad', '#fbe79d', '#f3a19e', '#b2dff4']

plt.figure(figsize=(12, 5), dpi=100)

for index, item in enumerate(data.columns[1:]):
    plt.plot(data['Day'], data[item], lw=1)
    if index == 0:
        plt.fill_between(data['Day'],
                         y1=0,
                         y2=data[lists[index][1]],
                         color=colors[index])
    else:
        plt.fill_between(data['Day'],
                         y1=data[lists[index][0]],
                         y2=data[lists[index][1]],
                         color=colors[index])

    plt.xticks(fontproperties='STsong')
plt.show()

在这里插入图片描述

1.2 柱形图

主要用于多个数据系列对比使用

1.2.1 基础柱形图-单数据系列

data = pd.read_csv('./data/stack_data_bar.csv')

data

	Country	Pensions	Income	Health	Other services
0	Italy	14	5	9	3
1	Spain	8	6	7	8
2	United States	15	4	7	1
3	France	11	5	8	3
4	Germany	9	7	7	2
5	Sweden	6	6	8	4
6	China	7	3	8	1
7	Britain	8	5	6	3

import matplotx

plt.style.use(matplotx.styles.pitaya_smoothie['light'])
plt.rcParams['font.sans-serif'] = ['Times New Roman']
plt.rcParams['font.size'] = 15
plt.figure(figsize=(16, 9), dpi=100)
plt.bar(x=data['Country'],  # x轴数据
        height=data['Pensions'],  # 柱子的高度
        width=0.5,  # 柱子的宽度
        color='#ff7f0e',  # 柱子颜色
        edgecolor='black',  # 柱子边框颜色
        linewidth=1,  # 柱子边框宽度
        alpha=0.8,  # 柱子透明度
        yerr=[0.3] * data.shape[0],  # 添加误差线
        ecolor='red',  # 误差线颜色
        label='Pensions',  # 数据标签  
        align='center',  # 柱子的位置
        # bottom=[2,3,4,6,5,4,2,1]  # 数据的起始位置，一般用于绘制瀑布图
        )
plt.xlabel('Country')
# plt.ylabel('Pensions')
plt.title('The Basic column chart -various parament settings', fontsize=20)
for i in range(len(data)):
    plt.text(data['Country'][i],  # x坐标
             data['Pensions'][i] / 2,  # y坐标
             data['Pensions'][i],  # y的实际值
             ha='center',
             va='bottom'
             )
plt.show()

在这里插入图片描述

每个柱子显示数字
柱子上加折线
柱子的颜色/柱子的变框/边框的粗细/颜色/柱子的宽度/柱子与柱子之间的间距/透明度

1.2.2 柱形图进阶-圆角柱形图

colors = ['#464ca6','#6472ea','#7fc9da','#eec568','#df8342']
from matplotlib.patches import FancyBboxPatch

plt.rcParams['font.sans-serif'] = 'Times New Roman'
plt.rcParams['font.size'] = 14
plt.figure(figsize=(6, 6), dpi=100)
rects = plt.bar(x=data['Country'][:5],  # x轴数据
                height=data['Pensions'][:5],  # 柱子的高度
                width=.4,  # 柱子的宽度
                color=colors,  # 柱子的颜色
                label='Pensions',  # 数据标签
                align='center',  # 柱子的位置
                bottom=[1.5] * 5
                )

ax = plt.gca()  # 获取当前的坐标轴对象
for rect in rects:
    rect.remove()
for index, rect in enumerate(rects):
    bb = rect.get_bbox()
    patch = FancyBboxPatch((bb.xmin, bb.ymin),  # 左下角的坐标
                           abs(bb.width), abs(bb.height),
                           boxstyle="Round, pad=0, rounding_size=0.1",  # 补丁的样式
                           ec=colors[index], fc=colors[index], linewidth=2,
                           mutation_aspect=4,
                           mutation_scale=2.2,
                           )
    ax.add_patch(patch)
ax = plt.gca()
ax.set_facecolor('#1a1c41')
plt.ylim(0, 26)
plt.text(x=-.34, y=21,
         color='white',
         s='Uses Excel, Python, SQL, power BI to share \n the whole process data analysis and correct solutions')
plt.text(x=- 0.34, y=24, s='Doughnut Classification', color='white', fontweight='bold')
plt.tick_params(axis='y', labelleft=False, left=False)
plt.tick_params(axis='x', bottom=False, pad=-18, )
plt.xticks(color='white')
for i in range(5):
    plt.text(data['Country'][i],  # X坐标
             data['Pensions'][i] + 3,  # y坐标
             data['Pensions'][i],  # y的实际值
             ha='center',
             va='bottom',
             color='white'
             )
# plt.savefig('./figure/fig-s-1.png',dpi=200,bbox_inches='tight',pad_inches=0.0)
plt.show()

在这里插入图片描述

# 圆角柱形图设置函数
def get_round_rect(rects, ec='black', fc='#96c8d6'):
    bb = rects.get_bbox()
    patch = FancyBboxPatch((bb.xmin, bb.ymin),  # 左下角的坐标
                           abs(bb.width), abs(bb.height),
                           boxstyle="Round, pad=0, rounding_size=0.05",  # 补丁的样式
                           ec=ec, fc=fc, linewidth=1,
                           mutation_aspect=4,
                           mutation_scale=1,
                           )
    return patch

1.2.3 多系列柱形图

x = np.arange(len(data))

with plt.style.context(matplotx.styles.pitaya_smoothie['light']):
    
    plt.figure(figsize=(16, 9), dpi=100)
    rects1 = plt.bar(x - 0.2, data['Pensions'], width=0.4)
    rects2 = plt.bar(x, data['Income'], align='edge', width=0.4)
    
    ax = plt.gca()
    for rect in zip(rects1, rects2):
        rect[0].remove()
        rect[1].remove()
        patch1 = get_round_rect(rect[0])
        patch2 = get_round_rect(rect[1], fc='#ff7f0e')
        ax.add_patch(patch1)
        ax.add_patch(patch2)
    
    
    plt.xlabel('Country')
    plt.ylabel('Value')
    plt.title('柱形图进阶-分组柱形图',fontdict={'family':'STsong','size':20})
    plt.xticks(ticks=x, labels=data['Country'],)
    plt.xticks(x, labels=data['Country'])
    for i in range(len(data)):
        plt.text(x[i]-0.2,data['Pensions'][i],data['Pensions'][i],ha='center',va='bottom')
        plt.text(x[i]+0.2,data['Income'][i],data['Income'][i],ha='center',va='bottom')
    plt.legend(['Pensions', 'Income'])
    plt.show()

在这里插入图片描述

1.3 柱形图进阶-堆叠柱形图

data = pd.read_csv('./data/奖牌.csv')
x = np.arange(4)
x

array([0, 1, 2, 3])

要求绘制最近的四次奥运会，金牌数量，银牌数量，铜牌数量分组柱形图/堆积柱形图

plt.rcParams['font.sans-serif'] = ['STsong']
plt.rcParams['font.size'] = 15
plt.figure(figsize=(16, 9),dpi=100)
rects1 = plt.bar(x,data['金牌'][6:10],color='red',width=0.2)
rects2 = plt.bar(x+0.1,data['银牌'][6:10],color='#ff7f0e',align='edge',width=0.2)
rects3 = plt.bar(x-0.2,data['铜牌'][6:10],color='blue',width=0.2)
plt.show()

在这里插入图片描述

plt.figure(figsize=(16, 9),dpi=100)

plt.bar(data['赛事'],data['铜牌'])
plt.bar(data['赛事'],data['银牌'],bottom=data['铜牌'])
plt.bar(data['赛事'],data['金牌'],bottom=data['铜牌']+data['银牌'])
plt.xticks(rotation=45)
for i in range(len(data)):
    plt.text(data['赛事'][i],data['金牌'][i]+data['银牌'][i]+data['铜牌'][i]/2,data['金牌'][i],ha='center',va='bottom')
    plt.text(data['赛事'][i],data['银牌'][i]+data['铜牌'][i]/2,data['银牌'][i],ha='center',va='bottom')
    plt.text(data['赛事'][i],data['铜牌'][i]/2,data['铜牌'][i],ha='center',va='bottom')
plt.legend(['铜牌','银牌','金牌'])
plt.show()

在这里插入图片描述

Study：day13-数据可视化之Matplotlib模块

文章目录

Study：day13-数据可视化之Matplotlib模块

1.1 折线图进阶

1.1.1 进阶折线图-双Y轴折线图

1.1.2 进阶折线图-双子图

1.1.3 进阶折线图-堆叠面积图

1.2 柱形图

1.2.1 基础柱形图-单数据系列

1.2.2 柱形图进阶-圆角柱形图

1.2.3 多系列柱形图

1.3 柱形图进阶-堆叠柱形图

网站公告

今日签到

热门文章

最新发布