【python】python进阶——多线程-EW帮帮网

引言

在现代软件开发中，程序的执行效率至关重要。无论是处理大量数据、响应用户交互，还是与外部系统通信，常常需要让程序同时执行多个任务。Python作为一门功能强大且易于学习的编程语言，提供了多种并发编程方式，其中多线程（Multithreading） 是最常用的技术之一。

一、多线程简介

1.1 基本概念

线程（Thread）：是操作系统能够进行运算调度的最小单位，它被包含在进程之中，是进程中的实际运作单位。
多线程：指一个进程中同时运行多个线程，每个线程可以执行不同的任务。
并发（Concurrency）：多个任务微观上交替执行，宏观上给人“同时”运行的错觉。
并行（Parallelism）：多个任务真正同时执行（在多核CPU上）。

1.2 使用多线程的优势

提高响应性：在GUI应用中，避免界面卡顿。
提高吞吐量：同时处理多个I/O操作（如网络请求、文件读写）。
资源共享：线程共享同一进程的内存空间，通信更高效。

二、Python中的多线程实现

Python标准库提供了 threading 模块来支持多线程编程。

创建线程

import threading
import time

def worker(name, delay):
    print(f"线程 {name} 开始")
    time.sleep(delay)
    print(f"线程 {name} 结束")

# 创建线程
t1 = threading.Thread(target=worker, args=("A", 2))
t2 = threading.Thread(target=worker, args=("B", 3))

# 启动线程
t1.start()
t2.start()

# 等待线程结束
t1.join()
t2.join()

print("所有线程执行完毕")

三、线程同步与通信

多线程最大的挑战是共享资源的竞争。当多个线程同时访问和修改同一数据时，可能导致数据不一致。

3.1 使用 `Lock`（互斥锁）

import threading
import time

# 共享资源
counter = 0
lock = threading.Lock()

def increment():
    global counter
    for _ in range(100000):
        with lock:  # 自动加锁和释放
            counter += 1

# 创建多个线程
threads = []
for i in range(5):
    t = threading.Thread(target=increment)
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print(f"最终计数: {counter}")  # 应为 500000

3.2 使用 `RLock`（可重入锁）

允许同一线程多次获取同一把锁。

lock = threading.RLock()

def recursive_func(n):
    with lock:
        if n > 0:
            print(f"递归调用: {n}")
            recursive_func(n - 1)

3.3 使用 `Condition`（条件变量）

用于线程间的同步协调。

import threading
import time

condition = threading.Condition()
items = []

def producer():
    for i in range(5):
        with condition:
            items.append(i)
            print(f"生产者添加: {i}")
            condition.notify()  # 通知等待的消费者
        time.sleep(0.1)

def consumer():
    while True:
        with condition:
            while not items:
                condition.wait()  # 等待通知
            item = items.pop(0)
            print(f"消费者取出: {item}")
            if item == 4:
                break

# 启动线程
t1 = threading.Thread(target=producer)
t2 = threading.Thread(target=consumer)

t1.start()
t2.start()

t1.join()
t2.join()

四、线程池

对于需要频繁创建和销毁线程的场景，使用线程池可以显著提升性能。

from concurrent.futures import ThreadPoolExecutor
import requests
import time

def fetch_url(url):
    response = requests.get(url)
    return f"{url}: {response.status_code}"

urls = [
    "https://httpbin.org/delay/1",
    "https://httpbin.org/delay/2",
    "https://httpbin.org/delay/1",
    "https://httpbin.org/delay/3"
]

# 使用线程池
start_time = time.time()

with ThreadPoolExecutor(max_workers=3) as executor:
    results = list(executor.map(fetch_url, urls))

for result in results:
    print(result)

print(f"总耗时: {time.time() - start_time:.2f}秒")

优势：

复用线程，减少创建开销

控制并发数量

提供更简洁的API

五、Python多线程的局限性：GIL

5.1 什么是GIL？

全局解释器锁（Global Interpreter Lock） 是CPython解释器的一个互斥锁，它确保同一时刻只有一个线程执行Python字节码。

5.2 GIL的影响

CPU密集型任务：多线程无法真正并行，性能提升有限。
I/O密集型任务：线程在等待I/O时会释放GIL，因此多线程依然有效。

5.3 如何绕过GIL？

使用 multiprocessing 模块（多进程）
使用C扩展（如NumPy）
使用Jython或PyPy等其他Python实现

六、最佳实践与注意事项

6.1 何时使用多线程？

I/O密集型任务（网络请求、文件操作、数据库查询）
GUI应用中保持界面响应
CPU密集型任务（应使用多进程）

6.2 安全注意事项

始终使用锁保护共享数据
避免死锁（按固定顺序获取锁）
尽量减少锁的持有时间
使用 with 语句确保锁的释放

6.3 调试技巧

使用 threading.current_thread() 查看当前线程
使用 threading.active_count() 查看活跃线程数
使用日志记录线程行为

七、实际应用示例：并发下载器

import threading
import requests
from concurrent.futures import ThreadPoolExecutor
import time

def download_file(url, filename):
    try:
        response = requests.get(url, stream=True)
        with open(filename, 'wb') as f:
            for chunk in response.iter_content(8192):
                f.write(chunk)
        print(f"下载完成: {filename}")
    except Exception as e:
        print(f"下载失败 {filename}: {e}")

# 多个文件下载
files = [
    ("https://example.com/file1.zip", "file1.zip"),
    ("https://example.com/file2.zip", "file2.zip"),
    ("https://example.com/file3.zip", "file3.zip"),
]

start_time = time.time()

with ThreadPoolExecutor(max_workers=3) as executor:
    for url, filename in files:
        executor.submit(download_file, url, filename)

print(f"全部下载完成，耗时: {time.time() - start_time:.2f}秒")

八、总结

Python多线程是处理I/O密集型任务的强大工具。通过本文的学习，你应该掌握了：

如何创建和管理线程
线程同步机制（Lock, Condition）
使用线程池提升性能
理解GIL的限制
多线程的最佳实践

虽然GIL限制了Python多线程在CPU密集型任务中的表现，但在I/O密集型场景下，多线程依然是提高程序效率的首选方案。

【python】python进阶——多线程

引言

一、多线程简介

1.1 基本概念

1.2 使用多线程的优势

二、Python中的多线程实现

三、线程同步与通信

3.1 使用 `Lock`（互斥锁）

3.2 使用 `RLock`（可重入锁）

3.3 使用 `Condition`（条件变量）

四、线程池

五、Python多线程的局限性：GIL

5.1 什么是GIL？

5.2 GIL的影响

5.3 如何绕过GIL？

六、最佳实践与注意事项

6.1 何时使用多线程？

6.2 安全注意事项

6.3 调试技巧

七、实际应用示例：并发下载器

八、总结

网站公告

今日签到

热门文章

最新发布

【python】python进阶——多线程

引言

一、多线程简介

1.1 基本概念

1.2 使用多线程的优势

二、Python中的多线程实现

三、线程同步与通信

3.1 使用 Lock（互斥锁）

3.2 使用 RLock（可重入锁）

3.3 使用 Condition（条件变量）

四、线程池

五、Python多线程的局限性：GIL

5.1 什么是GIL？

5.2 GIL的影响

5.3 如何绕过GIL？

六、最佳实践与注意事项

6.1 何时使用多线程？

6.2 安全注意事项

6.3 调试技巧

七、实际应用示例：并发下载器

八、总结

网站公告

今日签到

热门文章

最新发布

3.1 使用 `Lock`（互斥锁）

3.2 使用 `RLock`（可重入锁）

3.3 使用 `Condition`（条件变量）