python msedgedriver 获取小说
声明:只为学习/练习技术
from lxml import etree
from selenium import webdriver
from selenium.webdriver.edge.service import Service
from selenium.webdriver.edge.options import Options
import time
options = Options()
options.add_argument("--headless")
driver_path = "msedgedriver.exe"
service = Service(executable_path=driver_path)
driver = webdriver.Edge(service=service, options=options)
url_prefix = "https://www.l*d*k*s*.com"
url = url_prefix + "/html/91/91737/1100841.html"
while True:
driver.get(url)
time.sleep(1)
page_source = driver.page_source
tree = etree.HTML(page_source)
content = "\n".join(tree.xpath("//div[@id='content']/p/text()"))
title = tree.xpath("//div[@class='bookname']/h1/text()")[0]
next_ur = tree.xpath("//div[@class='bookname']/div[@class='bottem1']/a[3]/@href")[0]
url = url_prefix + next_ur
print(f"正在下载《{title}》...")
with open("./mjts/mjts.txt", "a", encoding="utf-8") as file:
file.write(title + "\n\n" + content + "\n\n")
if "章" not in title:
continue
if "/html/91/91737/" == next_ur:
break
time.sleep(2)
print("下载完毕")
1. 找到要下载小说的页面,F12 查看请求头的 User-Agent 中的浏览器版本,下载对应的msedgedriver.exe。下载地址: [msedgedriver下载传送门](https://registry.npmmirror.com/binary.html?path=edgedriver/)。
2. 下载的 msedgedriver.exe 放到一个位置,将来py运行时能找到就行。
3. 最后,有多种方式可以实现小说的获取,这只是其中的一种。