需求
在实际需要中,经常存在需要在服务器端将网页转化为PDF文件保存下来。
代码
requirements.txt
点击查看代码
playwright
convert_pdf.py
点击查看代码
from playwright.sync_api import sync_playwright, Playwright
import argparse
def run(playwright: Playwright, url: str, path: str, timeout: int):
chromium = playwright.chromium
browser = chromium.launch()
context = browser.new_context()
page = context.new_page()
page.goto(url=url, timeout=timeout)
page.emulate_media(media="print")
page.pdf(path=path, format="A4", outline=True, margin=dict(top="35px", right="35px", bottom="35px", left="35px"))
browser.close()
with sync_playwright() as playwright:
parser = argparse.ArgumentParser(description='Convert PDF')
parser.add_argument('-u', '--url', type=str, required = True, help='Need to convert PDF file network address')
parser.add_argument('-p', '--path', type=str, required = True, help='save file path')
parser.add_argument('-t', '--timeout', type=int, help='timeout(Unit millisecond), defualt 30000 ', default=30000)
args = parser.parse_args()
if args.timeout < 1000:
print("error: Please enter the correct timeout period in milliseconds.")
exit(0)
run(playwright, url=args.url, path=args.path, timeout=args.timeout)
python install -r requirements.txt
playwright install
等待安装完成后,在使用下列命令转化即可
python .\convert_pdf.py -u https://www.baidu.com --path ./page8.pdf