首页 > 编程语言 >python: xmlhelper

python: xmlhelper

时间:2023-07-16 22:46:08浏览次数:44  
标签:xml python doc xmlhelper url child news root

 

xml:

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

  

# encoding: utf-8
# 版权所有 2023 涂聚文有限公司
# 许可信息查看:
# 描述: pip install beautifulsoup4
# Author    : geovindu,Geovin Du 涂聚文.
# IDE       : PyCharm 2023.1 python 311
# Datetime  : 2023/7/16 22:17
# User      : geovindu
# Product   : PyCharm
# Project   : pythonTkinterDemo
# File      : XmlHelper.py
# explain   : 学习


from xml.dom import minidom
import xml.etree.ElementTree as ET
import csv
import requests
import os
import sys

def readXml(url):
    tree = ET.parse(url)
    root = tree.getroot()
    for child in root:
        print(child.tag, child.attrib)


def writeXml(url):
    # 实例化Document树
    doc = minidom.Document()
    # 创建根结点,XML必须存在root元素
    root_node = doc.createElement('root')
    # 将元素挂载在doc树中
    doc.appendChild(root_node)

    # 创建子元素
    c_node1 = doc.createElement('movie')
    root_node.appendChild(c_node1)
    # 设置该元素存储数据
    c_node1.setAttribute('shelf', 'New Arrivals')
    # 二级子结点
    c_node2 = doc.createElement('type')
    c_node1.appendChild(c_node2)
    # 也用DOM创建文本结点,把文本结点(文字内容)看成子结点
    c_text = doc.createTextNode("War, Thriller")
    c_node2.appendChild(c_text)
    try:
        with open(url, 'w', encoding='UTF-8') as f:
            # 第一个参数是目标文件对象
            doc.writexml(f, indent='', addindent='\t', newl='\n', encoding='UTF-8')
    except Exception as e:
        print('错误:', e)


def loadRSS():
    # url of rss feed
    url = 'http://www.hindustantimes.com/rss/topnews/rssfeed.xml'

    # creating HTTP response object from given url
    resp = requests.get(url)

    # saving the xml file
    with open('topnewsfeed.xml', 'wb') as f:
        f.write(resp.content)


def parseXML(xmlfile):
    # create element tree object
    tree = ET.parse(xmlfile)

    # get root element
    root = tree.getroot()

    # create empty list for news items
    newsitems = []

    # iterate news items
    for item in root.findall('./channel/item'):

        # empty news dictionary
        news = {}

        # iterate child elements of item
        for child in item:

            # special checking for namespace object content:media
            if child.tag == '{http://search.yahoo.com/mrss/}content':
                news['media'] = child.attrib['url']
            else:
                news[child.tag] = child.text.encode('utf8')

        # append news dictionary to news items list
        newsitems.append(news)

    # return news items list
    return newsitems


def savetoCSV(newsitems, filename):
    # specifying the fields for csv file
    fields = ['guid', 'title', 'pubDate', 'description', 'link', 'media']

    # writing to csv file
    with open(filename, 'w') as csvfile:
        # creating a csv dict writer object
        writer = csv.DictWriter(csvfile, fieldnames=fields)

        # writing headers (field names)
        writer.writeheader()

        # writing data rows
        writer.writerows(newsitems)


def main():
    # load rss from web to update existing xml file
    loadRSS()

    # parse xml file
    newsitems = parseXML('topnewsfeed.xml')

    # store news items in a csv file
    savetoCSV(newsitems, 'geovindu.csv')

  

标签:xml,python,doc,xmlhelper,url,child,news,root
From: https://www.cnblogs.com/geovindu/p/17558751.html

相关文章

  • python - 串口通讯
    1.安装pyserialpip3installpyserial2.使用方式config.pyimportserialport="COM1"baudrate=2400bytesize=serial.SEVENBITSstopbits=serial.STOPBITS_TWOparity=serial.PARITY_NONEtimeout=10main.pyimportserialimportconfigser=s......
  • [译]使用Python和Dash 创建一个仪表盘(上)
    介绍在数据科学和分析的领域,数据能力的释放不仅是通过提取见解的方式,同时也要能通过有效的方式来传达见解.这就是数据可视化发挥见解的地方.数据可视化是信息和数据的可视化呈现.它使用可视化元素,如图表、图形、地图,使其更容易看懂原始数据中的模式、趋势及异常值.对于数......
  • Python【3】有序字典 OrderdDict
    有序字典可以按字典中元素的插入顺序来输出。参考https://www.cnblogs.com/lowmanisbusy/p/10257360.htmlimportcollectionsmy_order_dict=collections.OrderedDict()my_order_dict["name"]="lowman"my_order_dict["age"]=45my_order_dict["money&......
  • 使用python在视频流网站下载ts视频流并合并为完整视频
    #!/usr/bin/python#encoding:utf-8importurllib.requestimportosimportssl#addline1ssl._create_default_https_context=ssl._create_unverified_context#addline2foriinrange(324,666):#起始位置要自己在浏览器的source来看s=str(i)s......
  • Java python C++
    Java和Python的区别编程范式:Java是一种面向对象的编程语言,而Python支持多种编程范式,包括面向对象、函数式和命令式等。这意味着Python在某些情况下可以比Java更简洁、易于理解和编写。代码可读性:Python是一种非常易于阅读和编写的编程语言,其语法和代码风格非常清晰......
  • python编程-核心知识
                  ......
  • python中tk无线按钮使用方法
    先上代码:fromtkinterimport*root=Tk()selected_var=IntVar()radiobutton=Radiobutton(root,text='hzq1',variable=selected_var,value=88)radiobutton.pack()radiobutton2=Radiobutton(root,text='hzq2',variable=selected_var,value......
  • python-2023-07-16
    1、easy_install和pip的有什么区别?2、解决requests安装错误的过程中,由于最新设置的pip环境变量放在了最后,想着能不能将pip和python环境变量临近放置,所以将python下移到了pip旁边,导致在cmd输入python就会自动弹出应用商店,后面通过上移python到原来位置才解决掉。3、在python中//......
  • Python 执行 MP4视频下载
    importrequestsdefextract_video_links(url):"""提取视频链接"""response=requests.get(url)html=response.text#在HTML中查找视频链接pattern=r'src="([^"]+\.mp4)"'matches=re.finda......
  • python魔术方法模拟篇
    6,模拟篇__call____len____length_hint____getitem____setitem____delitem____reversed____contains____iter____missing____enter__和__exit____call__方法所谓的callable就是可以以函数调用的形式来使用的对象,那想让一个类的对象成为callable,我们需要给它定义这个......