需要帮助来提取此 XML 节点 - Python 中的 Excel 连接字符串

时间：2024-07-25 13:33:01浏览次数：18

标签：python excel xml

我有一个 Python 程序，打开 Excel (XLSX) 文件，并尝试查找 <connection> 节点。

这是 connections.xml 文件中的完整 XML。

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<connections 
    xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" 
    xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" 
    mc:Ignorable="xr16" 
    xmlns:xr16="http://schemas.microsoft.com/office/spreadsheetml/2017/revision16">
    <connection 
        id="1" xr16:uid="{#####}" keepAlive="1" 
        name="Query - CargoData_small" 
        description="Connection to the 'CargoData_small' query in the workbook." 
        type="5" refreshedVersion="7" background="1">
        <dbPr connection="Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=CargoData_small;Extended Properties=&quot;&quot;" 
            command="SELECT * FROM [CargoData_small]"/>
    </connection>
</connections>

我正在尝试查找 <dbPr> 节点。 ||节点。但我被困在代码的子节点上，如下所示：

def checkfile(filename):
    if zipfile.is_zipfile(filename):
        zf = zipfile.ZipFile(filename, 'r')
        if "xl/connections.xml" in zf.namelist():
            print(filename)
            xml = zf.read('xl/connections.xml')
            root = parseString(xml)
            connections = root.getElementsByTagName('connection')
            try:
                for con in connections:
                    for child in con.childNodes:

                        # there are no 'children'
                        for children in child.childNodes:
                            dsn = children.attributes.values()[0].nodeValue
                            sql = children.attributes.values()[1].nodeValue
                            writeoutput(filename, dsn, sql )
            except:
                pass

所以发生的情况是我得到了“child”值，但我找不到 dbPr 部分。

这就是我收到的错误:

${TypeError}TypeError("'dict_values' object is not subscriptable")$

我使用 Pycharm 作为 IDE。

谢谢

以下是使用 Python 提取 Excel 连接字符串的改进方法：

import zipfile
from xml.dom.minidom import parseString

def checkfile(filename):
    if zipfile.is_zipfile(filename):
        with zipfile.ZipFile(filename, 'r') as zf:
            if "xl/connections.xml" in zf.namelist():
                print(filename)
                xml = zf.read('xl/connections.xml')
                root = parseString(xml)
                connections = root.getElementsByTagName('connection')
                for con in connections:
                    dbpr = con.getElementsByTagName('dbPr')[0]
                    dsn = dbpr.getAttribute('connection')
                    sql = dbpr.getAttribute('command')
                    writeoutput(filename, dsn, sql)

def writeoutput(filename, dsn, sql):
    # 根据需要处理提取的信息
    print(f"文件：{filename}")
    print(f"DSN：{dsn}")
    print(f"SQL：{sql}")
    print("-" * 20)

# 示例用法
checkfile("your_excel_file.xlsx")

说明：

直接访问 dbPr ： 代码现在直接使用 con.getElementsByTagName('dbPr')[0] 获取 dbPr 元素，而不是循环遍历所有子节点。这会查找 con 元素（即当前 connection 元素）内的 dbPr 元素，并选择第一个（也是唯一一个）匹配项。
使用 getAttribute ： 我们使用 getAttribute 方法来直接检索 connection 和 command 属性的值，而不是尝试从 attributes 对象中获取它们。
错误处理： 虽然代码中没有显式处理 dbPr 元素不存在的情况，但如果预期某些文件可能没有连接信息，则应添加错误处理。

通过这些更改，代码应该能够有效地提取 Excel 连接字符串并避免之前的错误。

标签：python,excel,xml
From： 78791591

【python】Python中采集Prometheus数据，进行数据分析和可视化展示
✨✨欢迎大家来到景天科技苑✨✨......
使用python3拼接rgb.txt与depth.txt为associate.txt(适用于GCNV2_SLAM中TUM数据集的运
这里以GCNV2_SLAM中TUM数据集的运行为例子：安装gnv2_slam可以参考：GCNv2_SLAM-CPU详细安装教程(ubuntu18.04)-CSDN博客首先下载数据集ComputerVisionGroup-DatasetDownload下载后通过该命令解压：tar-xvfrgbd_dataset_freiburg1_desk.tgz打开后，你可以发现：在该数据集......
【Python】到底什么是字符串格式化？
字符串格式化的目的：在字符串中动态地插入数据或表达式。字符串格式化的对象：要插入到字符串中的数据。在详细解释之前，先引入第一种字符串格式化的方法name=input('请输入你的名字:')gender=input('请输入你的性别：')age=input('请输入你的年龄：')print(f'你的名字是{......
Python PDF 编辑器
我将制作一个PDF编辑器，它可以更改PDF中的单词，而无需更改文本的来源。这是我的代码，到目前为止，它所做的一切都是正确的，但我可以获得文本颜色和文本大小。importpymupdfimportos#OpenthePDFdocumentdoc=pymupdf.open('input.pdf')#Loadthecustomfontfi......
JSON 响应中的字符串值变成 Python pandas datafram 中的数值
我正在使用Python从RESTAPI中提取数据并将其存储在SQL数据库中。除了响应中的一个JSON值之外，一切工作正常。JSON响应[{"pbxId":"XXXcxx","site":"XXXGroup-SydneyOffice","name":"XXXXService","extension":......
适用于 Python 的 Firebase-admin sdk 引发错误“StreamGenerator”不可迭代
以前（一周前）我与Firestore通信的Python代码工作得很好。但今天我又开始研究它，它在迭代collection.stream()service_account=credentials.Certificate('credential/serviceAccount.json')firebase_admin.initialize_app(service_account)firestore_d......
从 DOCKER 下的共享卷在 Linux 中执行 PyInstaller 生成的文件时出现 Python 子进程 F
我已经使用PyInstaller生成了一个可执行文件，例如test（没有扩展名，因为它是Linux）并将其存储在一个目录中，例如data我有一个Python程序，如下所示：importsubprocessfrompathlibimportPath...defrun_exe():try:#getcurrentdirectory......
从源代码安装 python3.5 后如何修复 virtualenv 的 python pip 分段错误（核心转储）响应
背景嗨，我的主要目标是为许多使用旧版本Python的项目创建一个virtualenv，这些项目与系统版本(3.10.x)是分开的。我是使用PopOS22.04并进行所有更新。由于此错误，我什至无法使用pip。我也阅读了周围的内容，但我读到的所有解决方案要么输出日志文件，要么......
如何确保 Excel 与 Bloomberg 保持连接以执行电子表格自动化？
我会尽力让这个问题尽可能清楚，因为我是一个初学者，我很难找到这个问题的解决方案。目标是运行一个自动打开excel的python代码，运行一些用于的宏Bloomberg公式然后关闭所有内容。我的python代码如下所示：importtimeimportwin32com.clientaswin32importgcde......
无法使用适用于 Azure AI 搜索的 Python SDK 将数据添加到 ComplexField
我想将带有嵌套字典的有效负载上传到AzureAI搜索索引。我在索引中使用ComplexField作为负载中的嵌套字典。索引无法识别嵌套字典，并且出现空错误。这是我的代码：ComplexField,CorsOptions,SearchIndex,ScoringProfile,SearchFieldDataType,Sim......

需要帮助来提取此 XML 节点 - Python 中的 Excel 连接字符串

相关文章

赞助商

阅读排行