首页 > 其他分享 >Handling Invalid Characters in an XML String (zz.IS2120.BG57IV3)

Handling Invalid Characters in an XML String (zz.IS2120.BG57IV3)

时间:2023-06-09 17:03:09浏览次数:45  
标签:XML Handling string invalidChars invalid XmlElement Characters xmlDoc


There are 5 predefined entity references in XML:

//z 2013-08-20 18:03:27 IS2120@BG57IV3.T3597203987.K[T191,L2147,R75,V2925]

<

<

less than

&gt;

>

greater than

&amp;

&

ampersand 

&apos;

'

apostrophe

&quot;

"

quotation mark

//z 2014-04-10 17:47:22 BG57IV3@XCL T1043027031.K.F253293061 [T191,L2414,R116,V3989]
严格来讲,只有 < 和 & 在xml是非法的。但作为一个良好的习惯,上述字符串最好都替换掉的。
Note:
 Only the characters "<" and "&" are strictly illegal in XML. Apostrophes, quotation marks and greater than signs are legal, but it is a good habit to replace them.



Recipe 15.7. Handling Invalid Characters in an XML String


Problem

//z 2012-11-15 17:45:37 IS2120@BG57IV3.T760357750 .K[T3,L107,R3,V27]
You are creating an XML string. Before adding a tag containing a text element, you want to check it to determine whether the string contains any of the following invalid characters:


< > " ' &



If any of these characters are encountered, you want them to be replaced with their escaped form:


&lt; &gt; &quot; &apos; &amp;




Solution

//z 2012-11-15 17:45:37 IS2120@BG57IV3.T760357750 .K[T3,L107,R3,V27]
There are different ways to accomplish this, depending on which XML-creation approach you are using. If you are using XmlWriter, theWriteCData,WriteString,WriteAttributeString,WriteValue, and WriteElementString methods take care of this for you. If you are usingXmlDocument andXmlElements, theXmlElement.InnerText

The two ways to handle this using an XmlWriter work like this. TheWriteCData method will wrap theinvalid character text in aCDATA section, as shown in the creation of theInvalidChars1 element in the example that follows. The other method, usingXmlWriter, is to use theWriteElementString method that will automatically escape the text for you, as shown while creating theInvalidChars2


// Set up a string with our invalid chars. string invalidChars = @"<>\&'"; XmlWriterSettings settings = new XmlWriterSettings(); settings.Indent = true; using (XmlWriter writer = XmlWriter.Create(Console.Out, settings)) { writer.WriteStartElement("Root"); writer.WriteStartElement("InvalidChars1"); writer.WriteCData(invalidChars); writer.WriteEndElement(); writer.WriteElementString("InvalidChars2", invalidChars); writer.WriteEndElement(); }



The output from this is:


<?xml version="1.0" encoding="IBM437"?> <Root> <InvalidChars1><![CDATA[<>\&']]></InvalidChars1> <InvalidChars2>&lt;&gt;\&amp;'</InvalidChars2> </Root>



There are two ways you can handle this problem with XmlDocument andXmlElement. The first way is to surround the text you are adding to the XML element with a CDATA section and add it to theInnerXML property of the XmlElement:


// Set up a string with our invalid chars. string invalidChars = @"<>\&'"; XmlElement invalidElement1 = xmlDoc.CreateElement("InvalidChars1"); invalidElement1.AppendChild(xmlDoc.CreateCDataSection(invalidChars));



The second way is to let the XmlElement class escape the data for you by assigning the text directly to theInnerText


// Set up a string with our invalid chars. string invalidChars = @"<>\&'"; XmlElement invalidElement2 = xmlDoc.CreateElement("InvalidChars2"); invalidElement2.InnerText = invalidChars;



The whole XmlDocument is created with these XmlElements


public static void HandlingInvalidChars( ) { // Set up a string with our invalid chars. string invalidChars = @"<>\&'"; XmlDocument xmlDoc = new XmlDocument( ); // Create a root node for the document. XmlElement root = xmlDoc.CreateElement("Root"); xmlDoc.AppendChild(root); // Create the first invalid character node. XmlElement invalidElement1 = xmlDoc.CreateElement("InvalidChars1"); // Wrap the invalid chars in a CDATA section and use the // InnerXML property to assign the value as it doesn't // escape the values, just passes in the text provided. invalidElement1.InnerXml = "<![CDATA[" + invalidChars + "]]>"; // Append the element to the root node. root.AppendChild(invalidElement1); // Create the second invalid character node. XmlElement invalidElement2 = xmlDoc.CreateElement("InvalidChars2"); // Add the invalid chars directly using the InnerText // property to assign the value as it will automatically // escape the values. invalidElement2.InnerText = invalidChars; // Append the element to the root node. root.AppendChild(invalidElement2); Console.WriteLine("Generated XML with Invalid Chars:\r\n{0}",xmlDoc.OuterXml); Console.WriteLine( ); }



The XML created by this procedure (and output to the console) looks like this:


Generated XML with Invalid Chars: <Root><InvalidChars1><![CDATA[<>\&']]></InvalidChars1><InvalidChars2>&lt;&gt;\ &amp;'</InvalidChars2></Root>




Discussion

The CDATA node allows you to represent the items in the text section as character data, not as escapedXML, for ease of entry. Normally thesecharacters would need to be in their escaped format (&lt; for< and so on), but theCDATA

When the CDATA tag is used in conjunction with the InnerXml property of theXmlElement class, you can submit characters that would normally need to be escaped first. TheXmlElement class also has an InnerText



See Also

See the "XmlDocument Class," "XmlWriter Class," "XmlElement Class," and "CDATA Sections" topics in the MSDN documentation.

//z 2012-11-15 17:45:37 IS2120@BG57IV3.T760357750 .K[T3,L107,R3,V27]


XML 非法 字符 转义 字符 处理


标签:XML,Handling,string,invalidChars,invalid,XmlElement,Characters,xmlDoc
From: https://blog.51cto.com/u_16156420/6449442

相关文章

  • 用XmlSerializer.Deserialize将XML转实体遇到的问题
    1、命名空间的问题1.1XML示例:1.2反序列化代码:点击查看源代码```publicstaticobjectDeserializeFromXml<T>(stringxmlFilePath){objectresult=null;using(FileStreamfs=newFileStream(xmlFi......
  • Tomcat中web.xml文件的详细说明
    2008年03月03日08:25:48Tomcat中web.xml文件的详细说明Tomcat中web.xml文件的详细说明<?xmlversion="1.0"encoding="GB2312"?><!--Web.xml依次定议了如下元素:<web-app><display-name></display-name>定义了WEB应用的名字<descript......
  • Java利用xml将大批量数据导出到excel的一个方法
    笔者在Java开发中常常会遇到将数据库数据导出到Excel的要求,比如在我的一个项目中,客户要求所有查询结果都可以导出到Excel,对于数据量不大的(几万条),这比较容易实现,但对于数据量比较大的(几十万及以上,具体要看导出内容的字段个数和长短),则在数据库查询和生成excel文件上都会有麻烦。......
  • javascript操作xml(增删改查)例子代码
    关键字:javascript操作xml(增删改查)自己做了一个小东西,不是很好,但是对初学来说是一个不错的例子!包括了stu.hta(是HTML应用程序);stu.xml注意下面的HTML代码必须保存为后缀名为hta否则当对XML文件进行操作(增删改)的时候就会提示没有权限!!文件stu.hta代码如......
  • 【转载】xsd文件验证xml的java实现
    importjava.io.File;importjava.io.IOException;importjavax.xml.transform.Source;importjavax.xml.transform.stream.StreamSource;importjavax.xml.validation.Schema;importjavax.xml.validation.SchemaFactory;importjavax.xml.validation.Validator;i......
  • springboot 引入jackson-dataformat-xml 接口都返回XML了
    springboot版本2.6.10springboot引入acksonDataformatXML后原本返回json的却返回xml<dependency> <groupId>com.fasterxml.jackson.dataformat</groupId> <artifactId>jackson-dataformat-xml</artifactId> </dependency>————————————————解......
  • Web.xml 4.0
    Web.xml4.0<?xmlversion="1.0"encoding="UTF-8"?><web-appxmlns="http://xmlns.jcp.org/xml/ns/javaee"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLoc......
  • python 解析HTML和XML文档
    一、BeautifulSoupBeautifulSoup是一个Python包,用于解析HTML和XML文档。它可以快速而方便地从网页中提取信息,并以易于使用的方式对其进行处理。它支持各种解析器,包括内置的Python解析器和第三方解析器,例如lxml和html5lib。二、对标签提取代码示列以下是使用BeautifulSoup解析H......
  • 深入理解注解驱动配置与XML配置的融合与区别
    摘要:本文旨在深入探讨Spring框架的注解驱动配置与XML配置,揭示两者之间的相似性与差异。本文分享自华为云社区《Spring高手之路2——深入理解注解驱动配置与XML配置的融合与区别》,作者:砖业洋__。本文旨在深入探讨Spring框架的注解驱动配置与XML配置,揭示两者之间的相似性与差异。......
  • xml qtreewidget 的遍历
    这些都是自己工作中遇到的,不具有普遍性 xml的递归遍历voidUserTreeWidget::travelDomElement(QDomElement&ele,QStringList&listOuterId){QDomNodenode=ele.firstChild();while(!node.isNull()){QDomElementchildElement=node.toElemen......