xml
what actually is PCDATA and CDATA?
From WIKI: PCDATA Simply speaking, PCDATA stands for Parsed Character Data. That means the characters are to be parsed by the XML, XHTML, or HTML parser. (< will be changed to <, <p> will be taken to mean a paragraph tag, etc). Compare that with CDATA, where the characters are not to be parsed by … Read more
Is there an XSLT buddy available somewhere?
XSLT IDEs (Interactive Development Environments): XSelerator (the one I’ve been using for 6-7 years). Free, has a Debugger for MSXML, has intellisense for both XSLT 1.0 and XSLT 2.0. In addition has some dynamic intellisense. The debugger has breakpoints, data breakpoints,visualizes temporary trees, variables, test conditions, current output, …, etc. VS2008 — a good XML … Read more
invalid byte 2 of 2-byte UTF-8 sequence
Most commonly it’s due to feeding ISO-8859-x (Latin-x, like Latin-1) but parser thinking it is getting UTF-8. Certain sequences of Latin-1 characters (two consecutive characters with accents or umlauts) form something that is invalid as UTF-8, and specifically such that based on first byte, second byte has unexpected high-order bits. This can easily occur when … Read more
How to rename XML attribute that generated after serializing List of objects
The most reliable way is to declare an outermost DTO class: [XmlRoot(“myOuterElement”)] public class MyOuterMessage { [XmlElement(“item”)] public List<TestObject> Items {get;set;} } and serialize that (i.e. put your list into another object). You can avoid a wrapper class, but I wouldn’t: class Program { static void Main() { XmlSerializer ser = new XmlSerializer(typeof(List<Foo>), new XmlRootAttribute(“Flibble”)); … Read more
How do I detect XML parsing errors when using Javascript’s DOMParser in a cross-browser way?
This is the best solution I’ve come up with. I attempt to parse a string that is intentionally invalid XML and observe the namespace of the resulting <parsererror> element. Then, when parsing actual XML, I can use getElementsByTagNameNS to detect the same kind of <parsererror> element and throw a Javascript Error. // My function that … Read more
How do I use a default namespace in an lxml xpath query?
Something like this should work: import lxml.etree as et ns = {“atom”: “http://www.w3.org/2005/Atom”} tree = et.fromstring(xml) for node in tree.xpath(‘//atom:entry’, namespaces=ns): print node See also http://lxml.de/xpathxslt.html#namespaces-and-prefixes. Alternative: for node in tree.xpath(“//*[local-name() = ‘entry’]”): print node
What Is the Difference Between a Tag and an Element?
Tags mark the start and end of an element. <foo> — start tag </foo> — end tag <foo></foo> — element See the specification: Each XML document contains one or more elements, the boundaries of which are either delimited by start-tags and end-tags, or, for empty elements, by an empty-element tag. See also section 5 of … Read more