You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

48 lines
5.6 KiB

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Parsing the file</title><meta name="generator" content="DocBook XSL Stylesheets V1.61.2"><link rel="home" href="index.html" title="Libxml Tutorial"><link rel="up" href="index.html" title="Libxml Tutorial"><link rel="previous" href="ar01s02.html" title="Data Types"><link rel="next" href="ar01s04.html" title="Retrieving Element Content"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Parsing the file</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="ar01s02.html">Prev</a> </td><th width="60%" align="center"> </th><td width="20%" align="right"> <a accesskey="n" href="ar01s04.html">Next</a></td></tr></table><hr></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="xmltutorialparsing"></a>Parsing the file</h2></div></div><div></div></div><p><a class="indexterm" name="fileparsing"></a>
Parsing the file requires only the name of the file and a single
function call, plus error checking. Full code: <a href="apc.html" title="C. Code for Keyword Example">Appendix C, <i>Code for Keyword Example</i></a></p><p>
</p><pre class="programlisting">
<a name="declaredoc"></a><img src="images/callouts/1.png" alt="1" border="0"> xmlDocPtr doc;
<a name="declarenode"></a><img src="images/callouts/2.png" alt="2" border="0"> xmlNodePtr cur;
<a name="parsefile"></a><img src="images/callouts/3.png" alt="3" border="0"> doc = xmlParseFile(docname);
<a name="checkparseerror"></a><img src="images/callouts/4.png" alt="4" border="0"> if (doc == NULL ) {
fprintf(stderr,"Document not parsed successfully. \n");
return;
}
<a name="getrootelement"></a><img src="images/callouts/5.png" alt="5" border="0"> cur = xmlDocGetRootElement(doc);
<a name="checkemptyerror"></a><img src="images/callouts/6.png" alt="6" border="0"> if (cur == NULL) {
fprintf(stderr,"empty document\n");
xmlFreeDoc(doc);
return;
}
<a name="checkroottype"></a><img src="images/callouts/7.png" alt="7" border="0"> if (xmlStrcmp(cur-&gt;name, (const xmlChar *) "story")) {
fprintf(stderr,"document of the wrong type, root node != story");
xmlFreeDoc(doc);
return;
}
</pre><p>
</p><div class="calloutlist"><table border="0" summary="Callout list"><tr><td width="5%" valign="top" align="left"><a href="#declaredoc"><img src="images/callouts/1.png" alt="1" border="0"></a> </td><td valign="top" align="left"><p>Declare the pointer that will point to your parsed document.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#declarenode"><img src="images/callouts/2.png" alt="2" border="0"></a> </td><td valign="top" align="left"><p>Declare a node pointer (you'll need this in order to
interact with individual nodes).</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkparseerror"><img src="images/callouts/4.png" alt="4" border="0"></a> </td><td valign="top" align="left"><p>Check to see that the document was successfully parsed. If it
was not, <span class="application">libxml</span> will at this point
register an error and stop.
</p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note"><tr><td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="images/note.png"></td><th align="left">Note</th></tr><tr><td colspan="2" align="left" valign="top"><p><a class="indexterm" name="id2525337"></a>
One common example of an error at this point is improper
handling of encoding. The <span class="acronym">XML</span> standard requires
documents stored with an encoding other than UTF-8 or UTF-16 to
contain an explicit declaration of their encoding. If the
declaration is there, <span class="application">libxml</span> will
automatically perform the necessary conversion to UTF-8 for
you. More information on <span class="acronym">XML's</span> encoding
requirements is contained in the <a href="http://www.w3.org/TR/REC-xml#charencoding" target="_top">standard</a>.</p></td></tr></table></div><p>
</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#getrootelement"><img src="images/callouts/5.png" alt="5" border="0"></a> </td><td valign="top" align="left"><p>Retrieve the document's root element.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkemptyerror"><img src="images/callouts/6.png" alt="6" border="0"></a> </td><td valign="top" align="left"><p>Check to make sure the document actually contains something.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkroottype"><img src="images/callouts/7.png" alt="7" border="0"></a> </td><td valign="top" align="left"><p>In our case, we need to make sure the document is the right
type. "story" is the root type of the documents used in this
tutorial.</p></td></tr></table></div><p>
<a class="indexterm" name="id2525415"></a>
</p></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="ar01s02.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="index.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="ar01s04.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Data Types </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Retrieving Element Content</td></tr></table></div></body></html>