/usr/share/doc/libxml2-devel/tutorial
<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Parsing the file</title><meta name="generator" content="DocBook XSL Stylesheets V1.61.2"><link rel="home" href="index.html" title="Libxml Tutorial"><link rel="up" href="index.html" title="Libxml Tutorial"><link rel="previous" href="ar01s02.html" title="Data Types"><link rel="next" href="ar01s04.html" title="Retrieving Element Content"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Parsing the file</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="ar01s02.html">Prev</a>�</td><th width="60%" align="center">�</th><td width="20%" align="right">�<a accesskey="n" href="ar01s04.html">Next</a></td></tr></table><hr></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="xmltutorialparsing"></a>Parsing the file</h2></div></div><div></div></div><p><a class="indexterm" name="fileparsing"></a> Parsing the file requires only the name of the file and a single function call, plus error checking. Full code: <a href="apc.html" title="C.�Code for Keyword Example">Appendix�C, <i>Code for Keyword Example</i></a></p><p> </p><pre class="programlisting"> <a name="declaredoc"></a><img src="images/callouts/1.png" alt="1" border="0"> xmlDocPtr doc; <a name="declarenode"></a><img src="images/callouts/2.png" alt="2" border="0"> xmlNodePtr cur; <a name="parsefile"></a><img src="images/callouts/3.png" alt="3" border="0"> doc = xmlParseFile(docname); <a name="checkparseerror"></a><img src="images/callouts/4.png" alt="4" border="0"> if (doc == NULL ) { fprintf(stderr,"Document not parsed successfully. \n"); return; } <a name="getrootelement"></a><img src="images/callouts/5.png" alt="5" border="0"> cur = xmlDocGetRootElement(doc); <a name="checkemptyerror"></a><img src="images/callouts/6.png" alt="6" border="0"> if (cur == NULL) { fprintf(stderr,"empty document\n"); xmlFreeDoc(doc); return; } <a name="checkroottype"></a><img src="images/callouts/7.png" alt="7" border="0"> if (xmlStrcmp(cur->name, (const xmlChar *) "story")) { fprintf(stderr,"document of the wrong type, root node != story"); xmlFreeDoc(doc); return; } </pre><p> </p><div class="calloutlist"><table border="0" summary="Callout list"><tr><td width="5%" valign="top" align="left"><a href="#declaredoc"><img src="images/callouts/1.png" alt="1" border="0"></a> </td><td valign="top" align="left"><p>Declare the pointer that will point to your parsed document.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#declarenode"><img src="images/callouts/2.png" alt="2" border="0"></a> </td><td valign="top" align="left"><p>Declare a node pointer (you'll need this in order to interact with individual nodes).</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkparseerror"><img src="images/callouts/4.png" alt="4" border="0"></a> </td><td valign="top" align="left"><p>Check to see that the document was successfully parsed. If it was not, <span class="application">libxml</span> will at this point register an error and stop. </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note"><tr><td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="images/note.png"></td><th align="left">Note</th></tr><tr><td colspan="2" align="left" valign="top"><p><a class="indexterm" name="id2525337"></a> One common example of an error at this point is improper handling of encoding. The <span class="acronym">XML</span> standard requires documents stored with an encoding other than UTF-8 or UTF-16 to contain an explicit declaration of their encoding. If the declaration is there, <span class="application">libxml</span> will automatically perform the necessary conversion to UTF-8 for you. More information on <span class="acronym">XML's</span> encoding requirements is contained in the <a href="http://www.w3.org/TR/REC-xml#charencoding" target="_top">standard</a>.</p></td></tr></table></div><p> </p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#getrootelement"><img src="images/callouts/5.png" alt="5" border="0"></a> </td><td valign="top" align="left"><p>Retrieve the document's root element.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkemptyerror"><img src="images/callouts/6.png" alt="6" border="0"></a> </td><td valign="top" align="left"><p>Check to make sure the document actually contains something.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkroottype"><img src="images/callouts/7.png" alt="7" border="0"></a> </td><td valign="top" align="left"><p>In our case, we need to make sure the document is the right type. "story" is the root type of the documents used in this tutorial.</p></td></tr></table></div><p> <a class="indexterm" name="id2525415"></a> </p></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="ar01s02.html">Prev</a>�</td><td width="20%" align="center"><a accesskey="u" href="index.html">Up</a></td><td width="40%" align="right">�<a accesskey="n" href="ar01s04.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Data Types�</td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top">�Retrieving Element Content</td></tr></table></div></body></html>
.
Edit
..
Edit
apa.html
Edit
apb.html
Edit
apc.html
Edit
apd.html
Edit
ape.html
Edit
apf.html
Edit
apg.html
Edit
aph.html
Edit
api.html
Edit
ar01s02.html
Edit
ar01s03.html
Edit
ar01s04.html
Edit
ar01s05.html
Edit
ar01s06.html
Edit
ar01s07.html
Edit
ar01s08.html
Edit
ar01s09.html
Edit
images
Edit
includeaddattribute.c
Edit
includeaddkeyword.c
Edit
includeconvert.c
Edit
includegetattribute.c
Edit
includekeyword.c
Edit
includexpath.c
Edit
index.html
Edit
ix01.html
Edit