...

<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Parsing the file</title><meta name="generator" content="DocBook XSL Stylesheets V1.61.2"><link rel="home" href="index.html" title="Libxml Tutorial"><link rel="up" href="index.html" title="Libxml Tutorial"><link rel="previous" href="ar01s02.html" title="Data Types"><link rel="next" href="ar01s04.html" title="Retrieving Element Content"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Parsing the file</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="ar01s02.html">Prev</a>�</td><th width="60%" align="center">�</th><td width="20%" align="right">�<a accesskey="n" href="ar01s04.html">Next</a></td></tr></table><hr></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="xmltutorialparsing"></a>Parsing the file</h2></div></div><div></div></div><p><a class="indexterm" name="fileparsing"></a>
Parsing the file requires only the name of the file and a single
      function call, plus error checking. Full code: <a href="apc.html" title="C.�Code for Keyword Example">Appendix�C, <i>Code for Keyword Example</i></a></p><p>
    </p><pre class="programlisting">
        <a name="declaredoc"></a><img src="images/callouts/1.png" alt="1" border="0"> xmlDocPtr doc;
	<a name="declarenode"></a><img src="images/callouts/2.png" alt="2" border="0"> xmlNodePtr cur;

<a name="parsefile"></a><img src="images/callouts/3.png" alt="3" border="0"> doc = xmlParseFile(docname);
	
	<a name="checkparseerror"></a><img src="images/callouts/4.png" alt="4" border="0"> if (doc == NULL ) {
		fprintf(stderr,"Document not parsed successfully. \n");
		return;
	}

<a name="getrootelement"></a><img src="images/callouts/5.png" alt="5" border="0"> cur = xmlDocGetRootElement(doc);
	
	<a name="checkemptyerror"></a><img src="images/callouts/6.png" alt="6" border="0"> if (cur == NULL) {
		fprintf(stderr,"empty document\n");
		xmlFreeDoc(doc);
		return;
	}
	
	<a name="checkroottype"></a><img src="images/callouts/7.png" alt="7" border="0"> if (xmlStrcmp(cur-&gt;name, (const xmlChar *) "story")) {
		fprintf(stderr,"document of the wrong type, root node != story");
		xmlFreeDoc(doc);
		return;
	}

</pre><p>
      </p><div class="calloutlist"><table border="0" summary="Callout list"><tr><td width="5%" valign="top" align="left"><a href="#declaredoc"><img src="images/callouts/1.png" alt="1" border="0"></a> </td><td valign="top" align="left"><p>Declare the pointer that will point to your parsed document.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#declarenode"><img src="images/callouts/2.png" alt="2" border="0"></a> </td><td valign="top" align="left"><p>Declare a node pointer (you'll need this in order to
	  interact with individual nodes).</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkparseerror"><img src="images/callouts/4.png" alt="4" border="0"></a> </td><td valign="top" align="left"><p>Check to see that the document was successfully parsed. If it
	    was not, <span class="application">libxml</span> will at this point
	    register an error and stop. 
	    </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note"><tr><td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="images/note.png"></td><th align="left">Note</th></tr><tr><td colspan="2" align="left" valign="top"><p><a class="indexterm" name="id2525337"></a>
One common example of an error at this point is improper
	    handling of encoding. The <span class="acronym">XML</span> standard requires
	    documents stored with an encoding other than UTF-8 or UTF-16 to
	    contain an explicit declaration of their encoding. If the
	    declaration is there, <span class="application">libxml</span> will
	    automatically perform the necessary conversion to UTF-8 for
		you. More information on <span class="acronym">XML's</span> encoding
		requirements is contained in the <a href="http://www.w3.org/TR/REC-xml#charencoding" target="_top">standard</a>.</p></td></tr></table></div><p>
	  </p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#getrootelement"><img src="images/callouts/5.png" alt="5" border="0"></a> </td><td valign="top" align="left"><p>Retrieve the document's root element.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkemptyerror"><img src="images/callouts/6.png" alt="6" border="0"></a> </td><td valign="top" align="left"><p>Check to make sure the document actually contains something.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#checkroottype"><img src="images/callouts/7.png" alt="7" border="0"></a> </td><td valign="top" align="left"><p>In our case, we need to make sure the document is the right
	  type. "story" is the root type of the documents used in this
	  tutorial.</p></td></tr></table></div><p>
      <a class="indexterm" name="id2525415"></a>
    </p></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="ar01s02.html">Prev</a>�</td><td width="20%" align="center"><a accesskey="u" href="index.html">Up</a></td><td width="40%" align="right">�<a accesskey="n" href="ar01s04.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Data Types�</td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top">�Retrieving Element Content</td></tr></table></div></body></html>

.	Edit
..	Edit
apa.html	Edit
apb.html	Edit
apc.html	Edit
apd.html	Edit
ape.html	Edit
apf.html	Edit
apg.html	Edit
aph.html	Edit
api.html	Edit
ar01s02.html	Edit
ar01s03.html	Edit
ar01s04.html	Edit
ar01s05.html	Edit
ar01s06.html	Edit
ar01s07.html	Edit
ar01s08.html	Edit
ar01s09.html	Edit
images	Edit
includeaddattribute.c	Edit
includeaddkeyword.c	Edit
includeconvert.c	Edit
includegetattribute.c	Edit
includekeyword.c	Edit
includexpath.c	Edit
index.html	Edit
ix01.html	Edit