/usr/share/doc/pcre2-devel/html
<html> <head> <title>pcre2serialize specification</title> </head> <body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB"> <h1>pcre2serialize man page</h1> <p> Return to the <a href="index.html">PCRE2 index page</a>. </p> <p> This page is part of the PCRE2 HTML documentation. It was generated automatically from the original man page. If there is any nonsense in it, please consult the man page, in case the conversion went wrong. <br> <ul> <li><a name="TOC1" href="#SEC1">SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS</a> <li><a name="TOC2" href="#SEC2">SECURITY CONCERNS</a> <li><a name="TOC3" href="#SEC3">SAVING COMPILED PATTERNS</a> <li><a name="TOC4" href="#SEC4">RE-USING PRECOMPILED PATTERNS</a> <li><a name="TOC5" href="#SEC5">AUTHOR</a> <li><a name="TOC6" href="#SEC6">REVISION</a> </ul> <br><a name="SEC1" href="#TOC1">SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS</a><br> <P> <b>int32_t pcre2_serialize_decode(pcre2_code **<i>codes</i>,</b> <b> int32_t <i>number_of_codes</i>, const uint8_t *<i>bytes</i>,</b> <b> pcre2_general_context *<i>gcontext</i>);</b> <br> <br> <b>int32_t pcre2_serialize_encode(const pcre2_code **<i>codes</i>,</b> <b> int32_t <i>number_of_codes</i>, uint8_t **<i>serialized_bytes</i>,</b> <b> PCRE2_SIZE *<i>serialized_size</i>, pcre2_general_context *<i>gcontext</i>);</b> <br> <br> <b>void pcre2_serialize_free(uint8_t *<i>bytes</i>);</b> <br> <br> <b>int32_t pcre2_serialize_get_number_of_codes(const uint8_t *<i>bytes</i>);</b> <br> <br> If you are running an application that uses a large number of regular expression patterns, it may be useful to store them in a precompiled form instead of having to compile them every time the application is run. However, if you are using the just-in-time optimization feature, it is not possible to save and reload the JIT data, because it is position-dependent. The host on which the patterns are reloaded must be running the same version of PCRE2, with the same code unit width, and must also have the same endianness, pointer width and PCRE2_SIZE type. For example, patterns compiled on a 32-bit system using PCRE2's 16-bit library cannot be reloaded on a 64-bit system, nor can they be reloaded using the 8-bit library. </P> <P> Note that "serialization" in PCRE2 does not convert compiled patterns to an abstract format like Java or .NET serialization. The serialized output is really just a bytecode dump, which is why it can only be reloaded in the same environment as the one that created it. Hence the restrictions mentioned above. Applications that are not statically linked with a fixed version of PCRE2 must be prepared to recompile patterns from their sources, in order to be immune to PCRE2 upgrades. </P> <br><a name="SEC2" href="#TOC1">SECURITY CONCERNS</a><br> <P> The facility for saving and restoring compiled patterns is intended for use within individual applications. As such, the data supplied to <b>pcre2_serialize_decode()</b> is expected to be trusted data, not data from arbitrary external sources. There is only some simple consistency checking, not complete validation of what is being re-loaded. Corrupted data may cause undefined results. For example, if the length field of a pattern in the serialized data is corrupted, the deserializing code may read beyond the end of the byte stream that is passed to it. </P> <br><a name="SEC3" href="#TOC1">SAVING COMPILED PATTERNS</a><br> <P> Before compiled patterns can be saved they must be serialized, which in PCRE2 means converting the pattern to a stream of bytes. A single byte stream may contain any number of compiled patterns, but they must all use the same character tables. A single copy of the tables is included in the byte stream (its size is 1088 bytes). For more details of character tables, see the <a href="pcre2api.html#localesupport">section on locale support</a> in the <a href="pcre2api.html"><b>pcre2api</b></a> documentation. </P> <P> The function <b>pcre2_serialize_encode()</b> creates a serialized byte stream from a list of compiled patterns. Its first two arguments specify the list, being a pointer to a vector of pointers to compiled patterns, and the length of the vector. The third and fourth arguments point to variables which are set to point to the created byte stream and its length, respectively. The final argument is a pointer to a general context, which can be used to specify custom memory mangagement functions. If this argument is NULL, <b>malloc()</b> is used to obtain memory for the byte stream. The yield of the function is the number of serialized patterns, or one of the following negative error codes: <pre> PCRE2_ERROR_BADDATA the number of patterns is zero or less PCRE2_ERROR_BADMAGIC mismatch of id bytes in one of the patterns PCRE2_ERROR_MEMORY memory allocation failed PCRE2_ERROR_MIXEDTABLES the patterns do not all use the same tables PCRE2_ERROR_NULL the 1st, 3rd, or 4th argument is NULL </pre> PCRE2_ERROR_BADMAGIC means either that a pattern's code has been corrupted, or that a slot in the vector does not point to a compiled pattern. </P> <P> Once a set of patterns has been serialized you can save the data in any appropriate manner. Here is sample code that compiles two patterns and writes them to a file. It assumes that the variable <i>fd</i> refers to a file that is open for output. The error checking that should be present in a real application has been omitted for simplicity. <pre> int errorcode; uint8_t *bytes; PCRE2_SIZE erroroffset; PCRE2_SIZE bytescount; pcre2_code *list_of_codes[2]; list_of_codes[0] = pcre2_compile("first pattern", PCRE2_ZERO_TERMINATED, 0, &errorcode, &erroroffset, NULL); list_of_codes[1] = pcre2_compile("second pattern", PCRE2_ZERO_TERMINATED, 0, &errorcode, &erroroffset, NULL); errorcode = pcre2_serialize_encode(list_of_codes, 2, &bytes, &bytescount, NULL); errorcode = fwrite(bytes, 1, bytescount, fd); </pre> Note that the serialized data is binary data that may contain any of the 256 possible byte values. On systems that make a distinction between binary and non-binary data, be sure that the file is opened for binary output. </P> <P> Serializing a set of patterns leaves the original data untouched, so they can still be used for matching. Their memory must eventually be freed in the usual way by calling <b>pcre2_code_free()</b>. When you have finished with the byte stream, it too must be freed by calling <b>pcre2_serialize_free()</b>. If this function is called with a NULL argument, it returns immediately without doing anything. </P> <br><a name="SEC4" href="#TOC1">RE-USING PRECOMPILED PATTERNS</a><br> <P> In order to re-use a set of saved patterns you must first make the serialized byte stream available in main memory (for example, by reading from a file). The management of this memory block is up to the application. You can use the <b>pcre2_serialize_get_number_of_codes()</b> function to find out how many compiled patterns are in the serialized data without actually decoding the patterns: <pre> uint8_t *bytes = <serialized data>; int32_t number_of_codes = pcre2_serialize_get_number_of_codes(bytes); </pre> The <b>pcre2_serialize_decode()</b> function reads a byte stream and recreates the compiled patterns in new memory blocks, setting pointers to them in a vector. The first two arguments are a pointer to a suitable vector and its length, and the third argument points to a byte stream. The final argument is a pointer to a general context, which can be used to specify custom memory mangagement functions for the decoded patterns. If this argument is NULL, <b>malloc()</b> and <b>free()</b> are used. After deserialization, the byte stream is no longer needed and can be discarded. <pre> pcre2_code *list_of_codes[2]; uint8_t *bytes = <serialized data>; int32_t number_of_codes = pcre2_serialize_decode(list_of_codes, 2, bytes, NULL); </pre> If the vector is not large enough for all the patterns in the byte stream, it is filled with those that fit, and the remainder are ignored. The yield of the function is the number of decoded patterns, or one of the following negative error codes: <pre> PCRE2_ERROR_BADDATA second argument is zero or less PCRE2_ERROR_BADMAGIC mismatch of id bytes in the data PCRE2_ERROR_BADMODE mismatch of code unit size or PCRE2 version PCRE2_ERROR_BADSERIALIZEDDATA other sanity check failure PCRE2_ERROR_MEMORY memory allocation failed PCRE2_ERROR_NULL first or third argument is NULL </pre> PCRE2_ERROR_BADMAGIC may mean that the data is corrupt, or that it was compiled on a system with different endianness. </P> <P> Decoded patterns can be used for matching in the usual way, and must be freed by calling <b>pcre2_code_free()</b>. However, be aware that there is a potential race issue if you are using multiple patterns that were decoded from a single byte stream in a multithreaded application. A single copy of the character tables is used by all the decoded patterns and a reference count is used to arrange for its memory to be automatically freed when the last pattern is freed, but there is no locking on this reference count. Therefore, if you want to call <b>pcre2_code_free()</b> for these patterns in different threads, you must arrange your own locking, and ensure that <b>pcre2_code_free()</b> cannot be called by two threads at the same time. </P> <P> If a pattern was processed by <b>pcre2_jit_compile()</b> before being serialized, the JIT data is discarded and so is no longer available after a save/restore cycle. You can, however, process a restored pattern with <b>pcre2_jit_compile()</b> if you wish. </P> <br><a name="SEC5" href="#TOC1">AUTHOR</a><br> <P> Philip Hazel <br> University Computing Service <br> Cambridge, England. <br> </P> <br><a name="SEC6" href="#TOC1">REVISION</a><br> <P> Last updated: 27 June 2018 <br> Copyright © 1997-2018 University of Cambridge. <br> <p> Return to the <a href="index.html">PCRE2 index page</a>. </p>
.
Edit
..
Edit
NON-AUTOTOOLS-BUILD.txt
Edit
README.txt
Edit
index.html
Edit
pcre2-config.html
Edit
pcre2.html
Edit
pcre2_callout_enumerate.html
Edit
pcre2_code_copy.html
Edit
pcre2_code_copy_with_tables.html
Edit
pcre2_code_free.html
Edit
pcre2_compile.html
Edit
pcre2_compile_context_copy.html
Edit
pcre2_compile_context_create.html
Edit
pcre2_compile_context_free.html
Edit
pcre2_config.html
Edit
pcre2_convert_context_copy.html
Edit
pcre2_convert_context_create.html
Edit
pcre2_convert_context_free.html
Edit
pcre2_converted_pattern_free.html
Edit
pcre2_dfa_match.html
Edit
pcre2_general_context_copy.html
Edit
pcre2_general_context_create.html
Edit
pcre2_general_context_free.html
Edit
pcre2_get_error_message.html
Edit
pcre2_get_mark.html
Edit
pcre2_get_match_data_size.html
Edit
pcre2_get_ovector_count.html
Edit
pcre2_get_ovector_pointer.html
Edit
pcre2_get_startchar.html
Edit
pcre2_jit_compile.html
Edit
pcre2_jit_free_unused_memory.html
Edit
pcre2_jit_match.html
Edit
pcre2_jit_stack_assign.html
Edit
pcre2_jit_stack_create.html
Edit
pcre2_jit_stack_free.html
Edit
pcre2_maketables.html
Edit
pcre2_maketables_free.html
Edit
pcre2_match.html
Edit
pcre2_match_context_copy.html
Edit
pcre2_match_context_create.html
Edit
pcre2_match_context_free.html
Edit
pcre2_match_data_create.html
Edit
pcre2_match_data_create_from_pattern.html
Edit
pcre2_match_data_free.html
Edit
pcre2_pattern_convert.html
Edit
pcre2_pattern_info.html
Edit
pcre2_serialize_decode.html
Edit
pcre2_serialize_encode.html
Edit
pcre2_serialize_free.html
Edit
pcre2_serialize_get_number_of_codes.html
Edit
pcre2_set_bsr.html
Edit
pcre2_set_callout.html
Edit
pcre2_set_character_tables.html
Edit
pcre2_set_compile_extra_options.html
Edit
pcre2_set_compile_recursion_guard.html
Edit
pcre2_set_depth_limit.html
Edit
pcre2_set_glob_escape.html
Edit
pcre2_set_glob_separator.html
Edit
pcre2_set_heap_limit.html
Edit
pcre2_set_match_limit.html
Edit
pcre2_set_max_pattern_length.html
Edit
pcre2_set_newline.html
Edit
pcre2_set_offset_limit.html
Edit
pcre2_set_parens_nest_limit.html
Edit
pcre2_set_recursion_limit.html
Edit
pcre2_set_recursion_memory_management.html
Edit
pcre2_set_substitute_callout.html
Edit
pcre2_substitute.html
Edit
pcre2_substring_copy_byname.html
Edit
pcre2_substring_copy_bynumber.html
Edit
pcre2_substring_free.html
Edit
pcre2_substring_get_byname.html
Edit
pcre2_substring_get_bynumber.html
Edit
pcre2_substring_length_byname.html
Edit
pcre2_substring_length_bynumber.html
Edit
pcre2_substring_list_free.html
Edit
pcre2_substring_list_get.html
Edit
pcre2_substring_nametable_scan.html
Edit
pcre2_substring_number_from_name.html
Edit
pcre2api.html
Edit
pcre2build.html
Edit
pcre2callout.html
Edit
pcre2compat.html
Edit
pcre2convert.html
Edit
pcre2demo.html
Edit
pcre2grep.html
Edit
pcre2jit.html
Edit
pcre2limits.html
Edit
pcre2matching.html
Edit
pcre2partial.html
Edit
pcre2pattern.html
Edit
pcre2perform.html
Edit
pcre2posix.html
Edit
pcre2sample.html
Edit
pcre2serialize.html
Edit
pcre2syntax.html
Edit
pcre2test.html
Edit
pcre2unicode.html
Edit