find: LOCATE02 Database Format
4.2.1 LOCATE02 Database Format
------------------------------
'updatedb' runs a program called 'frcode' to "front-compress" the list
of file names, which reduces the database size by a factor of 4 to 5.
Front-compression (also known as incremental encoding) works as follows.
The database entries are a sorted list (case-insensitively, for
users' convenience). Since the list is sorted, each entry is likely to
share a prefix (initial string) with the previous entry. Each database
entry begins with an offset-differential count byte, which is the
additional number of characters of prefix of the preceding entry to use
beyond the number that the preceding entry is using of its predecessor.
(The counts can be negative.) Following the count is a null-terminated
ASCII remainder - the part of the name that follows the shared prefix.
If the offset-differential count is larger than can be stored in a
byte (+/-127), the byte has the value 0x80 and the count follows in a
2-byte word, with the high byte first (network byte order).
Every database begins with a dummy entry for a file called
'LOCATE02', which 'locate' checks for to ensure that the database file
has the correct format; it ignores the entry in doing the search.
Databases cannot be concatenated together, even if the first (dummy)
entry is trimmed from all but the first database. This is because the
offset-differential count in the first entry of the second and following
databases will be wrong.
In the output of 'locate --statistics', the new database format is
referred to as 'LOCATE02'.