XZ Utils 5.3.3alpha
Data Structures | Functions
file_info.c File Reference

Decode .xz file information into a lzma_index structure. More...

#include "index_decoder.h"

Data Structures

struct  lzma_file_info_coder
 

Functions

static bool fill_temp (lzma_file_info_coder *coder, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size)
 
static bool seek_to_pos (lzma_file_info_coder *coder, uint64_t target_pos, size_t in_start, size_t *in_pos, size_t in_size)
 
static lzma_ret reverse_seek (lzma_file_info_coder *coder, size_t in_start, size_t *in_pos, size_t in_size)
 
static size_t get_padding_size (const uint8_t *buf, size_t buf_size)
 Gets the number of zero-bytes at the end of the buffer. More...
 
static lzma_ret hide_format_error (lzma_ret ret)
 
static lzma_ret decode_index (lzma_file_info_coder *coder, const lzma_allocator *allocator, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, bool update_file_cur_pos)
 
static lzma_ret file_info_decode (void *coder_ptr, const lzma_allocator *allocator, const uint8_t *restrict in, size_t *restrict in_pos, size_t in_size, uint8_t *restrict out lzma_attribute((__unused__)), size_t *restrict out_pos lzma_attribute((__unused__)), size_t out_size lzma_attribute((__unused__)), lzma_action action lzma_attribute((__unused__)))
 
static lzma_ret file_info_decoder_memconfig (void *coder_ptr, uint64_t *memusage, uint64_t *old_memlimit, uint64_t new_memlimit)
 
static void file_info_decoder_end (void *coder_ptr, const lzma_allocator *allocator)
 
static lzma_ret lzma_file_info_decoder_init (lzma_next_coder *next, const lzma_allocator *allocator, uint64_t *seek_pos, lzma_index **dest_index, uint64_t memlimit, uint64_t file_size)
 
lzma_ret lzma_file_info_decoder (lzma_stream *strm, lzma_index **dest_index, uint64_t memlimit, uint64_t file_size)
 Initialize a .xz file information decoder. More...
 

Detailed Description

Decode .xz file information into a lzma_index structure.

Function Documentation

◆ fill_temp()

static bool fill_temp ( lzma_file_info_coder coder,
const uint8_t *restrict  in,
size_t *restrict  in_pos,
size_t  in_size 
)
static

Copies data from in[*in_pos] into coder->temp until coder->temp_pos == coder->temp_size. This also keeps coder->file_cur_pos in sync with *in_pos. Returns true if more input is needed.

References lzma_file_info_coder::file_cur_pos, and lzma_bufcpy().

◆ seek_to_pos()

static bool seek_to_pos ( lzma_file_info_coder coder,
uint64_t  target_pos,
size_t  in_start,
size_t *  in_pos,
size_t  in_size 
)
static

Seeks to the absolute file position specified by target_pos. This tries to do the seeking by only modifying *in_pos, if possible. The main benefit of this is that if one passes the whole file at once to lzma_code(), the decoder will never need to return LZMA_SEEK_NEEDED as all the seeking can be done by adjusting *in_pos in this function.

Returns true if an external seek is needed and the caller must return LZMA_SEEK_NEEDED.

◆ reverse_seek()

static lzma_ret reverse_seek ( lzma_file_info_coder coder,
size_t  in_start,
size_t *  in_pos,
size_t  in_size 
)
static

The caller sets coder->file_target_pos so that it points to the end of the desired file position. This function then determines how far backwards from that position we can seek. After seeking fill_temp() can be used to read data into coder->temp. When fill_temp() has finished, coder->temp[coder->temp_size] will match coder->file_target_pos.

This also validates that coder->target_file_pos is sane in sense that we aren't trying to seek too far backwards (too close or beyond the beginning of the file).

References lzma_file_info_coder::file_target_pos, LZMA_DATA_ERROR, and LZMA_STREAM_HEADER_SIZE.

◆ get_padding_size()

static size_t get_padding_size ( const uint8_t *  buf,
size_t  buf_size 
)
static

Gets the number of zero-bytes at the end of the buffer.

◆ hide_format_error()

static lzma_ret hide_format_error ( lzma_ret  ret)
static

With the Stream Header at the very beginning of the file, LZMA_FORMAT_ERROR is used to tell the application that Magic Bytes didn't match. In other Stream Header/Footer fields (in the middle/end of the file) it could be a bit confusing to return LZMA_FORMAT_ERROR as we already know that there is a valid Stream Header at the beginning of the file. For those cases this function is used to convert LZMA_FORMAT_ERROR to LZMA_DATA_ERROR.

References LZMA_DATA_ERROR, and LZMA_FORMAT_ERROR.

◆ decode_index()

static lzma_ret decode_index ( lzma_file_info_coder coder,
const lzma_allocator allocator,
const uint8_t *restrict  in,
size_t *restrict  in_pos,
size_t  in_size,
bool  update_file_cur_pos 
)
static

Calls the Index decoder and updates coder->index_remaining. This is a separate function because the input can be either directly from the application or from coder->temp.

◆ lzma_file_info_decoder()

lzma_ret lzma_file_info_decoder ( lzma_stream strm,
lzma_index **  dest_index,
uint64_t  memlimit,
uint64_t  file_size 
)

Initialize a .xz file information decoder.

Parameters
strmPointer to a properly prepared lzma_stream
dest_indexPointer to a pointer where the decoder will put the decoded lzma_index. The old value of *dest_index is ignored (not freed).
memlimitHow much memory the resulting lzma_index is allowed to require. Use UINT64_MAX to effectively disable the limiter.
file_sizeSize of the input .xz file

This decoder decodes the Stream Header, Stream Footer, Index, and Stream Padding field(s) from the input .xz file and stores the resulting combined index in *dest_index. This information can be used to get the uncompressed file size with lzma_index_uncompressed_size(*dest_index) or, for example, to implement random access reading by locating the Blocks in the Streams.

To get the required information from the .xz file, lzma_code() may ask the application to seek in the input file by returning LZMA_SEEK_NEEDED and having the target file position specified in lzma_stream.seek_pos. The number of seeks required depends on the input file and how big buffers the application provides. When possible, the decoder will seek backward and forward in the given buffer to avoid useless seek requests. Thus, if the application provides the whole file at once, no external seeking will be required (that is, lzma_code() won't return LZMA_SEEK_NEEDED).

The value in lzma_stream.total_in can be used to estimate how much data liblzma had to read to get the file information. However, due to seeking and the way total_in is updated, the value of total_in will be somewhat inaccurate (a little too big). Thus, total_in is a good estimate but don't expect to see the same exact value for the same file if you change the input buffer size or switch to a different liblzma version.

Valid ‘action’ arguments to lzma_code() are LZMA_RUN and LZMA_FINISH. You only need to use LZMA_RUN; LZMA_FINISH is only supported because it might be convenient for some applications. If you use LZMA_FINISH and if lzma_code() asks the application to seek, remember to reset ‘action’ back to LZMA_RUN unless you hit the end of the file again.

Possible return values from lzma_code():

  • LZMA_OK: All OK so far, more input needed
  • LZMA_SEEK_NEEDED: Provide more input starting from the absolute file position strm->seek_pos
  • LZMA_STREAM_END: Decoding was successful, *dest_index has been set
  • LZMA_FORMAT_ERROR: The input file is not in the .xz format (the expected magic bytes were not found from the beginning of the file)
  • LZMA_OPTIONS_ERROR: File looks valid but contains headers that aren't supported by this version of liblzma
  • LZMA_DATA_ERROR: File is corrupt
  • LZMA_BUF_ERROR
  • LZMA_MEM_ERROR
  • LZMA_MEMLIMIT_ERROR
  • LZMA_PROG_ERROR
Returns
- LZMA_OK
  • LZMA_MEM_ERROR
  • LZMA_PROG_ERROR

References lzma_next_strm_init.