c - Zlib minimum deflate size -

- May 15, 2013

i'm trying figure out if there's way calculate minimum required size output buffer, based on size of input buffer.

this question similar zlib, deflate: how memory allocate?, not same. asking each chunk in isolation, rather entire stream.

so suppose have 2 buffers: input , output, , have buffer_size, - say, 4096 bytes. (just convenient number, no particular reason choose size.)

if deflate using:

deflate(stream, z_partial_flush)

so each chunk compressed, , flushed output buffer, there way can guarantee i'll have enough storage in output buffer without needing reallocate?

superficially, we'd assume deflated data larger uncompressed input data (assuming use compression level greater 0.)

of course, that's not case - small values. example, if deflate single byte, deflated data larger uncompressed data, due overhead of things headers , dictionaries in lzw stream.

thinking how lzw works, seem if our input data @ least 256 bytes (meaning worst case scenario, every single byte different , can't compress anything), should realize input size less 256 bytes + zlib headers potentially require larger output buffer.

but, generally, realworld applications aren't going compressing small sizes that. assuming input/output buffer of more 4k, there way guarantee output compressed data smaller input data?

(also, know deflatebound, rather avoid because of overhead.)

or, put way, there minimum buffer size can use input/output buffers guarantee output data (the compressed stream) smaller input data? or there pathological case can cause output stream larger input stream, regardless of size?

though can't quite make heads or tails out of question, can comment on parts of question in isolation.

is there way guarantee output compressed data smaller input data?

absolutely not. possible compressed output larger input. otherwise wouldn't able compress other input.

(also, know deflatebound, rather avoid because of overhead.)

overhead? seriously? we're talking fraction of percent larger input buffer reasonable sizes.

by way, deflatebound() provides bound on size of entire output stream function of size of entire input stream. can't when in middle of bunch of deflate() calls incomplete input , insufficient output space. example, may still have deflate output pending , delivered next deflate() call, without providing new input @ all. expansion ratio infinite isolated call.

due overhead of things headers , dictionaries in lzw stream.

deflate not lzw. approach uses called lz77. different lzw, obsolete. there no "dictionaries" stored in compressed deflate data. "dictionary" uncompressed data precedes data being compressed or decompressed.

or, put way, there minimum buffer size can use input/output buffers ...

the whole idea behind zlib interface not have worry fit in buffers. keep calling deflate() or inflate() more input data , more output space until you're done, , well. not matter if need make more 1 call consume 1 buffer of input, or more 1 call fill 1 buffer of output. have loops make more calls, provide more input when needed, , disposition output when needed , provide fresh output space.

Search This Blog

Yet

c - Zlib minimum deflate size -

Comments

Post a Comment

Popular posts from this blog

swift - How to change text of a button with a segmented controller? -

python - How to create jsonb index using GIN on SQLAlchemy? -

PHP DOM loadHTML() method unusual warning -