Alex Rousskov wrote:
> Thanks for the implementation and detailed description! I am
> glad the "real" numbers are at least as good as the simulated ones.
> Since your approach offers a trade-off (you trade RAM and CPU time for
> diff size), do you have any data illustrating the trade-off? For
> example, how much memory (at peak) your algorithm needs to encode the
> entire digest versus the portion of changed bits. Also, would that
> memory depend on the digest size (the number of changed bits)?
Memory requirements are at least the size of the binary tree used to
build the huffman code. Then there is a tradeoff between memory and CPU
usage: more memory = less CPU time.
Memory structures involved:
d1) The old digest
d2) The current digest
bd) bitdistance encoding of the differences
df) Frequency histogram of the bitdistance encoding
ht) binary tree based on the frequency histogram, used to calculate the
huffman code
hc) The resulting huffman code of "df"
out) Result
Memory:
Both "d1" and "d2" needs to be read at least one time, twice if memory
is more important than CPU.
"bd" can be calculated on the fly one word at a time. The full "bd" data
can be quite large, even larger than a full digest.
"df" stores one 32bit counter for each unique bit change distance. In my
simulator it is a array of 64K words (128KB) but can in most cases be
much smaller, probably in the range of 4K-8K words. The less dense the
changes are the more memory is required.
The size "ht" should be about 1.5*"df" entries. Size of each "ht" entry
is two 32bit integers or 8 bytes.
The size of "hc" is at least 32bits times number of "df" entries.
Theoretically it is possible to reuse the space of "df".
"out" is the output. It is written once. No back references, so there is
no need to lock it in memory.
As you see the "only" additional memory required compared to cd_diff is
the memory needed to compute and store the huffman code.
CPU:
The CPU usage on a typical digest diff of 800K digests using the
prototype is about 1 second on a Pentium 133. I expect it to be possible
to both select the least used memory approach above and cut the CPU time
down to 0.5 seconds or less. The simulator/prototype code is very
inefficient.
The situation is similar for decompression.
/Henrik
Received on Tue Jul 29 2003 - 13:16:00 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:17 MST