Commit graph

41 commits

Author SHA1 Message Date
James Bowes
9ab0b0d2aa Add a note about not including the parent node into the huffman coding. 2012-09-05 12:40:18 -03:00
James Bowes
48f963e143 add diagram of on-disk format 2012-08-28 13:00:29 -03:00
James Bowes
e756d673db update algorithm 2012-08-22 11:50:53 -03:00
James Bowes
a92e9f237a formatting fixup 2012-08-22 11:38:33 -03:00
James Bowes
17397ca0c2 Add start of algorithm file 2012-08-22 11:37:32 -03:00
James Bowes
cab64fd70b remove ref to algorithm.md, since I didn't write it 2012-08-13 14:37:32 -03:00
James Bowes
8d42cced12 Add doc describing format 2012-08-13 14:33:29 -03:00
James Bowes
b733d69aa4 Update README 2012-08-13 13:00:50 -03:00
James Bowes
6c00822523 Remove extra debug spew from thing.rb 2012-08-12 09:35:26 -03:00
James Bowes
e2f492f120 Add command line modes to unpack
Add modes to print stats, dump the content sets, and check a path to see
if it matches a content set.
2012-08-11 15:02:00 -03:00
James Bowes
16345dbad2 Decoding working for C 2012-08-11 14:16:29 -03:00
James Bowes
11fd9f1f4a Add huffman decoding for C 2012-08-09 17:51:05 -03:00
James Bowes
4b82b83e02 make thing.rb executable 2012-08-08 09:45:31 -03:00
James Bowes
62fff46d90 update readme for markdown header format 2012-08-08 09:27:36 -03:00
65bfba3f62 making the ruby unpacker have the same outcome as unpack.c
Unfortunately the ruby Zlib::ZStream internals are not really accessible
like the C functions
2012-08-07 09:49:21 -04:00
9ca686aa6f adding a 'p' option, to see the parent tree format 2012-08-06 17:40:57 -04:00
34514563b0 derp 2012-08-06 17:25:05 -04:00
e994597d42 adding a #to_h method for the Node object 2012-08-06 17:21:44 -04:00
0d71eb9e15 seperating output for verbosity 2012-08-06 17:04:30 -04:00
d5e899f804 get_child feels like java 2012-08-06 16:47:54 -04:00
96063631d8 correcting doc 2012-08-06 16:47:47 -04:00
9cebf811bc adding a ruby unpack'er 2012-08-06 16:31:50 -04:00
b5fd3c6008 stylistic tweaks 2012-08-06 14:51:56 -04:00
168d256fea adding a README 2012-08-06 14:49:56 -04:00
3e9789880d don't let the lookup return nil 2012-08-06 14:37:28 -04:00
28b6092ea3 adding logging to track where this nil is comming from 2012-08-06 14:32:17 -04:00
James Bowes
f4777de387 Merge branch 'vbatts/master'
Conflicts:
	thing.rb
2012-08-01 06:53:38 -03:00
James Bowes
6036500b74 Add c and d subcommands 2012-08-01 06:48:24 -03:00
50aeed0b53 show the size of the *.bin written 2012-07-31 13:47:01 -04:00
Vincent Batts
99eccf44c5 more Makefile tweaks 2012-07-30 12:13:33 -04:00
Vincent Batts
6b3bd894d6 Makefile cleanup 2012-07-30 12:09:02 -04:00
James Bowes
227e8de979 Fix bug in duplicate detection.
Each node is written to disk as a list of (path, node pointer) pairs.
The duplicate detection code was considering the node's children and the
node's name. If we only look for  the children, we can find much more
duplicates.

Previous duplicate detection went from 424 nodes to 127. New duplicate
detection reduces to 48 nodes.

With this better duplicate detection, the prefix compression doesn't
appear to be useful anymore. comment it out.

Trims an extra 40 bytes off my sample data.
2012-07-28 12:46:03 -03:00
James Bowes
a8a7fd57f6 Add start of C based decoder 2012-07-28 10:51:53 -03:00
James Bowes
427caabb1b add huffman implementation 2012-07-27 16:42:02 -03:00
James Bowes
7742eeb024 POC 2012-07-27 16:41:44 -03:00
James Bowes
abfdbebe28 checkpoint 2012-07-27 14:47:20 -03:00
James Bowes
ddf7d89408 temp 2012-07-26 17:04:52 -03:00
James Bowes
a5b7fd02ac poc with de-duped full nodes 2012-07-26 16:38:10 -03:00
James Bowes
606b0ea5e6 class based 2012-07-26 14:21:16 -03:00
James Bowes
4e0f638cd2 print out original stored value 2012-07-26 13:56:45 -03:00
James Bowes
afb59bf7fa init 2012-07-26 13:18:58 -03:00