1
0
Fork 1
mirror of https://github.com/vbatts/tar-split.git synced 2025-01-09 21:37:08 +00:00
Commit graph

169 commits

Author SHA1 Message Date
862ccd05bc Merge pull request #31 from vbatts/tar-go1.6
Tar go1.6
2016-02-15 09:41:56 -05:00
c32966b9e8 archive/tar: go1.3 and go1.4 compatibility
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2016-02-15 09:38:46 -05:00
Joe Tsai
10db8408f6 archive/tar: document how Reader.Read handles header-only files
Commit dd5e14a7511465d20c6e95bf54c9b8f999abbbf6 ensured that no data
could be read for header-only files regardless of what the Header.Size
said. We should document this fact in Reader.Read.

Updates #13647

Change-Id: I4df9a2892bc66b49e0279693d08454bf696cfa31
Reviewed-on: https://go-review.googlesource.com/17913
Reviewed-by: Russ Cox <rsc@golang.org>
2016-02-03 07:01:09 -05:00
Joe Tsai
962540fec3 archive/tar: spell license correctly in example
Change-Id: Ice85d161f026a991953bd63ecc6ec80f8d06dfbd
Reviewed-on: https://go-review.googlesource.com/17901
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-02-03 07:01:09 -05:00
Joe Tsai
a04b4ddba4 archive/tar: properly parse GNU base-256 encoding
Motivation:
* Previous implementation did not detect integer overflow when
parsing a base-256 encoded field.
* Previous implementation did not treat the integer as a two's
complement value as specified by GNU.

The relevant GNU specification says:
<<<
GNU format uses two's-complement base-256 notation to store values
that do not fit into standard ustar range.
>>>

Fixes #12435

Change-Id: I4639bcffac8d12e1cb040b76bd05c9d7bc6c23a8
Reviewed-on: https://go-review.googlesource.com/17424
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-03 07:01:09 -05:00
Joe Tsai
ce5aac17f9 archive/tar: properly format GNU base-256 encoding
Motivation:
* Previous implementation silently failed when an integer overflow
occurred. Now, we report an ErrFieldTooLong.
* Previous implementation did not encode in two's complement format and was
unable to encode negative numbers.

The relevant GNU specification says:
<<<
GNU format uses two's-complement base-256 notation to store values
that do not fit into standard ustar range.
>>>

Fixes #12436

Change-Id: I09c20602eabf8ae3a7e0db35b79440a64bfaf807
Reviewed-on: https://go-review.googlesource.com/17425
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-03 06:58:30 -05:00
Joe Tsai
be9ac88117 archive/tar: convert Reader.Next to be loop based
Motivation for change:
* Recursive logic is hard to follow, since it tends to apply
things in reverse. On the other hand, the tar formats tend to
describe meta headers as affecting the next entry.
* Recursion also applies changes in the wrong order. Two test
files are attached that use multiple headers. The previous Go
behavior differs from what GNU and BSD tar do.

Change-Id: Ic1557256fc1363c5cb26570e5d0b9f65a9e57341
Reviewed-on: https://go-review.googlesource.com/14624
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-02-03 06:58:30 -05:00
Joe Tsai
64935a5f0f archive/tar: move parse/format methods to standalone receiver
Motivations for this change:
* It allows these functions to be used outside of Reader/Writer.
* It allows these functions to be more easily unit tested.

Change-Id: Iebe2b70bdb8744371c9ffa87c24316cbbf025b59
Reviewed-on: https://go-review.googlesource.com/15113
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-02-02 14:32:27 -05:00
Joe Tsai
b598ba3ee7 archive/tar: fix issues with readGNUSparseMap1x0
Motivations:
* Use of strconv.ParseInt does not properly treat integers as 64bit,
preventing this function from working properly on 32bit machines.
* Use of io.ReadFull does not properly detect truncated streams
when the file suddenly ends on a block boundary.
* The function blindly trusts user input for numEntries and allocates
memory accordingly.
* The function does not validate that numEntries is not negative,
allowing a malicious sparse file to cause a panic during make.

In general, this function was overly complicated for what it was
accomplishing and it was hard to reason that it was free from
bounds errors. Instead, it has been rewritten and relies on
bytes.Buffer.ReadString to do the main work. So long as invariants
about the number of '\n' in the buffer are maintained, it is much
easier to see why this approach is correct.

Change-Id: Ibb12c4126c26e0ea460ea063cd17af68e3cf609e
Reviewed-on: https://go-review.googlesource.com/15174
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:17:35 -05:00
Joe Tsai
7500c932c7 archive/tar: properly handle header-only "files" in Reader
Certain special type-flags, specifically 1, 2, 3, 4, 5, 6,
do not have a data section. Thus, regardless of what the size field
says, we should not attempt to read any data for these special types.

The relevant PAX and USTAR specification says:
<<<
If the typeflag field is set to specify a file to be of type 1 (a link)
or 2 (a symbolic link), the size field shall be specified as zero.
If the typeflag field is set to specify a file of type 5 (directory),
the size field shall be interpreted as described under the definition
of that record type. No data logical records are stored for types 1, 2, or 5.
If the typeflag field is set to 3 (character special file),
4 (block special file), or 6 (FIFO), the meaning of the size field is
unspecified by this volume of POSIX.1-2008, and no data logical records shall
be stored on the medium.
Additionally, for type 6, the size field shall be ignored when reading.
If the typeflag field is set to any other value, the number of logical
records written following the header shall be (size+511)/512, ignoring
any fraction in the result of the division.
>>>

Contrary to the specification, we do not assert that the size field
is zero for type 1 and 2 since we liberally accept non-conforming formats.

Change-Id: I666b601597cb9d7a50caa081813d90ca9cfc52ed
Reviewed-on: https://go-review.googlesource.com/16614
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:10:38 -05:00
Matt Layher
2424f4e367 archive/tar: make output deterministic
Replaces PID in PaxHeaders with 0.  Sorts PAX header keys before writing
them to the archive.

Fixes #12358

Change-Id: If239f89c85f1c9d9895a253fb06a47ad44960124
Reviewed-on: https://go-review.googlesource.com/13975
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Joe Tsai <joetsai@digital-static.net>
2016-02-02 14:10:11 -05:00
Joe Tsai
bffda594f7 archive/tar: detect truncated files
Motivation:
* Reader.skipUnread never reports io.ErrUnexpectedEOF. This is strange
given that io.ErrUnexpectedEOF is given through Reader.Read if the
user manually reads the file.
* Reader.skipUnread fails to detect truncated files since io.Seeker
is lazy about reporting errors. Thus, the behavior of Reader differs
whether the input io.Reader also satisfies io.Seeker or not.

To solve this, we seek to one before the end of the data section and
always rely on at least one call to io.CopyN. If the tr.r satisfies
io.Seeker, this is guarunteed to never read more than blockSize.

Fixes #12557

Change-Id: I0ddddfc6bed0d74465cb7e7a02b26f1de7a7a279
Reviewed-on: https://go-review.googlesource.com/15175
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:09:30 -05:00
Joe Tsai
cf83c95de8 archive/tar: fix numeric overflow issues in readGNUSparseMap0x1
Motivation:
* The logic to verify the numEntries can overflow and incorrectly
pass, allowing a malicious file to allocate arbitrary memory.
* The use of strconv.ParseInt does not set the integer precision
to 64bit, causing this code to work incorrectly on 32bit machines.

Change-Id: I1b1571a750a84f2dde97cc329ed04fe2342aaa60
Reviewed-on: https://go-review.googlesource.com/15173
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:09:04 -05:00
Joe Tsai
cb423795eb archive/tar: add missing error checks to Reader.Next
A recursive call to Reader.Next did not check the error before
trying to use the result, leading to a nil pointer panic.
This specific CL addresses the immediate issue, which is the panic,
but does not solve the root issue, which is due to an integer
overflow in the base-256 parser.

Updates #12435

Change-Id: Ia908671f0f411a409a35e24f2ebf740d46734072
Reviewed-on: https://go-review.googlesource.com/15437
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:08:38 -05:00
Joe Tsai
4ad443d166 archive/tar: expand abilities of TestReader
Motivation:
* There are an increasing number of "one-off" corrupt files added
to make sure that package does not succeed or crash on them.
Instead, allow for the test to specify the error that is expected
to occur (if any).
* Also, fold in the logic to check the MD5 checksum into this
function.

The following tests are being removed:
* TestIncrementalRead: Done by TestReader by using io.CopyBuffer
with a buffer of 8. This achieves the same behavior as this test.
* TestSparseEndToEnd: Since TestReader checks the MD5 checksums
if the input corpus provides them, then this is redundant.
* TestSparseIncrementalRead: Redundant for the same reasons that
TestIncrementalRead is now redundant
* TestNegativeHdrSize: Added to TestReader corpus
* TestIssue10968: Added to TestReader corpus
* TestIssue11169: Added to TestReader corpus

With this change, code coverage did not change: 85.3%

Change-Id: I8550d48657d4dbb8f47dfc3dc280758ef73b47ec
Reviewed-on: https://go-review.googlesource.com/15176
Reviewed-by: Andrew Gerrand <adg@golang.org>
2016-02-02 14:06:30 -05:00
Joe Tsai
f0fc67b3a8 archive/tar: make Reader.Read errors persistent
If the stream is in an inconsistent state, it does not make sense
that Reader.Read can be called and possibly succeed.

Change-Id: I9d1c5a1300b2c2b45232188aa7999e350809dcf2
Reviewed-on: https://go-review.googlesource.com/15177
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
2016-02-02 14:06:30 -05:00
Joe Tsai
af15385a0d archive/tar: fix bugs with sparseFileReader
The sparseFileReader is prone to two different forms of
denial-of-service attacks:
* A malicious tar file can cause an infinite loop
* A malicious tar file can cause arbitrary panics

This results because of poor error checking/handling, which this
CL fixes. While we are at it, add a plethora of unit tests to
test for possible malicious inputs.

Change-Id: I2f9446539d189f3c1738a1608b0ad4859c1be929
Reviewed-on: https://go-review.googlesource.com/15115
Reviewed-by: Andrew Gerrand <adg@golang.org>
Run-TryBot: Andrew Gerrand <adg@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:06:30 -05:00
Joe Tsai
440ba9e519 archive/tar: remove dead code with USTAR path splitting
Convert splitUSTARPath to return a bool rather than an error since
the caller never ever uses the error other than to check if it is
nil. Thus, we can remove errNameTooLong as well.

Also, fold the checking of the length <= fileNameSize and whether
the string is ASCII into the split function itself.

Lastly, remove logic to set the MAGIC since that's already done on
L200. Thus, setting the magic is redundant.

There is no overall logic change.

Updates #12638

Change-Id: I26b6992578199abad723c2a2af7f4fc078af9c17
Reviewed-on: https://go-review.googlesource.com/14723
Reviewed-by: David Symonds <dsymonds@golang.org>
Run-TryBot: David Symonds <dsymonds@golang.org>
2016-02-02 14:06:30 -05:00
b87f81631a version: mark 0.9.12 2016-01-31 01:39:10 -05:00
d50e5c9283 LICENSE: update LICENSE to BSD 3-clause
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-12-03 15:45:57 -05:00
0de4e9db0c Merge pull request #27 from vbatts/bench_asm
tar/asm: basic benchmark on disasm/asm of testdata
2015-12-02 14:09:21 -06:00
1501fe6002 Merge pull request #22 from tonistiigi/stream-opt
Optimize tar stream generation
2015-12-02 14:09:08 -06:00
19b7e22058 tar/asm: basic benchmark on disasm/asm of testdata
```
PASS
BenchmarkAsm-4         5         238968475 ns/op        66841059 B/op       2449 allocs/op
ok      _/home/vbatts/src/vb/tar-split/tar/asm  2.267s
```

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-12-02 14:36:02 -05:00
026e78012b Merge pull request #26 from vbatts/better_discard_in_test
tar/asm: remove unneeded Tee
2015-12-02 12:00:26 -06:00
2efe34695a tar/asm: remove unneeded Tee
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-12-02 12:56:52 -05:00
Tonis Tiigi
23b6435e6b Optimize tar stream generation
- New writeTo method allows to avoid creating extra pipe.
- Copy with a pooled buffer instead of allocating new buffer for each file.
- Avoid extra object allocations inside the loop.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2015-12-01 14:08:53 -08:00
93666d5824 Merge pull request #25 from vbatts/bench
tar/storage: adding Getter Putter benchmark
2015-12-01 14:37:10 -06:00
11281e8c09 tar/storage: adding Getter Putter benchmark
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-12-01 15:31:48 -05:00
fc1e47e71d Merge pull request #24 from vbatts/drop_go1.2
travis: drop go1.2
2015-12-01 14:31:13 -06:00
d80c6b3bb1 travis: drop go1.2
seems overly reasonable to support go1.3 and greater. :-)

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-12-01 15:26:30 -05:00
Tonis Tiigi
8b20f9161d Optimize JSON decoding
This allows to avoid extra allocations on `ReadBytes` and
decoding buffers.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2015-11-30 09:52:44 -08:00
bece0c7009 demo: docker layer checksums
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-10-16 17:05:18 -04:00
7ea74e1c31 demo: basic command
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-10-16 16:41:09 -04:00
c955161e57 Merge pull request #20 from vbatts/unicode-utf8
remove common in favor of stdlib `unicode/utf8`
2015-09-25 14:38:49 -04:00
10250c25e0 tar/asm: remove useless test
The iso-8859-1 archive is already tested round trip, and this test did
not do anything really.
2015-09-25 14:35:12 -04:00
7e38cefd4b common: remove in favor of stdlib unicode/utf8 2015-09-25 14:33:24 -04:00
7ef16e6f67 Merge pull request #19 from LK4D4/go_143
Update travis to go1.4.3
2015-09-24 16:02:39 -04:00
Alexander Morozov
27876e49c2 Update travis to go1.4.3
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-09-24 12:24:31 -07:00
8a361ef0d8 tar/storage: Sprintf is unnecessary
fmt.Sprintf() vs string() for this []byte conversion is too much and
does not provide any further safety.

https://gist.github.com/vbatts/ab17181086aed558dd3a
2015-09-24 09:51:58 -04:00
7f56c08c48 Merge pull request #18 from vbatts/iso-8859-1
Iso 8859 1
2015-09-23 15:47:23 -04:00
cde639172f tar/asm: work with non-utf8 entry names 2015-09-23 15:27:33 -04:00
032efafc29 tar/storage: work with raw (invalid utf8) names
When the entry name is not UTF-8, for example ISO-8859-1, then store the
raw bytes.
To accommodate this, we will have getters and setters for the entry's
name now. Since this most heavily affects the json marshalling, we'll
double check the sanity of the name before storing it in the JSONPacker.
2015-09-23 15:27:33 -04:00
39d06b9dc4 tar/common: get index of first invalid utf-8 char 2015-09-23 15:27:15 -04:00
2865353200 common: add a UTF-8 check helper 2015-09-23 15:27:13 -04:00
7384cf1827 Merge pull request #16 from LK4D4/go_15
Add go 1.5.1 to CI
2015-09-11 12:50:40 -04:00
Alexander Morozov
1148e7ee3b Add go 1.5.1 to CI
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-09-11 08:49:52 -07:00
414a687f83 README: usage 2015-09-03 15:01:25 -04:00
b4d27b5426 Merge pull request #15 from vbatts/go1.5
Go1.5 updates for archive/tar
2015-08-20 21:25:41 -07:00
4d4b53c78b archive/tar: don't treat multiple file system links as a tar hardlink
Do not assume that if stat shows multiple links that we should mark the
file as a hardlink in the tar format.  If the hardlink link was not
referenced, this caused a link to "/".  On an overlay file system, all
files have multiple links.

The caller must keep the inode references and set TypeLink, Size = 0,
and LinkName themselves.

Change-Id: I873b8a235bc8f8fbb271db74ee54232da36ca013
Reviewed-on: https://go-review.googlesource.com/13045
Reviewed-by: Ian Lance Taylor <iant@golang.org>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00
Alex Brainman
3b34dbd368 archive/tar: move round-trip reading into common os file
Fixes #11426

Change-Id: I77368b0e852149ed4533e139cc43887508ac7f78
Reviewed-on: https://go-review.googlesource.com/11662
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00