NOTE: I'm not sure this is really the route I want to go here, but it
would need benchmarking to show if it's actually beneficial.
It would still be nicer to get something like this upstreamed instead.
trim down anything not used directly by tar-split.
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
I intend to not make changes to this `archive/tar` that aren't from
upstream, or are not directly related to the usage by this project...
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
This is a port of commits adding RawHeader() to go-1.11 archive/tar.
In addition:
* simplify the rawBytes.Write() code in readHeader()
* ignore errors from rawBytes.Write(), as (at least for go-1.11)
it never returns an error, only panics (if the buffer grew too large)
Also, remove the internal/testenv from tar_tar.go to enable go test.
As working symlink detection is non-trivial on Windows, just skip
the test on that platform.
In addition to `go test`, I did some minimal manual testing, and
it seems this code creates tar-data.json.gz which is identical
to the one made by the old version.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit dd5e14a7511465d20c6e95bf54c9b8f999abbbf6 ensured that no data
could be read for header-only files regardless of what the Header.Size
said. We should document this fact in Reader.Read.
Updates #13647
Change-Id: I4df9a2892bc66b49e0279693d08454bf696cfa31
Reviewed-on: https://go-review.googlesource.com/17913
Reviewed-by: Russ Cox <rsc@golang.org>
Motivation:
* Previous implementation did not detect integer overflow when
parsing a base-256 encoded field.
* Previous implementation did not treat the integer as a two's
complement value as specified by GNU.
The relevant GNU specification says:
<<<
GNU format uses two's-complement base-256 notation to store values
that do not fit into standard ustar range.
>>>
Fixes#12435
Change-Id: I4639bcffac8d12e1cb040b76bd05c9d7bc6c23a8
Reviewed-on: https://go-review.googlesource.com/17424
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Motivation for change:
* Recursive logic is hard to follow, since it tends to apply
things in reverse. On the other hand, the tar formats tend to
describe meta headers as affecting the next entry.
* Recursion also applies changes in the wrong order. Two test
files are attached that use multiple headers. The previous Go
behavior differs from what GNU and BSD tar do.
Change-Id: Ic1557256fc1363c5cb26570e5d0b9f65a9e57341
Reviewed-on: https://go-review.googlesource.com/14624
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Motivations:
* Use of strconv.ParseInt does not properly treat integers as 64bit,
preventing this function from working properly on 32bit machines.
* Use of io.ReadFull does not properly detect truncated streams
when the file suddenly ends on a block boundary.
* The function blindly trusts user input for numEntries and allocates
memory accordingly.
* The function does not validate that numEntries is not negative,
allowing a malicious sparse file to cause a panic during make.
In general, this function was overly complicated for what it was
accomplishing and it was hard to reason that it was free from
bounds errors. Instead, it has been rewritten and relies on
bytes.Buffer.ReadString to do the main work. So long as invariants
about the number of '\n' in the buffer are maintained, it is much
easier to see why this approach is correct.
Change-Id: Ibb12c4126c26e0ea460ea063cd17af68e3cf609e
Reviewed-on: https://go-review.googlesource.com/15174
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Certain special type-flags, specifically 1, 2, 3, 4, 5, 6,
do not have a data section. Thus, regardless of what the size field
says, we should not attempt to read any data for these special types.
The relevant PAX and USTAR specification says:
<<<
If the typeflag field is set to specify a file to be of type 1 (a link)
or 2 (a symbolic link), the size field shall be specified as zero.
If the typeflag field is set to specify a file of type 5 (directory),
the size field shall be interpreted as described under the definition
of that record type. No data logical records are stored for types 1, 2, or 5.
If the typeflag field is set to 3 (character special file),
4 (block special file), or 6 (FIFO), the meaning of the size field is
unspecified by this volume of POSIX.1-2008, and no data logical records shall
be stored on the medium.
Additionally, for type 6, the size field shall be ignored when reading.
If the typeflag field is set to any other value, the number of logical
records written following the header shall be (size+511)/512, ignoring
any fraction in the result of the division.
>>>
Contrary to the specification, we do not assert that the size field
is zero for type 1 and 2 since we liberally accept non-conforming formats.
Change-Id: I666b601597cb9d7a50caa081813d90ca9cfc52ed
Reviewed-on: https://go-review.googlesource.com/16614
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Replaces PID in PaxHeaders with 0. Sorts PAX header keys before writing
them to the archive.
Fixes#12358
Change-Id: If239f89c85f1c9d9895a253fb06a47ad44960124
Reviewed-on: https://go-review.googlesource.com/13975
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Joe Tsai <joetsai@digital-static.net>
Motivation:
* The logic to verify the numEntries can overflow and incorrectly
pass, allowing a malicious file to allocate arbitrary memory.
* The use of strconv.ParseInt does not set the integer precision
to 64bit, causing this code to work incorrectly on 32bit machines.
Change-Id: I1b1571a750a84f2dde97cc329ed04fe2342aaa60
Reviewed-on: https://go-review.googlesource.com/15173
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
A recursive call to Reader.Next did not check the error before
trying to use the result, leading to a nil pointer panic.
This specific CL addresses the immediate issue, which is the panic,
but does not solve the root issue, which is due to an integer
overflow in the base-256 parser.
Updates #12435
Change-Id: Ia908671f0f411a409a35e24f2ebf740d46734072
Reviewed-on: https://go-review.googlesource.com/15437
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
If the stream is in an inconsistent state, it does not make sense
that Reader.Read can be called and possibly succeed.
Change-Id: I9d1c5a1300b2c2b45232188aa7999e350809dcf2
Reviewed-on: https://go-review.googlesource.com/15177
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
The sparseFileReader is prone to two different forms of
denial-of-service attacks:
* A malicious tar file can cause an infinite loop
* A malicious tar file can cause arbitrary panics
This results because of poor error checking/handling, which this
CL fixes. While we are at it, add a plethora of unit tests to
test for possible malicious inputs.
Change-Id: I2f9446539d189f3c1738a1608b0ad4859c1be929
Reviewed-on: https://go-review.googlesource.com/15115
Reviewed-by: Andrew Gerrand <adg@golang.org>
Run-TryBot: Andrew Gerrand <adg@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Check for errors when reading the headers following the pax headers.
Fixes#11169.
Change-Id: Ifec4a949ec8df8b49fa7cb7a67eb826fe2282ad8
Reviewed-on: https://go-review.googlesource.com/11031
Reviewed-by: Russ Cox <rsc@golang.org>
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
Sanity check the pax-header size field before using it.
Fixes#11167.
Change-Id: I9d5d0210c3990e6fb9434c3fe333be0d507d5962
Reviewed-on: https://go-review.googlesource.com/10954
Reviewed-by: David Symonds <dsymonds@golang.org>
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
Archives produced with GNU tar can have types of TypeGNULongName and
TypeGNULongLink.
These fields effectively appear like two file entries in the tar
archive. While golang's `archive/tar` transparently provide the file
name and headers and file payload, the access to the raw bytes is still
needed.
This fixes the access to the longlink header, it's payload (of the long
file path name), and the following file header and actual file payload.