1
0
Fork 1
mirror of https://github.com/vbatts/tar-split.git synced 2024-11-18 14:28:38 +00:00
Commit graph

39 commits

Author SHA1 Message Date
Joe Tsai
2c3c708698 archive/tar: centralize all information about tar header format
The Reader and Writer have hard-coded constants regarding the
offsets and lengths of certain fields in the tar format sprinkled
all over. This makes it harder to verify that the offsets are
correct since a reviewer would need to search for them throughout
the code. Instead, all information about the layout of header
fields should be centralized in one single file. This has the
advantage of being both centralized, and also acting as a form
of documentation about the header struct format.

This method was chosen over using "encoding/binary" since that
method would cause an allocation of a header struct every time
binary.Read was called. This method causes zero allocations and
its logic is no longer than if structs were declared.

Updates #12594

Change-Id: Ic7a0565d2a2cd95d955547ace3b6dea2b57fab34
Reviewed-on: https://go-review.googlesource.com/14669
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2016-09-23 10:51:29 -04:00
Matthew Dempsky
5753de3a3b archive/tar: style nit: s/nano_buf/nanoBuf/
Pointed out during review of golang.org/cl/22104.

Change-Id: If8842e7f8146441e918ec6a2b6e893b7cf88615c
Reviewed-on: https://go-review.googlesource.com/22120
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2016-09-23 10:50:19 -04:00
Matthew Dempsky
c9d534b830 all: remove unnecessary type conversions
cmd and runtime were handled separately, and I'm intentionally skipped
syscall. This is the rest of the standard library.

CL generated mechanically with github.com/mdempsky/unconvert.

Change-Id: I9e0eff886974dedc37adb93f602064b83e469122
Reviewed-on: https://go-review.googlesource.com/22104
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2016-09-23 10:50:19 -04:00
Brad Fitzpatrick
67735b8612 all: use new io.SeekFoo constants instead of os.SEEK_FOO
Automated change.

Fixes #15269

Change-Id: I8deb2ac0101d3f7c390467ceb0a1561b72edbb2f
Reviewed-on: https://go-review.googlesource.com/21962
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2016-09-23 10:50:19 -04:00
Brad Fitzpatrick
e16c946914 all: single space after period.
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.

This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:

$ perl -i -npe 's,^(\s*// .+[a-z]\.)  +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.)  +([A-Z])')
$ go test go/doc -update

Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2016-09-23 10:50:19 -04:00
Derek McGowan
e527e70d25 Fix panic in Next
readHeader should never return nil with a tr.err also nil.
To correct this, ensure tr.err never gets reset to nil followed
by a nil return.
2016-09-22 17:38:18 -07:00
c32966b9e8 archive/tar: go1.3 and go1.4 compatibility
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2016-02-15 09:38:46 -05:00
Joe Tsai
10db8408f6 archive/tar: document how Reader.Read handles header-only files
Commit dd5e14a7511465d20c6e95bf54c9b8f999abbbf6 ensured that no data
could be read for header-only files regardless of what the Header.Size
said. We should document this fact in Reader.Read.

Updates #13647

Change-Id: I4df9a2892bc66b49e0279693d08454bf696cfa31
Reviewed-on: https://go-review.googlesource.com/17913
Reviewed-by: Russ Cox <rsc@golang.org>
2016-02-03 07:01:09 -05:00
Joe Tsai
962540fec3 archive/tar: spell license correctly in example
Change-Id: Ice85d161f026a991953bd63ecc6ec80f8d06dfbd
Reviewed-on: https://go-review.googlesource.com/17901
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-02-03 07:01:09 -05:00
Joe Tsai
a04b4ddba4 archive/tar: properly parse GNU base-256 encoding
Motivation:
* Previous implementation did not detect integer overflow when
parsing a base-256 encoded field.
* Previous implementation did not treat the integer as a two's
complement value as specified by GNU.

The relevant GNU specification says:
<<<
GNU format uses two's-complement base-256 notation to store values
that do not fit into standard ustar range.
>>>

Fixes #12435

Change-Id: I4639bcffac8d12e1cb040b76bd05c9d7bc6c23a8
Reviewed-on: https://go-review.googlesource.com/17424
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-03 07:01:09 -05:00
Joe Tsai
ce5aac17f9 archive/tar: properly format GNU base-256 encoding
Motivation:
* Previous implementation silently failed when an integer overflow
occurred. Now, we report an ErrFieldTooLong.
* Previous implementation did not encode in two's complement format and was
unable to encode negative numbers.

The relevant GNU specification says:
<<<
GNU format uses two's-complement base-256 notation to store values
that do not fit into standard ustar range.
>>>

Fixes #12436

Change-Id: I09c20602eabf8ae3a7e0db35b79440a64bfaf807
Reviewed-on: https://go-review.googlesource.com/17425
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-03 06:58:30 -05:00
Joe Tsai
be9ac88117 archive/tar: convert Reader.Next to be loop based
Motivation for change:
* Recursive logic is hard to follow, since it tends to apply
things in reverse. On the other hand, the tar formats tend to
describe meta headers as affecting the next entry.
* Recursion also applies changes in the wrong order. Two test
files are attached that use multiple headers. The previous Go
behavior differs from what GNU and BSD tar do.

Change-Id: Ic1557256fc1363c5cb26570e5d0b9f65a9e57341
Reviewed-on: https://go-review.googlesource.com/14624
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-02-03 06:58:30 -05:00
Joe Tsai
64935a5f0f archive/tar: move parse/format methods to standalone receiver
Motivations for this change:
* It allows these functions to be used outside of Reader/Writer.
* It allows these functions to be more easily unit tested.

Change-Id: Iebe2b70bdb8744371c9ffa87c24316cbbf025b59
Reviewed-on: https://go-review.googlesource.com/15113
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-02-02 14:32:27 -05:00
Joe Tsai
b598ba3ee7 archive/tar: fix issues with readGNUSparseMap1x0
Motivations:
* Use of strconv.ParseInt does not properly treat integers as 64bit,
preventing this function from working properly on 32bit machines.
* Use of io.ReadFull does not properly detect truncated streams
when the file suddenly ends on a block boundary.
* The function blindly trusts user input for numEntries and allocates
memory accordingly.
* The function does not validate that numEntries is not negative,
allowing a malicious sparse file to cause a panic during make.

In general, this function was overly complicated for what it was
accomplishing and it was hard to reason that it was free from
bounds errors. Instead, it has been rewritten and relies on
bytes.Buffer.ReadString to do the main work. So long as invariants
about the number of '\n' in the buffer are maintained, it is much
easier to see why this approach is correct.

Change-Id: Ibb12c4126c26e0ea460ea063cd17af68e3cf609e
Reviewed-on: https://go-review.googlesource.com/15174
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:17:35 -05:00
Joe Tsai
7500c932c7 archive/tar: properly handle header-only "files" in Reader
Certain special type-flags, specifically 1, 2, 3, 4, 5, 6,
do not have a data section. Thus, regardless of what the size field
says, we should not attempt to read any data for these special types.

The relevant PAX and USTAR specification says:
<<<
If the typeflag field is set to specify a file to be of type 1 (a link)
or 2 (a symbolic link), the size field shall be specified as zero.
If the typeflag field is set to specify a file of type 5 (directory),
the size field shall be interpreted as described under the definition
of that record type. No data logical records are stored for types 1, 2, or 5.
If the typeflag field is set to 3 (character special file),
4 (block special file), or 6 (FIFO), the meaning of the size field is
unspecified by this volume of POSIX.1-2008, and no data logical records shall
be stored on the medium.
Additionally, for type 6, the size field shall be ignored when reading.
If the typeflag field is set to any other value, the number of logical
records written following the header shall be (size+511)/512, ignoring
any fraction in the result of the division.
>>>

Contrary to the specification, we do not assert that the size field
is zero for type 1 and 2 since we liberally accept non-conforming formats.

Change-Id: I666b601597cb9d7a50caa081813d90ca9cfc52ed
Reviewed-on: https://go-review.googlesource.com/16614
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:10:38 -05:00
Matt Layher
2424f4e367 archive/tar: make output deterministic
Replaces PID in PaxHeaders with 0.  Sorts PAX header keys before writing
them to the archive.

Fixes #12358

Change-Id: If239f89c85f1c9d9895a253fb06a47ad44960124
Reviewed-on: https://go-review.googlesource.com/13975
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Joe Tsai <joetsai@digital-static.net>
2016-02-02 14:10:11 -05:00
Joe Tsai
bffda594f7 archive/tar: detect truncated files
Motivation:
* Reader.skipUnread never reports io.ErrUnexpectedEOF. This is strange
given that io.ErrUnexpectedEOF is given through Reader.Read if the
user manually reads the file.
* Reader.skipUnread fails to detect truncated files since io.Seeker
is lazy about reporting errors. Thus, the behavior of Reader differs
whether the input io.Reader also satisfies io.Seeker or not.

To solve this, we seek to one before the end of the data section and
always rely on at least one call to io.CopyN. If the tr.r satisfies
io.Seeker, this is guarunteed to never read more than blockSize.

Fixes #12557

Change-Id: I0ddddfc6bed0d74465cb7e7a02b26f1de7a7a279
Reviewed-on: https://go-review.googlesource.com/15175
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:09:30 -05:00
Joe Tsai
cf83c95de8 archive/tar: fix numeric overflow issues in readGNUSparseMap0x1
Motivation:
* The logic to verify the numEntries can overflow and incorrectly
pass, allowing a malicious file to allocate arbitrary memory.
* The use of strconv.ParseInt does not set the integer precision
to 64bit, causing this code to work incorrectly on 32bit machines.

Change-Id: I1b1571a750a84f2dde97cc329ed04fe2342aaa60
Reviewed-on: https://go-review.googlesource.com/15173
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:09:04 -05:00
Joe Tsai
cb423795eb archive/tar: add missing error checks to Reader.Next
A recursive call to Reader.Next did not check the error before
trying to use the result, leading to a nil pointer panic.
This specific CL addresses the immediate issue, which is the panic,
but does not solve the root issue, which is due to an integer
overflow in the base-256 parser.

Updates #12435

Change-Id: Ia908671f0f411a409a35e24f2ebf740d46734072
Reviewed-on: https://go-review.googlesource.com/15437
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:08:38 -05:00
Joe Tsai
4ad443d166 archive/tar: expand abilities of TestReader
Motivation:
* There are an increasing number of "one-off" corrupt files added
to make sure that package does not succeed or crash on them.
Instead, allow for the test to specify the error that is expected
to occur (if any).
* Also, fold in the logic to check the MD5 checksum into this
function.

The following tests are being removed:
* TestIncrementalRead: Done by TestReader by using io.CopyBuffer
with a buffer of 8. This achieves the same behavior as this test.
* TestSparseEndToEnd: Since TestReader checks the MD5 checksums
if the input corpus provides them, then this is redundant.
* TestSparseIncrementalRead: Redundant for the same reasons that
TestIncrementalRead is now redundant
* TestNegativeHdrSize: Added to TestReader corpus
* TestIssue10968: Added to TestReader corpus
* TestIssue11169: Added to TestReader corpus

With this change, code coverage did not change: 85.3%

Change-Id: I8550d48657d4dbb8f47dfc3dc280758ef73b47ec
Reviewed-on: https://go-review.googlesource.com/15176
Reviewed-by: Andrew Gerrand <adg@golang.org>
2016-02-02 14:06:30 -05:00
Joe Tsai
f0fc67b3a8 archive/tar: make Reader.Read errors persistent
If the stream is in an inconsistent state, it does not make sense
that Reader.Read can be called and possibly succeed.

Change-Id: I9d1c5a1300b2c2b45232188aa7999e350809dcf2
Reviewed-on: https://go-review.googlesource.com/15177
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
2016-02-02 14:06:30 -05:00
Joe Tsai
af15385a0d archive/tar: fix bugs with sparseFileReader
The sparseFileReader is prone to two different forms of
denial-of-service attacks:
* A malicious tar file can cause an infinite loop
* A malicious tar file can cause arbitrary panics

This results because of poor error checking/handling, which this
CL fixes. While we are at it, add a plethora of unit tests to
test for possible malicious inputs.

Change-Id: I2f9446539d189f3c1738a1608b0ad4859c1be929
Reviewed-on: https://go-review.googlesource.com/15115
Reviewed-by: Andrew Gerrand <adg@golang.org>
Run-TryBot: Andrew Gerrand <adg@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-02 14:06:30 -05:00
Joe Tsai
440ba9e519 archive/tar: remove dead code with USTAR path splitting
Convert splitUSTARPath to return a bool rather than an error since
the caller never ever uses the error other than to check if it is
nil. Thus, we can remove errNameTooLong as well.

Also, fold the checking of the length <= fileNameSize and whether
the string is ASCII into the split function itself.

Lastly, remove logic to set the MAGIC since that's already done on
L200. Thus, setting the magic is redundant.

There is no overall logic change.

Updates #12638

Change-Id: I26b6992578199abad723c2a2af7f4fc078af9c17
Reviewed-on: https://go-review.googlesource.com/14723
Reviewed-by: David Symonds <dsymonds@golang.org>
Run-TryBot: David Symonds <dsymonds@golang.org>
2016-02-02 14:06:30 -05:00
4d4b53c78b archive/tar: don't treat multiple file system links as a tar hardlink
Do not assume that if stat shows multiple links that we should mark the
file as a hardlink in the tar format.  If the hardlink link was not
referenced, this caused a link to "/".  On an overlay file system, all
files have multiple links.

The caller must keep the inode references and set TypeLink, Size = 0,
and LinkName themselves.

Change-Id: I873b8a235bc8f8fbb271db74ee54232da36ca013
Reviewed-on: https://go-review.googlesource.com/13045
Reviewed-by: Ian Lance Taylor <iant@golang.org>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00
Alex Brainman
3b34dbd368 archive/tar: move round-trip reading into common os file
Fixes #11426

Change-Id: I77368b0e852149ed4533e139cc43887508ac7f78
Reviewed-on: https://go-review.googlesource.com/11662
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00
Brad Fitzpatrick
27e18409b9 archive/tar: also skip header roundtrip test on nacl
Update #11426

Change-Id: I7abc4ed2241a7a3af6d57c934786f36de4f97b77
Reviewed-on: https://go-review.googlesource.com/11592
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00
Brad Fitzpatrick
8eee43d0df archive/tar: disable new failing test on windows and plan9
Update #11426

Change-Id: If406d2efcc81965825a63c76f5448d544ba2a740
Reviewed-on: https://go-review.googlesource.com/11590
Reviewed-by: Austin Clements <austin@google.com>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00
b48c28014e archive/tar: fix round-trip attributes
The issue was identified while
working with round trip FileInfo of the headers of hardlinks. Also,
additional test cases for hard link handling.
(review carried over from http://golang.org/cl/165860043)

Fixes #9027

Change-Id: I9e3a724c8de72eb1b0fbe0751a7b488894911b76
Reviewed-on: https://go-review.googlesource.com/6790
Reviewed-by: Russ Cox <rsc@golang.org>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00
Michael Gehring
2e5698249c archive/tar: add missing error checks
Check for errors when reading the headers following the pax headers.

Fixes #11169.

Change-Id: Ifec4a949ec8df8b49fa7cb7a67eb826fe2282ad8
Reviewed-on: https://go-review.googlesource.com/11031
Reviewed-by: Russ Cox <rsc@golang.org>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00
Michael Gehring
69de764807 archive/tar: fix slice bounds out of range
Sanity check the pax-header size field before using it.

Fixes #11167.

Change-Id: I9d5d0210c3990e6fb9434c3fe333be0d507d5962
Reviewed-on: https://go-review.googlesource.com/10954
Reviewed-by: David Symonds <dsymonds@golang.org>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00
Håvard Haugen
55dceefe42 archive/tar: terminate when reading malformed sparse files
Fixes #10968.

Change-Id: I027bc571a71629ac49c2a0ff101b2950af6e7531
Reviewed-on: https://go-review.googlesource.com/10482
Reviewed-by: David Symonds <dsymonds@golang.org>
Run-TryBot: David Symonds <dsymonds@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00
Håvard Haugen
576b273762 archive/tar: don't panic on negative file size
Fixes #10959.
Fixes #10960.

Change-Id: I9a81a0e2b8275338d0d1c3f7f7265e0fd91f3de2
Reviewed-on: https://go-review.googlesource.com/10402
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Symonds <dsymonds@golang.org>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00
David du Colombier
6e38573de2 archive/tar: fix error message
Write should return ErrWriteAfterClose instead
of ErrWriteTooLong when called after Close.

Change-Id: If5ec4ef924e4c56489e0d426976f7e5fad79be9b
Reviewed-on: https://go-review.googlesource.com/9259
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-08-21 00:15:22 -04:00
4d66163297 archive/tar: a []byte copy needed for GNU LongLink 2015-08-13 15:32:17 -04:00
e46a815cbc archive/tar: fix carry-over of bytes for GNU types
Archives produced with GNU tar can have types of TypeGNULongName and
TypeGNULongLink.
These fields effectively appear like two file entries in the tar
archive. While golang's `archive/tar` transparently provide the file
name and headers and file payload, the access to the raw bytes is still
needed.

This fixes the access to the longlink header, it's payload (of the long
file path name), and the following file header and actual file payload.
2015-08-11 15:57:20 -04:00
50168a6bb3 archive/tar: cleaner reset 2015-02-20 14:49:23 -05:00
739daf3e09 looking for missing bytes 2015-02-19 18:07:22 -05:00
7cc3f4b289 archive/tar: add RawBytes()
Plumbing a means to access the raw bytes of a tar archive apart from the
file payload itself.
2015-02-19 16:49:06 -05:00
64426b0aae archive/tar: adding from go as of a9dddb53f 2015-02-11 14:08:03 +01:00