Merge pull request #38 from vbatts/travis

travis: test more go versions
archive/tar: monotonic clock adjustment
2017-03-14 11:24:38 -04:00 · 2017-03-14 11:04:10 -04:00 · 2017-03-14 08:38:13 -04:00 · 2017-03-13 18:28:54 -04:00 · 2016-09-27 02:54:18 +00:00 · 2016-09-26 19:53:52 -04:00
29 changed files with 1796 additions and 872 deletions
--- a/.travis.yml
+++ b/.travis.yml
@ -1,17 +1,17 @@
 language: go
 go:
  - tip
-  - 1.5.1
+  - 1.x
-  - 1.4.3
+  - 1.8.x
-  - 1.3.3
+  - 1.7.x
-  - 1.2.2
+  - 1.6.x
  - 1.5.x
 # let us have pretty, fast Docker-based Travis workers!
 sudo: false
 install:
  - go get -d ./...
  - go get golang.org/x/tools/cmd/vet
 script:
  - go test -v ./...
--- a/39
+++ b/39
@ -1,19 +1,28 @@
 Copyright (c) 2015 Vincent Batts, Raleigh, NC, USA
-Permission is hereby granted, free of charge, to any person obtaining a copy
+All rights reserved.
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
-The above copyright notice and this permission notice shall be included in
+Redistribution and use in source and binary forms, with or without
-all copies or substantial portions of the Software.
+modification, are permitted provided that the following conditions are met:
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+1. Redistributions of source code must retain the above copyright notice, this
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+list of conditions and the following disclaimer.
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+2. Redistributions in binary form must reproduce the above copyright notice,
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+this list of conditions and the following disclaimer in the documentation
-OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+and/or other materials provided with the distribution.
-THE SOFTWARE.
+
 3. Neither the name of the copyright holder nor the names of its contributors
 may be used to endorse or promote products derived from this software without
 specific prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
 ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
 FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
 CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
 OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--- a/README.md
+++ b/README.md
@ -1,6 +1,7 @@
 # tar-split
 [![Build Status](https://travis-ci.org/vbatts/tar-split.svg?branch=master)](https://travis-ci.org/vbatts/tar-split)
 [![Go Report Card](https://goreportcard.com/badge/github.com/vbatts/tar-split)](https://goreportcard.com/report/github.com/vbatts/tar-split)
 Pristinely disassembling a tar archive, and stashing needed raw bytes and offsets to reassemble a validating original archive.
@ -25,6 +26,23 @@ go get github.com/vbatts/tar-split/cmd/tar-split
 For cli usage, see its [README.md](cmd/tar-split/README.md).
 For the library see the [docs](#docs)
 ## Demo
 ### Basic disassembly and assembly
 This demonstrates the `tar-split` command and how to assemble a tar archive from the `tar-data.json.gz`
 ![basic cmd demo thumbnail](https://i.ytimg.com/vi/vh5wyjIOBtc/2.jpg?time=1445027151805)
 [youtube video of basic command demo](https://youtu.be/vh5wyjIOBtc)
 ### Docker layer preservation
 This demonstrates the tar-split integration for docker-1.8. Providing consistent tar archives for the image layer content.
 ![docker tar-split demo](https://i.ytimg.com/vi_webp/vh5wyjIOBtc/default.webp)
 [youtube vide of docker layer checksums](https://youtu.be/tV_Dia8E8xw)
 ## Caveat
 Eventually this should detect TARs that this is not possible with.
@ -49,7 +67,7 @@ Do not break the API of stdlib `archive/tar` in our fork (ideally find an upstre
 ## Std Version
-The version of golang stdlib `archive/tar` is from go1.4.1, and their master branch around [a9dddb53f](https://github.com/golang/go/tree/a9dddb53f).
+The version of golang stdlib `archive/tar` is from go1.6
 It is minimally extended to expose the raw bytes of the TAR, rather than just the marshalled headers and file stream.
--- a/archive/tar/common.go
+++ b/archive/tar/common.go
@ -327,3 +327,14 @@ func toASCII(s string) string {
 	}
 	return buf.String()
 }
 // isHeaderOnlyType checks if the given type flag is of the type that has no
 // data section even if a size is specified.
 func isHeaderOnlyType(flag byte) bool {
 	switch flag {
 	case TypeLink, TypeSymlink, TypeChar, TypeBlock, TypeDir, TypeFifo:
 		return true
 	default:
 		return false
 	}
 }
--- a/archive/tar/example_test.go
+++ b/archive/tar/example_test.go
@ -26,7 +26,7 @@ func Example() {
 	}{
 		{"readme.txt", "This archive contains some text files."},
 		{"gopher.txt", "Gopher names:\nGeorge\nGeoffrey\nGonzo"},
-		{"todo.txt", "Get animal handling licence."},
+		{"todo.txt", "Get animal handling license."},
 	}
 	for _, file := range files {
 		hdr := &tar.Header{
@ -76,5 +76,5 @@ func Example() {
 	// Geoffrey
 	// Gonzo
 	// Contents of todo.txt:
-	// Get animal handling licence.
+	// Get animal handling license.
 }
--- a/archive/tar/reader.go
+++ b/archive/tar/reader.go
@ -12,6 +12,7 @@ import (
 	"errors"
 	"io"
 	"io/ioutil"
 	"math"
 	"os"
 	"strconv"
 	"strings"
@ -39,6 +40,10 @@ type Reader struct {
 	rawBytes      *bytes.Buffer // last raw bits
 }
 type parser struct {
 	err error // Last error seen
 }
 // RawBytes accesses the raw bytes of the archive, apart from the file payload itself.
 // This includes the header and padding.
 //
@ -70,12 +75,36 @@ type regFileReader struct {
 	nb int64     // number of unread bytes for current file entry
 }
-// A sparseFileReader is a numBytesReader for reading sparse file data from a tar archive.
+// A sparseFileReader is a numBytesReader for reading sparse file data from a
 // tar archive.
 type sparseFileReader struct {
-	rfr *regFileReader // reads the sparse-encoded file data
+	rfr   numBytesReader // Reads the sparse-encoded file data
-	sp  []sparseEntry  // the sparse map for the file
+	sp    []sparseEntry  // The sparse map for the file
-	pos int64          // keeps track of file position
+	pos   int64          // Keeps track of file position
-	tot int64          // total size of the file
+	total int64          // Total size of the file
 }
 // A sparseEntry holds a single entry in a sparse file's sparse map.
 //
 // Sparse files are represented using a series of sparseEntrys.
 // Despite the name, a sparseEntry represents an actual data fragment that
 // references data found in the underlying archive stream. All regions not
 // covered by a sparseEntry are logically filled with zeros.
 //
 // For example, if the underlying raw file contains the 10-byte data:
 //	var compactData = "abcdefgh"
 //
 // And the sparse map has the following entries:
 //	var sp = []sparseEntry{
 //		{offset: 2,  numBytes: 5} // Data fragment for [2..7]
 //		{offset: 18, numBytes: 3} // Data fragment for [18..21]
 //	}
 //
 // Then the content of the resulting sparse file with a "real" size of 25 is:
 //	var sparseData = "\x00"*2 + "abcde" + "\x00"*11 + "fgh" + "\x00"*4
 type sparseEntry struct {
 	offset   int64 // Starting position of the fragment
 	numBytes int64 // Length of the fragment
 }
 // Keywords for GNU sparse files in a PAX extended header
@ -109,7 +138,6 @@ func NewReader(r io.Reader) *Reader { return &Reader{r: r} }
 //
 // io.EOF is returned at the end of the input.
 func (tr *Reader) Next() (*Header, error) {
 	var hdr *Header
 	if tr.RawAccounting {
 		if tr.rawBytes == nil {
 			tr.rawBytes = bytes.NewBuffer(nil)
@ -117,98 +145,88 @@ func (tr *Reader) Next() (*Header, error) {
 			tr.rawBytes.Reset()
 		}
 	}
-	if tr.err == nil {
+
 		tr.skipUnread()
 	}
 	if tr.err != nil {
-		return hdr, tr.err
+		return nil, tr.err
 	}
-	hdr = tr.readHeader()
+
-	if hdr == nil {
+	var hdr *Header
-		return hdr, tr.err
+	var extHdrs map[string]string
-	}
+
-	// Check for PAX/GNU header.
+	// Externally, Next iterates through the tar archive as if it is a series of
-	switch hdr.Typeflag {
+	// files. Internally, the tar format often uses fake "files" to add meta
-	case TypeXHeader:
+	// data that describes the next file. These meta data "files" should not
-		//  PAX extended header
+	// normally be visible to the outside. As such, this loop iterates through
-		headers, err := parsePAX(tr)
+	// one or more "header files" until it finds a "normal file".
-		if err != nil {
+loop:
-			return nil, err
+	for {
-		}
+		tr.err = tr.skipUnread()
 		// We actually read the whole file,
 		// but this skips alignment padding
 		tr.skipUnread()
 		if tr.err != nil {
 			return nil, tr.err
 		}
 		hdr = tr.readHeader()
-		if hdr == nil {
+		if tr.err != nil {
 			return nil, tr.err
 		}
-		mergePAX(hdr, headers)
+		// Check for PAX/GNU special headers and files.
 		switch hdr.Typeflag {
 		case TypeXHeader:
 			extHdrs, tr.err = parsePAX(tr)
 			if tr.err != nil {
 				return nil, tr.err
 			}
 			continue loop // This is a meta header affecting the next header
 		case TypeGNULongName, TypeGNULongLink:
 			var realname []byte
 			realname, tr.err = ioutil.ReadAll(tr)
 			if tr.err != nil {
 				return nil, tr.err
 			}
-		// Check for a PAX format sparse file
+			if tr.RawAccounting {
-		sp, err := tr.checkForGNUSparsePAXHeaders(hdr, headers)
+				if _, tr.err = tr.rawBytes.Write(realname); tr.err != nil {
-		if err != nil {
+					return nil, tr.err
-			tr.err = err
+				}
-			return nil, err
+			}
-		}
+
-		if sp != nil {
+			// Convert GNU extensions to use PAX headers.
-			// Current file is a PAX format GNU sparse file.
+			if extHdrs == nil {
-			// Set the current file reader to a sparse file reader.
+				extHdrs = make(map[string]string)
-			tr.curr = &sparseFileReader{rfr: tr.curr.(*regFileReader), sp: sp, tot: hdr.Size}
+			}
-		}
+			var p parser
-		return hdr, nil
+			switch hdr.Typeflag {
-	case TypeGNULongName:
+			case TypeGNULongName:
-		// We have a GNU long name header. Its contents are the real file name.
+				extHdrs[paxPath] = p.parseString(realname)
-		realname, err := ioutil.ReadAll(tr)
+			case TypeGNULongLink:
-		if err != nil {
+				extHdrs[paxLinkpath] = p.parseString(realname)
-			return nil, err
+			}
-		}
+			if p.err != nil {
-		var buf []byte
+				tr.err = p.err
-		if tr.RawAccounting {
+				return nil, tr.err
-			if _, err = tr.rawBytes.Write(realname); err != nil {
+			}
 			continue loop // This is a meta header affecting the next header
 		default:
 			mergePAX(hdr, extHdrs)
 			// Check for a PAX format sparse file
 			sp, err := tr.checkForGNUSparsePAXHeaders(hdr, extHdrs)
 			if err != nil {
 				tr.err = err
 				return nil, err
 			}
-			buf = make([]byte, tr.rawBytes.Len())
+			if sp != nil {
-			copy(buf[:], tr.RawBytes())
+				// Current file is a PAX format GNU sparse file.
-		}
+				// Set the current file reader to a sparse file reader.
-		hdr, err := tr.Next()
+				tr.curr, tr.err = newSparseFileReader(tr.curr, sp, hdr.Size)
-		// since the above call to Next() resets the buffer, we need to throw the bytes over
+				if tr.err != nil {
-		if tr.RawAccounting {
+					return nil, tr.err
-			buf = append(buf, tr.RawBytes()...)
+				}
 			if _, err = tr.rawBytes.Write(buf); err != nil {
 				return nil, err
 			}
 			break loop // This is a file, so stop
 		}
 		hdr.Name = cString(realname)
 		return hdr, err
 	case TypeGNULongLink:
 		// We have a GNU long link header.
 		realname, err := ioutil.ReadAll(tr)
 		if err != nil {
 			return nil, err
 		}
 		var buf []byte
 		if tr.RawAccounting {
 			if _, err = tr.rawBytes.Write(realname); err != nil {
 				return nil, err
 			}
 			buf = make([]byte, tr.rawBytes.Len())
 			copy(buf[:], tr.RawBytes())
 		}
 		hdr, err := tr.Next()
 		// since the above call to Next() resets the buffer, we need to throw the bytes over
 		if tr.RawAccounting {
 			buf = append(buf, tr.RawBytes()...)
 			if _, err = tr.rawBytes.Write(buf); err != nil {
 				return nil, err
 			}
 		}
 		hdr.Linkname = cString(realname)
 		return hdr, err
 	}
-	return hdr, tr.err
+	return hdr, nil
 }
 // checkForGNUSparsePAXHeaders checks the PAX headers for GNU sparse headers. If they are found, then
@ -385,6 +403,7 @@ func parsePAX(r io.Reader) (map[string]string, error) {
 			return nil, err
 		}
 	}
 	sbuf := string(buf)
 	// For GNU PAX sparse format 0.0 support.
 	// This function transforms the sparse format 0.0 headers into sparse format 0.1 headers.
@ -393,35 +412,17 @@ func parsePAX(r io.Reader) (map[string]string, error) {
 	headers := make(map[string]string)
 	// Each record is constructed as
 	//     "%d %s=%s\n", length, keyword, value
-	for len(buf) > 0 {
+	for len(sbuf) > 0 {
-		// or the header was empty to start with.
+		key, value, residual, err := parsePAXRecord(sbuf)
-		var sp int
+		if err != nil {
 		// The size field ends at the first space.
 		sp = bytes.IndexByte(buf, ' ')
 		if sp == -1 {
 			return nil, ErrHeader
 		}
-		// Parse the first token as a decimal integer.
+		sbuf = residual
 		n, err := strconv.ParseInt(string(buf[:sp]), 10, 0)
 		if err != nil || n < 5 || int64(len(buf)) < n {
 			return nil, ErrHeader
 		}
 		// Extract everything between the decimal and the n -1 on the
 		// beginning to eat the ' ', -1 on the end to skip the newline.
 		var record []byte
 		record, buf = buf[sp+1:n-1], buf[n:]
 		// The first equals is guaranteed to mark the end of the key.
 		// Everything else is value.
 		eq := bytes.IndexByte(record, '=')
 		if eq == -1 {
 			return nil, ErrHeader
 		}
 		key, value := record[:eq], record[eq+1:]
 		keyStr := string(key)
 		if keyStr == paxGNUSparseOffset || keyStr == paxGNUSparseNumBytes {
 			// GNU sparse format 0.0 special key. Write to sparseMap instead of using the headers map.
-			sparseMap.Write(value)
+			sparseMap.WriteString(value)
 			sparseMap.Write([]byte{','})
 		} else {
 			// Normal key. Set the value in the headers map.
@ -436,9 +437,42 @@ func parsePAX(r io.Reader) (map[string]string, error) {
 	return headers, nil
 }
-// cString parses bytes as a NUL-terminated C-style string.
+// parsePAXRecord parses the input PAX record string into a key-value pair.
 // If parsing is successful, it will slice off the currently read record and
 // return the remainder as r.
 //
 // A PAX record is of the following form:
 //	"%d %s=%s\n" % (size, key, value)
 func parsePAXRecord(s string) (k, v, r string, err error) {
 	// The size field ends at the first space.
 	sp := strings.IndexByte(s, ' ')
 	if sp == -1 {
 		return "", "", s, ErrHeader
 	}
 	// Parse the first token as a decimal integer.
 	n, perr := strconv.ParseInt(s[:sp], 10, 0) // Intentionally parse as native int
 	if perr != nil || n < 5 || int64(len(s)) < n {
 		return "", "", s, ErrHeader
 	}
 	// Extract everything between the space and the final newline.
 	rec, nl, rem := s[sp+1:n-1], s[n-1:n], s[n:]
 	if nl != "\n" {
 		return "", "", s, ErrHeader
 	}
 	// The first equals separates the key from the value.
 	eq := strings.IndexByte(rec, '=')
 	if eq == -1 {
 		return "", "", s, ErrHeader
 	}
 	return rec[:eq], rec[eq+1:], rem, nil
 }
 // parseString parses bytes as a NUL-terminated C-style string.
 // If a NUL byte is not found then the whole slice is returned as a string.
-func cString(b []byte) string {
+func (*parser) parseString(b []byte) string {
 	n := 0
 	for n < len(b) && b[n] != 0 {
 		n++
@ -446,19 +480,51 @@ func cString(b []byte) string {
 	return string(b[0:n])
 }
-func (tr *Reader) octal(b []byte) int64 {
+// parseNumeric parses the input as being encoded in either base-256 or octal.
-	// Check for binary format first.
+// This function may return negative numbers.
 // If parsing fails or an integer overflow occurs, err will be set.
 func (p *parser) parseNumeric(b []byte) int64 {
 	// Check for base-256 (binary) format first.
 	// If the first bit is set, then all following bits constitute a two's
 	// complement encoded number in big-endian byte order.
 	if len(b) > 0 && b[0]&0x80 != 0 {
-		var x int64
+		// Handling negative numbers relies on the following identity:
-		for i, c := range b {
+		//	-a-1 == ^a
-			if i == 0 {
+		//
-				c &= 0x7f // ignore signal bit in first byte
+		// If the number is negative, we use an inversion mask to invert the
-			}
+		// data bytes and treat the value as an unsigned number.
-			x = x<<8 | int64(c)
+		var inv byte // 0x00 if positive or zero, 0xff if negative
 		if b[0]&0x40 != 0 {
 			inv = 0xff
 		}
-		return x
+
 		var x uint64
 		for i, c := range b {
 			c ^= inv // Inverts c only if inv is 0xff, otherwise does nothing
 			if i == 0 {
 				c &= 0x7f // Ignore signal bit in first byte
 			}
 			if (x >> 56) > 0 {
 				p.err = ErrHeader // Integer overflow
 				return 0
 			}
 			x = x<<8 | uint64(c)
 		}
 		if (x >> 63) > 0 {
 			p.err = ErrHeader // Integer overflow
 			return 0
 		}
 		if inv == 0xff {
 			return ^int64(x)
 		}
 		return int64(x)
 	}
 	// Normal case is base-8 (octal) format.
 	return p.parseOctal(b)
 }
 func (p *parser) parseOctal(b []byte) int64 {
 	// Because unused fields are filled with NULs, we need
 	// to skip leading NULs. Fields may also be padded with
 	// spaces or NULs.
@ -469,27 +535,55 @@ func (tr *Reader) octal(b []byte) int64 {
 	if len(b) == 0 {
 		return 0
 	}
-	x, err := strconv.ParseUint(cString(b), 8, 64)
+	x, perr := strconv.ParseUint(p.parseString(b), 8, 64)
-	if err != nil {
+	if perr != nil {
-		tr.err = err
+		p.err = ErrHeader
 	}
 	return int64(x)
 }
-// skipUnread skips any unread bytes in the existing file entry, as well as any alignment padding.
+// skipUnread skips any unread bytes in the existing file entry, as well as any
-func (tr *Reader) skipUnread() {
+// alignment padding. It returns io.ErrUnexpectedEOF if any io.EOF is
-	nr := tr.numBytes() + tr.pad // number of bytes to skip
+// encountered in the data portion; it is okay to hit io.EOF in the padding.
 //
 // Note that this function still works properly even when sparse files are being
 // used since numBytes returns the bytes remaining in the underlying io.Reader.
 func (tr *Reader) skipUnread() error {
 	dataSkip := tr.numBytes()      // Number of data bytes to skip
 	totalSkip := dataSkip + tr.pad // Total number of bytes to skip
 	tr.curr, tr.pad = nil, 0
 	if tr.RawAccounting {
-		_, tr.err = io.CopyN(tr.rawBytes, tr.r, nr)
+		_, tr.err = io.CopyN(tr.rawBytes, tr.r, totalSkip)
-		return
+		return tr.err
 	}
-	if sr, ok := tr.r.(io.Seeker); ok {
+	// If possible, Seek to the last byte before the end of the data section.
-		if _, err := sr.Seek(nr, os.SEEK_CUR); err == nil {
+	// Do this because Seek is often lazy about reporting errors; this will mask
-			return
+	// the fact that the tar stream may be truncated. We can rely on the
 	// io.CopyN done shortly afterwards to trigger any IO errors.
 	var seekSkipped int64 // Number of bytes skipped via Seek
 	if sr, ok := tr.r.(io.Seeker); ok && dataSkip > 1 {
 		// Not all io.Seeker can actually Seek. For example, os.Stdin implements
 		// io.Seeker, but calling Seek always returns an error and performs
 		// no action. Thus, we try an innocent seek to the current position
 		// to see if Seek is really supported.
 		pos1, err := sr.Seek(0, os.SEEK_CUR)
 		if err == nil {
 			// Seek seems supported, so perform the real Seek.
 			pos2, err := sr.Seek(dataSkip-1, os.SEEK_CUR)
 			if err != nil {
 				tr.err = err
 				return tr.err
 			}
 			seekSkipped = pos2 - pos1
 		}
 	}
-	_, tr.err = io.CopyN(ioutil.Discard, tr.r, nr)
+
 	var copySkipped int64 // Number of bytes skipped via CopyN
 	copySkipped, tr.err = io.CopyN(ioutil.Discard, tr.r, totalSkip-seekSkipped)
 	if tr.err == io.EOF && seekSkipped+copySkipped < dataSkip {
 		tr.err = io.ErrUnexpectedEOF
 	}
 	return tr.err
 }
 func (tr *Reader) verifyChecksum(header []byte) bool {
@ -497,23 +591,32 @@ func (tr *Reader) verifyChecksum(header []byte) bool {
 		return false
 	}
-	given := tr.octal(header[148:156])
+	var p parser
 	given := p.parseOctal(header[148:156])
 	unsigned, signed := checksum(header)
-	return given == unsigned || given == signed
+	return p.err == nil && (given == unsigned || given == signed)
 }
 // readHeader reads the next block header and assumes that the underlying reader
 // is already aligned to a block boundary.
 //
 // The err will be set to io.EOF only when one of the following occurs:
 //	* Exactly 0 bytes are read and EOF is hit.
 //	* Exactly 1 block of zeros is read and EOF is hit.
 //	* At least 2 blocks of zeros are read.
 func (tr *Reader) readHeader() *Header {
 	header := tr.hdrBuff[:]
 	copy(header, zeroBlock)
-	if _, tr.err = io.ReadFull(tr.r, header); tr.err != nil {
+	if n, err := io.ReadFull(tr.r, header); err != nil {
 		tr.err = err
 		// because it could read some of the block, but reach EOF first
 		if tr.err == io.EOF && tr.RawAccounting {
-			if _, tr.err = tr.rawBytes.Write(header); tr.err != nil {
+			if _, err := tr.rawBytes.Write(header[:n]); err != nil {
-				return nil
+				tr.err = err
 			}
 		}
-		return nil
+		return nil // io.EOF is okay here
 	}
 	if tr.RawAccounting {
 		if _, tr.err = tr.rawBytes.Write(header); tr.err != nil {
@ -523,14 +626,15 @@ func (tr *Reader) readHeader() *Header {
 	// Two blocks of zero bytes marks the end of the archive.
 	if bytes.Equal(header, zeroBlock[0:blockSize]) {
-		if _, tr.err = io.ReadFull(tr.r, header); tr.err != nil {
+		if n, err := io.ReadFull(tr.r, header); err != nil {
 			tr.err = err
 			// because it could read some of the block, but reach EOF first
 			if tr.err == io.EOF && tr.RawAccounting {
-				if _, tr.err = tr.rawBytes.Write(header); tr.err != nil {
+				if _, err := tr.rawBytes.Write(header[:n]); err != nil {
-					return nil
+					tr.err = err
 				}
 			}
-			return nil
+			return nil // io.EOF is okay here
 		}
 		if tr.RawAccounting {
 			if _, tr.err = tr.rawBytes.Write(header); tr.err != nil {
@ -551,22 +655,19 @@ func (tr *Reader) readHeader() *Header {
 	}
 	// Unpack
 	var p parser
 	hdr := new(Header)
 	s := slicer(header)
-	hdr.Name = cString(s.next(100))
+	hdr.Name = p.parseString(s.next(100))
-	hdr.Mode = tr.octal(s.next(8))
+	hdr.Mode = p.parseNumeric(s.next(8))
-	hdr.Uid = int(tr.octal(s.next(8)))
+	hdr.Uid = int(p.parseNumeric(s.next(8)))
-	hdr.Gid = int(tr.octal(s.next(8)))
+	hdr.Gid = int(p.parseNumeric(s.next(8)))
-	hdr.Size = tr.octal(s.next(12))
+	hdr.Size = p.parseNumeric(s.next(12))
-	if hdr.Size < 0 {
+	hdr.ModTime = time.Unix(p.parseNumeric(s.next(12)), 0)
 		tr.err = ErrHeader
 		return nil
 	}
 	hdr.ModTime = time.Unix(tr.octal(s.next(12)), 0)
 	s.next(8) // chksum
 	hdr.Typeflag = s.next(1)[0]
-	hdr.Linkname = cString(s.next(100))
+	hdr.Linkname = p.parseString(s.next(100))
 	// The remainder of the header depends on the value of magic.
 	// The original (v7) version of tar had no explicit magic field,
@ -586,70 +687,76 @@ func (tr *Reader) readHeader() *Header {
 	switch format {
 	case "posix", "gnu", "star":
-		hdr.Uname = cString(s.next(32))
+		hdr.Uname = p.parseString(s.next(32))
-		hdr.Gname = cString(s.next(32))
+		hdr.Gname = p.parseString(s.next(32))
 		devmajor := s.next(8)
 		devminor := s.next(8)
 		if hdr.Typeflag == TypeChar || hdr.Typeflag == TypeBlock {
-			hdr.Devmajor = tr.octal(devmajor)
+			hdr.Devmajor = p.parseNumeric(devmajor)
-			hdr.Devminor = tr.octal(devminor)
+			hdr.Devminor = p.parseNumeric(devminor)
 		}
 		var prefix string
 		switch format {
 		case "posix", "gnu":
-			prefix = cString(s.next(155))
+			prefix = p.parseString(s.next(155))
 		case "star":
-			prefix = cString(s.next(131))
+			prefix = p.parseString(s.next(131))
-			hdr.AccessTime = time.Unix(tr.octal(s.next(12)), 0)
+			hdr.AccessTime = time.Unix(p.parseNumeric(s.next(12)), 0)
-			hdr.ChangeTime = time.Unix(tr.octal(s.next(12)), 0)
+			hdr.ChangeTime = time.Unix(p.parseNumeric(s.next(12)), 0)
 		}
 		if len(prefix) > 0 {
 			hdr.Name = prefix + "/" + hdr.Name
 		}
 	}
-	if tr.err != nil {
+	if p.err != nil {
 		tr.err = p.err
 		return nil
 	}
 	nb := hdr.Size
 	if isHeaderOnlyType(hdr.Typeflag) {
 		nb = 0
 	}
 	if nb < 0 {
 		tr.err = ErrHeader
 		return nil
 	}
 	// Maximum value of hdr.Size is 64 GB (12 octal digits),
 	// so there's no risk of int64 overflowing.
 	nb := int64(hdr.Size)
 	tr.pad = -nb & (blockSize - 1) // blockSize is a power of two
 	// Set the current file reader.
 	tr.pad = -nb & (blockSize - 1) // blockSize is a power of two
 	tr.curr = &regFileReader{r: tr.r, nb: nb}
 	// Check for old GNU sparse format entry.
 	if hdr.Typeflag == TypeGNUSparse {
 		// Get the real size of the file.
-		hdr.Size = tr.octal(header[483:495])
+		hdr.Size = p.parseNumeric(header[483:495])
 		if p.err != nil {
 			tr.err = p.err
 			return nil
 		}
 		// Read the sparse map.
 		sp := tr.readOldGNUSparseMap(header)
 		if tr.err != nil {
 			return nil
 		}
 		// Current file is a GNU sparse file. Update the current file reader.
-		tr.curr = &sparseFileReader{rfr: tr.curr.(*regFileReader), sp: sp, tot: hdr.Size}
+		tr.curr, tr.err = newSparseFileReader(tr.curr, sp, hdr.Size)
 		if tr.err != nil {
 			return nil
 		}
 	}
 	return hdr
 }
 // A sparseEntry holds a single entry in a sparse file's sparse map.
 // A sparse entry indicates the offset and size in a sparse file of a
 // block of data.
 type sparseEntry struct {
 	offset   int64
 	numBytes int64
 }
 // readOldGNUSparseMap reads the sparse map as stored in the old GNU sparse format.
 // The sparse map is stored in the tar header if it's small enough. If it's larger than four entries,
 // then one or more extension headers are used to store the rest of the sparse map.
 func (tr *Reader) readOldGNUSparseMap(header []byte) []sparseEntry {
 	var p parser
 	isExtended := header[oldGNUSparseMainHeaderIsExtendedOffset] != 0
 	spCap := oldGNUSparseMainHeaderNumEntries
 	if isExtended {
@ -660,10 +767,10 @@ func (tr *Reader) readOldGNUSparseMap(header []byte) []sparseEntry {
 	// Read the four entries from the main tar header
 	for i := 0; i < oldGNUSparseMainHeaderNumEntries; i++ {
-		offset := tr.octal(s.next(oldGNUSparseOffsetSize))
+		offset := p.parseNumeric(s.next(oldGNUSparseOffsetSize))
-		numBytes := tr.octal(s.next(oldGNUSparseNumBytesSize))
+		numBytes := p.parseNumeric(s.next(oldGNUSparseNumBytesSize))
-		if tr.err != nil {
+		if p.err != nil {
-			tr.err = ErrHeader
+			tr.err = p.err
 			return nil
 		}
 		if offset == 0 && numBytes == 0 {
@ -687,10 +794,10 @@ func (tr *Reader) readOldGNUSparseMap(header []byte) []sparseEntry {
 		isExtended = sparseHeader[oldGNUSparseExtendedHeaderIsExtendedOffset] != 0
 		s = slicer(sparseHeader)
 		for i := 0; i < oldGNUSparseExtendedHeaderNumEntries; i++ {
-			offset := tr.octal(s.next(oldGNUSparseOffsetSize))
+			offset := p.parseNumeric(s.next(oldGNUSparseOffsetSize))
-			numBytes := tr.octal(s.next(oldGNUSparseNumBytesSize))
+			numBytes := p.parseNumeric(s.next(oldGNUSparseNumBytesSize))
-			if tr.err != nil {
+			if p.err != nil {
-				tr.err = ErrHeader
+				tr.err = p.err
 				return nil
 			}
 			if offset == 0 && numBytes == 0 {
@ -702,134 +809,111 @@ func (tr *Reader) readOldGNUSparseMap(header []byte) []sparseEntry {
 	return sp
 }
-// readGNUSparseMap1x0 reads the sparse map as stored in GNU's PAX sparse format version 1.0.
+// readGNUSparseMap1x0 reads the sparse map as stored in GNU's PAX sparse format
-// The sparse map is stored just before the file data and padded out to the nearest block boundary.
+// version 1.0. The format of the sparse map consists of a series of
 // newline-terminated numeric fields. The first field is the number of entries
 // and is always present. Following this are the entries, consisting of two
 // fields (offset, numBytes). This function must stop reading at the end
 // boundary of the block containing the last newline.
 //
 // Note that the GNU manual says that numeric values should be encoded in octal
 // format. However, the GNU tar utility itself outputs these values in decimal.
 // As such, this library treats values as being encoded in decimal.
 func readGNUSparseMap1x0(r io.Reader) ([]sparseEntry, error) {
-	buf := make([]byte, 2*blockSize)
+	var cntNewline int64
-	sparseHeader := buf[:blockSize]
+	var buf bytes.Buffer
 	var blk = make([]byte, blockSize)
-	// readDecimal is a helper function to read a decimal integer from the sparse map
+	// feedTokens copies data in numBlock chunks from r into buf until there are
-	// while making sure to read from the file in blocks of size blockSize
+	// at least cnt newlines in buf. It will not read more blocks than needed.
-	readDecimal := func() (int64, error) {
+	var feedTokens = func(cnt int64) error {
-		// Look for newline
+		for cntNewline < cnt {
-		nl := bytes.IndexByte(sparseHeader, '\n')
+			if _, err := io.ReadFull(r, blk); err != nil {
-		if nl == -1 {
+				if err == io.EOF {
-			if len(sparseHeader) >= blockSize {
+					err = io.ErrUnexpectedEOF
-				// This is an error
+				}
-				return 0, ErrHeader
+				return err
 			}
-			oldLen := len(sparseHeader)
+			buf.Write(blk)
-			newLen := oldLen + blockSize
+			for _, c := range blk {
-			if cap(sparseHeader) < newLen {
+				if c == '\n' {
-				// There's more header, but we need to make room for the next block
+					cntNewline++
 				copy(buf, sparseHeader)
 				sparseHeader = buf[:newLen]
 			} else {
 				// There's more header, and we can just reslice
 				sparseHeader = sparseHeader[:newLen]
 			}
 			// Now that sparseHeader is large enough, read next block
 			if _, err := io.ReadFull(r, sparseHeader[oldLen:newLen]); err != nil {
 				return 0, err
 			}
 			// leaving this function for io.Reader makes it more testable
 			if tr, ok := r.(*Reader); ok && tr.RawAccounting {
 				if _, err := tr.rawBytes.Write(sparseHeader[oldLen:newLen]); err != nil {
 					return 0, err
 				}
 			}
 			// Look for a newline in the new data
 			nl = bytes.IndexByte(sparseHeader[oldLen:newLen], '\n')
 			if nl == -1 {
 				// This is an error
 				return 0, ErrHeader
 			}
 			nl += oldLen // We want the position from the beginning
 		}
-		// Now that we've found a newline, read a number
+		return nil
 		n, err := strconv.ParseInt(string(sparseHeader[:nl]), 10, 0)
 		if err != nil {
 			return 0, ErrHeader
 		}
 		// Update sparseHeader to consume this number
 		sparseHeader = sparseHeader[nl+1:]
 		return n, nil
 	}
-	// Read the first block
+	// nextToken gets the next token delimited by a newline. This assumes that
-	if _, err := io.ReadFull(r, sparseHeader); err != nil {
+	// at least one newline exists in the buffer.
 	var nextToken = func() string {
 		cntNewline--
 		tok, _ := buf.ReadString('\n')
 		return tok[:len(tok)-1] // Cut off newline
 	}
 	// Parse for the number of entries.
 	// Use integer overflow resistant math to check this.
 	if err := feedTokens(1); err != nil {
 		return nil, err
 	}
-	// leaving this function for io.Reader makes it more testable
+	numEntries, err := strconv.ParseInt(nextToken(), 10, 0) // Intentionally parse as native int
-	if tr, ok := r.(*Reader); ok && tr.RawAccounting {
+	if err != nil || numEntries < 0 || int(2*numEntries) < int(numEntries) {
-		if _, err := tr.rawBytes.Write(sparseHeader); err != nil {
+		return nil, ErrHeader
 			return nil, err
 		}
 	}
-	// The first line contains the number of entries
+	// Parse for all member entries.
-	numEntries, err := readDecimal()
+	// numEntries is trusted after this since a potential attacker must have
-	if err != nil {
+	// committed resources proportional to what this library used.
 	if err := feedTokens(2 * numEntries); err != nil {
 		return nil, err
 	}
 	// Read all the entries
 	sp := make([]sparseEntry, 0, numEntries)
 	for i := int64(0); i < numEntries; i++ {
-		// Read the offset
+		offset, err := strconv.ParseInt(nextToken(), 10, 64)
 		offset, err := readDecimal()
 		if err != nil {
-			return nil, err
+			return nil, ErrHeader
 		}
-		// Read numBytes
+		numBytes, err := strconv.ParseInt(nextToken(), 10, 64)
 		numBytes, err := readDecimal()
 		if err != nil {
-			return nil, err
+			return nil, ErrHeader
 		}
 		sp = append(sp, sparseEntry{offset: offset, numBytes: numBytes})
 	}
 	return sp, nil
 }
-// readGNUSparseMap0x1 reads the sparse map as stored in GNU's PAX sparse format version 0.1.
+// readGNUSparseMap0x1 reads the sparse map as stored in GNU's PAX sparse format
-// The sparse map is stored in the PAX headers.
+// version 0.1. The sparse map is stored in the PAX headers.
-func readGNUSparseMap0x1(headers map[string]string) ([]sparseEntry, error) {
+func readGNUSparseMap0x1(extHdrs map[string]string) ([]sparseEntry, error) {
-	// Get number of entries
+	// Get number of entries.
-	numEntriesStr, ok := headers[paxGNUSparseNumBlocks]
+	// Use integer overflow resistant math to check this.
-	if !ok {
+	numEntriesStr := extHdrs[paxGNUSparseNumBlocks]
-		return nil, ErrHeader
+	numEntries, err := strconv.ParseInt(numEntriesStr, 10, 0) // Intentionally parse as native int
-	}
+	if err != nil || numEntries < 0 || int(2*numEntries) < int(numEntries) {
 	numEntries, err := strconv.ParseInt(numEntriesStr, 10, 0)
 	if err != nil {
 		return nil, ErrHeader
 	}
-	sparseMap := strings.Split(headers[paxGNUSparseMap], ",")
+	// There should be two numbers in sparseMap for each entry.
-
+	sparseMap := strings.Split(extHdrs[paxGNUSparseMap], ",")
 	// There should be two numbers in sparseMap for each entry
 	if int64(len(sparseMap)) != 2*numEntries {
 		return nil, ErrHeader
 	}
-	// Loop through the entries in the sparse map
+	// Loop through the entries in the sparse map.
 	// numEntries is trusted now.
 	sp := make([]sparseEntry, 0, numEntries)
 	for i := int64(0); i < numEntries; i++ {
-		offset, err := strconv.ParseInt(sparseMap[2*i], 10, 0)
+		offset, err := strconv.ParseInt(sparseMap[2*i], 10, 64)
 		if err != nil {
 			return nil, ErrHeader
 		}
-		numBytes, err := strconv.ParseInt(sparseMap[2*i+1], 10, 0)
+		numBytes, err := strconv.ParseInt(sparseMap[2*i+1], 10, 64)
 		if err != nil {
 			return nil, ErrHeader
 		}
 		sp = append(sp, sparseEntry{offset: offset, numBytes: numBytes})
 	}
 	return sp, nil
 }
@ -846,10 +930,18 @@ func (tr *Reader) numBytes() int64 {
 // Read reads from the current entry in the tar archive.
 // It returns 0, io.EOF when it reaches the end of that entry,
 // until Next is called to advance to the next entry.
 //
 // Calling Read on special types like TypeLink, TypeSymLink, TypeChar,
 // TypeBlock, TypeDir, and TypeFifo returns 0, io.EOF regardless of what
 // the Header.Size claims.
 func (tr *Reader) Read(b []byte) (n int, err error) {
 	if tr.err != nil {
 		return 0, tr.err
 	}
 	if tr.curr == nil {
 		return 0, io.EOF
 	}
 	n, err = tr.curr.Read(b)
 	if err != nil && err != io.EOF {
 		tr.err = err
@ -879,9 +971,33 @@ func (rfr *regFileReader) numBytes() int64 {
 	return rfr.nb
 }
-// readHole reads a sparse file hole ending at offset toOffset
+// newSparseFileReader creates a new sparseFileReader, but validates all of the
-func (sfr *sparseFileReader) readHole(b []byte, toOffset int64) int {
+// sparse entries before doing so.
-	n64 := toOffset - sfr.pos
+func newSparseFileReader(rfr numBytesReader, sp []sparseEntry, total int64) (*sparseFileReader, error) {
 	if total < 0 {
 		return nil, ErrHeader // Total size cannot be negative
 	}
 	// Validate all sparse entries. These are the same checks as performed by
 	// the BSD tar utility.
 	for i, s := range sp {
 		switch {
 		case s.offset < 0 || s.numBytes < 0:
 			return nil, ErrHeader // Negative values are never okay
 		case s.offset > math.MaxInt64-s.numBytes:
 			return nil, ErrHeader // Integer overflow with large length
 		case s.offset+s.numBytes > total:
 			return nil, ErrHeader // Region extends beyond the "real" size
 		case i > 0 && sp[i-1].offset+sp[i-1].numBytes > s.offset:
 			return nil, ErrHeader // Regions can't overlap and must be in order
 		}
 	}
 	return &sparseFileReader{rfr: rfr, sp: sp, total: total}, nil
 }
 // readHole reads a sparse hole ending at endOffset.
 func (sfr *sparseFileReader) readHole(b []byte, endOffset int64) int {
 	n64 := endOffset - sfr.pos
 	if n64 > int64(len(b)) {
 		n64 = int64(len(b))
 	}
@ -895,49 +1011,54 @@ func (sfr *sparseFileReader) readHole(b []byte, toOffset int64) int {
 // Read reads the sparse file data in expanded form.
 func (sfr *sparseFileReader) Read(b []byte) (n int, err error) {
-	if len(sfr.sp) == 0 {
+	// Skip past all empty fragments.
-		// No more data fragments to read from.
+	for len(sfr.sp) > 0 && sfr.sp[0].numBytes == 0 {
-		if sfr.pos < sfr.tot {
+		sfr.sp = sfr.sp[1:]
 			// We're in the last hole
 			n = sfr.readHole(b, sfr.tot)
 			return
 		}
 		// Otherwise, we're at the end of the file
 		return 0, io.EOF
 	}
 	if sfr.tot < sfr.sp[0].offset {
 		return 0, io.ErrUnexpectedEOF
 	}
 	if sfr.pos < sfr.sp[0].offset {
 		// We're in a hole
 		n = sfr.readHole(b, sfr.sp[0].offset)
 		return
 	}
-	// We're not in a hole, so we'll read from the next data fragment
+	// If there are no more fragments, then it is possible that there
-	posInFragment := sfr.pos - sfr.sp[0].offset
+	// is one last sparse hole.
-	bytesLeft := sfr.sp[0].numBytes - posInFragment
+	if len(sfr.sp) == 0 {
 		// This behavior matches the BSD tar utility.
 		// However, GNU tar stops returning data even if sfr.total is unmet.
 		if sfr.pos < sfr.total {
 			return sfr.readHole(b, sfr.total), nil
 		}
 		return 0, io.EOF
 	}
 	// In front of a data fragment, so read a hole.
 	if sfr.pos < sfr.sp[0].offset {
 		return sfr.readHole(b, sfr.sp[0].offset), nil
 	}
 	// In a data fragment, so read from it.
 	// This math is overflow free since we verify that offset and numBytes can
 	// be safely added when creating the sparseFileReader.
 	endPos := sfr.sp[0].offset + sfr.sp[0].numBytes // End offset of fragment
 	bytesLeft := endPos - sfr.pos                   // Bytes left in fragment
 	if int64(len(b)) > bytesLeft {
-		b = b[0:bytesLeft]
+		b = b[:bytesLeft]
 	}
 	n, err = sfr.rfr.Read(b)
 	sfr.pos += int64(n)
-
+	if err == io.EOF {
-	if int64(n) == bytesLeft {
+		if sfr.pos < endPos {
-		// We're done with this fragment
+			err = io.ErrUnexpectedEOF // There was supposed to be more data
-		sfr.sp = sfr.sp[1:]
+		} else if sfr.pos < sfr.total {
 			err = nil // There is still an implicit sparse hole at the end
 		}
 	}
-	if err == io.EOF && sfr.pos < sfr.tot {
+	if sfr.pos == endPos {
-		// We reached the end of the last fragment's data, but there's a final hole
+		sfr.sp = sfr.sp[1:] // We are done with this fragment, so pop it
 		err = nil
 	}
-	return
+	return n, err
 }
 // numBytes returns the number of bytes left to read in the sparse file's
 // sparse-encoded data in the tar archive.
 func (sfr *sparseFileReader) numBytes() int64 {
-	return sfr.rfr.nb
+	return sfr.rfr.numBytes()
 }
--- a/archive/tar/reader_test.go
+++ b/archive/tar/reader_test.go
--- a/archive/tar/tar_test.go
+++ b/archive/tar/tar_test.go
@ -94,13 +94,12 @@ func TestRoundTrip(t *testing.T) {
 	var b bytes.Buffer
 	tw := NewWriter(&b)
 	hdr := &Header{
-		Name:    "file.txt",
+		Name: "file.txt",
-		Uid:     1 << 21, // too big for 8 octal digits
+		Uid:  1 << 21, // too big for 8 octal digits
-		Size:    int64(len(data)),
+		Size: int64(len(data)),
-		ModTime: time.Now(),
+		// https://github.com/golang/go/commit/0e3355903d2ebcf5ee9e76096f51ac9a116a9dbb#diff-d7bf2a98d7b57b6ff754ca406f1b7581R105
 		ModTime: time.Now().AddDate(0, 0, 0).Round(1 * time.Second),
 	}
 	// tar only supports second precision.
 	hdr.ModTime = hdr.ModTime.Add(-time.Duration(hdr.ModTime.Nanosecond()) * time.Nanosecond)
 	if err := tw.WriteHeader(hdr); err != nil {
 		t.Fatalf("tw.WriteHeader: %v", err)
 	}
--- a/archive/tar/testdata/gnu-multi-hdrs.tar
+++ b/archive/tar/testdata/gnu-multi-hdrs.tar
--- a/archive/tar/testdata/hdr-only.tar
+++ b/archive/tar/testdata/hdr-only.tar
--- a/archive/tar/testdata/issue12435.tar
+++ b/archive/tar/testdata/issue12435.tar
--- a/archive/tar/testdata/neg-size.tar
+++ b/archive/tar/testdata/neg-size.tar
--- a/archive/tar/testdata/pax-multi-hdrs.tar
+++ b/archive/tar/testdata/pax-multi-hdrs.tar
--- a/archive/tar/testdata/pax-path-hdr.tar
+++ b/archive/tar/testdata/pax-path-hdr.tar
--- a/archive/tar/testdata/ustar-file-reg.tar
+++ b/archive/tar/testdata/ustar-file-reg.tar
--- a/archive/tar/writer.go
+++ b/archive/tar/writer.go
@ -12,8 +12,8 @@ import (
 	"errors"
 	"fmt"
 	"io"
 	"os"
 	"path"
 	"sort"
 	"strconv"
 	"strings"
 	"time"
@ -23,7 +23,6 @@ var (
 	ErrWriteTooLong    = errors.New("archive/tar: write too long")
 	ErrFieldTooLong    = errors.New("archive/tar: header field too long")
 	ErrWriteAfterClose = errors.New("archive/tar: write after close")
 	errNameTooLong     = errors.New("archive/tar: name too long")
 	errInvalidHeader   = errors.New("archive/tar: header field too long or contains invalid values")
 )
@ -43,6 +42,10 @@ type Writer struct {
 	paxHdrBuff [blockSize]byte // buffer to use in writeHeader when writing a pax header
 }
 type formatter struct {
 	err error // Last error seen
 }
 // NewWriter creates a new Writer writing to w.
 func NewWriter(w io.Writer) *Writer { return &Writer{w: w} }
@ -69,17 +72,9 @@ func (tw *Writer) Flush() error {
 }
 // Write s into b, terminating it with a NUL if there is room.
-// If the value is too long for the field and allowPax is true add a paxheader record instead
+func (f *formatter) formatString(b []byte, s string) {
 func (tw *Writer) cString(b []byte, s string, allowPax bool, paxKeyword string, paxHeaders map[string]string) {
 	needsPaxHeader := allowPax && len(s) > len(b) || !isASCII(s)
 	if needsPaxHeader {
 		paxHeaders[paxKeyword] = s
 		return
 	}
 	if len(s) > len(b) {
-		if tw.err == nil {
+		f.err = ErrFieldTooLong
 			tw.err = ErrFieldTooLong
 		}
 		return
 	}
 	ascii := toASCII(s)
@ -90,40 +85,40 @@ func (tw *Writer) cString(b []byte, s string, allowPax bool, paxKeyword string,
 }
 // Encode x as an octal ASCII string and write it into b with leading zeros.
-func (tw *Writer) octal(b []byte, x int64) {
+func (f *formatter) formatOctal(b []byte, x int64) {
 	s := strconv.FormatInt(x, 8)
 	// leading zeros, but leave room for a NUL.
 	for len(s)+1 < len(b) {
 		s = "0" + s
 	}
-	tw.cString(b, s, false, paxNone, nil)
+	f.formatString(b, s)
 }
-// Write x into b, either as octal or as binary (GNUtar/star extension).
+// fitsInBase256 reports whether x can be encoded into n bytes using base-256
-// If the value is too long for the field and writingPax is enabled both for the field and the add a paxheader record instead
+// encoding. Unlike octal encoding, base-256 encoding does not require that the
-func (tw *Writer) numeric(b []byte, x int64, allowPax bool, paxKeyword string, paxHeaders map[string]string) {
+// string ends with a NUL character. Thus, all n bytes are available for output.
-	// Try octal first.
+//
-	s := strconv.FormatInt(x, 8)
+// If operating in binary mode, this assumes strict GNU binary mode; which means
-	if len(s) < len(b) {
+// that the first byte can only be either 0x80 or 0xff. Thus, the first byte is
-		tw.octal(b, x)
+// equivalent to the sign bit in two's complement form.
 func fitsInBase256(n int, x int64) bool {
 	var binBits = uint(n-1) * 8
 	return n >= 9 || (x >= -1<<binBits && x < 1<<binBits)
 }
 // Write x into b, as binary (GNUtar/star extension).
 func (f *formatter) formatNumeric(b []byte, x int64) {
 	if fitsInBase256(len(b), x) {
 		for i := len(b) - 1; i >= 0; i-- {
 			b[i] = byte(x)
 			x >>= 8
 		}
 		b[0] |= 0x80 // Highest bit indicates binary format
 		return
 	}
-	// If it is too long for octal, and pax is preferred, use a pax header
+	f.formatOctal(b, 0) // Last resort, just write zero
-	if allowPax && tw.preferPax {
+	f.err = ErrFieldTooLong
 		tw.octal(b, 0)
 		s := strconv.FormatInt(x, 10)
 		paxHeaders[paxKeyword] = s
 		return
 	}
 	// Too big: use binary (big-endian).
 	tw.usedBinary = true
 	for i := len(b) - 1; x > 0 && i >= 0; i-- {
 		b[i] = byte(x)
 		x >>= 8
 	}
 	b[0] |= 0x80 // highest bit indicates binary format
 }
 var (
@ -162,6 +157,7 @@ func (tw *Writer) writeHeader(hdr *Header, allowPax bool) error {
 	// subsecond time resolution, but for now let's just capture
 	// too long fields or non ascii characters
 	var f formatter
 	var header []byte
 	// We need to select which scratch buffer to use carefully,
@ -176,10 +172,40 @@ func (tw *Writer) writeHeader(hdr *Header, allowPax bool) error {
 	copy(header, zeroBlock)
 	s := slicer(header)
 	// Wrappers around formatter that automatically sets paxHeaders if the
 	// argument extends beyond the capacity of the input byte slice.
 	var formatString = func(b []byte, s string, paxKeyword string) {
 		needsPaxHeader := paxKeyword != paxNone && len(s) > len(b) || !isASCII(s)
 		if needsPaxHeader {
 			paxHeaders[paxKeyword] = s
 			return
 		}
 		f.formatString(b, s)
 	}
 	var formatNumeric = func(b []byte, x int64, paxKeyword string) {
 		// Try octal first.
 		s := strconv.FormatInt(x, 8)
 		if len(s) < len(b) {
 			f.formatOctal(b, x)
 			return
 		}
 		// If it is too long for octal, and PAX is preferred, use a PAX header.
 		if paxKeyword != paxNone && tw.preferPax {
 			f.formatOctal(b, 0)
 			s := strconv.FormatInt(x, 10)
 			paxHeaders[paxKeyword] = s
 			return
 		}
 		tw.usedBinary = true
 		f.formatNumeric(b, x)
 	}
 	// keep a reference to the filename to allow to overwrite it later if we detect that we can use ustar longnames instead of pax
 	pathHeaderBytes := s.next(fileNameSize)
-	tw.cString(pathHeaderBytes, hdr.Name, true, paxPath, paxHeaders)
+	formatString(pathHeaderBytes, hdr.Name, paxPath)
 	// Handle out of range ModTime carefully.
 	var modTime int64
@ -187,25 +213,25 @@ func (tw *Writer) writeHeader(hdr *Header, allowPax bool) error {
 		modTime = hdr.ModTime.Unix()
 	}
-	tw.octal(s.next(8), hdr.Mode)                                   // 100:108
+	f.formatOctal(s.next(8), hdr.Mode)               // 100:108
-	tw.numeric(s.next(8), int64(hdr.Uid), true, paxUid, paxHeaders) // 108:116
+	formatNumeric(s.next(8), int64(hdr.Uid), paxUid) // 108:116
-	tw.numeric(s.next(8), int64(hdr.Gid), true, paxGid, paxHeaders) // 116:124
+	formatNumeric(s.next(8), int64(hdr.Gid), paxGid) // 116:124
-	tw.numeric(s.next(12), hdr.Size, true, paxSize, paxHeaders)     // 124:136
+	formatNumeric(s.next(12), hdr.Size, paxSize)     // 124:136
-	tw.numeric(s.next(12), modTime, false, paxNone, nil)            // 136:148 --- consider using pax for finer granularity
+	formatNumeric(s.next(12), modTime, paxNone)      // 136:148 --- consider using pax for finer granularity
-	s.next(8)                                                       // chksum (148:156)
+	s.next(8)                                        // chksum (148:156)
-	s.next(1)[0] = hdr.Typeflag                                     // 156:157
+	s.next(1)[0] = hdr.Typeflag                      // 156:157
-	tw.cString(s.next(100), hdr.Linkname, true, paxLinkpath, paxHeaders)
+	formatString(s.next(100), hdr.Linkname, paxLinkpath)
-	copy(s.next(8), []byte("ustar\x0000"))                        // 257:265
+	copy(s.next(8), []byte("ustar\x0000"))          // 257:265
-	tw.cString(s.next(32), hdr.Uname, true, paxUname, paxHeaders) // 265:297
+	formatString(s.next(32), hdr.Uname, paxUname)   // 265:297
-	tw.cString(s.next(32), hdr.Gname, true, paxGname, paxHeaders) // 297:329
+	formatString(s.next(32), hdr.Gname, paxGname)   // 297:329
-	tw.numeric(s.next(8), hdr.Devmajor, false, paxNone, nil)      // 329:337
+	formatNumeric(s.next(8), hdr.Devmajor, paxNone) // 329:337
-	tw.numeric(s.next(8), hdr.Devminor, false, paxNone, nil)      // 337:345
+	formatNumeric(s.next(8), hdr.Devminor, paxNone) // 337:345
 	// keep a reference to the prefix to allow to overwrite it later if we detect that we can use ustar longnames instead of pax
 	prefixHeaderBytes := s.next(155)
-	tw.cString(prefixHeaderBytes, "", false, paxNone, nil) // 345:500  prefix
+	formatString(prefixHeaderBytes, "", paxNone) // 345:500  prefix
 	// Use the GNU magic instead of POSIX magic if we used any GNU extensions.
 	if tw.usedBinary {
@ -215,37 +241,26 @@ func (tw *Writer) writeHeader(hdr *Header, allowPax bool) error {
 	_, paxPathUsed := paxHeaders[paxPath]
 	// try to use a ustar header when only the name is too long
 	if !tw.preferPax && len(paxHeaders) == 1 && paxPathUsed {
-		suffix := hdr.Name
+		prefix, suffix, ok := splitUSTARPath(hdr.Name)
-		prefix := ""
+		if ok {
-		if len(hdr.Name) > fileNameSize && isASCII(hdr.Name) {
+			// Since we can encode in USTAR format, disable PAX header.
-			var err error
+			delete(paxHeaders, paxPath)
 			prefix, suffix, err = tw.splitUSTARLongName(hdr.Name)
 			if err == nil {
 				// ok we can use a ustar long name instead of pax, now correct the fields
-				// remove the path field from the pax header. this will suppress the pax header
+			// Update the path fields
-				delete(paxHeaders, paxPath)
+			formatString(pathHeaderBytes, suffix, paxNone)
-
+			formatString(prefixHeaderBytes, prefix, paxNone)
 				// update the path fields
 				tw.cString(pathHeaderBytes, suffix, false, paxNone, nil)
 				tw.cString(prefixHeaderBytes, prefix, false, paxNone, nil)
 				// Use the ustar magic if we used ustar long names.
 				if len(prefix) > 0 && !tw.usedBinary {
 					copy(header[257:265], []byte("ustar\x00"))
 				}
 			}
 		}
 	}
 	// The chksum field is terminated by a NUL and a space.
 	// This is different from the other octal fields.
 	chksum, _ := checksum(header)
-	tw.octal(header[148:155], chksum)
+	f.formatOctal(header[148:155], chksum) // Never fails
 	header[155] = ' '
-	if tw.err != nil {
+	// Check if there were any formatting errors.
-		// problem with header; probably integer too big for a field.
+	if f.err != nil {
 		tw.err = f.err
 		return tw.err
 	}
@ -270,28 +285,25 @@ func (tw *Writer) writeHeader(hdr *Header, allowPax bool) error {
 	return tw.err
 }
-// writeUSTARLongName splits a USTAR long name hdr.Name.
+// splitUSTARPath splits a path according to USTAR prefix and suffix rules.
-// name must be < 256 characters. errNameTooLong is returned
+// If the path is not splittable, then it will return ("", "", false).
-// if hdr.Name can't be split. The splitting heuristic
+func splitUSTARPath(name string) (prefix, suffix string, ok bool) {
 // is compatible with gnu tar.
 func (tw *Writer) splitUSTARLongName(name string) (prefix, suffix string, err error) {
 	length := len(name)
-	if length > fileNamePrefixSize+1 {
+	if length <= fileNameSize || !isASCII(name) {
 		return "", "", false
 	} else if length > fileNamePrefixSize+1 {
 		length = fileNamePrefixSize + 1
 	} else if name[length-1] == '/' {
 		length--
 	}
 	i := strings.LastIndex(name[:length], "/")
-	// nlen contains the resulting length in the name field.
+	nlen := len(name) - i - 1 // nlen is length of suffix
-	// plen contains the resulting length in the prefix field.
+	plen := i                 // plen is length of prefix
 	nlen := len(name) - i - 1
 	plen := i
 	if i <= 0 || nlen > fileNameSize || nlen == 0 || plen > fileNamePrefixSize {
-		err = errNameTooLong
+		return "", "", false
 		return
 	}
-	prefix, suffix = name[:i], name[i+1:]
+	return name[:i], name[i+1:], true
 	return
 }
 // writePaxHeader writes an extended pax header to the
@ -304,11 +316,11 @@ func (tw *Writer) writePAXHeader(hdr *Header, paxHeaders map[string]string) erro
 	// succeed, and seems harmless enough.
 	ext.ModTime = hdr.ModTime
 	// The spec asks that we namespace our pseudo files
-	// with the current pid.
+	// with the current pid.  However, this results in differing outputs
-	pid := os.Getpid()
+	// for identical inputs.  As such, the constant 0 is now used instead.
 	// golang.org/issue/12358
 	dir, file := path.Split(hdr.Name)
-	fullName := path.Join(dir,
+	fullName := path.Join(dir, "PaxHeaders.0", file)
 		fmt.Sprintf("PaxHeaders.%d", pid), file)
 	ascii := toASCII(fullName)
 	if len(ascii) > 100 {
@ -318,8 +330,15 @@ func (tw *Writer) writePAXHeader(hdr *Header, paxHeaders map[string]string) erro
 	// Construct the body
 	var buf bytes.Buffer
-	for k, v := range paxHeaders {
+	// Keys are sorted before writing to body to allow deterministic output.
-		fmt.Fprint(&buf, paxHeader(k+"="+v))
+	var keys []string
 	for k := range paxHeaders {
 		keys = append(keys, k)
 	}
 	sort.Strings(keys)
 	for _, k := range keys {
 		fmt.Fprint(&buf, formatPAXRecord(k, paxHeaders[k]))
 	}
 	ext.Size = int64(len(buf.Bytes()))
@ -335,17 +354,18 @@ func (tw *Writer) writePAXHeader(hdr *Header, paxHeaders map[string]string) erro
 	return nil
 }
-// paxHeader formats a single pax record, prefixing it with the appropriate length
+// formatPAXRecord formats a single PAX record, prefixing it with the
-func paxHeader(msg string) string {
+// appropriate length.
-	const padding = 2 // Extra padding for space and newline
+func formatPAXRecord(k, v string) string {
-	size := len(msg) + padding
+	const padding = 3 // Extra padding for ' ', '=', and '\n'
 	size := len(k) + len(v) + padding
 	size += len(strconv.Itoa(size))
-	record := fmt.Sprintf("%d %s\n", size, msg)
+	record := fmt.Sprintf("%d %s=%s\n", size, k, v)
 	// Final adjustment if adding size field increased the record size.
 	if len(record) != size {
 		// Final adjustment if adding size increased
 		// the number of digits in size
 		size = len(record)
-		record = fmt.Sprintf("%d %s\n", size, msg)
+		record = fmt.Sprintf("%d %s=%s\n", size, k, v)
 	}
 	return record
 }
--- a/archive/tar/writer_test.go
+++ b/archive/tar/writer_test.go
@ -9,8 +9,10 @@ import (
 	"fmt"
 	"io"
 	"io/ioutil"
 	"math"
 	"os"
 	"reflect"
 	"sort"
 	"strings"
 	"testing"
 	"testing/iotest"
@ -291,7 +293,7 @@ func TestPax(t *testing.T) {
 		t.Fatal(err)
 	}
 	// Simple test to make sure PAX extensions are in effect
-	if !bytes.Contains(buf.Bytes(), []byte("PaxHeaders.")) {
+	if !bytes.Contains(buf.Bytes(), []byte("PaxHeaders.0")) {
 		t.Fatal("Expected at least one PAX header to be written.")
 	}
 	// Test that we can get a long name back out of the archive.
@ -330,7 +332,7 @@ func TestPaxSymlink(t *testing.T) {
 		t.Fatal(err)
 	}
 	// Simple test to make sure PAX extensions are in effect
-	if !bytes.Contains(buf.Bytes(), []byte("PaxHeaders.")) {
+	if !bytes.Contains(buf.Bytes(), []byte("PaxHeaders.0")) {
 		t.Fatal("Expected at least one PAX header to be written.")
 	}
 	// Test that we can get a long name back out of the archive.
@ -380,7 +382,7 @@ func TestPaxNonAscii(t *testing.T) {
 		t.Fatal(err)
 	}
 	// Simple test to make sure PAX extensions are in effect
-	if !bytes.Contains(buf.Bytes(), []byte("PaxHeaders.")) {
+	if !bytes.Contains(buf.Bytes(), []byte("PaxHeaders.0")) {
 		t.Fatal("Expected at least one PAX header to be written.")
 	}
 	// Test that we can get a long name back out of the archive.
@ -439,21 +441,49 @@ func TestPaxXattrs(t *testing.T) {
 	}
 }
-func TestPAXHeader(t *testing.T) {
+func TestPaxHeadersSorted(t *testing.T) {
-	medName := strings.Repeat("CD", 50)
+	fileinfo, err := os.Stat("testdata/small.txt")
-	longName := strings.Repeat("AB", 100)
+	if err != nil {
-	paxTests := [][2]string{
+		t.Fatal(err)
-		{paxPath + "=/etc/hosts", "19 path=/etc/hosts\n"},
+	}
-		{"a=b", "6 a=b\n"},          // Single digit length
+	hdr, err := FileInfoHeader(fileinfo, "")
-		{"a=names", "11 a=names\n"}, // Test case involving carries
+	if err != nil {
-		{paxPath + "=" + longName, fmt.Sprintf("210 path=%s\n", longName)},
+		t.Fatalf("os.Stat: %v", err)
-		{paxPath + "=" + medName, fmt.Sprintf("110 path=%s\n", medName)}}
+	}
 	contents := strings.Repeat(" ", int(hdr.Size))
-	for _, test := range paxTests {
+	hdr.Xattrs = map[string]string{
-		key, expected := test[0], test[1]
+		"foo": "foo",
-		if result := paxHeader(key); result != expected {
+		"bar": "bar",
-			t.Fatalf("paxHeader: got %s, expected %s", result, expected)
+		"baz": "baz",
-		}
+		"qux": "qux",
 	}
 	var buf bytes.Buffer
 	writer := NewWriter(&buf)
 	if err := writer.WriteHeader(hdr); err != nil {
 		t.Fatal(err)
 	}
 	if _, err = writer.Write([]byte(contents)); err != nil {
 		t.Fatal(err)
 	}
 	if err := writer.Close(); err != nil {
 		t.Fatal(err)
 	}
 	// Simple test to make sure PAX extensions are in effect
 	if !bytes.Contains(buf.Bytes(), []byte("PaxHeaders.0")) {
 		t.Fatal("Expected at least one PAX header to be written.")
 	}
 	// xattr bar should always appear before others
 	indices := []int{
 		bytes.Index(buf.Bytes(), []byte("bar=bar")),
 		bytes.Index(buf.Bytes(), []byte("baz=baz")),
 		bytes.Index(buf.Bytes(), []byte("foo=foo")),
 		bytes.Index(buf.Bytes(), []byte("qux=qux")),
 	}
 	if !sort.IntsAreSorted(indices) {
 		t.Fatal("PAX headers are not sorted")
 	}
 }
@ -544,3 +574,149 @@ func TestWriteAfterClose(t *testing.T) {
 		t.Fatalf("Write: got %v; want ErrWriteAfterClose", err)
 	}
 }
 func TestSplitUSTARPath(t *testing.T) {
 	var sr = strings.Repeat
 	var vectors = []struct {
 		input  string // Input path
 		prefix string // Expected output prefix
 		suffix string // Expected output suffix
 		ok     bool   // Split success?
 	}{
 		{"", "", "", false},
 		{"abc", "", "", false},
 		{"用戶名", "", "", false},
 		{sr("a", fileNameSize), "", "", false},
 		{sr("a", fileNameSize) + "/", "", "", false},
 		{sr("a", fileNameSize) + "/a", sr("a", fileNameSize), "a", true},
 		{sr("a", fileNamePrefixSize) + "/", "", "", false},
 		{sr("a", fileNamePrefixSize) + "/a", sr("a", fileNamePrefixSize), "a", true},
 		{sr("a", fileNameSize+1), "", "", false},
 		{sr("/", fileNameSize+1), sr("/", fileNameSize-1), "/", true},
 		{sr("a", fileNamePrefixSize) + "/" + sr("b", fileNameSize),
 			sr("a", fileNamePrefixSize), sr("b", fileNameSize), true},
 		{sr("a", fileNamePrefixSize) + "//" + sr("b", fileNameSize), "", "", false},
 		{sr("a/", fileNameSize), sr("a/", 77) + "a", sr("a/", 22), true},
 	}
 	for _, v := range vectors {
 		prefix, suffix, ok := splitUSTARPath(v.input)
 		if prefix != v.prefix || suffix != v.suffix || ok != v.ok {
 			t.Errorf("splitUSTARPath(%q):\ngot  (%q, %q, %v)\nwant (%q, %q, %v)",
 				v.input, prefix, suffix, ok, v.prefix, v.suffix, v.ok)
 		}
 	}
 }
 func TestFormatPAXRecord(t *testing.T) {
 	var medName = strings.Repeat("CD", 50)
 	var longName = strings.Repeat("AB", 100)
 	var vectors = []struct {
 		inputKey string
 		inputVal string
 		output   string
 	}{
 		{"k", "v", "6 k=v\n"},
 		{"path", "/etc/hosts", "19 path=/etc/hosts\n"},
 		{"path", longName, "210 path=" + longName + "\n"},
 		{"path", medName, "110 path=" + medName + "\n"},
 		{"foo", "ba", "9 foo=ba\n"},
 		{"foo", "bar", "11 foo=bar\n"},
 		{"foo", "b=\nar=\n==\x00", "18 foo=b=\nar=\n==\x00\n"},
 		{"foo", "hello9 foo=ba\nworld", "27 foo=hello9 foo=ba\nworld\n"},
 		{"☺☻☹", "日a本b語ç", "27 ☺☻☹=日a本b語ç\n"},
 		{"\x00hello", "\x00world", "17 \x00hello=\x00world\n"},
 	}
 	for _, v := range vectors {
 		output := formatPAXRecord(v.inputKey, v.inputVal)
 		if output != v.output {
 			t.Errorf("formatPAXRecord(%q, %q): got %q, want %q",
 				v.inputKey, v.inputVal, output, v.output)
 		}
 	}
 }
 func TestFitsInBase256(t *testing.T) {
 	var vectors = []struct {
 		input int64
 		width int
 		ok    bool
 	}{
 		{+1, 8, true},
 		{0, 8, true},
 		{-1, 8, true},
 		{1 << 56, 8, false},
 		{(1 << 56) - 1, 8, true},
 		{-1 << 56, 8, true},
 		{(-1 << 56) - 1, 8, false},
 		{121654, 8, true},
 		{-9849849, 8, true},
 		{math.MaxInt64, 9, true},
 		{0, 9, true},
 		{math.MinInt64, 9, true},
 		{math.MaxInt64, 12, true},
 		{0, 12, true},
 		{math.MinInt64, 12, true},
 	}
 	for _, v := range vectors {
 		ok := fitsInBase256(v.width, v.input)
 		if ok != v.ok {
 			t.Errorf("checkNumeric(%d, %d): got %v, want %v", v.input, v.width, ok, v.ok)
 		}
 	}
 }
 func TestFormatNumeric(t *testing.T) {
 	var vectors = []struct {
 		input  int64
 		output string
 		ok     bool
 	}{
 		// Test base-256 (binary) encoded values.
 		{-1, "\xff", true},
 		{-1, "\xff\xff", true},
 		{-1, "\xff\xff\xff", true},
 		{(1 << 0), "0", false},
 		{(1 << 8) - 1, "\x80\xff", true},
 		{(1 << 8), "0\x00", false},
 		{(1 << 16) - 1, "\x80\xff\xff", true},
 		{(1 << 16), "00\x00", false},
 		{-1 * (1 << 0), "\xff", true},
 		{-1*(1<<0) - 1, "0", false},
 		{-1 * (1 << 8), "\xff\x00", true},
 		{-1*(1<<8) - 1, "0\x00", false},
 		{-1 * (1 << 16), "\xff\x00\x00", true},
 		{-1*(1<<16) - 1, "00\x00", false},
 		{537795476381659745, "0000000\x00", false},
 		{537795476381659745, "\x80\x00\x00\x00\x07\x76\xa2\x22\xeb\x8a\x72\x61", true},
 		{-615126028225187231, "0000000\x00", false},
 		{-615126028225187231, "\xff\xff\xff\xff\xf7\x76\xa2\x22\xeb\x8a\x72\x61", true},
 		{math.MaxInt64, "0000000\x00", false},
 		{math.MaxInt64, "\x80\x00\x00\x00\x7f\xff\xff\xff\xff\xff\xff\xff", true},
 		{math.MinInt64, "0000000\x00", false},
 		{math.MinInt64, "\xff\xff\xff\xff\x80\x00\x00\x00\x00\x00\x00\x00", true},
 		{math.MaxInt64, "\x80\x7f\xff\xff\xff\xff\xff\xff\xff", true},
 		{math.MinInt64, "\xff\x80\x00\x00\x00\x00\x00\x00\x00", true},
 	}
 	for _, v := range vectors {
 		var f formatter
 		output := make([]byte, len(v.output))
 		f.formatNumeric(output, v.input)
 		ok := (f.err == nil)
 		if ok != v.ok {
 			if v.ok {
 				t.Errorf("formatNumeric(%d): got formatting failure, want success", v.input)
 			} else {
 				t.Errorf("formatNumeric(%d): got formatting success, want failure", v.input)
 			}
 		}
 		if string(output) != v.output {
 			t.Errorf("formatNumeric(%d): got %q, want %q", v.input, output, v.output)
 		}
 	}
 }
--- a/cmd/tar-split/asm.go
+++ b/cmd/tar-split/asm.go
@ -6,7 +6,7 @@ import (
 	"os"
 	"github.com/Sirupsen/logrus"
-	"github.com/codegangsta/cli"
+	"github.com/urfave/cli"
 	"github.com/vbatts/tar-split/tar/asm"
 	"github.com/vbatts/tar-split/tar/storage"
 )
--- a/cmd/tar-split/checksize.go
+++ b/cmd/tar-split/checksize.go
@ -10,7 +10,7 @@ import (
 	"os"
 	"github.com/Sirupsen/logrus"
-	"github.com/codegangsta/cli"
+	"github.com/urfave/cli"
 	"github.com/vbatts/tar-split/tar/asm"
 	"github.com/vbatts/tar-split/tar/storage"
 )
--- a/cmd/tar-split/disasm.go
+++ b/cmd/tar-split/disasm.go
@ -3,10 +3,11 @@ package main
 import (
 	"compress/gzip"
 	"io"
 	"io/ioutil"
 	"os"
 	"github.com/Sirupsen/logrus"
-	"github.com/codegangsta/cli"
+	"github.com/urfave/cli"
 	"github.com/vbatts/tar-split/tar/asm"
 	"github.com/vbatts/tar-split/tar/storage"
 )
@ -48,7 +49,13 @@ func CommandDisasm(c *cli.Context) {
 	if err != nil {
 		logrus.Fatal(err)
 	}
-	i, err := io.Copy(os.Stdout, its)
+	var out io.Writer
 	if c.Bool("no-stdout") {
 		out = ioutil.Discard
 	} else {
 		out = os.Stdout
 	}
 	i, err := io.Copy(out, its)
 	if err != nil {
 		logrus.Fatal(err)
 	}
--- a/cmd/tar-split/main.go
+++ b/cmd/tar-split/main.go
@ -4,7 +4,7 @@ import (
 	"os"
 	"github.com/Sirupsen/logrus"
-	"github.com/codegangsta/cli"
+	"github.com/urfave/cli"
 	"github.com/vbatts/tar-split/version"
 )
@ -42,6 +42,10 @@ func main() {
 					Value: "tar-data.json.gz",
 					Usage: "output of disassembled tar stream",
 				},
 				cli.BoolFlag{
 					Name:  "no-stdout",
 					Usage: "do not throughput the stream to STDOUT",
 				},
 			},
 		},
 		{
--- a/tar/asm/assemble.go
+++ b/tar/asm/assemble.go
@ -3,8 +3,10 @@ package asm
 import (
 	"bytes"
 	"fmt"
 	"hash"
 	"hash/crc64"
 	"io"
 	"sync"
 	"github.com/vbatts/tar-split/tar/storage"
 )
@ -23,45 +25,106 @@ func NewOutputTarStream(fg storage.FileGetter, up storage.Unpacker) io.ReadClose
 	}
 	pr, pw := io.Pipe()
 	go func() {
-		for {
+		err := WriteOutputTarStream(fg, up, pw)
-			entry, err := up.Next()
+		if err != nil {
-			if err != nil {
+			pw.CloseWithError(err)
-				pw.CloseWithError(err)
+		} else {
-				return
+			pw.Close()
 			}
 			switch entry.Type {
 			case storage.SegmentType:
 				if _, err := pw.Write(entry.Payload); err != nil {
 					pw.CloseWithError(err)
 					return
 				}
 			case storage.FileType:
 				if entry.Size == 0 {
 					continue
 				}
 				fh, err := fg.Get(entry.GetName())
 				if err != nil {
 					pw.CloseWithError(err)
 					return
 				}
 				c := crc64.New(storage.CRCTable)
 				tRdr := io.TeeReader(fh, c)
 				if _, err := io.Copy(pw, tRdr); err != nil {
 					fh.Close()
 					pw.CloseWithError(err)
 					return
 				}
 				if !bytes.Equal(c.Sum(nil), entry.Payload) {
 					// I would rather this be a comparable ErrInvalidChecksum or such,
 					// but since it's coming through the PipeReader, the context of
 					// _which_ file would be lost...
 					fh.Close()
 					pw.CloseWithError(fmt.Errorf("file integrity checksum failed for %q", entry.GetName()))
 					return
 				}
 				fh.Close()
 			}
 		}
 	}()
 	return pr
 }
 // WriteOutputTarStream writes assembled tar archive to a writer.
 func WriteOutputTarStream(fg storage.FileGetter, up storage.Unpacker, w io.Writer) error {
 	// ... Since these are interfaces, this is possible, so let's not have a nil pointer
 	if fg == nil || up == nil {
 		return nil
 	}
 	var copyBuffer []byte
 	var crcHash hash.Hash
 	var crcSum []byte
 	var multiWriter io.Writer
 	for {
 		entry, err := up.Next()
 		if err != nil {
 			if err == io.EOF {
 				return nil
 			}
 			return err
 		}
 		switch entry.Type {
 		case storage.SegmentType:
 			if _, err := w.Write(entry.Payload); err != nil {
 				return err
 			}
 		case storage.FileType:
 			if entry.Size == 0 {
 				continue
 			}
 			fh, err := fg.Get(entry.GetName())
 			if err != nil {
 				return err
 			}
 			if crcHash == nil {
 				crcHash = crc64.New(storage.CRCTable)
 				crcSum = make([]byte, 8)
 				multiWriter = io.MultiWriter(w, crcHash)
 				copyBuffer = byteBufferPool.Get().([]byte)
 				defer byteBufferPool.Put(copyBuffer)
 			} else {
 				crcHash.Reset()
 			}
 			if _, err := copyWithBuffer(multiWriter, fh, copyBuffer); err != nil {
 				fh.Close()
 				return err
 			}
 			if !bytes.Equal(crcHash.Sum(crcSum[:0]), entry.Payload) {
 				// I would rather this be a comparable ErrInvalidChecksum or such,
 				// but since it's coming through the PipeReader, the context of
 				// _which_ file would be lost...
 				fh.Close()
 				return fmt.Errorf("file integrity checksum failed for %q", entry.GetName())
 			}
 			fh.Close()
 		}
 	}
 }
 var byteBufferPool = &sync.Pool{
 	New: func() interface{} {
 		return make([]byte, 32*1024)
 	},
 }
 // copyWithBuffer is taken from stdlib io.Copy implementation
 // https://github.com/golang/go/blob/go1.5.1/src/io/io.go#L367
 func copyWithBuffer(dst io.Writer, src io.Reader, buf []byte) (written int64, err error) {
 	for {
 		nr, er := src.Read(buf)
 		if nr > 0 {
 			nw, ew := dst.Write(buf[0:nr])
 			if nw > 0 {
 				written += int64(nw)
 			}
 			if ew != nil {
 				err = ew
 				break
 			}
 			if nr != nw {
 				err = io.ErrShortWrite
 				break
 			}
 		}
 		if er == io.EOF {
 			break
 		}
 		if er != nil {
 			err = er
 			break
 		}
 	}
 	return written, err
 }
--- a/tar/asm/assemble_test.go
+++ b/tar/asm/assemble_test.go
@ -130,17 +130,20 @@ func TestTarStreamMangledGetterPutter(t *testing.T) {
 	}
 }
 var testCases = []struct {
 	path            string
 	expectedSHA1Sum string
 	expectedSize    int64
 }{
 	{"./testdata/t.tar.gz", "1eb237ff69bca6e22789ecb05b45d35ca307adbd", 10240},
 	{"./testdata/longlink.tar.gz", "d9f6babe107b7247953dff6b5b5ae31a3a880add", 20480},
 	{"./testdata/fatlonglink.tar.gz", "8537f03f89aeef537382f8b0bb065d93e03b0be8", 26234880},
 	{"./testdata/iso-8859.tar.gz", "ddafa51cb03c74ec117ab366ee2240d13bba1ec3", 10240},
 	{"./testdata/extranils.tar.gz", "e187b4b3e739deaccc257342f4940f34403dc588", 10648},
 	{"./testdata/notenoughnils.tar.gz", "72f93f41efd95290baa5c174c234f5d4c22ce601", 512},
 }
 func TestTarStream(t *testing.T) {
 	testCases := []struct {
 		path            string
 		expectedSHA1Sum string
 		expectedSize    int64
 	}{
 		{"./testdata/t.tar.gz", "1eb237ff69bca6e22789ecb05b45d35ca307adbd", 10240},
 		{"./testdata/longlink.tar.gz", "d9f6babe107b7247953dff6b5b5ae31a3a880add", 20480},
 		{"./testdata/fatlonglink.tar.gz", "8537f03f89aeef537382f8b0bb065d93e03b0be8", 26234880},
 		{"./testdata/iso-8859.tar.gz", "ddafa51cb03c74ec117ab366ee2240d13bba1ec3", 10240},
 	}
 	for _, tc := range testCases {
 		fh, err := os.Open(tc.path)
@ -167,10 +170,7 @@ func TestTarStream(t *testing.T) {
 		// get a sum of the stream after it has passed through to ensure it's the same.
 		h0 := sha1.New()
-		tRdr0 := io.TeeReader(tarStream, h0)
+		i, err := io.Copy(h0, tarStream)
 		// read it all to the bit bucket
 		i, err := io.Copy(ioutil.Discard, tRdr0)
 		if err != nil {
 			t.Fatal(err)
 		}
@ -205,3 +205,52 @@ func TestTarStream(t *testing.T) {
 		}
 	}
 }
 func BenchmarkAsm(b *testing.B) {
 	for i := 0; i < b.N; i++ {
 		for _, tc := range testCases {
 			func() {
 				fh, err := os.Open(tc.path)
 				if err != nil {
 					b.Fatal(err)
 				}
 				defer fh.Close()
 				gzRdr, err := gzip.NewReader(fh)
 				if err != nil {
 					b.Fatal(err)
 				}
 				defer gzRdr.Close()
 				// Setup where we'll store the metadata
 				w := bytes.NewBuffer([]byte{})
 				sp := storage.NewJSONPacker(w)
 				fgp := storage.NewBufferFileGetPutter()
 				// wrap the disassembly stream
 				tarStream, err := NewInputTarStream(gzRdr, sp, fgp)
 				if err != nil {
 					b.Fatal(err)
 				}
 				// read it all to the bit bucket
 				i1, err := io.Copy(ioutil.Discard, tarStream)
 				if err != nil {
 					b.Fatal(err)
 				}
 				r := bytes.NewBuffer(w.Bytes())
 				sup := storage.NewJSONUnpacker(r)
 				// and reuse the fgp that we Put the payloads to.
 				rc := NewOutputTarStream(fgp, sup)
 				i2, err := io.Copy(ioutil.Discard, rc)
 				if err != nil {
 					b.Fatal(err)
 				}
 				if i1 != i2 {
 					b.Errorf("%s: input(%d) and ouput(%d) byte count didn't match", tc.path, i1, i2)
 				}
 			}()
 		}
 	}
 }
--- a/tar/asm/testdata/extranils.tar.gz
+++ b/tar/asm/testdata/extranils.tar.gz
--- a/tar/asm/testdata/notenoughnils.tar.gz
+++ b/tar/asm/testdata/notenoughnils.tar.gz
--- a/tar/storage/packer.go
+++ b/tar/storage/packer.go
@ -1,7 +1,6 @@
 package storage
 import (
 	"bufio"
 	"encoding/json"
 	"errors"
 	"io"
@ -33,31 +32,15 @@ type PackUnpacker interface {
 */
 type jsonUnpacker struct {
-	r     io.Reader
+	seen seenNames
-	b     *bufio.Reader
+	dec  *json.Decoder
 	isEOF bool
 	seen  seenNames
 }
 func (jup *jsonUnpacker) Next() (*Entry, error) {
 	var e Entry
-	if jup.isEOF {
+	err := jup.dec.Decode(&e)
-		// since ReadBytes() will return read bytes AND an EOF, we handle it this
+	if err != nil {
 		// round-a-bout way so we can Unmarshal the tail with relevant errors, but
 		// still get an io.EOF when the stream is ended.
 		return nil, io.EOF
 	}
 	line, err := jup.b.ReadBytes('\n')
 	if err != nil && err != io.EOF {
 		return nil, err
 	} else if err == io.EOF {
 		jup.isEOF = true
 	}
 	err = json.Unmarshal(line, &e)
 	if err != nil && jup.isEOF {
 		// if the remainder actually _wasn't_ a remaining json structure, then just EOF
 		return nil, io.EOF
 	}
 	// check for dup name
@ -78,8 +61,7 @@ func (jup *jsonUnpacker) Next() (*Entry, error) {
 // Each Entry read are expected to be delimited by new line.
 func NewJSONUnpacker(r io.Reader) Unpacker {
 	return &jsonUnpacker{
-		r:    r,
+		dec:  json.NewDecoder(r),
 		b:    bufio.NewReader(r),
 		seen: seenNames{},
 	}
 }
--- a/tar/storage/packer_test.go
+++ b/tar/storage/packer_test.go
@ -4,6 +4,8 @@ import (
 	"bytes"
 	"compress/gzip"
 	"io"
 	"io/ioutil"
 	"os"
 	"testing"
 )
@ -159,5 +161,58 @@ func TestGzip(t *testing.T) {
 	if len(entries) != len(e) {
 		t.Errorf("expected %d entries, got %d", len(e), len(entries))
 	}
-
+}
 func BenchmarkGetPut(b *testing.B) {
 	e := []Entry{
 		Entry{
 			Type:    SegmentType,
 			Payload: []byte("how"),
 		},
 		Entry{
 			Type:    SegmentType,
 			Payload: []byte("y'all"),
 		},
 		Entry{
 			Type:    FileType,
 			Name:    "./hurr.txt",
 			Payload: []byte("deadbeef"),
 		},
 		Entry{
 			Type:    SegmentType,
 			Payload: []byte("doin"),
 		},
 	}
 	b.RunParallel(func(pb *testing.PB) {
 		for pb.Next() {
 			func() {
 				fh, err := ioutil.TempFile("", "tar-split.")
 				if err != nil {
 					b.Fatal(err)
 				}
 				defer os.Remove(fh.Name())
 				defer fh.Close()
 				jp := NewJSONPacker(fh)
 				for i := range e {
 					if _, err := jp.AddEntry(e[i]); err != nil {
 						b.Fatal(err)
 					}
 				}
 				fh.Sync()
 				up := NewJSONUnpacker(fh)
 				for {
 					_, err := up.Next()
 					if err != nil {
 						if err == io.EOF {
 							break
 						}
 						b.Fatal(err)
 					}
 				}
 			}()
 		}
 	})
 }
--- a/tar_benchmark_test.go
+++ b/tar_benchmark_test.go
@ -0,0 +1,84 @@
 package tartest
 import (
 	"io"
 	"io/ioutil"
 	"os"
 	"testing"
 	upTar "archive/tar"
 	ourTar "github.com/vbatts/tar-split/archive/tar"
 )
 var testfile = "./archive/tar/testdata/sparse-formats.tar"
 func BenchmarkUpstreamTar(b *testing.B) {
 	for n := 0; n < b.N; n++ {
 		fh, err := os.Open(testfile)
 		if err != nil {
 			b.Fatal(err)
 		}
 		tr := upTar.NewReader(fh)
 		for {
 			_, err := tr.Next()
 			if err != nil {
 				if err == io.EOF {
 					break
 				}
 				fh.Close()
 				b.Fatal(err)
 			}
 			io.Copy(ioutil.Discard, tr)
 		}
 		fh.Close()
 	}
 }
 func BenchmarkOurTarNoAccounting(b *testing.B) {
 	for n := 0; n < b.N; n++ {
 		fh, err := os.Open(testfile)
 		if err != nil {
 			b.Fatal(err)
 		}
 		tr := ourTar.NewReader(fh)
 		tr.RawAccounting = false // this is default, but explicit here
 		for {
 			_, err := tr.Next()
 			if err != nil {
 				if err == io.EOF {
 					break
 				}
 				fh.Close()
 				b.Fatal(err)
 			}
 			io.Copy(ioutil.Discard, tr)
 		}
 		fh.Close()
 	}
 }
 func BenchmarkOurTarYesAccounting(b *testing.B) {
 	for n := 0; n < b.N; n++ {
 		fh, err := os.Open(testfile)
 		if err != nil {
 			b.Fatal(err)
 		}
 		tr := ourTar.NewReader(fh)
 		tr.RawAccounting = true // This enables mechanics for collecting raw bytes
 		for {
 			_ = tr.RawBytes()
 			_, err := tr.Next()
 			_ = tr.RawBytes()
 			if err != nil {
 				if err == io.EOF {
 					break
 				}
 				fh.Close()
 				b.Fatal(err)
 			}
 			io.Copy(ioutil.Discard, tr)
 			_ = tr.RawBytes()
 		}
 		fh.Close()
 	}
 }
--- a/version/version.go
+++ b/version/version.go
@ -1,7 +1,7 @@
 package version
 // AUTO-GENEREATED. DO NOT EDIT
-// 2015-08-14 09:56:50.742727493 -0400 EDT
+// 2016-09-26 19:53:30.825879 -0400 EDT
 // VERSION is the generated version from /home/vbatts/src/vb/tar-split/version
-var VERSION = "v0.9.6-1-gc76e420"
+var VERSION = "v0.10.1-4-gf280282"
Author	SHA1	Message	Date
Vincent Batts	b9127a1393	Merge pull request #38 from vbatts/travis travis: test more go versions	2017-03-14 11:24:38 -04:00
Vincent Batts	c6dd42815a	archive/tar: monotonic clock adjustment commit 0e3355903d2ebcf5ee9e76096f51ac9a116a9dbb upstream Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2017-03-14 11:04:10 -04:00
Vincent Batts	245403c324	travis: test more go versions Thanks to @tianon, for pointing to `5e3ef60b0d/lib/travis/build/config.rb (L54-L70)` Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2017-03-14 08:38:13 -04:00
Vincent Batts	7560005f21	README: adding a golang report card Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2017-03-13 18:28:54 -04:00
Vincent Batts	bd4c5d64c3	main: switch import paths to urfave Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-09-27 02:54:18 +00:00
Vincent Batts	d3f1b54304	version: bump to v0.10.1 Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-09-26 19:53:52 -04:00
Vincent Batts	f28028292a	Merge branch 'master' of github.com:vbatts/tar-split	2016-09-26 19:52:55 -04:00
Vincent Batts	416fa5dcfe	Merge pull request #36 from dmcgowan/fix-extra-nil-accounting archive/tar: fix writing too many raw bytes	2016-09-26 18:31:47 -04:00
Derek McGowan	6b59e6942e	archive/tar: fix writing too many raw bytes When an EOF is read, only the part of the header buffer which was read should be accounted for. Signed-off-by: Derek McGowan <derek@mcgstyle.net>	2016-09-26 14:01:48 -07:00
Vincent Batts	7410961e75	tar/asm: failing test for lack of EOF nils Reported-by: Derek McGowan <derek@mcgstyle.net> Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-09-26 13:39:03 -07:00
Vincent Batts	eb3808673d	version: bump to v0.10.0 Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-09-23 11:01:58 -04:00
Vincent Batts	ae8540dc47	Merge pull request #34 from dmcgowan/fix-panic-issue-33 Fix panic in Next	2016-09-23 09:41:12 -04:00
Derek McGowan	e527e70d25	Fix panic in Next readHeader should never return nil with a tr.err also nil. To correct this, ensure tr.err never gets reset to nil followed by a nil return.	2016-09-22 17:38:18 -07:00
Vincent Batts	6810cedb21	benchmark: add a comparison of 'archive/tar' Since this project has forked logic of upstream 'archive/tar', this does a brief comparison including the RawBytes usage. ```bash $ go test -run="XXX" -bench=. testing: warning: no tests to run BenchmarkUpstreamTar-4 2000 700809 ns/op BenchmarkOurTarNoAccounting-4 2000 692055 ns/op BenchmarkOurTarYesAccounting-4 2000 723184 ns/op PASS ok vb/tar-split 4.461s ``` From this, the difference is negligible. Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-07-26 09:50:08 -04:00
Vincent Batts	28bc4c32f9	Merge pull request #32 from vbatts/fix-travis travis: update golang versions	2016-06-26 15:00:37 -04:00
Vincent Batts	beaeceb06f	travis: update golang versions This is not saying that tar-split no longer works on go1.3 or go1.4, but rather that the headache of `go vet` having a version dependent ability to install it, makes it a headache in travis. Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-06-26 14:56:04 -04:00
Vincent Batts	54e3a92a60	Merge branch 'master' of github.com:vbatts/tar-split	2016-06-26 14:43:38 -04:00
Vincent Batts	354fd6cf34	cmd: add a `disasm --no-stdout` flag Since sometimes you just need to > /dev/null Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-06-26 10:15:12 -04:00
Vincent Batts	226f7c7490	README: update `archive/tar` version reference Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-03-30 16:38:51 -04:00
Vincent Batts	e2a62d6b0d	README.md: fix thumbnail Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-02-29 11:40:38 -05:00
Vincent Batts	24fe0a94fe	version: bump to v0.9.13 Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-02-15 09:44:28 -05:00
Vincent Batts	862ccd05bc	Merge pull request #31 from vbatts/tar-go1.6 Tar go1.6	2016-02-15 09:41:56 -05:00
Vincent Batts	c32966b9e8	archive/tar: go1.3 and go1.4 compatibility Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-02-15 09:38:46 -05:00
Joe Tsai	10db8408f6	archive/tar: document how Reader.Read handles header-only files Commit dd5e14a7511465d20c6e95bf54c9b8f999abbbf6 ensured that no data could be read for header-only files regardless of what the Header.Size said. We should document this fact in Reader.Read. Updates #13647 Change-Id: I4df9a2892bc66b49e0279693d08454bf696cfa31 Reviewed-on: https://go-review.googlesource.com/17913 Reviewed-by: Russ Cox <rsc@golang.org>	2016-02-03 07:01:09 -05:00
Joe Tsai	962540fec3	archive/tar: spell license correctly in example Change-Id: Ice85d161f026a991953bd63ecc6ec80f8d06dfbd Reviewed-on: https://go-review.googlesource.com/17901 Run-TryBot: Joe Tsai <joetsai@digital-static.net> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-02-03 07:01:09 -05:00
Joe Tsai	a04b4ddba4	archive/tar: properly parse GNU base-256 encoding Motivation: * Previous implementation did not detect integer overflow when parsing a base-256 encoded field. * Previous implementation did not treat the integer as a two's complement value as specified by GNU. The relevant GNU specification says: <<< GNU format uses two's-complement base-256 notation to store values that do not fit into standard ustar range. >>> Fixes #12435 Change-Id: I4639bcffac8d12e1cb040b76bd05c9d7bc6c23a8 Reviewed-on: https://go-review.googlesource.com/17424 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-03 07:01:09 -05:00
Joe Tsai	ce5aac17f9	archive/tar: properly format GNU base-256 encoding Motivation: * Previous implementation silently failed when an integer overflow occurred. Now, we report an ErrFieldTooLong. * Previous implementation did not encode in two's complement format and was unable to encode negative numbers. The relevant GNU specification says: <<< GNU format uses two's-complement base-256 notation to store values that do not fit into standard ustar range. >>> Fixes #12436 Change-Id: I09c20602eabf8ae3a7e0db35b79440a64bfaf807 Reviewed-on: https://go-review.googlesource.com/17425 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-03 06:58:30 -05:00
Joe Tsai	be9ac88117	archive/tar: convert Reader.Next to be loop based Motivation for change: * Recursive logic is hard to follow, since it tends to apply things in reverse. On the other hand, the tar formats tend to describe meta headers as affecting the next entry. * Recursion also applies changes in the wrong order. Two test files are attached that use multiple headers. The previous Go behavior differs from what GNU and BSD tar do. Change-Id: Ic1557256fc1363c5cb26570e5d0b9f65a9e57341 Reviewed-on: https://go-review.googlesource.com/14624 Run-TryBot: Joe Tsai <joetsai@digital-static.net> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-02-03 06:58:30 -05:00
Joe Tsai	64935a5f0f	archive/tar: move parse/format methods to standalone receiver Motivations for this change: * It allows these functions to be used outside of Reader/Writer. * It allows these functions to be more easily unit tested. Change-Id: Iebe2b70bdb8744371c9ffa87c24316cbbf025b59 Reviewed-on: https://go-review.googlesource.com/15113 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Joe Tsai <joetsai@digital-static.net> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-02-02 14:32:27 -05:00
Joe Tsai	b598ba3ee7	archive/tar: fix issues with readGNUSparseMap1x0 Motivations: * Use of strconv.ParseInt does not properly treat integers as 64bit, preventing this function from working properly on 32bit machines. * Use of io.ReadFull does not properly detect truncated streams when the file suddenly ends on a block boundary. * The function blindly trusts user input for numEntries and allocates memory accordingly. * The function does not validate that numEntries is not negative, allowing a malicious sparse file to cause a panic during make. In general, this function was overly complicated for what it was accomplishing and it was hard to reason that it was free from bounds errors. Instead, it has been rewritten and relies on bytes.Buffer.ReadString to do the main work. So long as invariants about the number of '\n' in the buffer are maintained, it is much easier to see why this approach is correct. Change-Id: Ibb12c4126c26e0ea460ea063cd17af68e3cf609e Reviewed-on: https://go-review.googlesource.com/15174 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-02 14:17:35 -05:00
Joe Tsai	7500c932c7	archive/tar: properly handle header-only "files" in Reader Certain special type-flags, specifically 1, 2, 3, 4, 5, 6, do not have a data section. Thus, regardless of what the size field says, we should not attempt to read any data for these special types. The relevant PAX and USTAR specification says: <<< If the typeflag field is set to specify a file to be of type 1 (a link) or 2 (a symbolic link), the size field shall be specified as zero. If the typeflag field is set to specify a file of type 5 (directory), the size field shall be interpreted as described under the definition of that record type. No data logical records are stored for types 1, 2, or 5. If the typeflag field is set to 3 (character special file), 4 (block special file), or 6 (FIFO), the meaning of the size field is unspecified by this volume of POSIX.1-2008, and no data logical records shall be stored on the medium. Additionally, for type 6, the size field shall be ignored when reading. If the typeflag field is set to any other value, the number of logical records written following the header shall be (size+511)/512, ignoring any fraction in the result of the division. >>> Contrary to the specification, we do not assert that the size field is zero for type 1 and 2 since we liberally accept non-conforming formats. Change-Id: I666b601597cb9d7a50caa081813d90ca9cfc52ed Reviewed-on: https://go-review.googlesource.com/16614 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-02 14:10:38 -05:00
Matt Layher	2424f4e367	archive/tar: make output deterministic Replaces PID in PaxHeaders with 0. Sorts PAX header keys before writing them to the archive. Fixes #12358 Change-Id: If239f89c85f1c9d9895a253fb06a47ad44960124 Reviewed-on: https://go-review.googlesource.com/13975 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Joe Tsai <joetsai@digital-static.net>	2016-02-02 14:10:11 -05:00
Joe Tsai	bffda594f7	archive/tar: detect truncated files Motivation: * Reader.skipUnread never reports io.ErrUnexpectedEOF. This is strange given that io.ErrUnexpectedEOF is given through Reader.Read if the user manually reads the file. * Reader.skipUnread fails to detect truncated files since io.Seeker is lazy about reporting errors. Thus, the behavior of Reader differs whether the input io.Reader also satisfies io.Seeker or not. To solve this, we seek to one before the end of the data section and always rely on at least one call to io.CopyN. If the tr.r satisfies io.Seeker, this is guarunteed to never read more than blockSize. Fixes #12557 Change-Id: I0ddddfc6bed0d74465cb7e7a02b26f1de7a7a279 Reviewed-on: https://go-review.googlesource.com/15175 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-02 14:09:30 -05:00
Joe Tsai	cf83c95de8	archive/tar: fix numeric overflow issues in readGNUSparseMap0x1 Motivation: * The logic to verify the numEntries can overflow and incorrectly pass, allowing a malicious file to allocate arbitrary memory. * The use of strconv.ParseInt does not set the integer precision to 64bit, causing this code to work incorrectly on 32bit machines. Change-Id: I1b1571a750a84f2dde97cc329ed04fe2342aaa60 Reviewed-on: https://go-review.googlesource.com/15173 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-02 14:09:04 -05:00
Joe Tsai	cb423795eb	archive/tar: add missing error checks to Reader.Next A recursive call to Reader.Next did not check the error before trying to use the result, leading to a nil pointer panic. This specific CL addresses the immediate issue, which is the panic, but does not solve the root issue, which is due to an integer overflow in the base-256 parser. Updates #12435 Change-Id: Ia908671f0f411a409a35e24f2ebf740d46734072 Reviewed-on: https://go-review.googlesource.com/15437 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-02 14:08:38 -05:00
Joe Tsai	4ad443d166	archive/tar: expand abilities of TestReader Motivation: * There are an increasing number of "one-off" corrupt files added to make sure that package does not succeed or crash on them. Instead, allow for the test to specify the error that is expected to occur (if any). * Also, fold in the logic to check the MD5 checksum into this function. The following tests are being removed: * TestIncrementalRead: Done by TestReader by using io.CopyBuffer with a buffer of 8. This achieves the same behavior as this test. * TestSparseEndToEnd: Since TestReader checks the MD5 checksums if the input corpus provides them, then this is redundant. * TestSparseIncrementalRead: Redundant for the same reasons that TestIncrementalRead is now redundant * TestNegativeHdrSize: Added to TestReader corpus * TestIssue10968: Added to TestReader corpus * TestIssue11169: Added to TestReader corpus With this change, code coverage did not change: 85.3% Change-Id: I8550d48657d4dbb8f47dfc3dc280758ef73b47ec Reviewed-on: https://go-review.googlesource.com/15176 Reviewed-by: Andrew Gerrand <adg@golang.org>	2016-02-02 14:06:30 -05:00
Joe Tsai	f0fc67b3a8	archive/tar: make Reader.Read errors persistent If the stream is in an inconsistent state, it does not make sense that Reader.Read can be called and possibly succeed. Change-Id: I9d1c5a1300b2c2b45232188aa7999e350809dcf2 Reviewed-on: https://go-review.googlesource.com/15177 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2016-02-02 14:06:30 -05:00
Joe Tsai	af15385a0d	archive/tar: fix bugs with sparseFileReader The sparseFileReader is prone to two different forms of denial-of-service attacks: * A malicious tar file can cause an infinite loop * A malicious tar file can cause arbitrary panics This results because of poor error checking/handling, which this CL fixes. While we are at it, add a plethora of unit tests to test for possible malicious inputs. Change-Id: I2f9446539d189f3c1738a1608b0ad4859c1be929 Reviewed-on: https://go-review.googlesource.com/15115 Reviewed-by: Andrew Gerrand <adg@golang.org> Run-TryBot: Andrew Gerrand <adg@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-02 14:06:30 -05:00
Joe Tsai	440ba9e519	archive/tar: remove dead code with USTAR path splitting Convert splitUSTARPath to return a bool rather than an error since the caller never ever uses the error other than to check if it is nil. Thus, we can remove errNameTooLong as well. Also, fold the checking of the length <= fileNameSize and whether the string is ASCII into the split function itself. Lastly, remove logic to set the MAGIC since that's already done on L200. Thus, setting the magic is redundant. There is no overall logic change. Updates #12638 Change-Id: I26b6992578199abad723c2a2af7f4fc078af9c17 Reviewed-on: https://go-review.googlesource.com/14723 Reviewed-by: David Symonds <dsymonds@golang.org> Run-TryBot: David Symonds <dsymonds@golang.org>	2016-02-02 14:06:30 -05:00
Vincent Batts	b87f81631a	version: mark 0.9.12	2016-01-31 01:39:10 -05:00
Vincent Batts	d50e5c9283	LICENSE: update LICENSE to BSD 3-clause Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2015-12-03 15:45:57 -05:00
Vincent Batts	0de4e9db0c	Merge pull request #27 from vbatts/bench_asm tar/asm: basic benchmark on disasm/asm of testdata	2015-12-02 14:09:21 -06:00
Vincent Batts	1501fe6002	Merge pull request #22 from tonistiigi/stream-opt Optimize tar stream generation	2015-12-02 14:09:08 -06:00
Vincent Batts	19b7e22058	tar/asm: basic benchmark on disasm/asm of testdata ``` PASS BenchmarkAsm-4 5 238968475 ns/op 66841059 B/op 2449 allocs/op ok _/home/vbatts/src/vb/tar-split/tar/asm 2.267s ``` Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2015-12-02 14:36:02 -05:00
Vincent Batts	026e78012b	Merge pull request #26 from vbatts/better_discard_in_test tar/asm: remove unneeded Tee	2015-12-02 12:00:26 -06:00
Vincent Batts	2efe34695a	tar/asm: remove unneeded Tee Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2015-12-02 12:56:52 -05:00
Tonis Tiigi	23b6435e6b	Optimize tar stream generation - New writeTo method allows to avoid creating extra pipe. - Copy with a pooled buffer instead of allocating new buffer for each file. - Avoid extra object allocations inside the loop. Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>	2015-12-01 14:08:53 -08:00
Vincent Batts	93666d5824	Merge pull request #25 from vbatts/bench tar/storage: adding Getter Putter benchmark	2015-12-01 14:37:10 -06:00
Vincent Batts	11281e8c09	tar/storage: adding Getter Putter benchmark Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2015-12-01 15:31:48 -05:00
Vincent Batts	fc1e47e71d	Merge pull request #24 from vbatts/drop_go1.2 travis: drop go1.2	2015-12-01 14:31:13 -06:00
Vincent Batts	d80c6b3bb1	travis: drop go1.2 seems overly reasonable to support go1.3 and greater. :-) Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2015-12-01 15:26:30 -05:00
Tonis Tiigi	8b20f9161d	Optimize JSON decoding This allows to avoid extra allocations on `ReadBytes` and decoding buffers. Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>	2015-11-30 09:52:44 -08:00
Vincent Batts	bece0c7009	demo: docker layer checksums Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2015-10-16 17:05:18 -04:00
Vincent Batts	7ea74e1c31	demo: basic command Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2015-10-16 16:41:09 -04:00