I intend to not make changes to this `archive/tar` that aren't from
upstream, or are not directly related to the usage by this project...
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
the pointer to the pool may be useful, but holding on that until I get
benchmarks of memory use to show the benefit.
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
io.Copy usually allocates a 32kB buffer, and due to the large
number of files processed by tar-split, this shows up in Go profiles
as a very large alloc_space total.
It doesn't seem to actually be a measurable problem in any way,
but we can allocate the buffer only once per tar-split creation,
at no additional cost to existing allocations, so let's do so,
and remove the distraction.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
There is a discrepancy of behavior of `github.com/urfave/cli` between
using go1.12 and go1.15, when the dependency is not present as vendored
source. Now this builds fine with go1.12
There are users of tar-split as a package. It is the hope that by adding
this vendored source it does not impact them depending on tar-split
itself.
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
The Go implementation of gzip is the only known to produce compressed
layers with the expected digest hashes.
This change allows compressed tar layer files to be produced, which is
useful for exporting layers from non-Go tools.
Now when golang 1.11 is out, 1.9 and older versions are no longer
supported. More to say, since the archive/tar is from go-1.11, it
uses some features from new Go versions (strings.Builder and sync.Map)
not supported by anything older than Go 1.10.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is a port of commits adding RawHeader() to go-1.11 archive/tar.
In addition:
* simplify the rawBytes.Write() code in readHeader()
* ignore errors from rawBytes.Write(), as (at least for go-1.11)
it never returns an error, only panics (if the buffer grew too large)
Also, remove the internal/testenv from tar_tar.go to enable go test.
As working symlink detection is non-trivial on Windows, just skip
the test on that platform.
In addition to `go test`, I did some minimal manual testing, and
it seems this code creates tar-data.json.gz which is identical
to the one made by the old version.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
To ensure we don't have regressions in our padding fix, add a test case
that attempts to crash the test by creating 20GB of random junk padding.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Previously, we would read the entire padding in a given archive into
memory in order to store it in the packer. This would cause memory
exhaustion if a malicious archive was crafted with very large amounts of
padding. Since a given SegmentType is reconstructed losslessly, we can
simply chunk up any padding into large segments to avoid this problem.
Use a reasonable default of 1MiB to avoid changing the tar-split.json of
existing archives that are not malformed.
Fixes: CVE-2017-14992
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This fixes a new go-vet(1) error which has surfaced in Go HEAD.
$ go vet ./...
go build github.com/vbatts/tar-split: no non-test Go files in
/home/travis/gopath/src/github.com/vbatts/tar-split
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Since this project has forked logic of upstream 'archive/tar', this does
a brief comparison including the RawBytes usage.
```bash
$ go test -run="XXX" -bench=.
testing: warning: no tests to run
BenchmarkUpstreamTar-4 2000 700809 ns/op
BenchmarkOurTarNoAccounting-4 2000 692055 ns/op
BenchmarkOurTarYesAccounting-4 2000 723184 ns/op
PASS
ok vb/tar-split 4.461s
```
From this, the difference is negligible.
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>