Commit Graph

74 Commits

Author SHA1 Message Date
Miloslav Trmač cd197d3076 Correctly handle Read returning (0, nil)
It's not an EOF indication.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2023-07-22 02:35:45 +02:00
Vincent Batts b6372414e5
tar/asm: don't add a padding entry if it has no bytes
Fixes #65

if the read bytes is 0, then don't even create the entry for that
padding.
This sounds like the solution for the issue opened, but I haven't found
a reproducer for this issue yet. :-\

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2023-07-21 09:02:43 -04:00
Vincent Batts cad1f451fd
tar/asm: troubleshooting padding EOF issue
Reference #65

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2023-07-21 09:02:29 -04:00
guoguangwu 919f9abf38 chore: remove refs to deprecated io/ioutil
Signed-off-by: guoguangwu <guoguangwu@magic-shield.com>
2023-07-20 23:00:46 +08:00
Vincent Batts e4450847fb
tar/storage: remove TODO's on sailed shipped for changing the encoding
this function is used widely and it's JSON. And it was not written in
such a way as to have exchangable codec.. per se
So, maybe I'll just kick out the idea of using https://github.com/ugorji/go

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2023-03-26 14:10:16 -04:00
Vincent Batts 2b88967591
*.go: `gomft -s -w`
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2023-03-25 21:05:25 -04:00
Vincent Batts 516158dbfb
*.go: linting project specific code
the pointer to the pool may be useful, but holding on that until I get
benchmarks of memory use to show the benefit.

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2023-03-25 20:45:23 -04:00
Vincent Batts 70fb294a9b
tar/asm: go vet fixes
on go1.19.7

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2023-03-25 20:38:36 -04:00
Miloslav Trmač 8d76363085 Avoid a 32 kB file allocation on every bitBucketFilePutter.Put
io.Copy usually allocates a 32kB buffer, and due to the large
number of files processed by tar-split, this shows up in Go profiles
as a very large alloc_space total.

It doesn't seem to actually be a measurable problem in any way,
but we can allocate the buffer only once per tar-split creation,
at no additional cost to existing allocations, so let's do so,
and remove the distraction.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2021-08-21 03:24:39 +02:00
Aleksa Sarai 99430a8454
tar: asm: add an excess padding test case
To ensure we don't have regressions in our padding fix, add a test case
that attempts to crash the test by creating 20GB of random junk padding.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-11-08 02:35:01 +11:00
Aleksa Sarai 3d9db48dbe
tar: asm: store padding in chunks to avoid memory exhaustion
Previously, we would read the entire padding in a given archive into
memory in order to store it in the packer. This would cause memory
exhaustion if a malicious archive was crafted with very large amounts of
padding. Since a given SegmentType is reconstructed losslessly, we can
simply chunk up any padding into large segments to avoid this problem.
Use a reasonable default of 1MiB to avoid changing the tar-split.json of
existing archives that are not malformed.

Fixes: CVE-2017-14992
Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-11-08 02:34:56 +11:00
Vincent Batts 7410961e75 tar/asm: failing test for lack of EOF nils
Reported-by: Derek McGowan <derek@mcgstyle.net>
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2016-09-26 13:39:03 -07:00
Vincent Batts 0de4e9db0c Merge pull request #27 from vbatts/bench_asm
tar/asm: basic benchmark on disasm/asm of testdata
2015-12-02 14:09:21 -06:00
Vincent Batts 1501fe6002 Merge pull request #22 from tonistiigi/stream-opt
Optimize tar stream generation
2015-12-02 14:09:08 -06:00
Vincent Batts 19b7e22058 tar/asm: basic benchmark on disasm/asm of testdata
```
PASS
BenchmarkAsm-4         5         238968475 ns/op        66841059 B/op       2449 allocs/op
ok      _/home/vbatts/src/vb/tar-split/tar/asm  2.267s
```

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-12-02 14:36:02 -05:00
Vincent Batts 2efe34695a tar/asm: remove unneeded Tee
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-12-02 12:56:52 -05:00
Tonis Tiigi 23b6435e6b Optimize tar stream generation
- New writeTo method allows to avoid creating extra pipe.
- Copy with a pooled buffer instead of allocating new buffer for each file.
- Avoid extra object allocations inside the loop.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2015-12-01 14:08:53 -08:00
Vincent Batts 11281e8c09 tar/storage: adding Getter Putter benchmark
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-12-01 15:31:48 -05:00
Tonis Tiigi 8b20f9161d Optimize JSON decoding
This allows to avoid extra allocations on `ReadBytes` and
decoding buffers.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2015-11-30 09:52:44 -08:00
Vincent Batts 10250c25e0 tar/asm: remove useless test
The iso-8859-1 archive is already tested round trip, and this test did
not do anything really.
2015-09-25 14:35:12 -04:00
Vincent Batts 7e38cefd4b common: remove in favor of stdlib `unicode/utf8` 2015-09-25 14:33:24 -04:00
Vincent Batts 8a361ef0d8 tar/storage: Sprintf is unnecessary
fmt.Sprintf() vs string() for this []byte conversion is too much and
does not provide any further safety.

https://gist.github.com/vbatts/ab17181086aed558dd3a
2015-09-24 09:51:58 -04:00
Vincent Batts cde639172f tar/asm: work with non-utf8 entry names 2015-09-23 15:27:33 -04:00
Vincent Batts 032efafc29 tar/storage: work with raw (invalid utf8) names
When the entry name is not UTF-8, for example ISO-8859-1, then store the
raw bytes.
To accommodate this, we will have getters and setters for the entry's
name now. Since this most heavily affects the json marshalling, we'll
double check the sanity of the name before storing it in the JSONPacker.
2015-09-23 15:27:33 -04:00
Vincent Batts 39d06b9dc4 tar/common: get index of first invalid utf-8 char 2015-09-23 15:27:15 -04:00
Vincent Batts 2865353200 common: add a UTF-8 check helper 2015-09-23 15:27:13 -04:00
Vincent Batts c76e42010e tar/asm: additional GNU LongLink testcase
Adding a minimal test case for GNU @LongLink.
Tested that it fails on v0.9.5, but now passes on v0.9.6 and master.
2015-08-14 07:55:18 -04:00
Vincent Batts 8f81a50860 Merge pull request #10 from LK4D4/fix_pipe_close
asm: Remove unreachable code
2015-08-13 15:36:42 -04:00
Vincent Batts e72b4959f9 Merge pull request #9 from LK4D4/fix_json_tags
storage: Fix syntax of json tags
2015-08-13 15:35:20 -04:00
Alexander Morozov 45399711c2 tar/storage: Replace TeeReader with MultiWriter
It uses slightly less memory and more understandable.
Benchmar results:

benchmark             old ns/op     new ns/op     delta
BenchmarkPutter-4     57272         52375         -8.55%

benchmark             old allocs     new allocs     delta
BenchmarkPutter-4     21             19             -9.52%
benchmark             old bytes     new bytes     delta
BenchmarkPutter-4     19416         13336         -31.31%

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-08-13 11:43:31 -07:00
Alexander Morozov ea73dc6f6f tar/storage: Benchmark for bufferFileGetPutter.Put
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-08-13 11:42:14 -07:00
Alexander Morozov 93c0a320a8 asm: Remove unreachable code
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-08-12 22:45:39 -07:00
Alexander Morozov b1783bc86d storage: Fix syntax of json tags
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-08-12 22:41:28 -07:00
Alexander Morozov e6df23162e Remove redundant TeeReader
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-08-12 16:46:04 -07:00
Vincent Batts df8572a1eb tar/asm: check length before adding an entry 2015-08-11 15:57:20 -04:00
Vincent Batts 51b0481d4a tar/asm: adding a failing test due to GNU LongLink 2015-08-11 15:57:20 -04:00
Jonathan Boulle caf6a872c9 tar/storage: switch to map[string]struct{} for set
Using an empty struct is more idiomatic/efficient for representing a
set-like container.
2015-07-22 15:32:49 -04:00
Jonathan Boulle 002d19f0b0 *: clean up assorted spelling/grammar issues
Various minor fixes noticed on walking through
2015-07-22 15:32:49 -04:00
Vincent Batts e0e9886972 tar/asm: return instead of break
5ddec2ae4a (commitcomment-12290378)

Reported-by: Tibor Vass <tibor@docker.com>
2015-07-22 11:32:18 -04:00
Vincent Batts c2c2dde4cb tar/storage: use `filepath` instead of `path` 2015-07-22 10:27:53 -04:00
Vincent Batts 6d59e7bc76 tar/asm: clean up return on errors
This closure on error message needs returns so that the error message is
bubbled up to the reader.
2015-07-21 12:10:09 -04:00
Vincent Batts c74af0bae7 tar/asm: test was flipped 2015-07-20 17:26:16 -04:00
Vincent Batts 04172717de tar/asm: test for failure when mangling 2015-07-20 16:46:22 -04:00
Vincent Batts e33913bf75 tar/asm: don't defer file closing
this `for {}` can read many files. defering the file handle close can
cause an EMFILE (too many open files).
2015-07-15 13:43:48 -04:00
Vincent Batts 86ada47639 tar/asm: handle nil tar Header
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2015-06-23 12:23:36 -04:00
Vincent Batts ae13eaae94 tar/asm: remove uneeded goroutine
Reported-by: Derek McGowan <derek@mcgstyle.net>
2015-06-21 14:14:37 -04:00
Vincent Batts 46840c585a *: golint and docs 2015-03-09 14:11:11 -04:00
Vincent Batts f7b9a6caee tar/asm: comments 2015-03-09 13:56:45 -04:00
Vincent Batts 4ab9185a57 tar/asm: package docs 2015-03-09 13:54:06 -04:00
Vincent Batts d8ebf3c0a7 tar: mv the Getter to tar/storage 2015-03-09 13:20:26 -04:00