This allows reading the metadata contained in tar-split
without expensively recreating the whole tar stream
including full contents.
We have two use cases for this:
- In a situation where tar-split is distributed along with
a separate metadata stream, ensuring that the two are
exactly consistent
- Reading the tar headers allows making a ~cheap check
of consistency of on-disk layers, just checking that the
files exist in expected sizes, without reading the full
contents.
This can be implemented outside of this repo, but it's
not ideal:
- The function necessarily hard-codes some assumptions
about how tar-split determines the boundaries of
SegmentType/FileType entries (or, indeed, whether it
uses FileType entries at all). That's best maintained
directly beside the code that creates this.
- The ExpectedPadding() value is not currently exported,
so the consumer would have to heuristically guess where
the padding ends.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Fixes#65
if the read bytes is 0, then don't even create the entry for that
padding.
This sounds like the solution for the issue opened, but I haven't found
a reproducer for this issue yet. :-\
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
this function is used widely and it's JSON. And it was not written in
such a way as to have exchangable codec.. per se
So, maybe I'll just kick out the idea of using https://github.com/ugorji/go
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
I intend to not make changes to this `archive/tar` that aren't from
upstream, or are not directly related to the usage by this project...
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
the pointer to the pool may be useful, but holding on that until I get
benchmarks of memory use to show the benefit.
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
io.Copy usually allocates a 32kB buffer, and due to the large
number of files processed by tar-split, this shows up in Go profiles
as a very large alloc_space total.
It doesn't seem to actually be a measurable problem in any way,
but we can allocate the buffer only once per tar-split creation,
at no additional cost to existing allocations, so let's do so,
and remove the distraction.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>