pkg/tarsum: review amendments

(separate commit to preserve github conversation)

Signed-off-by: Vincent Batts <vbatts@redhat.com>
This commit is contained in:
Vincent Batts 2014-11-12 09:25:46 -05:00 committed by Vincent Batts
parent 7c1b9831df
commit 0597513d59

View file

@ -14,8 +14,10 @@ methods, and the versioning of this calculation.
## Introduction ## Introduction
The transportation of file systems, regarding docker, is done with tar(1) The transportation of file systems, regarding docker, is done with tar(1)
archives. Types of transpiration include distribution to and from a registry archives. There are a variety of tar serialization formats [2], and a key
endpoint, saving and loading through commands or docker daemon APIs, concern here is ensuring a repeatable checksum given a set of inputs from a
generic tar archive. Types of transportation include distribution to and from a
registry endpoint, saving and loading through commands or docker daemon APIs,
transferring the build context from client to docker daemon, and committing the transferring the build context from client to docker daemon, and committing the
file system of a container to become an image. file system of a container to become an image.
@ -40,7 +42,7 @@ versions.
## Concept ## Concept
The checksum mechanism must ensure the integrity and confidentiality of the The checksum mechanism must ensure the integrity and assurance of the
file system payload. file system payload.
@ -62,11 +64,11 @@ A checksum mechanism must define the following operations and attributes:
The calculated sum output is a text string. The elements included in the output The calculated sum output is a text string. The elements included in the output
of the calculated sum comprise the information needed for validation of the sum of the calculated sum comprise the information needed for validation of the sum
(TarSum version and block cipher used) and the expected checksum in hexadecimal (TarSum version and hashing cipher used) and the expected checksum in hexadecimal
form. form.
There are two delimiters used: There are two delimiters used:
* '+' separates TarSum version from block cipher * '+' separates TarSum version from hashing cipher
* ':' separates calculation mechanics from expected hash * ':' separates calculation mechanics from expected hash
Example: Example:
@ -114,11 +116,11 @@ calculation are subject to change without notice.
## Ciphers ## Ciphers
The official default and standard block cipher used in the calculation mechanic The official default and standard hashing cipher used in the calculation mechanic
is "sha256". This refers to SHA256 hash algorithm as defined in FIPS 180-4. is "sha256". This refers to SHA256 hash algorithm as defined in FIPS 180-4.
Though the algorithm itself is not exclusively bound to this single block Though the algorithm itself is not exclusively bound to this single hashing
cipher, and support for alternate block ciphers was later added [1]. Presently cipher, and support for alternate hashing ciphers was later added [1]. Presently
use of this is for isolated use-cases and future-proofing the TarSum checksum use of this is for isolated use-cases and future-proofing the TarSum checksum
format. format.
@ -128,7 +130,7 @@ format.
As mentioned earlier, the calculation is such that it takes into consideration As mentioned earlier, the calculation is such that it takes into consideration
the life and cycle of the tar archive. In that the tar archive is not an the life and cycle of the tar archive. In that the tar archive is not an
immutable, permanent artifact. Otherwise options like relying on a known block immutable, permanent artifact. Otherwise options like relying on a known hashing
cipher checksum of the archive itself would be reliable enough. Since the tar cipher checksum of the archive itself would be reliable enough. Since the tar
archive is used as a transportation medium, and is thrown away after its archive is used as a transportation medium, and is thrown away after its
contents are extracted. Therefore, for consistent validation items such as contents are extracted. Therefore, for consistent validation items such as
@ -200,10 +202,12 @@ body.
#### Final Checksum #### Final Checksum
Using an initialize hash of the associated hash cipher, if there is additional Begin with a fresh or initial state of the associated hash cipher. If there is
payload to include in the TarSum calculation for the archive, it is written additional payload to include in the TarSum calculation for the archive, it is
first. Then each checksum from the ordered list of files sums is written to the written first. Then each checksum from the ordered list of file sums is written
hash. The resulting digest is formatted per the Elements of TarSum checksum, to the hash.
The resulting digest is formatted per the Elements of TarSum checksum,
including the TarSum version, the associated hash cipher and the hexadecimal including the TarSum version, the associated hash cipher and the hexadecimal
encoded checksum digest. encoded checksum digest.
@ -213,13 +217,16 @@ encoded checksum digest.
The initial version of TarSum has undergone one update that could invalidate The initial version of TarSum has undergone one update that could invalidate
handcrafted tar archives. The tar archive format supports appending of files handcrafted tar archives. The tar archive format supports appending of files
with same names as prior files in the archive. The latter file will clobber the with same names as prior files in the archive. The latter file will clobber the
prior file of the same path. Due to this the algorithm now accounts for prior file of the same path. Due to this the algorithm now accounts for files
with matching paths, and orders the list of file sums accordingly [3].
## Footnotes ## Footnotes
* [0] Versioning https://github.com/docker/docker/commit/747f89cd327db9d50251b17797c4d825162226d0 * [0] Versioning https://github.com/docker/docker/commit/747f89cd327db9d50251b17797c4d825162226d0
* [1] Alternate ciphers https://github.com/docker/docker/commit/4e9925d780665149b8bc940d5ba242ada1973c4e * [1] Alternate ciphers https://github.com/docker/docker/commit/4e9925d780665149b8bc940d5ba242ada1973c4e
* [2] Tar http://en.wikipedia.org/wiki/Tar_%28computing%29
* [3] Name collision https://github.com/docker/docker/commit/c5e6362c53cbbc09ddbabd5a7323e04438b57d31
## Acknowledgements ## Acknowledgements