2016-11-04 00:02:34 +00:00
|
|
|
package content
|
|
|
|
|
|
|
|
import (
|
2016-11-16 03:46:24 +00:00
|
|
|
"log"
|
2016-11-04 00:02:34 +00:00
|
|
|
"os"
|
|
|
|
"path/filepath"
|
|
|
|
|
2016-11-16 03:46:24 +00:00
|
|
|
"github.com/nightlyone/lockfile"
|
2017-01-09 23:10:52 +00:00
|
|
|
"github.com/opencontainers/go-digest"
|
2016-11-04 00:02:34 +00:00
|
|
|
"github.com/pkg/errors"
|
|
|
|
)
|
|
|
|
|
2017-01-24 01:39:44 +00:00
|
|
|
// Writer represents a write transaction against the blob store.
|
|
|
|
type Writer struct {
|
|
|
|
cs *Store
|
2016-11-04 00:02:34 +00:00
|
|
|
fp *os.File // opened data file
|
2016-11-16 03:46:24 +00:00
|
|
|
lock lockfile.Lockfile
|
|
|
|
path string // path to writer dir
|
2017-01-24 01:39:44 +00:00
|
|
|
ref string // ref key
|
2016-11-04 00:02:34 +00:00
|
|
|
offset int64
|
|
|
|
digester digest.Digester
|
|
|
|
}
|
|
|
|
|
2017-01-24 01:39:44 +00:00
|
|
|
func (cw *Writer) Ref() string {
|
|
|
|
return cw.ref
|
|
|
|
}
|
|
|
|
|
dist: expand functionality of the dist tool
With this change, we add the following commands to the dist tool:
- `ingest`: verify and accept content into storage
- `active`: display active ingest processes
- `list`: list content in storage
- `path`: provide a path to a blob by digest
- `delete`: remove a piece of content from storage
We demonstrate the utility with the following shell pipeline:
```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
jq -r '.layers[] | "./dist fetch docker.io/library/redis "+.digest + "| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size | tostring) +" docker.io/library/redis@"+.digest' | xargs -I{} -P10 -n1 sh -c "{}"
```
The above fetches a manifest, pipes it to jq, which assembles a shell
pipeline to ingest each layer into the content store. Because the
transactions are keyed by their digest, concurrent downloads and
downloads of repeated content are ignored. Each process is then executed
parallel using xargs.
Put shortly, this is a parallel layer download.
In a separate shell session, could monitor the active downloads with the
following:
```
$ watch -n0.2 ./dist active
```
For now, the content is downloaded into `.content` in the current
working directory. To watch the contents of this directory, you can use
the following:
```
$ watch -n0.2 tree .content
```
This will help to understand what is going on internally.
To get access to the layers, you can use the path command:
```
$./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
```
When you are done, you can clear out the content with the classic xargs
pipeline:
```
$ ./dist list -q | xargs ./dist delete
```
Note that this is mostly a POC. Things like failed downloads and
abandoned download cleanup aren't quite handled. We'll probably make
adjustments around how content store transactions are handled to address
this.
From here, we'll build out full image pull and create tooling to get
runtime bundles from the fetched content.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-26 22:08:56 +00:00
|
|
|
// Size returns the current size written.
|
|
|
|
//
|
|
|
|
// Cannot be called concurrently with `Write`. If you need need concurrent
|
|
|
|
// status, query it with `Store.Stat`.
|
|
|
|
func (cw *Writer) Size() int64 {
|
|
|
|
return cw.offset
|
|
|
|
}
|
|
|
|
|
|
|
|
// Digest returns the current digest of the content, up to the current write.
|
|
|
|
//
|
|
|
|
// Cannot be called concurrently with `Write`.
|
|
|
|
func (cw *Writer) Digest() digest.Digest {
|
|
|
|
return cw.digester.Digest()
|
|
|
|
}
|
|
|
|
|
2016-11-04 00:02:34 +00:00
|
|
|
// Write p to the transaction.
|
|
|
|
//
|
|
|
|
// Note that writes are unbuffered to the backing file. When writing, it is
|
2016-12-16 01:31:19 +00:00
|
|
|
// recommended to wrap in a bufio.Writer or, preferably, use io.CopyBuffer.
|
2017-01-24 01:39:44 +00:00
|
|
|
func (cw *Writer) Write(p []byte) (n int, err error) {
|
2016-11-04 00:02:34 +00:00
|
|
|
n, err = cw.fp.Write(p)
|
|
|
|
cw.digester.Hash().Write(p[:n])
|
dist: expand functionality of the dist tool
With this change, we add the following commands to the dist tool:
- `ingest`: verify and accept content into storage
- `active`: display active ingest processes
- `list`: list content in storage
- `path`: provide a path to a blob by digest
- `delete`: remove a piece of content from storage
We demonstrate the utility with the following shell pipeline:
```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
jq -r '.layers[] | "./dist fetch docker.io/library/redis "+.digest + "| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size | tostring) +" docker.io/library/redis@"+.digest' | xargs -I{} -P10 -n1 sh -c "{}"
```
The above fetches a manifest, pipes it to jq, which assembles a shell
pipeline to ingest each layer into the content store. Because the
transactions are keyed by their digest, concurrent downloads and
downloads of repeated content are ignored. Each process is then executed
parallel using xargs.
Put shortly, this is a parallel layer download.
In a separate shell session, could monitor the active downloads with the
following:
```
$ watch -n0.2 ./dist active
```
For now, the content is downloaded into `.content` in the current
working directory. To watch the contents of this directory, you can use
the following:
```
$ watch -n0.2 tree .content
```
This will help to understand what is going on internally.
To get access to the layers, you can use the path command:
```
$./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
```
When you are done, you can clear out the content with the classic xargs
pipeline:
```
$ ./dist list -q | xargs ./dist delete
```
Note that this is mostly a POC. Things like failed downloads and
abandoned download cleanup aren't quite handled. We'll probably make
adjustments around how content store transactions are handled to address
this.
From here, we'll build out full image pull and create tooling to get
runtime bundles from the fetched content.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-26 22:08:56 +00:00
|
|
|
cw.offset += int64(len(p))
|
2016-11-04 00:02:34 +00:00
|
|
|
return n, err
|
|
|
|
}
|
|
|
|
|
2017-01-24 01:39:44 +00:00
|
|
|
func (cw *Writer) Commit(size int64, expected digest.Digest) error {
|
2016-11-04 00:02:34 +00:00
|
|
|
if err := cw.fp.Sync(); err != nil {
|
|
|
|
return errors.Wrap(err, "sync failed")
|
|
|
|
}
|
|
|
|
|
|
|
|
fi, err := cw.fp.Stat()
|
|
|
|
if err != nil {
|
|
|
|
return errors.Wrap(err, "stat on ingest file failed")
|
|
|
|
}
|
|
|
|
|
|
|
|
// change to readonly, more important for read, but provides _some_
|
|
|
|
// protection from this point on. We use the existing perms with a mask
|
|
|
|
// only allowing reads honoring the umask on creation.
|
|
|
|
//
|
|
|
|
// This removes write and exec, only allowing read per the creation umask.
|
|
|
|
if err := cw.fp.Chmod((fi.Mode() & os.ModePerm) &^ 0333); err != nil {
|
|
|
|
return errors.Wrap(err, "failed to change ingest file permissions")
|
|
|
|
}
|
|
|
|
|
dist: expand functionality of the dist tool
With this change, we add the following commands to the dist tool:
- `ingest`: verify and accept content into storage
- `active`: display active ingest processes
- `list`: list content in storage
- `path`: provide a path to a blob by digest
- `delete`: remove a piece of content from storage
We demonstrate the utility with the following shell pipeline:
```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
jq -r '.layers[] | "./dist fetch docker.io/library/redis "+.digest + "| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size | tostring) +" docker.io/library/redis@"+.digest' | xargs -I{} -P10 -n1 sh -c "{}"
```
The above fetches a manifest, pipes it to jq, which assembles a shell
pipeline to ingest each layer into the content store. Because the
transactions are keyed by their digest, concurrent downloads and
downloads of repeated content are ignored. Each process is then executed
parallel using xargs.
Put shortly, this is a parallel layer download.
In a separate shell session, could monitor the active downloads with the
following:
```
$ watch -n0.2 ./dist active
```
For now, the content is downloaded into `.content` in the current
working directory. To watch the contents of this directory, you can use
the following:
```
$ watch -n0.2 tree .content
```
This will help to understand what is going on internally.
To get access to the layers, you can use the path command:
```
$./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
```
When you are done, you can clear out the content with the classic xargs
pipeline:
```
$ ./dist list -q | xargs ./dist delete
```
Note that this is mostly a POC. Things like failed downloads and
abandoned download cleanup aren't quite handled. We'll probably make
adjustments around how content store transactions are handled to address
this.
From here, we'll build out full image pull and create tooling to get
runtime bundles from the fetched content.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-26 22:08:56 +00:00
|
|
|
if size > 0 && size != fi.Size() {
|
2016-11-04 00:02:34 +00:00
|
|
|
return errors.Errorf("failed size validation: %v != %v", fi.Size(), size)
|
|
|
|
}
|
|
|
|
|
2016-11-16 03:46:24 +00:00
|
|
|
if err := cw.fp.Close(); err != nil {
|
|
|
|
return errors.Wrap(err, "failed closing ingest")
|
|
|
|
}
|
|
|
|
|
2016-11-04 00:02:34 +00:00
|
|
|
dgst := cw.digester.Digest()
|
2016-12-02 05:37:58 +00:00
|
|
|
if expected != "" && expected != dgst {
|
2016-11-04 00:02:34 +00:00
|
|
|
return errors.Errorf("unexpected digest: %v != %v", dgst, expected)
|
|
|
|
}
|
|
|
|
|
|
|
|
var (
|
|
|
|
ingest = filepath.Join(cw.path, "data")
|
dist: expand functionality of the dist tool
With this change, we add the following commands to the dist tool:
- `ingest`: verify and accept content into storage
- `active`: display active ingest processes
- `list`: list content in storage
- `path`: provide a path to a blob by digest
- `delete`: remove a piece of content from storage
We demonstrate the utility with the following shell pipeline:
```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
jq -r '.layers[] | "./dist fetch docker.io/library/redis "+.digest + "| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size | tostring) +" docker.io/library/redis@"+.digest' | xargs -I{} -P10 -n1 sh -c "{}"
```
The above fetches a manifest, pipes it to jq, which assembles a shell
pipeline to ingest each layer into the content store. Because the
transactions are keyed by their digest, concurrent downloads and
downloads of repeated content are ignored. Each process is then executed
parallel using xargs.
Put shortly, this is a parallel layer download.
In a separate shell session, could monitor the active downloads with the
following:
```
$ watch -n0.2 ./dist active
```
For now, the content is downloaded into `.content` in the current
working directory. To watch the contents of this directory, you can use
the following:
```
$ watch -n0.2 tree .content
```
This will help to understand what is going on internally.
To get access to the layers, you can use the path command:
```
$./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
```
When you are done, you can clear out the content with the classic xargs
pipeline:
```
$ ./dist list -q | xargs ./dist delete
```
Note that this is mostly a POC. Things like failed downloads and
abandoned download cleanup aren't quite handled. We'll probably make
adjustments around how content store transactions are handled to address
this.
From here, we'll build out full image pull and create tooling to get
runtime bundles from the fetched content.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-26 22:08:56 +00:00
|
|
|
target = cw.cs.blobPath(dgst)
|
2016-11-04 00:02:34 +00:00
|
|
|
)
|
|
|
|
|
dist: expand functionality of the dist tool
With this change, we add the following commands to the dist tool:
- `ingest`: verify and accept content into storage
- `active`: display active ingest processes
- `list`: list content in storage
- `path`: provide a path to a blob by digest
- `delete`: remove a piece of content from storage
We demonstrate the utility with the following shell pipeline:
```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
jq -r '.layers[] | "./dist fetch docker.io/library/redis "+.digest + "| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size | tostring) +" docker.io/library/redis@"+.digest' | xargs -I{} -P10 -n1 sh -c "{}"
```
The above fetches a manifest, pipes it to jq, which assembles a shell
pipeline to ingest each layer into the content store. Because the
transactions are keyed by their digest, concurrent downloads and
downloads of repeated content are ignored. Each process is then executed
parallel using xargs.
Put shortly, this is a parallel layer download.
In a separate shell session, could monitor the active downloads with the
following:
```
$ watch -n0.2 ./dist active
```
For now, the content is downloaded into `.content` in the current
working directory. To watch the contents of this directory, you can use
the following:
```
$ watch -n0.2 tree .content
```
This will help to understand what is going on internally.
To get access to the layers, you can use the path command:
```
$./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
```
When you are done, you can clear out the content with the classic xargs
pipeline:
```
$ ./dist list -q | xargs ./dist delete
```
Note that this is mostly a POC. Things like failed downloads and
abandoned download cleanup aren't quite handled. We'll probably make
adjustments around how content store transactions are handled to address
this.
From here, we'll build out full image pull and create tooling to get
runtime bundles from the fetched content.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-26 22:08:56 +00:00
|
|
|
// make sure parent directories of blob exist
|
|
|
|
if err := os.MkdirAll(filepath.Dir(target), 0755); err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
|
2016-11-04 00:02:34 +00:00
|
|
|
// clean up!!
|
|
|
|
defer os.RemoveAll(cw.path)
|
dist: expand functionality of the dist tool
With this change, we add the following commands to the dist tool:
- `ingest`: verify and accept content into storage
- `active`: display active ingest processes
- `list`: list content in storage
- `path`: provide a path to a blob by digest
- `delete`: remove a piece of content from storage
We demonstrate the utility with the following shell pipeline:
```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
jq -r '.layers[] | "./dist fetch docker.io/library/redis "+.digest + "| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size | tostring) +" docker.io/library/redis@"+.digest' | xargs -I{} -P10 -n1 sh -c "{}"
```
The above fetches a manifest, pipes it to jq, which assembles a shell
pipeline to ingest each layer into the content store. Because the
transactions are keyed by their digest, concurrent downloads and
downloads of repeated content are ignored. Each process is then executed
parallel using xargs.
Put shortly, this is a parallel layer download.
In a separate shell session, could monitor the active downloads with the
following:
```
$ watch -n0.2 ./dist active
```
For now, the content is downloaded into `.content` in the current
working directory. To watch the contents of this directory, you can use
the following:
```
$ watch -n0.2 tree .content
```
This will help to understand what is going on internally.
To get access to the layers, you can use the path command:
```
$./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
```
When you are done, you can clear out the content with the classic xargs
pipeline:
```
$ ./dist list -q | xargs ./dist delete
```
Note that this is mostly a POC. Things like failed downloads and
abandoned download cleanup aren't quite handled. We'll probably make
adjustments around how content store transactions are handled to address
this.
From here, we'll build out full image pull and create tooling to get
runtime bundles from the fetched content.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-26 22:08:56 +00:00
|
|
|
|
2016-11-04 00:02:34 +00:00
|
|
|
if err := os.Rename(ingest, target); err != nil {
|
|
|
|
if os.IsExist(err) {
|
|
|
|
// collision with the target file!
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
|
2016-11-16 03:46:24 +00:00
|
|
|
unlock(cw.lock)
|
|
|
|
cw.fp = nil
|
2016-11-04 00:02:34 +00:00
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// Close the writer, flushing any unwritten data and leaving the progress in
|
|
|
|
// tact.
|
|
|
|
//
|
|
|
|
// If one needs to resume the transaction, a new writer can be obtained from
|
|
|
|
// `ContentStore.Resume` using the same key. The write can then be continued
|
|
|
|
// from it was left off.
|
dist: expand functionality of the dist tool
With this change, we add the following commands to the dist tool:
- `ingest`: verify and accept content into storage
- `active`: display active ingest processes
- `list`: list content in storage
- `path`: provide a path to a blob by digest
- `delete`: remove a piece of content from storage
We demonstrate the utility with the following shell pipeline:
```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
jq -r '.layers[] | "./dist fetch docker.io/library/redis "+.digest + "| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size | tostring) +" docker.io/library/redis@"+.digest' | xargs -I{} -P10 -n1 sh -c "{}"
```
The above fetches a manifest, pipes it to jq, which assembles a shell
pipeline to ingest each layer into the content store. Because the
transactions are keyed by their digest, concurrent downloads and
downloads of repeated content are ignored. Each process is then executed
parallel using xargs.
Put shortly, this is a parallel layer download.
In a separate shell session, could monitor the active downloads with the
following:
```
$ watch -n0.2 ./dist active
```
For now, the content is downloaded into `.content` in the current
working directory. To watch the contents of this directory, you can use
the following:
```
$ watch -n0.2 tree .content
```
This will help to understand what is going on internally.
To get access to the layers, you can use the path command:
```
$./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
```
When you are done, you can clear out the content with the classic xargs
pipeline:
```
$ ./dist list -q | xargs ./dist delete
```
Note that this is mostly a POC. Things like failed downloads and
abandoned download cleanup aren't quite handled. We'll probably make
adjustments around how content store transactions are handled to address
this.
From here, we'll build out full image pull and create tooling to get
runtime bundles from the fetched content.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-26 22:08:56 +00:00
|
|
|
//
|
|
|
|
// To abandon a transaction completely, first call close then `Store.Remove` to
|
|
|
|
// clean up the associated resources.
|
2017-01-24 01:39:44 +00:00
|
|
|
func (cw *Writer) Close() (err error) {
|
2016-11-16 03:46:24 +00:00
|
|
|
if err := unlock(cw.lock); err != nil {
|
|
|
|
log.Printf("unlock failed: %v", err)
|
|
|
|
}
|
|
|
|
|
|
|
|
if cw.fp != nil {
|
|
|
|
cw.fp.Sync()
|
|
|
|
return cw.fp.Close()
|
|
|
|
}
|
|
|
|
|
|
|
|
return nil
|
2016-11-04 00:02:34 +00:00
|
|
|
}
|