Commit graph

51 commits

Author SHA1 Message Date
Stephen J Day
0a29b59e14 Webhook notification support in registry webapp
Endpoints are now created at applications startup time, using notification
configuration. The instances are then added to a Broadcaster instance, which
becomes the main event sink for the application. At request time, an event
bridge is configured to listen to repository method calls. The actor and source
of the eventBridge are created from the requeest context and application,
respectively. The result is notifications are dispatched with calls to the
context's Repository instance and are queued to each endpoint via the
broadcaster.

This commit also adds the concept of a RequestID and App.InstanceID. The
request id uniquely identifies each request and the InstanceID uniquely
identifies a run of the registry. These identifiers can be used in the future
to correlate log messages with generated events to support rich debugging.

The fields of the app were slightly reorganized for clarity and a few horrid
util functions have been removed.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-02-03 13:32:37 -08:00
Stephen J Day
e5de2594ad Remove decorator package
After implementing notifications end to end, it was found that decorating
repositories was more straightforward that previously thought. It's unfortunate
to can this package, but it led to the techniques employed in
storage/notifications/listeners.go. The ultimate result turned out much better.
2015-02-03 13:30:20 -08:00
Stephen J Day
9f0c8d6616 Implement notification endpoint webhook dispatch
This changeset implements webhook notification endpoints for dispatching
registry events. Repository instances can be decorated by a listener that
converts calls into context-aware events, using a bridge. Events generated in
the bridge are written to a sink. Implementations of sink include a broadcast
and endpoint sink which can be used to configure event dispatch. Endpoints
represent a webhook notification target, with queueing and retries built in.
They can be added to a Broadcaster, which is a simple sink that writes a block
of events to several sinks, to provide a complete dispatch mechanism.

The main caveat to the current approach is that all unsent notifications are
inmemory. Best effort is made to ensure that notifications are not dropped, to
the point where queues may back up on faulty endpoints. If the endpoint is
fixed, the events will be retried and all messages will go through.

Internally, this functionality is all made up of Sink objects. The queuing
functionality is implemented with an eventQueue sink and retries are
implemented with retryingSink. Replacing the inmemory queuing with something
persistent should be as simple as replacing broadcaster with a remote queue and
that sets up the sinks to be local workers listening to that remote queue.

Metrics are kept for each endpoint and exported via expvar. This may not be a
permanent appraoch but should provide enough information for troubleshooting
notification problems.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-02-03 13:30:20 -08:00
Stephen J Day
14fb80d6c3 Add payload and signatures method to SignedManifest
To provide easier access to digestible content, the paylaod has been made
accessible on the signed manifest type. This hides the specifics of the
interaction with libtrust with the caveat that signatures may be parsed twice.

We'll have to have a future look at the interface for manifest as we may be
making problematic architectural decisions. We'll visit this after the initial
release.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-02-03 13:30:20 -08:00
Stephen J Day
af7eb42793 Event notification message definition
This commit defines the message format used to notify external parties of
activity within a registry instance. The event includes information about which
action was taken on which registry object, including what user created the
action and which instance generated the event.

Message instances can be sent throughout an application or transmitted
externally. An envelope format along with a custom media type is defined along
with tests to detect changes to the wire format.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-02-03 13:30:20 -08:00
Olivier Gambier
092dadde6d Merge pull request #121 from stevvooe/address-layer-upload-errors
Address server errors received during layer upload
2015-02-03 11:48:34 -08:00
Stephen Day
d91e4bc34d Merge pull request #130 from stevvooe/path-names-escaped
Prefix non-name path components
2015-02-03 11:24:45 -08:00
Stephen J Day
43b36970f5 Prefix non-name path components
To address the possibility of confusing registry name components with
repository paths, path components that abut user provided repository names are
escaped with a prefixed underscore. This works because repository name
components are no allowed to start with underscores. The requirements on
backend driver path names have been relaxed greatly to support this use case.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-02-02 14:47:24 -08:00
Stephen J Day
0270bec916 Handle empty blob files more appropriately
Several API tests were added to ensure correct acceptance of zero-size and
empty tar files. This led to several changes in the storage backend around the
guarantees of remote file reading, which backs the layer and layer upload type.

In support of these changes, zero-length and empty checks have been added to
the digest package. These provide a sanity check against upstream tarsum
changes. The fileReader has been modified to be more robust when reading and
seeking on zero-length or non-existent files. The file no longer needs to exist
for the reader to be created. Seeks can now move beyond the end of the file,
causing reads to issue an io.EOF. This eliminates errors during certain race
conditions for reading files which should be detected by stat calls. As a part
of this, a few error types were factored out and the read buffer size was
increased to something more reasonable.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-02-02 13:01:49 -08:00
Brian Bland
fb71af75c8 Updates goamz dependency from crowdmob->AdRoll
Also includes goamz PR #331 for s3 v4 auth + IAM role support
2015-02-02 11:03:20 -08:00
Stephen J Day
f926a93778 Report layer upload as unavialable when data missing
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-29 20:45:19 -08:00
Stephen J Day
3911880491 Implement registry decorator toolkit
This change provides a toolkit for intercepting registry calls, such as
`ManifestService.Get` and `LayerUpload.Finish`, with the goal of easily
supporting interesting callbacks and listeners. The package proxies
returned objects through the decorate function before creation, allowing one to
carefully choose injection points.

Use cases range from notification systems all the way to cache integration.
While such a tool isn't strictly necessary, it reduces the amount of code
required to accomplish such tasks, deferring the tricky aspects to the
decorator package.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-21 10:46:20 -08:00
Stephen J Day
ea5b999fc0 Refactor storage API to be registry oriented
In support of making the storage API ready for supporting notifications and
mirroring, we've begun the process of paring down the storage model. The
process started by creating a central Registry interface. From there, the
common name argument on the LayerService and ManifestService was factored into
a Repository interface. The rest of the changes directly follow from this.

An interface wishlist was added, suggesting a direction to take the registry
package that should support the distribution project's future goals. As these
objects move out of the storage package and we implement a Registry backed by
the http client, these design choices will start getting validation.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-16 18:33:21 -08:00
Brian Bland
bd9f3702f7 DelegateLayerHandler now uses http method in url generation 2015-01-15 18:15:26 -08:00
Stephen J Day
83d62628fc Refactor storage to use new backend layout
This change refactors the storage backend to use the new path layout. To
facilitate this, manifest storage has been separated into a revision store and
tag store, supported by a more general blob store. The blob store is a hybrid
object, effectively providing both small object access, keyed by content
address, as well as methods that can be used to manage and traverse links to
underlying blobs. This covers common operations used in the revision store and
tag store, such as linking and traversal. The blob store can also be updated to
better support layer reading but this refactoring has been left for another
day.

The revision store and tag store support the manifest store's compound view of
data. These underlying stores provide facilities for richer access models, such
as content-addressable access and a richer tagging model. The highlight of this
change is the ability to sign a manifest from different hosts and have the
registry merge and serve those signatures as part of the manifest package.

Various other items, such as the delegate layer handler, were updated to more
directly use the blob store or other mechanism to fit with the changes.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-15 10:32:18 -08:00
Stephen J Day
3277d9fc74 Redesign path layout for backend storage
Several requirements for storing registry data have been compiled and the
backend layout has been refactored to comply. Specifically, we now store most
data as blobs that are linked from repositories. All data access is traversed
through repositories. Manifest updates are no longer destructive and support
references by digest or tag. Signatures for manifests are now stored externally
to the manifest payload to allow merging of signatures posted at different
time.

The design is detailed in the documentation for pathMapper.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-14 11:34:47 -08:00
Stephen J Day
ba6b774aea Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.

Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.

The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.

Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.

Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-09 14:50:39 -08:00
Stephen J Day
219bd48c24 Add path mapper definitions for upload locations
This change updates the path mapper to be able to specify upload management
locations. This includes a startedat file, which contains the RFC3339 formatted
start time of the upload and the actual data file.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-09 14:49:07 -08:00
Stephen J Day
09522d8535 Implement a remote file writer for use with StorageDriver
This changeset implements a fileWriter type that can be used to managed writes
to remote files in a StorageDriver. Basically, it manages a local seek position
for a remote path. An efficient use of this implementation will write data in
large blocks.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-09 14:49:06 -08:00
Brian Bland
f22ad79d36 Factors out resolveBlobPath, renames expires -> expiry 2015-01-08 17:56:45 -08:00
Brian Bland
abb901e4ab Adds options map for storagedriver URLFor() method 2015-01-08 17:10:32 -08:00
Brian Bland
17915e1b01 Adds support for content redirects for layer downloads
Includes a delegate implementation which redirects to the URL generated
by the storagedriver, and a cloudfront implementation.
Satisfies proposal #49
2015-01-08 17:01:28 -08:00
Stephen Day
fdea60af05 Merge pull request #24 from stevvooe/breakup-common
Breakup common package
2015-01-06 10:08:10 -08:00
Stephen J Day
adaa2246e7 Move testutil package to top-level
Since the common package no longer exists, the testutil package is being moved
up to the root. Ideally, we don't have large omnibus packages, like testutil,
but we can fix that in another refactoring round.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-05 16:53:13 -08:00
Stephen J Day
8be20212f1 Move tarsum utilities out of common package
In preparation for removing the common package, the tarsum utilities are being
moved to the more relevant digest package. This functionality will probably go
away in the future, but it's maintained here for the time being.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-05 16:04:30 -08:00
Brian Bland
ea6c082e85 Minor cleanup/testing for HMAC upload tokens
Changes configuration variable, lowercases private interface methods,
adds token sanity tests.
2015-01-05 14:37:56 -08:00
Brian Bland
07ba5db168 Serializes upload state to an HMAC token for subsequent requests
To support clustered registry, upload UUIDs must be recognizable by
registries that did not issue the UUID. By creating an HMAC verifiable
upload state token, registries can validate upload requests that other
instances authorized. The tokenProvider interface could also use a redis
store or other system for token handling in the future.
2015-01-05 14:27:05 -08:00
Stephen J Day
f1f610c6cd Decouple manifest signing and verification
It was probably ill-advised to couple manifest signing and verification to
their respective types. This changeset simply changes them from methods to
functions. These might not even be in this package in the future.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-02 15:46:47 -08:00
Stephen J Day
a4024b2f90 Move manifest to discrete package
Because manifests and their signatures are a discrete component of the
registry, we are moving the definitions into a separate package. This causes us
to lose some test coverage, but we can fill this in shortly. No changes have
been made to the external interfaces, but they are likely to come.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-02 13:23:11 -08:00
Olivier Gambier
67ca9d10cf Move from docker-registry to distribution 2014-12-23 17:13:02 -08:00
Stephen J Day
a4f42b8eea Relax requirement for size argument during blob upload
During client implementation, it was found that requiring the size argument
made client implementation more complex. The original benefit of the size
argument was to provide an additional check alongside of tarsum to validate
incoming data. For the purposes of the registry, it has been determined that
tarsum should be enough to validate incoming content.

At this time, the size check is optional but we may consider removing it
completely.
2014-12-12 19:08:50 -08:00
Stephen J Day
33b2b80a8c Remove errant log message 2014-12-09 14:19:07 -08:00
Stephen J Day
49d13f9a08 Move manifest store errors to where they happen 2014-12-09 13:40:44 -08:00
Stephen J Day
c71089c653 Implement Tags method on ManifestService 2014-12-09 13:37:21 -08:00
Stephen J Day
1a75fccb43 Address PathNotFoundError in (*manifestStore).Exists
Exists was returning an error when encountering a PathNotFoundError when it
should just return false without an error.
2014-12-05 14:34:54 -08:00
Stephen J Day
70ab06b864 Update storage package to use StorageDriver.Stat
This change updates the backend storage package that consumes StorageDriver to
use the new Stat call, over CurrentSize. It also makes minor updates for using
WriteStream and ReadStream.
2014-12-04 20:55:59 -08:00
Stephen J Day
e6e0219065 Avoid manifest verification errors by using Raw
Because json.Marshal does compaction on returned results, applications must
directly use SignedManifest.Raw when the marshaled value is required.
Otherwise, the returned manifest will fail signature checks.
2014-12-01 17:10:33 -08:00
Stephen J Day
8c7bec72b1 Cleanup image verification error handling
This diff removes a few early outs that caused errors to be unreported and
catches a missed error case for signature verification from libtrust. More work
needs to be done around ensuring consistent error handling but this is enough
to make the API work correctly.
2014-12-01 16:13:01 -08:00
Stephen J Day
b73a6c1998 Use json.MashalIndent for raw manifest json
This provides compatibility with what is in docker core, ensuring that image
manifests generated here have the same formatting. We'll need to automate this
some how.
2014-12-01 16:11:27 -08:00
Stephen J Day
98f5f30e75 Create copy of buffer for SignedManifest.Raw
Without this copy, the buffer may be re-used in the json package, causing
missing or corrupted content for the long-lived SignedManifest object. By
creating a new buffer, owned by the SignedManifest object, the content remains
stable.
2014-12-01 15:57:05 -08:00
Stephen J Day
6fead90736 Rich error reporting for manifest push
To provide rich error reporting during manifest pushes, the storage layers
verifyManifest stage has been modified to provide the necessary granularity.
Along with this comes with a partial shift to explicit error types, which
represents a small move in larger refactoring of error handling. Signature
methods from libtrust have been added to the various Manifest types to clean up
the verification code.

A primitive deletion implementation for manifests has been added. It only
deletes the manifest file and doesn't attempt to add some of the richer
features request, such as layer cleanup.
2014-11-26 12:57:14 -08:00
Stephen J Day
68944ea9cf Clean up layer storage layout
Previously, discussions were still ongoing about different storage layouts that
could support various access models. This changeset removes a layer of
indirection that was in place due to earlier designs. Effectively, this both
associates a layer with a named repository and ensures that content cannot be
accessed across repositories. It also moves to rely on tarsum as a true
content-addressable identifier, removing a layer of indirection during blob
resolution.
2014-11-25 09:57:43 -08:00
Stephen J Day
4decfaa82e Initial implementation of image manifest storage
This change implements the first pass at image manifest storage on top of the
storagedriver. Very similar to LayerService, its much simpler due to less
complexity of pushing and pulling images.

Various components are still missing, such as detailed error reporting on
missing layers during verification, but the base functionality is present.
2014-11-24 13:05:27 -08:00
Stephen J Day
eaadb82e1e Move Manifest type into storage package
This changeset move the Manifest type into the storage package to make the type
accessible to client and registry without import cycles. The structure of the
manifest was also changed to accuratle reflect the stages of the signing
process. A straw man Manifest.Sign method has been added to start testing this
concept out but will probably be accompanied by the more import
SignedManifest.Verify method as the security model develops.

This is probably the start of a concerted effort to consolidate types across
the client and server portions of the code base but we may want to see how such
a handy type, like the Manifest and SignedManifest, would work in docker core.
2014-11-21 19:37:44 -08:00
Stephen J Day
4bbabc6e36 Implement path spec for manifest storage 2014-11-21 19:15:35 -08:00
Stephen J Day
3f479b62b4 Refactor layerReader into fileReader
This change separates out the remote file reader functionality from layer
reprsentation data. More importantly, issues with seeking have been fixed and
thoroughly tested.
2014-11-21 15:24:14 -08:00
Stephen J Day
c0fe9d72d1 Various adjustments to digest package for govet/golint 2014-11-19 14:59:05 -08:00
Stephen J Day
1a508d67d9 Move storage package to use Digest type
Mostly, we've made superficial changes to the storage package to start using
the Digest type. Many of the exported interface methods have been changed to
reflect this in addition to changes in the way layer uploads will be initiated.

Further work here is necessary but will come with a separate PR.
2014-11-19 14:39:32 -08:00
Stephen J Day
2637e29e18 Initial implementation of registry LayerService
This change contains the initial implementation of the LayerService to power
layer push and pulls on the storagedriver. The interfaces presented in this
package will be used by the http application to drive most features around
efficient pulls and resumable pushes.

The file storage/layer.go defines the interface interactions. LayerService is
the root type and supports methods to access Layer and LayerUpload objects.
Pull operations are supported with LayerService.Fetch and push operations are
supported with LayerService.Upload and LayerService.Resume. Reads and writes of
layers are split between Layer and LayerUpload, respectively.

LayerService is implemented internally with the layerStore object, which takes
a storagedriver.StorageDriver and a pathMapper instance.

LayerUploadState is currently exported and will likely continue to be as the
interaction between it and layerUploadStore are better understood. Likely, the
layerUploadStore lifecycle and implementation will be deferred to the
application.

Image pushes pulls will be implemented in a similar manner without the
discrete, persistent upload.

Much of this change is in place to get something running and working. Caveats
of this change include the following:

1. Layer upload state storage is implemented on the local filesystem, separate
   from the storage driver. This must be replaced with using the proper backend
   and other state storage. This can be removed when we implement resumable
   hashing and tarsum calculations to avoid backend roundtrips.
2. Error handling is rather bespoke at this time. The http API implementation
   should really dictate the error return structure for the future, so we
   intend to refactor this heavily to support these errors. We'd also like to
   collect production data to understand how failures happen in the system as
   a while before moving to a particular edict around error handling.
3. The layerUploadStore, which manages layer upload storage and state is not
   currently exported. This will likely end up being split, with the file
   management portion being pointed at the storagedriver and the state storage
   elsewhere.
4. Access Control provisions are nearly completely missing from this change.
   There are details around how layerindex lookup works that are related with
   access controls. As the auth portions of the new API take shape, these
   provisions will become more clear.

Please see TODOs for details and individual recommendations.
2014-11-17 17:54:07 -08:00
Brian Bland
88795e0a14 Lots of various golint fixes
Changes some names to match go conventions
Comments all exported methods
Removes dot imports
2014-11-17 15:46:06 -08:00