Commit graph

44 commits

Author SHA1 Message Date
Joseph Schorr
5f99448adc Add a chunk cleanup queue for async GC of empty chunks
Instead of having the Swift storage engine try to delete the empty chunk(s) synchronously, we simply queue them and have a worker come along after 30s to delete the empty chunks. This has a few key benefits: it is async (doesn't slow down the push code), helps deal with Swift's eventual consistency (less retries necessary) and is generic for other storage engines if/when they need this as well
2016-11-15 15:07:41 -05:00
Joseph Schorr
ce0e3e0e8d Add missing parameter on RADOS storage
Fixes Python error that results from the missing parameter
2016-10-31 12:48:05 -04:00
Joseph Schorr
bfe2646a50 Make sure we don't generate chunk sizes larger than 5 GB.
Amazon S3 does not allow for chunk sizes larger than 5 GB; we currently don't handle that case at all, which is why large uploads are failing. This change ensures that if a storage engine specifies a *maximum* chunk size, we write multiple chunks no larger than that size.
2016-10-25 13:57:49 -04:00
Joseph Schorr
38415065e6 Add S3 retry to all possible operations around the multipart upload
Fixes #1933
2016-10-04 21:54:23 +03:00
Joseph Schorr
41cffe33f0 Fix borkened storage call 2016-09-02 14:29:53 -04:00
Joseph Schorr
233f55829e Add multipart upload retry to chunk uploads as well 2016-09-02 12:00:18 -04:00
Joseph Schorr
0bc90ea45b Add retry attempts for internal error on multipart upload
Fixes #1740
2016-08-18 12:04:36 -04:00
Joseph Schorr
14b93f72ff Make S3 access key and secret key optional, enabling IAM.
If not specified, then boto will fallback to reading the credentials from IAM if on an EC2 machine. This should be safe as the validator will still ensure the credentials work if not specified.

Fixes #1707
2016-08-11 17:17:36 -04:00
Joseph Schorr
cbf7c2bf44 Add better logging to blob uploads
Fixes #1635
2016-07-20 17:53:43 -04:00
Joseph Schorr
713ba3abaf Further updates to the Prometheus client code 2016-07-01 14:16:51 -04:00
Jake Moshenko
668a8edc50 Refactor prometheus integration
Move prometheus to SaaS and make it a plugin
Move static callers to use metrics_queue plugin
Change local-docker to support different quay clone dirnames
Change prom_aggregator to use logrus
2016-07-01 14:16:50 -04:00
Matt Jibson
3d9acf2fff Use prometheus as a metric backend
This entails writing a metric aggregation program since each worker has its
own memory, and thus own metrics because of python gunicorn. The python
client is a simple wrapper that makes web requests to it.
2016-07-01 14:16:50 -04:00
Joseph Schorr
eab6af2b87 Add mocked unit tests for cloud storage engine 2016-03-23 12:13:54 -04:00
Joseph Schorr
b440564df1 Fix client side chunk paths
Fixes #1306
2016-03-22 17:34:50 -04:00
Jimmy Zelinskie
2b07b6d8a9 allow HEAD on ACI images
Fixes #911.
2016-02-12 16:28:44 -05:00
Jake Moshenko
909e7d45b7 Add a test for swift path computation 2016-01-15 15:35:04 -05:00
Jake Moshenko
0b1951a4a4 Remove list directory from storage driver 2016-01-15 15:35:04 -05:00
Silas Sewell
2dcc1f13a6 Handle IOErrors in v2 uploads 2015-12-14 11:58:24 -05:00
Silas Sewell
76fd744453 Log stream_write_to_fp ioerrors 2015-12-07 16:26:48 -05:00
Joseph Schorr
ee0eb80c8f Fix blob content types
Fixes #990
2015-12-04 16:13:58 -05:00
Joseph Schorr
f38e7f5b25 Make it explicit that the hostname is a hostname, and not a URL 2015-12-04 15:40:33 -05:00
Matt Jibson
26f1d77a69 Merge pull request #889 from mjibson/s3-sigv4-host
Allow setting of boto's S3 host for SIGv4
2015-11-18 17:36:15 -05:00
Matt Jibson
b3c2388618 Allow setting of boto's S3 host for SIGv4
The problem only happens when a user has configured the new AWS Frankfurt
region for their S3 backend. It is the only region to require the new
v4 signature. All other regions support both v2 and v4. I'm not sure
which version is used by default on US Standard.

We could attempt to figure out where the bucket is hosted based on its
DNS resolution and auto-populate the host field that way. But I think
the amount of effort to have that work correctly outweighs its benefit
for such a simple solution.

fixes #863
fixes #764
2015-11-18 17:19:33 -05:00
Jimmy Zelinskie
9ddad4a1a9 client-side join chunks for GCS
Boto does not implement GCS's custom multipart API and so we're left to
join them client-side until it does.
2015-10-02 14:57:39 -04:00
Jimmy Zelinskie
6ed5087a3c add client side chunk join method 2015-10-01 12:28:56 -04:00
Jimmy Zelinskie
abe43a0e07 override upload_chunk_complete for RadosGW
RadosGW doesn't support server-side copy of keys into multipart, so we
have to always join it on the local side.
2015-09-30 17:46:59 -04:00
Jimmy Zelinskie
c5aa3ca4f0 make registry v2 tests pass for GCS
Fixes #509.
2015-09-28 15:42:48 -04:00
Jake Moshenko
26cea9a07c Merge remote-tracking branch 'upstream/master' into python-registry-v2 2015-09-17 16:16:27 -04:00
Joseph Schorr
cccb1651f5 Fixes for direct cloud storage copying 2015-09-08 16:55:47 -04:00
Jake Moshenko
210ed7cf02 Merge remote-tracking branch 'upstream/master' into python-registry-v2 2015-09-04 16:32:01 -04:00
Jake Moshenko
8269d4ac90 Checkpoint implementing PATCH according to Docker 2015-09-03 16:26:02 -04:00
Matt Jibson
a821ad2b01 Return an error on failed S3 uploads
The previous change to this file didn't raise the error up to stream_write,
and so the complete_upload function still ran because the loop was only
broken. It errored because the data was already canceled. This is better
than what we had before, which was to silently fail but report success
(even internally to ourselves!) on bad image upload.

This means we discovered a bug where a user could have failed during image
upload, but quay would write that image to the repository, potentially
writing broken images to S3.
2015-09-01 15:53:32 -04:00
Joseph Schorr
724b1607d7 Add automatic storage replication
Adds a worker to automatically replicate data between storages and update the database accordingly
2015-09-01 14:53:32 -04:00
Matt Jibson
ab25542bd7 Measure multipart uploads
see #304
2015-08-31 13:48:52 -04:00
Matt Jibson
9aedfc8d2c Cancel failed multipart uploads 2015-08-31 03:07:44 -04:00
Jake Moshenko
398202e6fc Implement some new methods on the storage engines. 2015-08-27 11:29:19 -04:00
Joseph Schorr
53e5fc6265 Have the config setup tool automatically prepare the S3 or GCS storage with CORS config 2015-01-16 16:10:40 -05:00
Joseph Schorr
e8ad01cb41 Lots of small NPE and other exception fixes 2014-09-15 11:27:33 -04:00
Jake Moshenko
8b3a3178b0 Finish the build logs archiver, add handlers for cloud and local that handle gzip encoded archived content. 2014-09-11 15:33:10 -04:00
Jake Moshenko
548f855f71 Use the pure python io module to avoid some interaction between gunicorn, wsgi, and bufferedreader that prevents gunicorn from properly sending the files. 2014-09-09 22:28:25 -04:00
Jake Moshenko
c9e1648781 Small fixes to bugs in the streaming handler for use with magic and radosgw. 2014-09-09 18:30:14 -04:00
Jake Moshenko
756e8ec848 Send the content type through to the cloud engines. 2014-09-09 16:52:53 -04:00
Jake Moshenko
29d40db5ea Add a new RadosGW storage engine. Allow engines to distinguish not only between those that can support direct uploads and downloads, but those that support doing it through the browser. Rename resumeable->resumable. 2014-09-09 15:54:03 -04:00
Jake Moshenko
29f1b048a3 Add support for Google Cloud Storage. 2014-08-12 02:06:44 -04:00
Renamed from storage/s3.py (Browse further)