Commit graph

306 commits

Author SHA1 Message Date
Joseph Schorr
5b3212ea0e Change security notification code to use the new stream diff reporters
This ensures that even if security scanner pagination sends Old and New layer IDs on different pages, they will properly be handled across the entire notification.

Fixes https://www.pivotaltracker.com/story/show/136133657
2016-12-20 12:50:19 -05:00
Joseph Schorr
405eca074c Security scanner flow changes and auto-retry
Changes the security scanner code to raise exceptions now for non-successful operations. One of the new exceptions raised is MissingParentLayerException, which, when raised, will cause the security worker to perform a full rescan of all parent images for the current layer, before trying once more to scan the current layer. This should allow the system to be "self-healing" in the case where the security scanner engine somehow loses or corrupts a parent layer.
2016-12-16 15:38:09 -05:00
Joseph Schorr
15041ac5ed Add a fake security scanner class for easier testing
The FakeSecurityScanner mocks out all calls that Quay is expected to make to the security scanner API, and returns faked data that can be adjusted by the calling test case
2016-12-14 17:11:45 -05:00
Charlton Austin
9e25fde3a0 Fixing api usage. 2016-12-07 12:53:07 -05:00
Jimmy Zelinskie
3a7119d499 Merge pull request #2209 from coreos-inc/clair-notification-read
Clair notification read and queue fixes
2016-12-05 19:36:59 -05:00
Joseph Schorr
9f0ce7c634 Have the security worker remove failed notifications from Clair 2016-12-05 19:08:52 -05:00
Jake Moshenko
c263772703 Do not extend processing immediately after taking queue item. 2016-12-05 18:12:14 -05:00
Jake Moshenko
709edd7eb6 Reduce the update period on queue worker metrics. 2016-12-05 18:12:14 -05:00
Quentin Machu
b990a27d50 Increase limit in securitynotificationworker
With https://github.com/coreos/clair/pull/278 and https://github.com/coreos/clair/pull/279, performance of this API call has increased. It has been observed that querying 100 or 1000 layers page doesn't noticeably change the execution time. Therefore, doing significantly less calls will reduce the overall processing time for each notification.
2016-12-04 13:39:34 +01:00
Charlton Austin
7b3d8e3977 Merge pull request #2183 from charltonaustin/metrics_for_unscanned_images
Adding in some metrics around clair sec scan.
2016-12-02 11:50:29 -05:00
Charlton Austin
edd9dcd7f6 Adding in some metrics around clair sec scan. 2016-12-01 16:50:02 -05:00
Joseph Schorr
e6ee538e15 Fix full database test script to not fail randomly
- Switches database schema creation to alembic, which solves the MySQL issue (and makes sure we test migrations as well)
- Adds a few time.sleep(1) to work around MySQL's second-precision issue when adding items to queues and then immediately retrieving them
- Disables the storage proxy tests when running against non-SQLite databases, as it causes failures with the multiple process and multiple transactions
- Changes initdb to support only populating the database, as well as fixing a few small items around the test data when working with non-SQLite data
2016-11-30 18:24:08 -05:00
Joseph Schorr
e29cb34336 Fix Set calls to gauges
Fixes #2150

The proper function is `Set` (not `set`), which was causing these gauges to not report to Prometheus
2016-11-21 15:27:17 -05:00
Joseph Schorr
5f99448adc Add a chunk cleanup queue for async GC of empty chunks
Instead of having the Swift storage engine try to delete the empty chunk(s) synchronously, we simply queue them and have a worker come along after 30s to delete the empty chunks. This has a few key benefits: it is async (doesn't slow down the push code), helps deal with Swift's eventual consistency (less retries necessary) and is generic for other storage engines if/when they need this as well
2016-11-15 15:07:41 -05:00
Jimmy Zelinskie
8b9f9478a4 pylint formatting 2016-10-28 17:12:46 -04:00
Jimmy Zelinskie
a30b358709 add staggered worker startup
Fixes #787
2016-10-28 17:12:39 -04:00
Jimmy Zelinskie
2bd1e76267 workers.queuecleanup: s/week/day cleanup frequency 2016-10-20 13:47:07 -04:00
Jimmy Zelinskie
20ef43d5fb workers.queuecleanup: remove direct peewee usage 2016-10-20 13:46:00 -04:00
Joseph Schorr
30af8aef1a Add a worker for reporting global stats to Prometheus
Fixes #1789
2016-09-12 16:19:19 -04:00
josephschorr
5c64646629 Merge pull request #1778 from coreos-inc/redlock
Fix locking via RedLock
2016-08-29 16:12:01 -04:00
Joseph Schorr
aa7c87d765 Fix locking via RedLock
Fixes #1777
2016-08-29 16:06:26 -04:00
Joseph Schorr
08a3b70b56 Extend processing before processing security notifications
Makes sure queue items don't expire during processing

Fixes #1776
2016-08-29 13:08:38 -04:00
Jake Moshenko
a113f548db Accidentally forgot a line in the gc worker. 2016-08-02 10:44:53 -04:00
Jake Moshenko
05e2773fa7 Get rid of remaining slow query for garbage collection. 2016-08-01 18:22:38 -04:00
Joseph Schorr
b8d2570725 Don't raise an error on duplicate placements
This can happen if two pushes are racing on the same storage.
2016-07-19 16:44:05 -04:00
Joseph Schorr
5cd793331e Fix storage replication for CAS and add tests 2016-07-12 13:46:06 -04:00
Joseph Schorr
3b994431eb Auto expire the build status and logs in redis 2016-06-20 13:53:13 -04:00
Jake Moshenko
a1cf12e460 Add a sitemap.txt for popular public repos
and reference it from the robots.txt
2016-06-17 14:34:20 -04:00
Joseph Schorr
8887f09ba8 Use the instance service key for registry JWT signing 2016-06-07 11:58:10 -04:00
Joseph Schorr
dd0dd39bf0 Fix the queue cleanup worker to delete the items that have expired, not unexpired 2016-06-03 22:14:14 -04:00
Joseph Schorr
5746b42c69 Add a cleanup worker for the queue item table
Fixes #784
2016-06-02 15:00:44 -04:00
josephschorr
ec492bb683 Merge pull request #1323 from coreos-inc/secworkerreturn
Move security notification work into its own method to allow for retu…
2016-06-02 13:59:25 -04:00
Jake Moshenko
9221a515de Use the registry API for security scanning
when the storage engine doesn't support direct download url
2016-05-04 18:04:06 -04:00
Joseph Schorr
73fa593d02 Various small fixes in prep for QE release 2016-05-04 15:20:27 -04:00
Jimmy Zelinskie
f842545b3e rename config values to remove "Quay" (#1431) 2016-05-03 13:11:21 -04:00
Evan Cordell
489752a0b7 Only refresh current instance service key 2016-04-29 14:10:33 -04:00
Evan Cordell
a6f6a114c2 service key worker to refresh automatic keys 2016-04-29 14:10:33 -04:00
Jimmy Zelinskie
128b0cd38c logrotateworker: archive every 24 hours 2016-04-18 13:02:30 -04:00
Jimmy Zelinskie
ef65822410 logrotateworker: perf optimizations
This removes our needless transaction, only calculates the cutoff date
once, removes the logs generator, and uses a tested optimal
MIN_LOGS_PER_ROTATION.
2016-04-15 16:51:17 -04:00
Jimmy Zelinskie
3d190b786f userfiles: make handler optional 2016-04-15 13:56:07 -04:00
Jimmy Zelinskie
c7c52e6c74 logrotateworker: save to storage via userfiles 2016-04-14 13:29:29 -04:00
Joseph Schorr
d62ec22fc9 Move security notification work into its own method to allow for return values
Fixes #1302
Fixes #1304
2016-03-31 14:08:33 -04:00
Joseph Schorr
dc8f9713f8 Change logs worker to use a global lock in the inner loop and move storage out of the transaction 2016-03-24 14:09:48 -04:00
Joseph Schorr
aa5587c93c Fixes and added tests for the security notification worker
Fixes #1301

- Ensures that the worker uses pagination properly
- Ensures that the worker handles failure as expected
- Moves marking the notification as read to after the worker processes it
- Increases the number of layers requested to 100
2016-03-18 20:28:06 -04:00
Quentin Machu
5b7d6b0638 Merge pull request #1275 from Quentin-M/min_id_once
Compute min_id only once during securityworker's lifetime
2016-03-04 14:02:47 -05:00
Quentin Machu
54153c9b80 Compute min_id only once during securityworker's lifetime 2016-03-04 14:02:28 -05:00
Jimmy Zelinskie
b5d904f373 Merge pull request #1218 from jzelinskie/logrotate5ever
vastly simplify log rotation
2016-03-04 13:48:21 -05:00
Quentin Machu
888f976e8d Use a feature flag to toggle security notifications 2016-03-01 15:54:18 -05:00
Joseph Schorr
f498e92d58 Implement against new Clair paginated notification system 2016-02-25 15:58:42 -05:00
Joseph Schorr
c0374d71c9 Refactor the security worker and API calls and add a bunch of tests 2016-02-25 12:29:41 -05:00
Quentin Machu
e5da33578c Adapt security worker for Clair v1.0 (except notifications) 2016-02-19 17:44:14 -05:00
Quentin Machu
f62a05f6d7 various securityworker fixes 2016-02-09 21:25:07 -05:00
Quentin Machu
1d2b31a581 Mark layers that Clair can't extract as failed 2016-02-09 18:24:35 -05:00
Jimmy Zelinskie
ee705fe7a9 vastly simplify log rotation 2016-02-09 18:20:14 -05:00
Quentin Machu
13c10ba7b1 Double the securityworker indexing interval 2016-02-09 14:49:10 -05:00
Joseph Schorr
ab166c4448 Delete the image diff feature
Fixes #1077
2015-12-23 13:08:01 -05:00
Jimmy Zelinskie
f439ad7804 Merge pull request #618 from jzelinskie/logsworker
add a log rotation worker
2015-12-16 17:25:50 -05:00
Jimmy Zelinskie
e1f955a3f6 add a log rotation worker
Fixes #609.
2015-12-16 17:22:28 -05:00
Joseph Schorr
c888a8b3be Make GC timeout configurable 2015-12-16 15:45:02 -05:00
Jake Moshenko
2f626f2691 Unify the database connection lifecycle across all workers 2015-12-04 15:51:53 -05:00
Joseph Schorr
544fa40a5f Add a base class for a global worker that locks via Redis 2015-11-24 16:18:45 -05:00
Silas Sewell
1162814734 securityworker: mark children we can't analyze
This allows us to differentiate between images that are queued and those we
can't analyze in constant time.
2015-11-19 11:22:15 -05:00
Quentin Machu
88e85cded0 Fix security worker (again?) 2015-11-18 19:45:09 -05:00
Quentin Machu
7e9faa6c54 Add missing import 2015-11-18 17:39:27 -05:00
Quentin Machu
605ed1fc77 Refactor security worker 2015-11-18 14:38:32 -05:00
Jake Moshenko
0459c3bc54 Merge remote-tracking branch 'upstream/master' into python-registry-v2 2015-11-16 14:22:54 -05:00
Joseph Schorr
6412e145dd Fix key error 2015-11-13 13:16:33 -05:00
Jimmy Zelinskie
09ce33e0dc fix case where query broke on empty list 2015-11-13 12:35:18 -05:00
Joseph Schorr
927a0b639c Add check for empty locations list 2015-11-13 12:23:02 -05:00
Joseph Schorr
030c69d7d2 Further merge fixes 2015-11-12 22:00:28 -05:00
Joseph Schorr
7816b0c657 Merge master into vulnerability-tool 2015-11-12 21:52:47 -05:00
Joseph Schorr
25b8b7590f Fix all the things! 2015-11-12 20:55:41 -05:00
Jimmy Zelinskie
37ce84f6af tiny fixes to securityworker 2015-11-12 17:18:04 -05:00
Jimmy Zelinskie
f6a34c5d06 refactor securityworker
Fixes #772.
2015-11-12 16:03:10 -05:00
Jake Moshenko
ab340e20ea Merge remote-tracking branch 'upstream/master' into python-registry-v2 2015-11-11 16:41:40 -05:00
Joseph Schorr
ca7d736db2 Only send vulnerability events if the minimum priority is gte to that specified
Fixes #770
2015-11-10 16:05:55 -05:00
Jimmy Zelinskie
8e2868737b rename secscan_endpoint and move db close to API 2015-11-10 15:22:31 -05:00
Jimmy Zelinskie
da31714fb5 specify securityworker skip message 2015-11-10 15:22:30 -05:00
Jimmy Zelinskie
52962b3732 close db connections when calling out to clair 2015-11-10 15:22:30 -05:00
Jimmy Zelinskie
d651ea4b48 initial security notification worker 2015-11-10 15:22:30 -05:00
Quentin Machu
16c364a90c Rename secscan_endpoint where required, fix index and indentation 2015-11-09 15:18:42 -05:00
Joseph Schorr
2d2662f53f Fix deleting repos and images under MySQL
MySQL doesn't handle constraints at the end of transactions, so deleting images currently fails. This removes the constraint and just leaves parent_id as an int
2015-11-09 14:42:05 -05:00
Quentin Machu
7dbe15e339 Remove checksum from Clair's worker and adjust line length 2015-11-09 14:31:24 -05:00
Joseph Schorr
b408cfd2cc Ready for demo 2015-11-09 12:51:05 -05:00
Joseph Schorr
7fa4fe08e7 Fix worker 2015-11-09 12:50:39 -05:00
Joseph Schorr
407eaae137 WIP: Towards sec demo 2015-11-09 12:50:39 -05:00
Quentin Machu
37118423a5 Add support for Quay's vulnerability tool 2015-11-09 12:49:19 -05:00
Jake Moshenko
c2fcf8bead Merge remote-tracking branch 'upstream/phase4-11-07-2015' into python-registry-v2 2015-11-06 18:18:29 -05:00
Quentin Machu
af4511455f Remove .distinct() from these queries 2015-11-06 15:22:18 -05:00
Quentin Machu
3677947521 Add support for Quay's vulnerability tool 2015-11-06 15:22:18 -05:00
Quentin Machu
1b41200e49 Fix PostgresSQL compatibility and parent omittance securityworker 2015-11-06 15:22:18 -05:00
Quentin Machu
f59e35cc81 Add support for Quay's vulnerability tool 2015-11-06 15:22:18 -05:00
Jake Moshenko
9da64f3aba Stop writing to deprecated columns for image data. 2015-10-24 14:45:15 -04:00
Jake Moshenko
e7a6176594 Merge remote-tracking branch 'upstream/v2-phase4' into python-registry-v2 2015-10-22 16:59:28 -04:00
Jake Moshenko
ce94931540 Stop writing to deprecated columns for image data. 2015-10-22 12:14:39 -04:00
josephschorr
8e7b20a0d7 Merge pull request #675 from coreos-inc/distinctgc
Reduce GC work time and make sure to use distinct query
2015-10-21 12:01:26 -04:00
Silas Sewell
fd96f7c1e3 Merge pull request #667 from coreos-inc/error-georeplication-local-storage
workers.storagereplication: error on LocalStorage
2015-10-20 20:29:24 -04:00
Silas Sewell
03f5fe6143 workers.storagereplication: error on LocalStorage
Ensure we don't start when LocalStorage is in the config.

Fixes #502
2015-10-20 19:04:31 -04:00
Joseph Schorr
4e5c8a9281 Reduce GC work time and make sure to use distinct query 2015-10-20 18:13:29 -04:00
Joseph Schorr
5941f3937c Enable async GC for all
Fixes #569
2015-10-19 14:22:41 -04:00