Commit graph

216 commits

Author SHA1 Message Date
Jake Moshenko
c4b637521c Remove Matt Jibson's public key 2015-11-23 18:18:42 -05:00
Matt Jibson
2325328bbd Update mjibson ssh key 2015-11-06 15:34:52 -05:00
Jimmy Zelinskie
e973289397 Revert "Revert "Merge pull request #682 from jzelinskie/revertrevert""
This reverts commit 278bc736e3.
2015-10-23 15:26:33 -04:00
Jimmy Zelinskie
278bc736e3 Revert "Merge pull request #682 from jzelinskie/revertrevert"
This reverts commit 627ad25c9c, reversing
changes made to 31c392fecc.
2015-10-22 16:02:07 -04:00
Jimmy Zelinskie
46b2f10d7f check for VPC subnet ID before using builder VPC
This means you can use legacy networking machines by simply changing the
instance type and removing the specified 'EC2_VPC_SUBNET_ID' from the
executor config.
2015-10-22 14:50:54 -04:00
Jimmy Zelinskie
39cfe77d42 Revert "Merge pull request #557 from coreos-inc/revert-migration"
This reverts commit c4f938898a, reversing
changes made to 7ad2522dbe.
2015-10-21 15:29:57 -04:00
Joseph Schorr
0f37e66cc8 Better error handling for the build manager
Fixes #604
2015-10-13 11:40:07 -04:00
Matt Jibson
87cc3289a0 Remove transaction from metric reporting 2015-10-06 01:28:43 -04:00
Joseph Schorr
752d05dedb Add exception logging to the build manager
Fixes #547
2015-09-30 15:49:35 -04:00
Joseph Schorr
2d3092b826 Make build system resistant to Redis being broken
Fixes #549
2015-09-30 15:15:10 -04:00
Silas Sewell
9000169b53 Revert "Merge pull request #491 from jakedt/migratebackp2"
This reverts commit 7ad2522dbe, reversing
changes made to a0b191ffa1.
2015-09-28 16:09:22 -04:00
josephschorr
7ad2522dbe Merge pull request #491 from jakedt/migratebackp2
Migrate image data back phase 2
2015-09-26 15:11:46 -04:00
Matt Jibson
bba1557437 Monitor queue adds and EC2 node starts
fixes #157
see #304
2015-09-18 16:21:16 -04:00
Jake Moshenko
8baacd2741 Migrate old data to new locations, read only new. 2015-09-17 15:47:13 -04:00
Jimmy Zelinskie
cb6b6c4091 buildman: add silas keys to builders 2015-09-09 16:53:19 -04:00
Jimmy Zelinskie
0365831015 add barakmich, quentin, mjibson keys to builders
Fixes coreos-inc/quay-policies#38
2015-08-27 11:42:53 -04:00
Jimmy Zelinskie
239f76d39f Merge pull request #368 from coreos-inc/buildarchive
Allow builds to be started with an external archive URL
2015-08-17 17:09:14 -04:00
Joseph Schorr
f092c00621 Allow builds to be started with an external archive URL
Fixes #114
2015-08-17 17:01:49 -04:00
Matt Jibson
cfb6e884f2 Refactor metric collection
This change adds a generic queue onto which metrics can be pushed. A
separate module removes metrics from the queue and adds them to Cloudwatch.
Since these are now separate ideas, we can easily change the consumer from
Cloudwatch to anything else.

This change maintains near feature parity (the only change is there is now
just one queue instead of two - not a big deal).
2015-08-12 12:15:52 -04:00
Jake Moshenko
18100be481 Refactor the util directory to use subpackages. 2015-08-03 16:04:19 -04:00
Jimmy Zelinskie
7dbcbe4706 Merge pull request #234 from coreos-inc/morespace
Increase the HD size on the build nodes
2015-07-27 15:35:45 -04:00
Jake Moshenko
3efaa255e8 Accidental refactor, split out legacy.py into separate sumodules and update all call sites. 2015-07-17 11:56:15 -04:00
Joseph Schorr
04cc471585 Increase the HD size on the build nodes
Fixes #228
2015-07-14 15:20:17 +03:00
Joseph Schorr
d842881608 Don't None the build_status, as it might still be used later 2015-07-14 12:49:03 +03:00
Joseph Schorr
e06435fee4 Record phase information and make better error messages on pull failure 2015-06-30 18:04:44 +03:00
Joseph Schorr
6655c7f745 Add exception handling that doesn't log the read-timeout exception
Note: This is a *hack* and needs to be replaced with proper code ASAP
2015-06-25 23:35:29 -04:00
Joseph Schorr
6e6610f31a Switch to a 30s maximum timeout 2015-06-25 23:08:49 -04:00
Joseph Schorr
bead839abd Make sure build components timeout if the initial connection fails 2015-06-25 22:13:01 -04:00
Joseph Schorr
ecebc06343 Update comment now that restarter is abstracted 2015-06-25 21:53:42 -04:00
Joseph Schorr
9f5f71398c Abstract out the concept of a restart function 2015-06-25 21:40:50 -04:00
Joseph Schorr
52fa9aad5b Fix etcd watching
Etcd can miss events on watches if they are occurring fast enough, so if we can get an exception indicating that we've missed an index, we reset the state of our local tracking structures by re-reading the *full* list and starting a new watch at HEAD
2015-06-25 21:22:39 -04:00
Jimmy Zelinskie
1195e3ec7c buildman: rm coroutine decorator from subscribers
Python isn't able to figure out that these are generators and properly
handle theme.
2015-06-24 17:38:29 -04:00
josephschorr
2ade08468d Merge pull request #168 from coreos-inc/etcdindex
Fix ephemeral build manager to ask for watches in index order with no gaps
2015-06-23 17:12:18 -04:00
Joseph Schorr
b4c39e8ec0 Fix ephemeral build manager to ask for watches in index order with no gaps 2015-06-23 17:11:46 -04:00
Jimmy Zelinskie
18aa7b6c1e buildcomponent: use consistent trollius imports 2015-06-23 17:03:26 -04:00
Jimmy Zelinskie
197f3b9b85 buildman: fix ER failing to heartbeat 2015-06-22 18:12:20 -04:00
Jimmy Zelinskie
82287926ab Merge pull request #140 from coreos-inc/eventinfo
Add more build information to the events and have better messaging
2015-06-17 16:49:59 -04:00
Joseph Schorr
c2dc1c9b75 Handle case where etcd key is already removed on job complete 2015-06-17 15:02:58 -04:00
Jimmy Zelinskie
177b96e965 builder: add missing 'yield from' coroutine 2015-06-17 14:16:27 -04:00
Jimmy Zelinskie
59aba93514 builder: update heartbeat timestamp on log message 2015-06-17 14:16:27 -04:00
Joseph Schorr
9b974f6b80 Add more build information to the events and have better messaging
Fixes #79
2015-06-16 23:16:36 -04:00
Jake Moshenko
c435f5c127 Add a comment about why we are taking a lock when terminating a builder machine. 2015-06-10 16:19:51 -04:00
Jake Moshenko
f767fc4d03 Track whether builders ever came online in etcd. Mark builds which never successfully heartbeated as incomplete. 2015-06-10 16:19:51 -04:00
Jake Moshenko
79f1181a63 Switch build-scheduled to an official build phase. 2015-06-10 16:19:51 -04:00
Jake Moshenko
884fedd229 Improve the log messages in the buildman. 2015-06-10 16:19:51 -04:00
Jake Moshenko
d31e25d5cd Allow the individual build manager types to specify how long the queue should wait before retring a job that fails to schedule. 2015-06-10 16:19:50 -04:00
Jimmy Zelinskie
b7303665a2 Merge pull request #111 from coreos-inc/incompletefix
Requeue build jobs after the work check timeout + some additional padding.
2015-06-09 20:44:40 -04:00
Joseph Schorr
24ce0decd9 Requeue build jobs after the work check timeout + some additional padding. This ensures that if a build somehow gets wedged, other builds can continue to be picked up. 2015-06-09 20:43:48 -04:00
Joseph Schorr
f82831bff6 Log the etcd exception so we can debug this issue 2015-06-09 20:33:55 -04:00
Jimmy Zelinskie
7f4dd7d42f triggers: backwards compatible schema for metadata 2015-06-02 16:05:17 -04:00