Commit graph

236 commits

Author SHA1 Message Date
Joseph Schorr
52fa9aad5b Fix etcd watching
Etcd can miss events on watches if they are occurring fast enough, so if we can get an exception indicating that we've missed an index, we reset the state of our local tracking structures by re-reading the *full* list and starting a new watch at HEAD
2015-06-25 21:22:39 -04:00
Jimmy Zelinskie
1195e3ec7c buildman: rm coroutine decorator from subscribers
Python isn't able to figure out that these are generators and properly
handle theme.
2015-06-24 17:38:29 -04:00
josephschorr
2ade08468d Merge pull request #168 from coreos-inc/etcdindex
Fix ephemeral build manager to ask for watches in index order with no gaps
2015-06-23 17:12:18 -04:00
Joseph Schorr
b4c39e8ec0 Fix ephemeral build manager to ask for watches in index order with no gaps 2015-06-23 17:11:46 -04:00
Jimmy Zelinskie
18aa7b6c1e buildcomponent: use consistent trollius imports 2015-06-23 17:03:26 -04:00
Jimmy Zelinskie
197f3b9b85 buildman: fix ER failing to heartbeat 2015-06-22 18:12:20 -04:00
Jimmy Zelinskie
82287926ab Merge pull request #140 from coreos-inc/eventinfo
Add more build information to the events and have better messaging
2015-06-17 16:49:59 -04:00
Joseph Schorr
c2dc1c9b75 Handle case where etcd key is already removed on job complete 2015-06-17 15:02:58 -04:00
Jimmy Zelinskie
177b96e965 builder: add missing 'yield from' coroutine 2015-06-17 14:16:27 -04:00
Jimmy Zelinskie
59aba93514 builder: update heartbeat timestamp on log message 2015-06-17 14:16:27 -04:00
Joseph Schorr
9b974f6b80 Add more build information to the events and have better messaging
Fixes #79
2015-06-16 23:16:36 -04:00
Jake Moshenko
c435f5c127 Add a comment about why we are taking a lock when terminating a builder machine. 2015-06-10 16:19:51 -04:00
Jake Moshenko
f767fc4d03 Track whether builders ever came online in etcd. Mark builds which never successfully heartbeated as incomplete. 2015-06-10 16:19:51 -04:00
Jake Moshenko
79f1181a63 Switch build-scheduled to an official build phase. 2015-06-10 16:19:51 -04:00
Jake Moshenko
884fedd229 Improve the log messages in the buildman. 2015-06-10 16:19:51 -04:00
Jake Moshenko
d31e25d5cd Allow the individual build manager types to specify how long the queue should wait before retring a job that fails to schedule. 2015-06-10 16:19:50 -04:00
Jimmy Zelinskie
b7303665a2 Merge pull request #111 from coreos-inc/incompletefix
Requeue build jobs after the work check timeout + some additional padding.
2015-06-09 20:44:40 -04:00
Joseph Schorr
24ce0decd9 Requeue build jobs after the work check timeout + some additional padding. This ensures that if a build somehow gets wedged, other builds can continue to be picked up. 2015-06-09 20:43:48 -04:00
Joseph Schorr
f82831bff6 Log the etcd exception so we can debug this issue 2015-06-09 20:33:55 -04:00
Jimmy Zelinskie
7f4dd7d42f triggers: backwards compatible schema for metadata 2015-06-02 16:05:17 -04:00
Jimmy Zelinskie
e01bdd4ab0 triggers: metadata.commit_sha -> metadata.commit
This resolves an issue where the custom-git trigger's public facing
schema was not the same as the internal metadata schema. Instead of
breaking users, we rework the internal metadata schema to be the same as
the custom-git JSON schema. This commit also updates everything that
used `metadata.commit_sha` including the test database.
2015-06-02 15:32:28 -04:00
Joseph Schorr
5589bfc6d5 - Have the heartbeat fail to update if the worker has timed out
- Add additional build component logging for tracking down problems in the future
2015-05-22 15:24:14 -04:00
Jimmy Zelinskie
db05db6295 cloudconfig: flatten logentries container 2015-05-20 16:34:16 -04:00
Joseph Schorr
598fc6ec46 Add the error code to the worker error logged to redis 2015-05-18 15:01:48 -04:00
Joseph Schorr
91b464d0de Switch build manager to always just WARN on boto 2015-05-18 12:34:26 -04:00
Jimmy Zelinskie
86f400fdf5 buildman: fix btrfs mounting in worker cloudconfig 2015-05-13 17:40:35 -04:00
Jimmy Zelinskie
6a5cecebc5 buildman: create and mount btrfs volume for docker
There are numerous issues with overlayfs that actually aren't present with
btrfs. Btrfs seems to have long-running issues, but our builders are
ephemeral. Example issue: https://github.com/docker/docker/issues/10180
2015-05-12 17:42:34 -04:00
Jimmy Zelinskie
9f31bdd571 buildman: add new io.quay.builder.gitfailure error 2015-05-11 15:25:22 -04:00
Jimmy Zelinskie
15fdae6688 buildman: show base error for buildpack failures
Whereas before these were reserved only for S3 errors, users need these
specifics to debug custom-git configurations.
2015-05-11 14:18:48 -04:00
Joseph Schorr
31260d50f5 Rename the new images method to a slightly better name 2015-04-24 16:37:37 -04:00
Joseph Schorr
e70343d849 Faster cache lookup by removing a join with the ImagePlacementTable, removing the extra loop to add the locations and filtering the images looked up by the base image 2015-04-24 16:22:19 -04:00
Jimmy Zelinskie
02498d72ba almost all PR discussion fixes 2015-04-21 18:04:25 -04:00
Jimmy Zelinskie
ba2cb08904 Merge branch 'master' into git 2015-04-16 17:38:35 -04:00
Jake Moshenko
b10fd4ff22 Tell the journal on the builders to listen on the proper socket. 2015-03-27 16:31:35 -04:00
Jake Moshenko
6eead7c860 Add logentries reporting to the ephemeral builders. 2015-03-27 15:28:08 -04:00
Jake Moshenko
0349f3f1a3 Handle the case where YAML config returns a list not a tuple. 2015-03-26 14:53:56 -04:00
Jimmy Zelinskie
cd1b003ca6 buildcomponent: handle builds without resource_key 2015-03-23 15:46:23 -04:00
Jimmy Zelinskie
d29c8d60c7 trigger: pass trigger into manual_start & handle_trigger_request 2015-03-23 12:14:47 -04:00
Jimmy Zelinskie
b851986cf5 add git_url to metadata, add git to buildargs 2015-03-19 18:09:27 -04:00
Jimmy Zelinskie
b35f6ed25c buildman: add git_key buildconfig parameter 2015-03-16 13:18:18 -04:00
Jimmy Zelinskie
4c8814866c buildman: add git_url to build_config 2015-03-13 14:58:05 -04:00
Jimmy Zelinskie
8589871f43 buildman: rm unused imports 2015-03-09 13:04:16 -04:00
Jake Moshenko
5c68e52fce Really really fix the exception handling. 2015-02-27 17:33:46 -05:00
Jake Moshenko
cf5bc6f0be Properly catch multiple exceptions. 2015-02-27 17:32:10 -05:00
Jake Moshenko
857c3e2959 Start catching etcd key errors as well. 2015-02-27 17:10:15 -05:00
Joseph Schorr
d973f9df45 Reenable metrics until we know they are the problem 2015-02-25 16:00:46 -05:00
Joseph Schorr
bdb84f1c20 Merge branch 'master' of github.com:coreos-inc/quay 2015-02-25 16:00:17 -05:00
Joseph Schorr
4551b3a957 Remove the boto timeout set (doesn't work anyway) and add some better logging to the scheduler 2015-02-25 16:00:14 -05:00
Jimmy Zelinskie
090a198afc temporarily comment out metrics 2015-02-25 15:29:35 -05:00
Jimmy Zelinskie
db79ad2dde unused import 2015-02-25 15:26:36 -05:00
Joseph Schorr
5dd78f76c7 Add additional logging, timeouts, and exception checks 2015-02-25 15:15:22 -05:00
Jimmy Zelinskie
328de0201f Merge branch 'master' of github.com:coreos-inc/quay 2015-02-25 13:56:05 -05:00
Jimmy Zelinskie
346d6b933a buildman: initialize queuemetrics asynchronously 2015-02-25 13:55:18 -05:00
Joseph Schorr
2eaec092f0 Handle the case where we cannot write the tags on the build nodes 2015-02-25 13:47:36 -05:00
Joseph Schorr
390f8df4ad Make sure the build manager dies on an unhandled schedule exception 2015-02-25 12:19:21 -05:00
Joseph Schorr
afe7e14254 Add better exception handling and logging to the ephemeral build manager 2015-02-25 12:09:14 -05:00
Joseph Schorr
b7901d2adb Add trigger metadata (which includes the SHA) and the built image_id to the event data 2015-02-24 15:13:51 -05:00
Jimmy Zelinskie
47f8cb77c4 Merge pull request #11 from coreos-inc/nimbus
CloudWatch for build job status
2015-02-18 17:17:28 -05:00
Jimmy Zelinskie
9ab3554226 buildreporter: does not execute in a coroutine! 2015-02-18 17:11:45 -05:00
Jimmy Zelinskie
0d38e0b00b metrics: use config['name'] to get metric conf 2015-02-18 16:05:36 -05:00
Jimmy Zelinskie
f53dea46b7 buildman: address PR #11 comments 2015-02-18 14:13:36 -05:00
Joseph Schorr
524705b88c Get dashboard working and upgrade bootstrap. Note: the bootstrap fixes will be coming in the followup CL 2015-02-17 19:15:54 -05:00
Jimmy Zelinskie
5790d7d8cc buildman: build_metrics call correct method 2015-02-17 17:03:12 -05:00
Jimmy Zelinskie
1a71925125 buildreporter: remove unused logging 2015-02-17 17:02:37 -05:00
Jimmy Zelinskie
85edb651e2 buildserver: remove pylint comments 2015-02-17 15:32:25 -05:00
Jimmy Zelinskie
d70c95e42e buildreporter: move reporting into server callback 2015-02-17 15:31:53 -05:00
Jimmy Zelinskie
25fc999d50 buildreporter: handle app=None 2015-02-17 15:30:09 -05:00
Jimmy Zelinskie
b8d9ef0fe9 buildman: remove old create_task for queue metrics 2015-02-17 14:18:32 -05:00
Jimmy Zelinskie
935db5c766 buildman: clarify queue metrics from job state metrics 2015-02-17 12:23:08 -05:00
Jimmy Zelinskie
ffb897dfe6 buildman: add job status logging to managers 2015-02-17 12:22:23 -05:00
Jimmy Zelinskie
ca0d2b1721 buildreporter: getattr method 2015-02-17 12:21:22 -05:00
Jimmy Zelinskie
0a00453024 buildreporter: rm pylint comments 2015-02-17 12:20:46 -05:00
Jimmy Zelinskie
0e7418ffce buildman: add BuildMetrics and BuildReporter 2015-02-17 10:56:09 -05:00
Joseph Schorr
fbdbc21eb1 Merge branch 'master' into quark 2015-02-13 16:24:53 -05:00
Jimmy Zelinskie
6a3d269574 buildman: update metrics task 2015-02-13 11:25:29 -05:00
Joseph Schorr
ae8bb5fc13 Add preparing build node status item and change the build status colors to be variations on a blue color 2015-02-12 16:38:43 -05:00
Joseph Schorr
f84d1bad45 Handle internal errors in a better fashion: If a build would be marked as internal error, only do so if there are retries remaining. Otherwise, we mark it as failed (since it won't be rebuilt anyway) 2015-02-12 16:19:44 -05:00
Joseph Schorr
f107b50a46 Merge branch 'master' into ackbar 2015-02-12 12:04:45 -05:00
Joseph Schorr
f796c281d5 Remove support for v0.2 2015-02-11 17:12:53 -05:00
Joseph Schorr
e1a15464a1 Fix typo, add some logging and fix command comparison 2015-02-11 16:02:36 -05:00
Joseph Schorr
893ae46dec Add an ImageTree class and change to searching *all applicable* branches when looking for the best cache tag. 2015-02-10 21:46:58 -05:00
Joseph Schorr
98b4f62ef7 Switch to using a squashed image for the build workers 2015-02-10 15:43:01 -05:00
Joseph Schorr
045614c6c8 Merge branch 'master' into ackbar 2015-02-09 17:16:42 -05:00
Joseph Schorr
6b9464c999 Add support for 0.3 (the new builder version) 2015-02-09 16:59:21 -05:00
Joseph Schorr
9f1ec9d47d Fix loading of partial caching under a tag 2015-02-09 16:29:15 -05:00
Joseph Schorr
b0e315c332 Fix issues in cache comment comparison 2015-02-09 15:48:36 -05:00
Joseph Schorr
9b0e43514b Fix typos 2015-02-09 14:53:18 -05:00
Joseph Schorr
384d0eba6f Fix cache command argument 2015-02-09 14:12:24 -05:00
Joseph Schorr
6cb1212da6 Add logging 2015-02-09 13:54:14 -05:00
Joseph Schorr
4310f47dee Some code cleanup in the cached tag determination code 2015-02-09 12:16:43 -05:00
Joseph Schorr
0065ac8503 Add back in the cache checking code and remove the old 0.1 build pack code 2015-02-09 12:13:40 -05:00
Joseph Schorr
48949627e0 Merge master in delta 2015-02-09 12:07:43 -05:00
Joseph Schorr
9dfe523615 Merge master changes 2015-02-05 13:11:16 -05:00
Jimmy Zelinskie
c7c5377285 Add my key back to the ephemeral builder machines. 2015-02-05 12:51:02 -05:00
Joseph Schorr
5fedd74399 Remove Jake's key 2015-02-04 21:31:26 -05:00
Jake Moshenko
a952d0b1ce Merge branch 'master' of github.com:coreos-inc/quay 2015-02-04 11:59:27 -05:00
Jake Moshenko
5b8d65991e Update the space on the builder nodes because its cheap. 2015-02-04 11:58:58 -05:00
Joseph Schorr
9ffb53cd47 Add support for v2 of the build worker, which performs the Dockerfile parsing on its own. Note that this version is backwards compatible with v1-beta of the build worker, so it should be pushed first. Also note that this version is temporary until such time as we get the caching branches merged. 2015-02-03 21:05:18 -05:00
Joseph Schorr
a1938593a9 Better handling of retries on build errors 2015-02-03 16:29:47 -05:00
Joseph Schorr
3bf5e93f06 Remove log statement 2015-02-03 16:06:23 -05:00