Commit graph

37 commits

Author SHA1 Message Date
Joseph Schorr
b4c39e8ec0 Fix ephemeral build manager to ask for watches in index order with no gaps 2015-06-23 17:11:46 -04:00
Joseph Schorr
c2dc1c9b75 Handle case where etcd key is already removed on job complete 2015-06-17 15:02:58 -04:00
Jake Moshenko
c435f5c127 Add a comment about why we are taking a lock when terminating a builder machine. 2015-06-10 16:19:51 -04:00
Jake Moshenko
f767fc4d03 Track whether builders ever came online in etcd. Mark builds which never successfully heartbeated as incomplete. 2015-06-10 16:19:51 -04:00
Jake Moshenko
884fedd229 Improve the log messages in the buildman. 2015-06-10 16:19:51 -04:00
Jake Moshenko
d31e25d5cd Allow the individual build manager types to specify how long the queue should wait before retring a job that fails to schedule. 2015-06-10 16:19:50 -04:00
Joseph Schorr
f82831bff6 Log the etcd exception so we can debug this issue 2015-06-09 20:33:55 -04:00
Jake Moshenko
0349f3f1a3 Handle the case where YAML config returns a list not a tuple. 2015-03-26 14:53:56 -04:00
Jimmy Zelinskie
8589871f43 buildman: rm unused imports 2015-03-09 13:04:16 -04:00
Jake Moshenko
5c68e52fce Really really fix the exception handling. 2015-02-27 17:33:46 -05:00
Jake Moshenko
cf5bc6f0be Properly catch multiple exceptions. 2015-02-27 17:32:10 -05:00
Jake Moshenko
857c3e2959 Start catching etcd key errors as well. 2015-02-27 17:10:15 -05:00
Joseph Schorr
4551b3a957 Remove the boto timeout set (doesn't work anyway) and add some better logging to the scheduler 2015-02-25 16:00:14 -05:00
Joseph Schorr
5dd78f76c7 Add additional logging, timeouts, and exception checks 2015-02-25 15:15:22 -05:00
Joseph Schorr
afe7e14254 Add better exception handling and logging to the ephemeral build manager 2015-02-25 12:09:14 -05:00
Joseph Schorr
524705b88c Get dashboard working and upgrade bootstrap. Note: the bootstrap fixes will be coming in the followup CL 2015-02-17 19:15:54 -05:00
Jake Moshenko
a4b0c8698d Allow the key prefixes in etcd to be configurable. 2015-02-02 12:00:19 -05:00
Jake Moshenko
ef0806bd9d Make the logs for the build manager more bearable. 2015-01-26 15:27:39 -05:00
Jake Moshenko
86852da4ba Catch exceptions when ELB times out a connection to etcd. 2015-01-23 11:29:38 -05:00
Jake Moshenko
265aeabf60 We need to tell the etcd client which protocol to use. 2015-01-22 16:59:04 -05:00
Jake Moshenko
f2471a86f6 Fix the python requirements. Add the ability to map in etcd client certs and ca. 2015-01-22 10:53:23 -05:00
Jake Moshenko
fc757fecad Tag the EC2 instances with the build uuid. 2015-01-05 15:35:14 -05:00
Jake Moshenko
8037962716 Change the severity of a log message which is actually expected in the happy case. 2015-01-05 14:44:54 -05:00
Jake Moshenko
f58b09a064 Remove the loop argument from the call to build_component_ready. 2015-01-05 13:08:25 -05:00
Jake Moshenko
320ae63ccd Handle the case where there are no realms registered. 2015-01-05 12:23:54 -05:00
Jake Moshenko
b33ee1a474 Register existing builders to watch their expirations. 2015-01-05 11:21:36 -05:00
Jake Moshenko
a9839021af When the etcd key tracking realms is first created the action is create, not set. 2014-12-31 11:46:02 -05:00
Jake Moshenko
cc70225043 Generalize the ephemeral build managers so that any manager may manage a builder spawned by any other manager. 2014-12-31 11:33:56 -05:00
Jake Moshenko
3ce64b4a7f We must yield from stop_builder. 2014-12-23 16:12:10 -05:00
Jake Moshenko
2ed9b3d243 Disable the etcd timeout on watch calls to prevent them from disconnecting the client. 2014-12-23 14:54:34 -05:00
Jake Moshenko
4e22e22ba1 We have to serialize our build data before sending it to etc. 2014-12-23 14:09:04 -05:00
Jake Moshenko
709e571b78 Handle read timeouts from etcd when watching a key. 2014-12-23 12:13:49 -05:00
Jake Moshenko
055a6b0c37 Add a total maximum time that a machine is allowed to stick around before we terminate it more forcefully. 2014-12-23 11:18:10 -05:00
Jake Moshenko
34bf92673b Add support for adjusting etcd ttl on job_heartbeat. Switch the heartbeat method to a coroutine. 2014-12-22 17:24:44 -05:00
Jake Moshenko
2b6c2a2a50 Improve tests for the ephemeral build manager. 2014-12-22 16:22:07 -05:00
Jake Moshenko
12ee8e0fc0 Switch a few of the buildman methods to coroutines in order to support network calls in methods. Add a test for the ephemeral build manager. 2014-12-22 12:14:16 -05:00
Jake Moshenko
2d7e844753 First implementation of ephemeral build lifecycle manager. 2014-12-16 13:41:30 -05:00