Commit graph

223 commits

Author SHA1 Message Date
Joseph Schorr
a6fa08c19c Change returns to trollius returns 2015-01-29 18:21:32 -05:00
Joseph Schorr
0e5f6dc17d Fix typo in timed out 2015-01-29 18:13:31 -05:00
Joseph Schorr
60eae43ae4 Add the date time to the log entries 2015-01-29 18:05:05 -05:00
Joseph Schorr
ce3f8b438c Fix pull credentials bug, fix job details parse bug and add some better logging 2015-01-29 18:01:42 -05:00
Joseph Schorr
7ee00b83cb Switch to using a CloseForLongOperation around the sleep 2015-01-29 14:50:07 -05:00
Joseph Schorr
cf35da30bc Make sure to not hold DB connections open in the new build manager 2015-01-29 14:40:24 -05:00
Jake Moshenko
2e86417329 Allow the buildman server to die if an uncaught exception terminates the scheduler process. 2015-01-29 10:56:57 -05:00
Jake Moshenko
c308794063 Fix the enterprise manager to use the new coroutine based interface. 2015-01-29 10:56:18 -05:00
Joseph Schorr
d359c849cd Add the build worker and job count information to the charts 2015-01-28 17:12:33 -05:00
Jake Moshenko
0ddfd07749 Use the tiny registry-build-worker image. Bind mount in the root certificates so that Quay SSL certificates can be calidated. 2015-01-27 14:12:47 -05:00
Jake Moshenko
ef0806bd9d Make the logs for the build manager more bearable. 2015-01-26 15:27:39 -05:00
Joseph Schorr
be6701b310 Have the builder not start and stop, over and over, if not enabled 2015-01-26 14:13:55 -05:00
Jake Moshenko
86852da4ba Catch exceptions when ELB times out a connection to etcd. 2015-01-23 11:29:38 -05:00
Jake Moshenko
725808a4f8 Make the logs from the build manager more useful. 2015-01-23 11:29:15 -05:00
Jake Moshenko
265aeabf60 We need to tell the etcd client which protocol to use. 2015-01-22 16:59:04 -05:00
Jake Moshenko
f2471a86f6 Fix the python requirements. Add the ability to map in etcd client certs and ca. 2015-01-22 10:53:23 -05:00
Jake Moshenko
fc757fecad Tag the EC2 instances with the build uuid. 2015-01-05 15:35:14 -05:00
Jake Moshenko
dd7664328c Make the build manager ports configurable. 2015-01-05 15:09:03 -05:00
Jake Moshenko
8037962716 Change the severity of a log message which is actually expected in the happy case. 2015-01-05 14:44:54 -05:00
Jake Moshenko
f58b09a064 Remove the loop argument from the call to build_component_ready. 2015-01-05 13:08:25 -05:00
Jake Moshenko
320ae63ccd Handle the case where there are no realms registered. 2015-01-05 12:23:54 -05:00
Jake Moshenko
b33ee1a474 Register existing builders to watch their expirations. 2015-01-05 11:21:36 -05:00
Jake Moshenko
a9839021af When the etcd key tracking realms is first created the action is create, not set. 2014-12-31 11:46:02 -05:00
Jake Moshenko
cc70225043 Generalize the ephemeral build managers so that any manager may manage a builder spawned by any other manager. 2014-12-31 11:33:56 -05:00
Jake Moshenko
ccb19571d6 Try lowering the sleep on the shutdown timeout to avoid the service dispatch timeout built into systemd. 2014-12-23 17:42:47 -05:00
Jake Moshenko
ec87e37d8c EC2 terminate_instances does not take a force flag. 2014-12-23 17:17:53 -05:00
Jake Moshenko
1005c29b6b Fix the shutdown command for when the builder terminates itself. 2014-12-23 17:08:16 -05:00
Jake Moshenko
cece94e1da We want to terminate instances, not stop them. 2014-12-23 16:20:42 -05:00
Jake Moshenko
3ce64b4a7f We must yield from stop_builder. 2014-12-23 16:12:10 -05:00
Jake Moshenko
ef70432b11 We need to call build_finished async. 2014-12-23 16:04:10 -05:00
Jake Moshenko
8e16fbf59b The root device on CoreOS is /dev/xvda. 2014-12-23 15:41:58 -05:00
Jake Moshenko
2f2a88825d Try using SSD for root volumes. 2014-12-23 15:35:21 -05:00
Jake Moshenko
723fb27671 Calls to the ec2 service must be async, and responses must be wrapped as well. 2014-12-23 14:54:58 -05:00
Jake Moshenko
2ed9b3d243 Disable the etcd timeout on watch calls to prevent them from disconnecting the client. 2014-12-23 14:54:34 -05:00
Jake Moshenko
b2d7fad667 Fix a typo with the automatic node shutdown fallback in the ephemeral nodes. 2014-12-23 14:09:24 -05:00
Jake Moshenko
4e22e22ba1 We have to serialize our build data before sending it to etc. 2014-12-23 14:09:04 -05:00
Jake Moshenko
709e571b78 Handle read timeouts from etcd when watching a key. 2014-12-23 12:13:49 -05:00
Jake Moshenko
055a6b0c37 Add a total maximum time that a machine is allowed to stick around before we terminate it more forcefully. 2014-12-23 11:18:10 -05:00
Jake Moshenko
aac7feb20b Refresh the build_job from the database before we write updates. 2014-12-23 11:17:23 -05:00
Jake Moshenko
34bf92673b Add support for adjusting etcd ttl on job_heartbeat. Switch the heartbeat method to a coroutine. 2014-12-22 17:24:44 -05:00
Jake Moshenko
2b6c2a2a50 Improve tests for the ephemeral build manager. 2014-12-22 16:22:07 -05:00
Jake Moshenko
e53b6b0e21 Merge remote-tracking branch 'origin/master' into ephemeral 2014-12-22 12:14:59 -05:00
Jake Moshenko
12ee8e0fc0 Switch a few of the buildman methods to coroutines in order to support network calls in methods. Add a test for the ephemeral build manager. 2014-12-22 12:14:16 -05:00
Jake Moshenko
a280bbcb6d Add tag metadata to the instances. 2014-12-16 15:17:39 -05:00
Jake Moshenko
1d68594dc2 Extract instance ids from the instance objects returned by boto. 2014-12-16 15:10:50 -05:00
Jake Moshenko
2d7e844753 First implementation of ephemeral build lifecycle manager. 2014-12-16 13:41:30 -05:00
Jimmy Zelinskie
33f12c58ba Add active worker count to buildmanager logs. 2014-12-16 13:37:40 -05:00
Jimmy Zelinskie
37079315d2 use os.path.join when locating ssl certs 2014-12-16 13:19:35 -05:00
Joseph Schorr
00299ca60f We need to make sure to use the *full* command 2014-12-11 18:17:15 +02:00
Joseph Schorr
6601e83285 When speaking to version 0.2-beta of the build worker, properly lookup the cached commands and see if we have a matching image/tag in the repository 2014-12-11 18:03:40 +02:00
Jake Moshenko
fd825b82cd Update the buildman with the database job config post merge with nomenclature. 2014-12-08 14:43:32 -05:00
Joseph Schorr
5cb36fe053 Have the build manager sleep if the requested manager is external 2014-12-01 14:41:46 -05:00
Joseph Schorr
4f5bf8185a Add version checking to the python side 2014-12-01 12:11:23 -05:00
Jimmy Zelinskie
09cc4ba4c1 LOGGER -> logger.
While logger may be a global variable, it is not constant. Let the
linters complain!
2014-11-30 17:48:38 -05:00
Joseph Schorr
a8473db87f Make sure the realm is connected before heartbeat checks start. 2014-11-26 17:02:49 -05:00
Joseph Schorr
d91829dc3c Only start the build manager if building is enabled 2014-11-26 11:28:29 -05:00
Joseph Schorr
9d675b51ed - Change SSL to only be enabled via an environment variable. Nginx will be terminating SSL for the ER.
- Add the missing dependencies to the requirements.txt
- Change the builder ports to non-standard locations
- Add the /b1/socket and /b1/controller endpoints in nginx, to map to the build manager
- Have the build manager start automatically.
2014-11-25 18:08:18 -05:00
Joseph Schorr
04fc6d82a5 Add support for SSL if the certificate is found in the config directory 2014-11-25 16:36:21 -05:00
Joseph Schorr
660a640de6 Better organize the source file structure of the build manager and change it to choose a lifecycle manager based on the config 2014-11-25 16:14:44 -05:00
Joseph Schorr
b8e873b00b Add support to the build system for tracking if/when the build manager crashes and make sure builds are restarted within a few minutes 2014-11-21 14:27:06 -05:00
Jimmy Zelinskie
872c135205 Make ping method static.
Without being static or passing a self parameter, the worker will
receive a runtime WAMP error when they attempt to ping during a
health check, this marks them unhealthy every single time you
attempt a health check.
2014-11-20 16:19:02 -05:00
Jimmy Zelinskie
290c8abeb5 Make empty token more readable in logs.
Enterprises use "" for tokens. This was confusing to read in the logs
without making things more clear by adding quotes around the value.
2014-11-20 15:22:34 -05:00
Jimmy Zelinskie
d0763862b1 Simple code review changes.
I sneakily also added local-test.sh and renamed run-local to
local-run.sh.
2014-11-20 14:36:22 -05:00
Jimmy Zelinskie
0763f0d999 Initialize BaseComponent members in constructor 2014-11-19 13:17:53 -05:00
Joseph Schorr
a9fd516dad Disable WAMP debug 2014-11-18 16:35:03 -05:00
Joseph Schorr
63f2e7794f Various small fixes 2014-11-18 16:34:09 -05:00
Jimmy Zelinskie
6df6f28edf Lint BuildManager 2014-11-18 15:45:56 -05:00
Joseph Schorr
043a30ee96 Add a heartbeat to the build status, so we know if a manager crashed 2014-11-14 15:31:02 -05:00
Joseph Schorr
01dc10b8fc Remove server hostname hack 2014-11-14 15:05:49 -05:00
Joseph Schorr
cfc6b196a4 - Extra the build component statuses into an enum
- Add a ping method so the workers can verify the state of the controller
- Fix a bug with current_step and 0 values
- Rename the build status var to phase, to make it more distinct from the controller status
2014-11-14 14:53:35 -05:00
Joseph Schorr
4322b5f81c Get the new build system working for enterprise 2014-11-13 19:41:17 -05:00
Joseph Schorr
f93c0a46e8 WIP: Get everything working except logging and job completion 2014-11-12 14:03:07 -05:00
Joseph Schorr
eacf3f01d2 WIP: Start implementation of the build manager/controller. This code is not yet working completely. 2014-11-11 18:23:15 -05:00