Joseph Schorr
bdb84f1c20
Merge branch 'master' of github.com:coreos-inc/quay
2015-02-25 16:00:17 -05:00
Joseph Schorr
4551b3a957
Remove the boto timeout set (doesn't work anyway) and add some better logging to the scheduler
2015-02-25 16:00:14 -05:00
Jimmy Zelinskie
090a198afc
temporarily comment out metrics
2015-02-25 15:29:35 -05:00
Jimmy Zelinskie
db79ad2dde
unused import
2015-02-25 15:26:36 -05:00
Joseph Schorr
5dd78f76c7
Add additional logging, timeouts, and exception checks
2015-02-25 15:15:22 -05:00
Jimmy Zelinskie
328de0201f
Merge branch 'master' of github.com:coreos-inc/quay
2015-02-25 13:56:05 -05:00
Jimmy Zelinskie
346d6b933a
buildman: initialize queuemetrics asynchronously
2015-02-25 13:55:18 -05:00
Joseph Schorr
2eaec092f0
Handle the case where we cannot write the tags on the build nodes
2015-02-25 13:47:36 -05:00
Joseph Schorr
390f8df4ad
Make sure the build manager dies on an unhandled schedule exception
2015-02-25 12:19:21 -05:00
Joseph Schorr
afe7e14254
Add better exception handling and logging to the ephemeral build manager
2015-02-25 12:09:14 -05:00
Joseph Schorr
b7901d2adb
Add trigger metadata (which includes the SHA) and the built image_id to the event data
2015-02-24 15:13:51 -05:00
Jimmy Zelinskie
47f8cb77c4
Merge pull request #11 from coreos-inc/nimbus
...
CloudWatch for build job status
2015-02-18 17:17:28 -05:00
Jimmy Zelinskie
9ab3554226
buildreporter: does not execute in a coroutine!
2015-02-18 17:11:45 -05:00
Jimmy Zelinskie
0d38e0b00b
metrics: use config['name'] to get metric conf
2015-02-18 16:05:36 -05:00
Jimmy Zelinskie
f53dea46b7
buildman: address PR #11 comments
2015-02-18 14:13:36 -05:00
Joseph Schorr
524705b88c
Get dashboard working and upgrade bootstrap. Note: the bootstrap fixes will be coming in the followup CL
2015-02-17 19:15:54 -05:00
Jimmy Zelinskie
5790d7d8cc
buildman: build_metrics call correct method
2015-02-17 17:03:12 -05:00
Jimmy Zelinskie
1a71925125
buildreporter: remove unused logging
2015-02-17 17:02:37 -05:00
Jimmy Zelinskie
85edb651e2
buildserver: remove pylint comments
2015-02-17 15:32:25 -05:00
Jimmy Zelinskie
d70c95e42e
buildreporter: move reporting into server callback
2015-02-17 15:31:53 -05:00
Jimmy Zelinskie
25fc999d50
buildreporter: handle app=None
2015-02-17 15:30:09 -05:00
Jimmy Zelinskie
b8d9ef0fe9
buildman: remove old create_task for queue metrics
2015-02-17 14:18:32 -05:00
Jimmy Zelinskie
935db5c766
buildman: clarify queue metrics from job state metrics
2015-02-17 12:23:08 -05:00
Jimmy Zelinskie
ffb897dfe6
buildman: add job status logging to managers
2015-02-17 12:22:23 -05:00
Jimmy Zelinskie
ca0d2b1721
buildreporter: getattr method
2015-02-17 12:21:22 -05:00
Jimmy Zelinskie
0a00453024
buildreporter: rm pylint comments
2015-02-17 12:20:46 -05:00
Jimmy Zelinskie
0e7418ffce
buildman: add BuildMetrics and BuildReporter
2015-02-17 10:56:09 -05:00
Joseph Schorr
fbdbc21eb1
Merge branch 'master' into quark
2015-02-13 16:24:53 -05:00
Jimmy Zelinskie
6a3d269574
buildman: update metrics task
2015-02-13 11:25:29 -05:00
Joseph Schorr
ae8bb5fc13
Add preparing build node status item and change the build status colors to be variations on a blue color
2015-02-12 16:38:43 -05:00
Joseph Schorr
f84d1bad45
Handle internal errors in a better fashion: If a build would be marked as internal error, only do so if there are retries remaining. Otherwise, we mark it as failed (since it won't be rebuilt anyway)
2015-02-12 16:19:44 -05:00
Joseph Schorr
f107b50a46
Merge branch 'master' into ackbar
2015-02-12 12:04:45 -05:00
Joseph Schorr
f796c281d5
Remove support for v0.2
2015-02-11 17:12:53 -05:00
Joseph Schorr
e1a15464a1
Fix typo, add some logging and fix command comparison
2015-02-11 16:02:36 -05:00
Joseph Schorr
893ae46dec
Add an ImageTree class and change to searching *all applicable* branches when looking for the best cache tag.
2015-02-10 21:46:58 -05:00
Joseph Schorr
98b4f62ef7
Switch to using a squashed image for the build workers
2015-02-10 15:43:01 -05:00
Joseph Schorr
045614c6c8
Merge branch 'master' into ackbar
2015-02-09 17:16:42 -05:00
Joseph Schorr
6b9464c999
Add support for 0.3 (the new builder version)
2015-02-09 16:59:21 -05:00
Joseph Schorr
9f1ec9d47d
Fix loading of partial caching under a tag
2015-02-09 16:29:15 -05:00
Joseph Schorr
b0e315c332
Fix issues in cache comment comparison
2015-02-09 15:48:36 -05:00
Joseph Schorr
9b0e43514b
Fix typos
2015-02-09 14:53:18 -05:00
Joseph Schorr
384d0eba6f
Fix cache command argument
2015-02-09 14:12:24 -05:00
Joseph Schorr
6cb1212da6
Add logging
2015-02-09 13:54:14 -05:00
Joseph Schorr
4310f47dee
Some code cleanup in the cached tag determination code
2015-02-09 12:16:43 -05:00
Joseph Schorr
0065ac8503
Add back in the cache checking code and remove the old 0.1 build pack code
2015-02-09 12:13:40 -05:00
Joseph Schorr
48949627e0
Merge master in delta
2015-02-09 12:07:43 -05:00
Joseph Schorr
9dfe523615
Merge master changes
2015-02-05 13:11:16 -05:00
Jimmy Zelinskie
c7c5377285
Add my key back to the ephemeral builder machines.
2015-02-05 12:51:02 -05:00
Joseph Schorr
5fedd74399
Remove Jake's key
2015-02-04 21:31:26 -05:00
Jake Moshenko
a952d0b1ce
Merge branch 'master' of github.com:coreos-inc/quay
2015-02-04 11:59:27 -05:00
Jake Moshenko
5b8d65991e
Update the space on the builder nodes because its cheap.
2015-02-04 11:58:58 -05:00
Joseph Schorr
9ffb53cd47
Add support for v2 of the build worker, which performs the Dockerfile parsing on its own. Note that this version is backwards compatible with v1-beta of the build worker, so it should be pushed first. Also note that this version is temporary until such time as we get the caching branches merged.
2015-02-03 21:05:18 -05:00
Joseph Schorr
a1938593a9
Better handling of retries on build errors
2015-02-03 16:29:47 -05:00
Joseph Schorr
3bf5e93f06
Remove log statement
2015-02-03 16:06:23 -05:00
Joseph Schorr
d709e0f64a
Fix the new notifications code to work
2015-02-03 13:08:38 -05:00
Joseph Schorr
07e85324e9
- Add build notifications back in
...
- Fix spelling mistake
- Add the sha output as part of the build script
2015-02-03 13:01:42 -05:00
Joseph Schorr
361fb33574
- Add a small build script
...
- Take in the build worker branch name from config
- Add additional logging (to be removed after we figure out the problem)
2015-02-03 12:48:41 -05:00
Jake Moshenko
2215ec6669
Associate a public IP with the network interfaces on our VPC instances.
2015-02-02 15:28:40 -05:00
Jake Moshenko
db8493f254
update the executor template to use VPC instances.
2015-02-02 14:55:34 -05:00
Jake Moshenko
3687419ab3
Change a typo to an enum
2015-02-02 12:24:32 -05:00
Jake Moshenko
a4b0c8698d
Allow the key prefixes in etcd to be configurable.
2015-02-02 12:00:19 -05:00
Joseph Schorr
0875d3dce1
Merge branch 'master' of https://github.com/coreos-inc/quay
2015-01-29 18:40:49 -05:00
Joseph Schorr
3872d29de9
Add a transaction around the extend_processing call
2015-01-29 18:40:41 -05:00
Jake Moshenko
fb533a1f4c
Merge branch 'master' of github.com:coreos-inc/quay
2015-01-29 18:40:24 -05:00
Jake Moshenko
8e85ff63f1
Add everyones ssh keys to the ephemeral build workers.
2015-01-29 18:40:17 -05:00
Jake Moshenko
63d23a04c0
Make the loop pause when we run out of builder capacity.
2015-01-29 18:40:01 -05:00
Joseph Schorr
838bfe23b1
Remove retries update in the extend processing call and make sure it is under a transaction
2015-01-29 18:33:17 -05:00
Joseph Schorr
a6fa08c19c
Change returns to trollius returns
2015-01-29 18:21:32 -05:00
Joseph Schorr
0e5f6dc17d
Fix typo in timed out
2015-01-29 18:13:31 -05:00
Joseph Schorr
60eae43ae4
Add the date time to the log entries
2015-01-29 18:05:05 -05:00
Joseph Schorr
ce3f8b438c
Fix pull credentials bug, fix job details parse bug and add some better logging
2015-01-29 18:01:42 -05:00
Joseph Schorr
7ee00b83cb
Switch to using a CloseForLongOperation around the sleep
2015-01-29 14:50:07 -05:00
Joseph Schorr
cf35da30bc
Make sure to not hold DB connections open in the new build manager
2015-01-29 14:40:24 -05:00
Jake Moshenko
2e86417329
Allow the buildman server to die if an uncaught exception terminates the scheduler process.
2015-01-29 10:56:57 -05:00
Jake Moshenko
c308794063
Fix the enterprise manager to use the new coroutine based interface.
2015-01-29 10:56:18 -05:00
Joseph Schorr
d359c849cd
Add the build worker and job count information to the charts
2015-01-28 17:12:33 -05:00
Jake Moshenko
0ddfd07749
Use the tiny registry-build-worker image. Bind mount in the root certificates so that Quay SSL certificates can be calidated.
2015-01-27 14:12:47 -05:00
Jake Moshenko
ef0806bd9d
Make the logs for the build manager more bearable.
2015-01-26 15:27:39 -05:00
Joseph Schorr
be6701b310
Have the builder not start and stop, over and over, if not enabled
2015-01-26 14:13:55 -05:00
Jake Moshenko
86852da4ba
Catch exceptions when ELB times out a connection to etcd.
2015-01-23 11:29:38 -05:00
Jake Moshenko
725808a4f8
Make the logs from the build manager more useful.
2015-01-23 11:29:15 -05:00
Jake Moshenko
265aeabf60
We need to tell the etcd client which protocol to use.
2015-01-22 16:59:04 -05:00
Jake Moshenko
f2471a86f6
Fix the python requirements. Add the ability to map in etcd client certs and ca.
2015-01-22 10:53:23 -05:00
Jake Moshenko
fc757fecad
Tag the EC2 instances with the build uuid.
2015-01-05 15:35:14 -05:00
Jake Moshenko
dd7664328c
Make the build manager ports configurable.
2015-01-05 15:09:03 -05:00
Jake Moshenko
8037962716
Change the severity of a log message which is actually expected in the happy case.
2015-01-05 14:44:54 -05:00
Jake Moshenko
f58b09a064
Remove the loop argument from the call to build_component_ready.
2015-01-05 13:08:25 -05:00
Jake Moshenko
320ae63ccd
Handle the case where there are no realms registered.
2015-01-05 12:23:54 -05:00
Jake Moshenko
b33ee1a474
Register existing builders to watch their expirations.
2015-01-05 11:21:36 -05:00
Jake Moshenko
a9839021af
When the etcd key tracking realms is first created the action is create, not set.
2014-12-31 11:46:02 -05:00
Jake Moshenko
cc70225043
Generalize the ephemeral build managers so that any manager may manage a builder spawned by any other manager.
2014-12-31 11:33:56 -05:00
Jake Moshenko
ccb19571d6
Try lowering the sleep on the shutdown timeout to avoid the service dispatch timeout built into systemd.
2014-12-23 17:42:47 -05:00
Jake Moshenko
ec87e37d8c
EC2 terminate_instances does not take a force flag.
2014-12-23 17:17:53 -05:00
Jake Moshenko
1005c29b6b
Fix the shutdown command for when the builder terminates itself.
2014-12-23 17:08:16 -05:00
Jake Moshenko
cece94e1da
We want to terminate instances, not stop them.
2014-12-23 16:20:42 -05:00
Jake Moshenko
3ce64b4a7f
We must yield from stop_builder.
2014-12-23 16:12:10 -05:00
Jake Moshenko
ef70432b11
We need to call build_finished async.
2014-12-23 16:04:10 -05:00
Jake Moshenko
8e16fbf59b
The root device on CoreOS is /dev/xvda.
2014-12-23 15:41:58 -05:00
Jake Moshenko
2f2a88825d
Try using SSD for root volumes.
2014-12-23 15:35:21 -05:00
Jake Moshenko
723fb27671
Calls to the ec2 service must be async, and responses must be wrapped as well.
2014-12-23 14:54:58 -05:00
Jake Moshenko
2ed9b3d243
Disable the etcd timeout on watch calls to prevent them from disconnecting the client.
2014-12-23 14:54:34 -05:00
Jake Moshenko
b2d7fad667
Fix a typo with the automatic node shutdown fallback in the ephemeral nodes.
2014-12-23 14:09:24 -05:00
Jake Moshenko
4e22e22ba1
We have to serialize our build data before sending it to etc.
2014-12-23 14:09:04 -05:00
Jake Moshenko
709e571b78
Handle read timeouts from etcd when watching a key.
2014-12-23 12:13:49 -05:00
Jake Moshenko
055a6b0c37
Add a total maximum time that a machine is allowed to stick around before we terminate it more forcefully.
2014-12-23 11:18:10 -05:00
Jake Moshenko
aac7feb20b
Refresh the build_job from the database before we write updates.
2014-12-23 11:17:23 -05:00
Jake Moshenko
34bf92673b
Add support for adjusting etcd ttl on job_heartbeat. Switch the heartbeat method to a coroutine.
2014-12-22 17:24:44 -05:00
Jake Moshenko
2b6c2a2a50
Improve tests for the ephemeral build manager.
2014-12-22 16:22:07 -05:00
Jake Moshenko
e53b6b0e21
Merge remote-tracking branch 'origin/master' into ephemeral
2014-12-22 12:14:59 -05:00
Jake Moshenko
12ee8e0fc0
Switch a few of the buildman methods to coroutines in order to support network calls in methods. Add a test for the ephemeral build manager.
2014-12-22 12:14:16 -05:00
Jake Moshenko
a280bbcb6d
Add tag metadata to the instances.
2014-12-16 15:17:39 -05:00
Jake Moshenko
1d68594dc2
Extract instance ids from the instance objects returned by boto.
2014-12-16 15:10:50 -05:00
Jake Moshenko
2d7e844753
First implementation of ephemeral build lifecycle manager.
2014-12-16 13:41:30 -05:00
Jimmy Zelinskie
33f12c58ba
Add active worker count to buildmanager logs.
2014-12-16 13:37:40 -05:00
Jimmy Zelinskie
37079315d2
use os.path.join when locating ssl certs
2014-12-16 13:19:35 -05:00
Joseph Schorr
00299ca60f
We need to make sure to use the *full* command
2014-12-11 18:17:15 +02:00
Joseph Schorr
6601e83285
When speaking to version 0.2-beta of the build worker, properly lookup the cached commands and see if we have a matching image/tag in the repository
2014-12-11 18:03:40 +02:00
Jake Moshenko
fd825b82cd
Update the buildman with the database job config post merge with nomenclature.
2014-12-08 14:43:32 -05:00
Joseph Schorr
5cb36fe053
Have the build manager sleep if the requested manager is external
2014-12-01 14:41:46 -05:00
Joseph Schorr
4f5bf8185a
Add version checking to the python side
2014-12-01 12:11:23 -05:00
Jimmy Zelinskie
09cc4ba4c1
LOGGER -> logger.
...
While logger may be a global variable, it is not constant. Let the
linters complain!
2014-11-30 17:48:38 -05:00
Joseph Schorr
a8473db87f
Make sure the realm is connected before heartbeat checks start.
2014-11-26 17:02:49 -05:00
Joseph Schorr
d91829dc3c
Only start the build manager if building is enabled
2014-11-26 11:28:29 -05:00
Joseph Schorr
9d675b51ed
- Change SSL to only be enabled via an environment variable. Nginx will be terminating SSL for the ER.
...
- Add the missing dependencies to the requirements.txt
- Change the builder ports to non-standard locations
- Add the /b1/socket and /b1/controller endpoints in nginx, to map to the build manager
- Have the build manager start automatically.
2014-11-25 18:08:18 -05:00
Joseph Schorr
04fc6d82a5
Add support for SSL if the certificate is found in the config directory
2014-11-25 16:36:21 -05:00
Joseph Schorr
660a640de6
Better organize the source file structure of the build manager and change it to choose a lifecycle manager based on the config
2014-11-25 16:14:44 -05:00
Joseph Schorr
b8e873b00b
Add support to the build system for tracking if/when the build manager crashes and make sure builds are restarted within a few minutes
2014-11-21 14:27:06 -05:00
Jimmy Zelinskie
872c135205
Make ping method static.
...
Without being static or passing a self parameter, the worker will
receive a runtime WAMP error when they attempt to ping during a
health check, this marks them unhealthy every single time you
attempt a health check.
2014-11-20 16:19:02 -05:00
Jimmy Zelinskie
290c8abeb5
Make empty token more readable in logs.
...
Enterprises use "" for tokens. This was confusing to read in the logs
without making things more clear by adding quotes around the value.
2014-11-20 15:22:34 -05:00
Jimmy Zelinskie
d0763862b1
Simple code review changes.
...
I sneakily also added local-test.sh and renamed run-local to
local-run.sh.
2014-11-20 14:36:22 -05:00
Jimmy Zelinskie
0763f0d999
Initialize BaseComponent members in constructor
2014-11-19 13:17:53 -05:00
Joseph Schorr
a9fd516dad
Disable WAMP debug
2014-11-18 16:35:03 -05:00
Joseph Schorr
63f2e7794f
Various small fixes
2014-11-18 16:34:09 -05:00
Jimmy Zelinskie
6df6f28edf
Lint BuildManager
2014-11-18 15:45:56 -05:00
Joseph Schorr
043a30ee96
Add a heartbeat to the build status, so we know if a manager crashed
2014-11-14 15:31:02 -05:00
Joseph Schorr
01dc10b8fc
Remove server hostname hack
2014-11-14 15:05:49 -05:00
Joseph Schorr
cfc6b196a4
- Extra the build component statuses into an enum
...
- Add a ping method so the workers can verify the state of the controller
- Fix a bug with current_step and 0 values
- Rename the build status var to phase, to make it more distinct from the controller status
2014-11-14 14:53:35 -05:00
Joseph Schorr
4322b5f81c
Get the new build system working for enterprise
2014-11-13 19:41:17 -05:00
Joseph Schorr
f93c0a46e8
WIP: Get everything working except logging and job completion
2014-11-12 14:03:07 -05:00
Joseph Schorr
eacf3f01d2
WIP: Start implementation of the build manager/controller. This code is not yet working completely.
2014-11-11 18:23:15 -05:00