Joseph Schorr
cac5f44d15
Improvements to health checks
...
- Adds a warning endpoint for warning-only checks
- Changes the default for the disk space check to 1%, instead of 10%
- Removes instance services from the overall health check endpoint
2019-02-04 13:17:59 -05:00
Joseph Schorr
a94f657cb7
Add health check for node disk space
...
If a node runs out of disk space, nginx can no longer swap, and this can cause issues with large pushes
Fixes https://jira.coreos.com/browse/QUAY-1047
2018-09-05 17:57:22 -04:00
Sam Chow
dbce986af6
Setup reroutes when complete, fix gunicorn check
2018-08-31 15:17:48 -04:00
Joseph Schorr
bbdf9e074c
Add metrics for tracking when instance key renewal succeeds and fails, as well as when instance key *lookup* fails
2018-02-02 11:14:42 -05:00
Joseph Schorr
c1cc52f58b
Add a health check for the instance key
...
If the key expires or disappears, the node will now go unhealthy, taking it out of service and preventing downtime
2018-02-02 11:14:00 -05:00
Joseph Schorr
e91b83e1be
Add instance health checks for all gunicorn workers
...
Fixes https://jira.coreos.com/browse/QS-121
2018-01-16 11:29:40 -05:00
Joseph Schorr
4ad3682b9c
Make health check failures report their reasons
...
Note that we add a new block with expanded service info, to avoid breaking compatibility with existing callers of the health endpoint
2017-07-19 16:17:02 +03:00
Joseph Schorr
e44a503bd0
Add status check for auth endpoint
2017-07-19 16:17:02 +03:00
Joseph Schorr
7b1dfbb256
yapf
2017-07-11 13:48:55 +03:00
Joseph Schorr
4853634c2f
Switch health to use a data interface
2017-07-11 13:48:25 +03:00
Joseph Schorr
310eded8e6
Add a configuration flag for external TLS termination
...
This is necessary to ensure that we use the correct scheme when conducting health checks, setting cookies, etc.
Fixes #1865
2016-09-22 18:28:57 -04:00
Joseph Schorr
974ab6c42c
Add missing arg to validate call and add logging
2016-08-03 11:13:27 -04:00
Joseph Schorr
c30b8dd1ad
Add storage validation to the status endpoint
...
Fixes #1659
2016-08-01 13:02:26 -04:00
Joseph Schorr
c518874ded
I hate Redis!
...
- Remove redis check from our health endpoint in prod entirely
- Have the redis check have a maximum timeout of 1 second
2015-10-22 14:24:42 -04:00
Jake Moshenko
3efaa255e8
Accidental refactor, split out legacy.py into separate sumodules and update all call sites.
2015-07-17 11:56:15 -04:00
Joseph Schorr
b74b7de197
Clean up the health checking code and move the endpoints to /health/instance and /health/endtoend.
2015-01-20 16:53:05 -05:00