Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(172)

Issue 928043005: Monitoring proxy for time series data (Closed)

Created:
5 years, 10 months ago by Sergey Berezin
Modified:
5 years, 7 months ago
CC:
chromium-reviews
Base URL:
https://chromium.googlesource.com/infra/infra.git@master
Target Ref:
refs/heads/master
Project:
infra
Visibility:
Public.

Description

Patch Set 1 #

Patch Set 2 : Cleaned up the data flow #

Patch Set 3 : Streamlining, cleanup, oauth2 in Docker #

Patch Set 4 : Added auth #

Patch Set 5 : Fixed httplib2 data mangling #

Patch Set 6 : Tried to fix httplib2 data corruption. Cleanup. #

Patch Set 7 : Proxy works for real #

Patch Set 8 : Writing tests (WIP) #

Patch Set 9 : Finished tests. Ready for review. #

Patch Set 10 : Updated nat_setup.sh #

Patch Set 11 : Ignore symlink to components (to fix tests) #

Patch Set 12 : Factored out test module. #

Patch Set 13 : Branch coverage weirdness #

Patch Set 14 : Added /loadtest endpoint #

Patch Set 15 : Working load test infrastructure #

Patch Set 16 : Auto-scale managed VMs #

Patch Set 17 : Switched to task queues #

Patch Set 18 : Updated tests; more fine-tuning parameters. #

Patch Set 19 : Updated #VMs with new quota #

Patch Set 20 : More fine-tuning #

Total comments: 96

Patch Set 21 : Back to 4-core VMs with more parallelism. #

Total comments: 42

Patch Set 22 : Switch to jinja2 in admin UI #

Patch Set 23 : Responded to comments #

Patch Set 24 : No more task queues #

Patch Set 25 : dispatch bug #

Patch Set 26 : Fine-tuning for performance #

Patch Set 27 : Another VM tweak #

Patch Set 28 : Removed loadtest code from the app #

Patch Set 29 : Rebase #

Total comments: 40

Patch Set 30 : Responded to comments #

Total comments: 7
Unified diffs Side-by-side diffs Delta from patch set Stats (+1138 lines, --5 lines) Patch
A + appengine/chrome_infra_mon_proxy/.expect_tests.cfg View 1 2 3 4 5 6 7 8 9 10 0 chunks +-1 lines, --1 lines 0 comments Download
A + appengine/chrome_infra_mon_proxy/.expect_tests_pretest.py View 1 2 3 4 5 6 7 0 chunks +-1 lines, --1 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/Dockerfile View 1 2 3 4 1 chunk +14 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/OWNERS View 1 1 chunk +4 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/admin_handler.py View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 chunk +139 lines, -0 lines 2 comments Download
A appengine/chrome_infra_mon_proxy/app.yaml View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 chunk +38 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/common.py View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 1 chunk +59 lines, -0 lines 0 comments Download
A + appengine/chrome_infra_mon_proxy/components View 1 2 3 0 chunks +-1 lines, --1 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/css/bootstrap.min.css View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 1 chunk +1 line, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/dispatch.yaml View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 1 chunk +11 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/docker/Dockerfile View 1 2 3 4 1 chunk +18 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/main.py View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 chunk +103 lines, -0 lines 2 comments Download
A appengine/chrome_infra_mon_proxy/scripts/nat_setup.sh View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 chunk +83 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/scripts/nat_startup.sh View 1 chunk +8 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/templates/admin.html View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 1 chunk +15 lines, -0 lines 0 comments Download
A + appengine/chrome_infra_mon_proxy/templates/base.html View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 0 chunks +-1 lines, --1 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/templates/main.html View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 chunk +29 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/templates/set_credentials.html View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 1 chunk +43 lines, -0 lines 0 comments Download
A + appengine/chrome_infra_mon_proxy/test/__init__.py View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 21 23 24 25 26 27 1 chunk +1 line, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/test/admin_handler_test.py View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 chunk +136 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/test/common_test.py View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 1 chunk +36 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/test/main_test.py View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 1 chunk +69 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/test/vm_module_test.py View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 chunk +146 lines, -0 lines 0 comments Download
A + appengine/chrome_infra_mon_proxy/testing_utils View 1 2 3 4 5 6 7 0 chunks +-1 lines, --1 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/vm1.yaml View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 1 chunk +31 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/vm2.yaml View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 1 chunk +31 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/vm3.yaml View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 1 chunk +31 lines, -0 lines 0 comments Download
A appengine/chrome_infra_mon_proxy/vm_module.py View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 chunk +97 lines, -0 lines 3 comments Download

Messages

Total messages: 40 (5 generated)
Sergey Berezin
This is NOT ready for review. I'm uploading this CL to warm you up for ...
5 years, 10 months ago (2015-02-18 03:10:39 UTC) #1
Sergey Berezin
FYI, I uploaded a new version. Still no tests, and I'm still fighting an httplib2 ...
5 years, 9 months ago (2015-03-05 03:11:54 UTC) #2
pgervais
On 2015/03/05 03:11:54, Sergey Berezin wrote: > FYI, I uploaded a new version. Still no ...
5 years, 9 months ago (2015-03-05 17:33:43 UTC) #3
Sergey Berezin
On 2015/03/05 17:33:43, pgervais wrote: > On 2015/03/05 03:11:54, Sergey Berezin wrote: > > FYI, ...
5 years, 9 months ago (2015-03-05 18:17:43 UTC) #4
Sergey Berezin
Added auth module, and now authentication works all the way through the pipeline (using the ...
5 years, 9 months ago (2015-03-06 03:04:36 UTC) #5
Sergey Berezin
+pgervais FYI Fixed the mangling of data in httplib2 (had to set content-type to application/octet-stream). ...
5 years, 9 months ago (2015-03-10 00:47:28 UTC) #6
Sergey Berezin
PTAL: this is ready for review, complete with tests.
5 years, 8 months ago (2015-04-01 00:55:22 UTC) #7
Sergey Berezin
Hmm, weird. Locally, tests show 100% coverage and pass cleanly. On the bot, I'm missing ...
5 years, 8 months ago (2015-04-01 01:26:39 UTC) #8
pgervais
We also hit some coverage flakiness in the past few days (all tests pass, coverage ...
5 years, 8 months ago (2015-04-01 16:36:13 UTC) #9
Sergey Berezin
The tests now pass. Please start reviewing - I'd appreciate comments, even if it's only ...
5 years, 8 months ago (2015-04-02 19:26:38 UTC) #10
Sergey Berezin
Small changes to update VM counts.
5 years, 8 months ago (2015-04-10 23:10:15 UTC) #12
Sergey Berezin
Ran another load test up to ~650 qps today. Got GAE task queues to buckle... ...
5 years, 8 months ago (2015-04-11 01:54:08 UTC) #13
ghost stip (do not use)
this code was pleasant to review, despite being a large change. thank you for writing ...
5 years, 8 months ago (2015-04-14 00:36:44 UTC) #14
Sergey Berezin (google)
Heads up: after another round of loadtests, I decided to switch back to synchronous requests, ...
5 years, 8 months ago (2015-04-14 00:38:02 UTC) #15
Sergey Berezin (google)
On 2015/04/14 00:36:44, stip wrote: > this code was pleasant to review, despite being a ...
5 years, 8 months ago (2015-04-14 00:41:17 UTC) #16
agable
https://codereview.chromium.org/928043005/diff/400001/appengine/chrome_infra_mon_proxy/admin_handler.py File appengine/chrome_infra_mon_proxy/admin_handler.py (right): https://codereview.chromium.org/928043005/diff/400001/appengine/chrome_infra_mon_proxy/admin_handler.py#newcode32 appengine/chrome_infra_mon_proxy/admin_handler.py:32: self.response.write(open('templates/admin.html').read()) Would be way better to use a real ...
5 years, 8 months ago (2015-04-14 21:46:25 UTC) #17
Sergey Berezin (google)
Switched to jinja2 in admin UI, responded to comments in admin_handler. Thanks for reviews! I'll ...
5 years, 8 months ago (2015-04-15 01:38:10 UTC) #19
Sergey Berezin (google)
Uploaded the new admin UI version: https://chrome-infra-mon-proxy.appspot.com/admin/ (you should all have access)
5 years, 8 months ago (2015-04-15 16:16:29 UTC) #20
agable
https://codereview.chromium.org/928043005/diff/400001/appengine/chrome_infra_mon_proxy/admin_handler.py File appengine/chrome_infra_mon_proxy/admin_handler.py (right): https://codereview.chromium.org/928043005/diff/400001/appengine/chrome_infra_mon_proxy/admin_handler.py#newcode85 appengine/chrome_infra_mon_proxy/admin_handler.py:85: updated_fields = [] > Note, that in the future, ...
5 years, 8 months ago (2015-04-15 17:14:54 UTC) #21
Sergey Berezin (google)
Responded to all the comments, and did some refactoring - the code definitely improved - ...
5 years, 8 months ago (2015-04-16 04:39:09 UTC) #22
Sergey Berezin (google)
Small adjustments in *.yaml files for performance: these seem to stabilize the app, and it ...
5 years, 8 months ago (2015-04-20 23:04:06 UTC) #23
Sergey Berezin (google)
Another small adjustment: VMs are now 1-core with target CPU usage at 80%. This allows ...
5 years, 8 months ago (2015-04-22 01:29:17 UTC) #24
Sergey Berezin (google)
The loadtest moved a separate app: http://chrome-infra-loadtest.appspot.com The code is temporarily checked in here: https://chromium.googlesource.com/infra/experimental/+/master/appengine/chrome_infra_loadtest/ ...
5 years, 8 months ago (2015-04-22 03:19:09 UTC) #25
Sergey Berezin (google)
OK, PTAL, it's ready to land. This is the final working version of the proxy, ...
5 years, 8 months ago (2015-04-22 17:25:26 UTC) #26
agable
https://codereview.chromium.org/928043005/diff/560001/appengine/chrome_infra_mon_proxy/admin_handler.py File appengine/chrome_infra_mon_proxy/admin_handler.py (right): https://codereview.chromium.org/928043005/diff/560001/appengine/chrome_infra_mon_proxy/admin_handler.py#newcode49 appengine/chrome_infra_mon_proxy/admin_handler.py:49: def setParams(params, data): This should be a staticmethod inside ...
5 years, 8 months ago (2015-04-23 19:29:38 UTC) #27
Sergey Berezin (google)
PTAL. The app is now running the latest version by default: https://chrome-infra-mon-proxy.appspot.com/ https://codereview.chromium.org/928043005/diff/560001/appengine/chrome_infra_mon_proxy/admin_handler.py File appengine/chrome_infra_mon_proxy/admin_handler.py ...
5 years, 8 months ago (2015-04-24 19:09:53 UTC) #28
agable
LGTM! https://codereview.chromium.org/928043005/diff/560001/appengine/chrome_infra_mon_proxy/scripts/nat_setup.sh File appengine/chrome_infra_mon_proxy/scripts/nat_setup.sh (right): https://codereview.chromium.org/928043005/diff/560001/appengine/chrome_infra_mon_proxy/scripts/nat_setup.sh#newcode7 appengine/chrome_infra_mon_proxy/scripts/nat_setup.sh:7: REGION="us-central1" On 2015/04/24 at 19:09:52, sergeyberezin wrote: > ...
5 years, 8 months ago (2015-04-27 20:51:07 UTC) #29
Sergey Berezin
https://codereview.chromium.org/928043005/diff/560001/appengine/chrome_infra_mon_proxy/scripts/nat_setup.sh File appengine/chrome_infra_mon_proxy/scripts/nat_setup.sh (right): https://codereview.chromium.org/928043005/diff/560001/appengine/chrome_infra_mon_proxy/scripts/nat_setup.sh#newcode7 appengine/chrome_infra_mon_proxy/scripts/nat_setup.sh:7: REGION="us-central1" On 2015/04/27 20:51:07, agable wrote: > On 2015/04/24 ...
5 years, 8 months ago (2015-04-27 21:45:51 UTC) #31
commit-bot: I haz the power
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/928043005/580001
5 years, 8 months ago (2015-04-27 21:56:12 UTC) #33
commit-bot: I haz the power
Committed patchset #30 (id:580001) as https://chromium.googlesource.com/infra/infra/+/73d6305244bd455777957602e1aa1d9bb7634a27
5 years, 8 months ago (2015-04-27 21:56:36 UTC) #34
Sergey Berezin
On 2015/04/27 21:56:36, I haz the power (commit-bot) wrote: > Committed patchset #30 (id:580001) as ...
5 years, 8 months ago (2015-04-27 21:57:37 UTC) #35
Vadim Sh.
Stumbled on this CL in Snippets. https://codereview.chromium.org/928043005/diff/580001/appengine/chrome_infra_mon_proxy/admin_handler.py File appengine/chrome_infra_mon_proxy/admin_handler.py (right): https://codereview.chromium.org/928043005/diff/580001/appengine/chrome_infra_mon_proxy/admin_handler.py#newcode128 appengine/chrome_infra_mon_proxy/admin_handler.py:128: def post(self, command): ...
5 years, 7 months ago (2015-04-28 18:10:43 UTC) #37
Sergey Berezin
https://codereview.chromium.org/928043005/diff/580001/appengine/chrome_infra_mon_proxy/admin_handler.py File appengine/chrome_infra_mon_proxy/admin_handler.py (right): https://codereview.chromium.org/928043005/diff/580001/appengine/chrome_infra_mon_proxy/admin_handler.py#newcode128 appengine/chrome_infra_mon_proxy/admin_handler.py:128: def post(self, command): Awesome! Thanks, Vadim, you just answered ...
5 years, 7 months ago (2015-04-28 19:35:27 UTC) #38
pgervais
On 2015/04/28 19:35:27, Sergey Berezin wrote: > https://codereview.chromium.org/928043005/diff/580001/appengine/chrome_infra_mon_proxy/admin_handler.py > File appengine/chrome_infra_mon_proxy/admin_handler.py (right): > > https://codereview.chromium.org/928043005/diff/580001/appengine/chrome_infra_mon_proxy/admin_handler.py#newcode128 ...
5 years, 7 months ago (2015-04-28 19:48:50 UTC) #39
Vadim Sh.
5 years, 7 months ago (2015-04-28 20:28:52 UTC) #40
Message was sent while issue was closed.
https://codereview.chromium.org/928043005/diff/580001/appengine/chrome_infra_...
File appengine/chrome_infra_mon_proxy/vm_module.py (right):

https://codereview.chromium.org/928043005/diff/580001/appengine/chrome_infra_...
appengine/chrome_infra_mon_proxy/vm_module.py:70: http_auth =
credentials.authorize(http_auth)
On 2015/04/28 19:35:27, Sergey Berezin wrote:
> Thanks, this is great info. I'll definitely think about this, and will fix it
in
> future CLs. It is not mission-critical at the moment though, so I'd say it can
> wait for a bit.
> 
> Can you suggest alternative libraries? Otherwise, I'll look for a way to store
> the token in the Datastore, and reconstruct SignedJwtAssertionCredentials with
a
> cached token. Refreshing a token at 500 qps without a race storm will indeed
be
> an interesting challenge. I'll think about it.

Swarming components have simple "get_access_token(...)" function that works with
service accounts and that uses memcache to store access token. It can be
slightly adjusted to use local memory as yet another caching layer (for existing
use cases memcache was more than enough, for long lived managed VM local cache
is justifiable).

We also have wrapper around URLfetch that knows how to use get_access_token and
that implements retries.

Roughly, the usage would be:

from components import auth
from components import net

service_account_key = auth.ServiceAccountKey(
  client_email=...,
  private_key=...,
  private_key_id=...)

response = net.request(
  url=url, 
  method='POST', 
  payload=self.request.body,
  headers={'Content-Type': 'application/x-protobuf'},
  scopes=data['scopes'],
  service_account_key=service_account_key)

(succeeds on  200 <= code <= 299)
(raises net.Error on 300 <= code < 500)
(retries >= 500 a bunch of times)

net.py:
https://github.com/luci/luci-py/blob/master/appengine/components/components/n...
service_account.py:
https://github.com/luci/luci-py/blob/master/appengine/components/components/a...

Powered by Google App Engine
This is Rietveld 408576698