|
|
Created:
7 years, 4 months ago by craigdh Modified:
7 years, 3 months ago CC:
chromium-reviews, craigdh+watch_chromium.org, bulach+watch_chromium.org, yfriedman+watch_chromium.org, ilevy-cc_chromium.org, klundberg+watch_chromium.org, frankf+watch_chromium.org Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
Description[android] Add frankf and craigdh to build/android/buildbot/OWNERS.
BUG=None
TEST=None
NOTRY=True
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=219288
Patch Set 1 #
Messages
Total messages: 18 (0 generated)
lgtm, thanks!
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/craigdh@chromium.org/23129027/1
Message was sent while issue was closed.
Change committed as 219288
Message was sent while issue was closed.
This folder was separated specifically to ensure infrastructure team had visibility and veto on changes that could negatively affect variety of production systems. The OWNERS list should be kept small and constrained. I would like to revert this.
Message was sent while issue was closed.
On 2013/09/06 04:08:41, Isaac wrote: > This folder was separated specifically to ensure infrastructure team had > visibility and veto on changes that could negatively affect variety of > production systems. The OWNERS list should be kept small and constrained. I > would like to revert this. *sigh*... I apologize for the long text, but seriously, we need to solve this... There's a lot of issues embroiled here, so let me try to split them as best as I can. - from London, I can see this totally *artificial* silo being created, separating "infrastructure" from "testing"... can we please stop being territorial? I'm in neither "infra" nor "testing". I'm here to help. and I'm 100% confident that craig and frank share the same goals... - data: we (myself, frank, craig) have had *many* patches that required "infra" OWNERS review and were stuck for _ages_... and after pings, the "infra" OWNERS just do a RS anyways... so why get in the way? what are We gaining? (big We, google, not "we" infra, not "we" testing, not "we" whatever other team)... "We" have data that the current OWNERS is NOT working in an efficient manner (if it was, we wouldn't have made this patch...). is there _any_ data to prove that adding these OWNERS will be hurtful? - hostility: again, "We" are here to help. words like "veto" are needlessly hostile... how about "work together to ensure the production systems will have enough capacity to handle all the applications needs"? can we add such a comment to this file? - s*it happens... I totally appreciate the issue with "content browser tests"... that should've been clearer communicated across the board.... but that's cool, s*it happens, We revert, We improve, We fix.... my understanding is that the "content browser tests" issue happened _with_ this OWNERS in place no? iirc, was https://codereview.chromium.org/13445005 Having said all that, here's my proposals: 1) keep and trust craig and frank. :) 2) remove craig, frank and myself... I am 100% sure this will increase the "infra" team load for absolute no reason, but that's totally cool by me.. ;) To be extremely clear: it's _your_ call, I have absolutely no feelings either towards keeping or kicking the three of us. :) I will totally understand and respect whichever way you want to go.
Message was sent while issue was closed.
On 2013/09/06 08:56:26, bulach wrote: > On 2013/09/06 04:08:41, Isaac wrote: > > This folder was separated specifically to ensure infrastructure team had > > visibility and veto on changes that could negatively affect variety of > > production systems. The OWNERS list should be kept small and constrained. I > > would like to revert this. > > *sigh*... I apologize for the long text, but seriously, we need to solve this... > > There's a lot of issues embroiled here, so let me try to split them as best as I > can. > > - from London, I can see this totally *artificial* silo being created, > separating "infrastructure" from "testing"... can we please stop being > territorial? I'm in neither "infra" nor "testing". I'm here to help. and I'm > 100% confident that craig and frank share the same goals... > > - data: we (myself, frank, craig) have had *many* patches that required "infra" > OWNERS review and were stuck for _ages_... and after pings, the "infra" OWNERS > just do a RS anyways... so why get in the way? what are We gaining? (big We, > google, not "we" infra, not "we" testing, not "we" whatever other team)... > "We" have data that the current OWNERS is NOT working in an efficient manner (if > it was, we wouldn't have made this patch...). is there _any_ data to prove that > adding these OWNERS will be hurtful? > > - hostility: again, "We" are here to help. words like "veto" are needlessly > hostile... how about "work together to ensure the production systems will have > enough capacity to handle all the applications needs"? can we add such a comment > to this file? > > - s*it happens... I totally appreciate the issue with "content browser tests"... > that should've been clearer communicated across the board.... but that's cool, > s*it happens, We revert, We improve, We fix.... my understanding is that the > "content browser tests" issue happened _with_ this OWNERS in place no? iirc, was > https://codereview.chromium.org/13445005 > > > Having said all that, here's my proposals: > 1) keep and trust craig and frank. :) > > 2) remove craig, frank and myself... I am 100% sure this will increase the > "infra" team load for absolute no reason, but that's totally cool by me.. ;) > > > To be extremely clear: it's _your_ call, I have absolutely no feelings either > towards keeping or kicking the three of us. :) I will totally understand and > respect whichever way you want to go. FYI, isaac decided in a separate chat to go for (2), so here it is: https://codereview.chromium.org/23814005/
Message was sent while issue was closed.
I missed this Friday morning. Where was the "discussion" here? Was cmp involved? All I see Marcus's points about why he should remain on the OWNER list. I agree with his points. With cmp and yfriedman being busy, what happens is all changes have to go through ilevy. The way things are currently is quite frustrating. One solution is add more infra people to the OWNERS list (aka me). That would at least balance the load a little bit. As someone on the infra team, I often find it difficult to make progress. I can imagine others finding this frustrating.
Message was sent while issue was closed.
On 2013/09/09 17:31:17, navabi wrote: > I missed this Friday morning. Where was the "discussion" here? Was cmp > involved? All I see Marcus's points about why he should remain on the OWNER > list. I agree with his points. > > With cmp and yfriedman being busy, what happens is all changes have to go > through ilevy. The way things are currently is quite frustrating. > > One solution is add more infra people to the OWNERS list (aka me). That would at > least balance the load a little bit. As someone on the infra team, I often find > it difficult to make progress. I can imagine others finding this frustrating. See more info on the revert: 23567017 and related cl 23814005
Message was sent while issue was closed.
On 2013/09/09 17:54:42, Isaac wrote: > On 2013/09/09 17:31:17, navabi wrote: > > I missed this Friday morning. Where was the "discussion" here? Was cmp > > involved? All I see Marcus's points about why he should remain on the OWNER > > list. I agree with his points. > > > > With cmp and yfriedman being busy, what happens is all changes have to go > > through ilevy. The way things are currently is quite frustrating. > > > > One solution is add more infra people to the OWNERS list (aka me). That would > at > > least balance the load a little bit. As someone on the infra team, I often > find > > it difficult to make progress. I can imagine others finding this frustrating. > > See more info on the revert: 23567017 and related cl 23814005 Just to accumulate all the discussion here, these were the points made in the cl's you referenced: 23567017: Removed frankf and craigdh with a comment regarding why we have yfriedman and bulach on the OWNERS list 23814005: Removed bulach with little explanation except a link to this CL (infinite mutual recursion) I still do not see any discussion about if/how we plan on dealing the difficulties Marcus expressed in this CL, which I have also experienced.
Message was sent while issue was closed.
On 2013/09/09 18:34:09, navabi wrote: > On 2013/09/09 17:54:42, Isaac wrote: > > On 2013/09/09 17:31:17, navabi wrote: > > > I missed this Friday morning. Where was the "discussion" here? Was cmp > > > involved? All I see Marcus's points about why he should remain on the OWNER > > > list. I agree with his points. > > > > > > With cmp and yfriedman being busy, what happens is all changes have to go > > > through ilevy. The way things are currently is quite frustrating. > > > > > > One solution is add more infra people to the OWNERS list (aka me). That > would > > at > > > least balance the load a little bit. As someone on the infra team, I often > > find > > > it difficult to make progress. I can imagine others finding this > frustrating. > > > > See more info on the revert: 23567017 and related cl 23814005 > > Just to accumulate all the discussion here, these were the points made in the > cl's you referenced: > 23567017: Removed frankf and craigdh with a comment regarding why we have > yfriedman and bulach on the OWNERS list > 23814005: Removed bulach with little explanation except a link to this CL > (infinite mutual recursion) > > I still do not see any discussion about if/how we plan on dealing the > difficulties Marcus expressed in this CL, which I have also experienced. I agree with Marcus and Armand. Other people cannot be blocked because someone's not following Chromium's reviewer/OWNERS responsibility. Example CLs: https://codereview.chromium.org/20545002/ (It took 1 month to land 6 lines!) https://codereview.chromium.org/23601004/ Being responsive for code reviews is even more crucial for infra.
Message was sent while issue was closed.
hey everyone, let me try to summarize here :) - I removed myself as OWNERS from build/android/buildbot. - I had a long discussions with ilevy@ over chat, and then I met cmp@ late last Friday. My understanding from their point of view, which I fully respect, is that the infra team need to have more control over things that affect infrastructure... I think the trade-off here is: - without such control, infra team ends up fighting more fires than needed, and having less time to actually be productive. - with this extra control, they will have an increased load reviewing this patches, but hopefully less fires to fight. I am in no position to judge this trade-offs, but again, I totally appreciate this point, and I hope the less fires will soon lead to more responsive and less "review latency"... ilevy, cmp, please correct me if I'm wrong.
Message was sent while issue was closed.
On 2013/09/09 19:53:51, bulach wrote: > hey everyone, > > let me try to summarize here :) > - I removed myself as OWNERS from build/android/buildbot. > - I had a long discussions with ilevy@ over chat, and then I met cmp@ late last > Friday. > > My understanding from their point of view, which I fully respect, is that the > infra team need to have more control over things that affect infrastructure... > > I think the trade-off here is: > - without such control, infra team ends up fighting more fires than needed, and > having less time to actually be productive. > - with this extra control, they will have an increased load reviewing this > patches, but hopefully less fires to fight. > > I am in no position to judge this trade-offs, but again, I totally appreciate > this point, and I hope the less fires will soon lead to more responsive and less > "review latency"... > > ilevy, cmp, please correct me if I'm wrong. Ah. The discussion took place offline. Without having been part of that discussion and while understanding the desire for having control to avoid fires, it is my opinion the current situation is at best not ideal and at worst completely unsustainable. Let's end this discussion here, as it seems like a discussion that needs to be had by those of us on the infra team.
Message was sent while issue was closed.
On 2013/09/09 20:14:02, navabi wrote: > On 2013/09/09 19:53:51, bulach wrote: > > hey everyone, > > > > let me try to summarize here :) > > - I removed myself as OWNERS from build/android/buildbot. > > - I had a long discussions with ilevy@ over chat, and then I met cmp@ late > last > > Friday. > > > > My understanding from their point of view, which I fully respect, is that the > > infra team need to have more control over things that affect infrastructure... > > > > I think the trade-off here is: > > - without such control, infra team ends up fighting more fires than needed, > and > > having less time to actually be productive. > > - with this extra control, they will have an increased load reviewing this > > patches, but hopefully less fires to fight. > > > > I am in no position to judge this trade-offs, but again, I totally appreciate > > this point, and I hope the less fires will soon lead to more responsive and > less > > "review latency"... > > > > ilevy, cmp, please correct me if I'm wrong. > > Ah. The discussion took place offline. Without having been part of that > discussion and while understanding the desire for having control to avoid fires, > it is my opinion the current situation is at best not ideal and at worst > completely unsustainable. > > Let's end this discussion here, as it seems like a discussion that needs to be > had by those of us on the infra team. I think the proper course of action is to remove set noparent. However, we can only do so if involved parties agree to a certain set of guidelines. When I put the stable test definitions in pylib, I thought we similarly established guidelines, namely that the infra team will always be CCed with a chance to respond on changes that have major implications for trybot capacity or trybot flakiness, or that add additional expectations about the buildbot environment. I have seen this and similar agreements fail repeatedly, most recently with the re-enabling of content browsertests (which caused major turmoil for all chromium developers) after infra explicitly reverted it earlier because of capacity issues. Similarly, this owners change was sent directly to Marcus, who while listed on owners, is not the main contributor of these scripts (I am, and with all due respect, the CL should have been sent to either me or chase). I reverted it in part because I considered it landing without proper review. Owners is a way to enforce this agreement. It is a heavy hammer -- and also not sustainable for those on the owners list, so I'd really like to change this. Frank and Craig have done a fantastic job improving the rest of the android scripts and I trust them to do the right thing here too
Message was sent while issue was closed.
On 2013/09/09 20:58:00, Isaac wrote: > On 2013/09/09 20:14:02, navabi wrote: > > On 2013/09/09 19:53:51, bulach wrote: > > > hey everyone, > > > > > > let me try to summarize here :) > > > - I removed myself as OWNERS from build/android/buildbot. > > > - I had a long discussions with ilevy@ over chat, and then I met cmp@ late > > last > > > Friday. > > > > > > My understanding from their point of view, which I fully respect, is that > the > > > infra team need to have more control over things that affect > infrastructure... > > > > > > I think the trade-off here is: > > > - without such control, infra team ends up fighting more fires than needed, > > and > > > having less time to actually be productive. > > > - with this extra control, they will have an increased load reviewing this > > > patches, but hopefully less fires to fight. > > > > > > I am in no position to judge this trade-offs, but again, I totally > appreciate > > > this point, and I hope the less fires will soon lead to more responsive and > > less > > > "review latency"... > > > > > > ilevy, cmp, please correct me if I'm wrong. > > > > Ah. The discussion took place offline. Without having been part of that > > discussion and while understanding the desire for having control to avoid > fires, > > it is my opinion the current situation is at best not ideal and at worst > > completely unsustainable. > > > > Let's end this discussion here, as it seems like a discussion that needs to be > > had by those of us on the infra team. > > I think the proper course of action is to remove set noparent. However, we can > only do so if involved parties agree to a certain set of guidelines. When I put > the stable test definitions in pylib, I thought we similarly established > guidelines, namely that the infra team will always be CCed with a chance to > respond on changes that have major implications for trybot capacity or trybot > flakiness, or that add additional expectations about the buildbot environment. I > have seen this and similar agreements fail repeatedly, most recently with the > re-enabling of content browsertests (which caused major turmoil for all chromium > developers) after infra explicitly reverted it earlier because of capacity > issues. Similarly, this owners change was sent directly to Marcus, who while > listed on owners, is not the main contributor of these scripts (I am, and with > all due respect, the CL should have been sent to either me or chase). I reverted > it in part because I considered it landing without proper review. > > Owners is a way to enforce this agreement. It is a heavy hammer -- and also not > sustainable for those on the owners list, so I'd really like to change this. > Frank and Craig have done a fantastic job improving the rest of the android > scripts and I trust them to do the right thing here too While I disagree with a some of your comments, I won't bring them up here. We should discuss this among the clank infra team. The main take away from this thread is that people are experiencing pain the way things are now, and in my opinion, we can not simply ignore this pain.
Message was sent while issue was closed.
On 2013/09/09 21:04:55, navabi wrote: > On 2013/09/09 20:58:00, Isaac wrote: > > On 2013/09/09 20:14:02, navabi wrote: > > > On 2013/09/09 19:53:51, bulach wrote: > > > > hey everyone, > > > > > > > > let me try to summarize here :) > > > > - I removed myself as OWNERS from build/android/buildbot. > > > > - I had a long discussions with ilevy@ over chat, and then I met cmp@ late > > > last > > > > Friday. > > > > > > > > My understanding from their point of view, which I fully respect, is that > > the > > > > infra team need to have more control over things that affect > > infrastructure... > > > > > > > > I think the trade-off here is: > > > > - without such control, infra team ends up fighting more fires than > needed, > > > and > > > > having less time to actually be productive. > > > > - with this extra control, they will have an increased load reviewing this > > > > patches, but hopefully less fires to fight. > > > > > > > > I am in no position to judge this trade-offs, but again, I totally > > appreciate > > > > this point, and I hope the less fires will soon lead to more responsive > and > > > less > > > > "review latency"... > > > > > > > > ilevy, cmp, please correct me if I'm wrong. > > > > > > Ah. The discussion took place offline. Without having been part of that > > > discussion and while understanding the desire for having control to avoid > > fires, > > > it is my opinion the current situation is at best not ideal and at worst > > > completely unsustainable. > > > > > > Let's end this discussion here, as it seems like a discussion that needs to > be > > > had by those of us on the infra team. > > > > I think the proper course of action is to remove set noparent. However, we > can > > only do so if involved parties agree to a certain set of guidelines. When I > put > > the stable test definitions in pylib, I thought we similarly established > > guidelines, namely that the infra team will always be CCed with a chance to > > respond on changes that have major implications for trybot capacity or trybot > > flakiness, or that add additional expectations about the buildbot environment. > I > > have seen this and similar agreements fail repeatedly, most recently with the > > re-enabling of content browsertests (which caused major turmoil for all > chromium > > developers) after infra explicitly reverted it earlier because of capacity > > issues. Similarly, this owners change was sent directly to Marcus, who while > > listed on owners, is not the main contributor of these scripts (I am, and with > > all due respect, the CL should have been sent to either me or chase). I > reverted > > it in part because I considered it landing without proper review. > > > > Owners is a way to enforce this agreement. It is a heavy hammer -- and also > not > > sustainable for those on the owners list, so I'd really like to change this. > > Frank and Craig have done a fantastic job improving the rest of the android > > scripts and I trust them to do the right thing here too > > While I disagree with a some of your comments, I won't bring them up here. We > should discuss this among the clank infra team. > The main take away from this thread is that people are experiencing pain the way > things are now, and in my opinion, we can not simply ignore this pain. this: ...Similarly, this owners change was sent directly to Marcus ...I reverted it in part because I considered it landing without proper review. is needlessly hostile. Words should be more carefully chosen, specially in a public forum where this thread will outlive any internal team boundaries and other offline discussions. My explanation above was sufficient, and plenty respectful to everyone involved. It also fully justified my decision to remove myself from this OWNERS, since I don't have visibility in all aspects that "infra" wants to control. Again, the issue is about "control", not about how "proper" my reviews were / are. I hope this is very clear.
Message was sent while issue was closed.
On 2013/09/10 09:20:38, bulach wrote: > this: > ...Similarly, this owners change was sent directly to Marcus > ...I reverted it in part because I considered it landing without proper review. > > is needlessly hostile. > > Words should be more carefully chosen, specially in a public forum where this > thread will outlive any internal team boundaries and other offline discussions. I apologize for my crass wording. I intended to convey that I felt adding owners merited a discussion involving those who originally pushed for the file to be small and locked down. I agree OWNERS has become a roadblock and I'm interested in working to resolve. I also see this was a bad channel for discussion -- I'm sorry about that too. > > My explanation above was sufficient, and plenty respectful to everyone involved. > It also fully justified my decision to remove myself from this OWNERS, since I > don't have visibility in all aspects that "infra" wants to control. > > Again, the issue is about "control", not about how "proper" my reviews were / > are. The question here is what types of changes here could actually destabilize the bots (vs. just causing redness and getting rolled back). I think the most sensitive file for android lives outside this directory (gtest_config.py); however changes to bb_run_bot do have the ability to create discrepancies between trybots and main waterfall. As far as visibility, usually it's just a matter of who monitors what. Sheriffs don't usually look at android trybots, and if these start failing it might be hours before someone notices and reverts the problem, which could back up CQ for the rest of the day. For the most part this folder isn't unique in it's ability to screw up the build, though. > > I hope this is very clear.
Message was sent while issue was closed.
On the one hand, it would be bad if because there are fewer checks in place that basic services many more people depend on like the Chrome CQ go offline as a result of a bad change. I've investigated a large number of those sorts of problems myself (not due to changes here but elsewhere) and they take orders of magnitude more time to debug and fix compared to the time it takes to review incoming patches. This probably best sums up what issues we were concerned about a year ago. On the other hand, a life without set noparent OWNERS may just be an okay price to pay to let more people contribute without being blocked. As we said, this folder isn't unique in its ability to screw up the build. That could happen with a strategically placed sleep call in a test harness or any number of other like innocent errors. That alone seems to be the most significant factor for why this folder is not worth any extra guards. I'm OK with us paying that price as long as We (the Google and Chrome we's) are prepared that the Chrome CQ and other systems will be offline for hours or a couple of days due to disruptive changes. There could be other more heinous problems, too, but I won't go into those. It wouldn't be accurate to sum up what's been going on here for over a year in a brief update. The reality is more complicated and has been driven from various shifting perspectives over that period of time. What is obvious is that the amount of interest in this area has risen considerably and the original intent no longer appears to be valid. Even more important is that it looks like the level of overall knowledge has increased a lot since the early days. We'll address the concerns and make sure people are able to make the changes they need to make to do their work. |