|
|
Chromium Code Reviews|
Created:
8 years, 6 months ago by mmenke Modified:
8 years, 5 months ago CC:
chromium-reviews, joi+watch-content_chromium.org, pam+watch_chromium.org, jochen+watch-content_chromium.org, darin-cc_chromium.org, cmp Base URL:
svn://svn.chromium.org/chrome/trunk/src/ Visibility:
Public. |
DescriptionMake sharding_supervisor.py run InProcessBrowserTest.Empty
before other tests, with a longer timeout. This is aimed
at fixing the issue where the first 4 tests on the XP
buildbots take much longer than other tests, often timing
out.
BUG=124260
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=144760
Patch Set 1 #Patch Set 2 : reorganize #Patch Set 3 : Oops #Patch Set 4 : #
Total comments: 4
Patch Set 5 : Response to comments #Patch Set 6 : Remove comment about log parsing, as requested #Patch Set 7 : Use --warmup (And sync) #
Total comments: 4
Patch Set 8 : Response to comments #
Messages
Total messages: 49 (0 generated)
This looks quite hacky. I don't think we should shard browser_test unless a bug with mixed-up output is fixed.
On 2012/06/01 14:48:23, Paweł Hajdan Jr. wrote: > This looks quite hacky. I don't think we should shard browser_test unless a bug > with mixed-up output is fixed. The problem is we're already sharding it, and it is causing tests unfortunate to be run first to time out on XP. I'm happy to solve this some other way, but at the moment, a number of my tests are timing out pretty consistently on the buildbots.
On 2012/06/01 14:50:16, Matt Menke wrote: > On 2012/06/01 14:48:23, Paweł Hajdan Jr. wrote: > > This looks quite hacky. I don't think we should shard browser_test unless a > bug > > with mixed-up output is fixed. > > The problem is we're already sharding it, and it is causing tests unfortunate to > be run first to time out on XP. I'm happy to solve this some other way, but at > the moment, a number of my tests are timing out pretty consistently on the > buildbots. Or at least they were when I started this CL (Admittedly, took a while to get around to finishing it). Are the browser tests no longer sharded?
On 2012/06/01 14:52:25, Matt Menke wrote: > On 2012/06/01 14:50:16, Matt Menke wrote: > > On 2012/06/01 14:48:23, Paweł Hajdan Jr. wrote: > > > This looks quite hacky. I don't think we should shard browser_test unless a > > bug > > > with mixed-up output is fixed. > > > > The problem is we're already sharding it, and it is causing tests unfortunate > to > > be run first to time out on XP. I'm happy to solve this some other way, but > at > > the moment, a number of my tests are timing out pretty consistently on the > > buildbots. > > Or at least they were when I started this CL (Admittedly, took a while to get > around to finishing it). Are the browser tests no longer sharded? Double-checked, and it looks like they are still being sharded and timing out on some runs: http://build.chromium.org/p/chromium/builders/XP%20Tests%20%28dbg%29%284%29/b...
My point is that: 1. This change is hacky (one place undoes change made in other place) 2. Sharding browser tests has more serious problems (mixed up output), and this should either be fixed (i.e. fix more serious problem first), or we should stop sharding browser tests, removing the need for this change.
On 2012/06/01 15:09:43, Paweł Hajdan Jr. wrote: > My point is that: > > 1. This change is hacky (one place undoes change made in other place) > 2. Sharding browser tests has more serious problems (mixed up output), and this > should either be fixed (i.e. fix more serious problem first), or we should stop > sharding browser tests, removing the need for this change. I agree with 1, though other than reverting the change that removed running the Empty test when sharding, or increasing the timeout (Which also seems like a really bad idea to me), I don't have any idea how to work around the underlying issue. While you may be correct on 2, I just want my tests to stop timing out. This has been going on for at least six weeks, so it doesn't look to me like this will happen on its own, and disabling it myself is something I'd consider a bit beyond my purview.
I am not sure I understand the "mixed output" bug thing. Sharding supervisor for browser tests allows us to continue running browser tests. Otherwise we'd have to remove it because it would make the bots too slow have productive development. Some other people argue that if we were not to run browser_tests, then our testing process would be completely useless... so... unless an alternative to sharding supervisor is found, or someone has time to fix the underlying issues we have with sharding supervisor, then we'll need to deal with the problems it has. This change LGTM in the sense that it patches an immediate problem we have. I'm sure there are better ways to do this, but I'm not sure they would be doable without refactoring the whole thing. (which we probably need to do, but I don't want this to fall on Matt just because he happens to care a tiny bit more than everyone else) On 2012/06/01 15:22:22, Matt Menke wrote: > On 2012/06/01 15:09:43, Paweł Hajdan Jr. wrote: > > My point is that: > > > > 1. This change is hacky (one place undoes change made in other place) > > 2. Sharding browser tests has more serious problems (mixed up output), and > this > > should either be fixed (i.e. fix more serious problem first), or we should > stop > > sharding browser tests, removing the need for this change. > > I agree with 1, though other than reverting the change that removed running the > Empty test when sharding, or increasing the timeout (Which also seems like a > really bad idea to me), I don't have any idea how to work around the underlying > issue. > > While you may be correct on 2, I just want my tests to stop timing out. This > has been going on for at least six weeks, so it doesn't look to me like this > will happen on its own, and disabling it myself is something I'd consider a bit > beyond my purview.
The mixed-up output I was referring to is https://groups.google.com/a/chromium.org/forum/#!topic/chromium-dev/EwHAn-WtW... Also adding John, since what I'm suggesting here is effectively reverting https://chromiumcodereview.appspot.com/9662011/ . I believe this is the right thing to do. http://codereview.chromium.org/10441103/diff/8004/content/test/test_launcher.cc File content/test/test_launcher.cc (right): http://codereview.chromium.org/10441103/diff/8004/content/test/test_launcher.... content/test/test_launcher.cc:640: if (!should_shard && !command_line->HasSwitch(kGTestFilterFlag)) { I think the right place to fix it is here. Remove this "if", and it should be much prettier.
This CL is now just a partial revert of John's earlier CL (https://chromiumcodereview.appspot.com/9662011). http://codereview.chromium.org/10441103/diff/8004/content/test/test_launcher.cc File content/test/test_launcher.cc (right): http://codereview.chromium.org/10441103/diff/8004/content/test/test_launcher.... content/test/test_launcher.cc:640: if (!should_shard && !command_line->HasSwitch(kGTestFilterFlag)) { On 2012/06/04 10:28:16, Paweł Hajdan Jr. wrote: > I think the right place to fix it is here. Remove this "if", and it should be > much prettier. Done (Well...Mostly. I kept the "!command_line->HasSwitch(kGTestFilterFlag)" half of the conditional).
http://codereview.chromium.org/10441103/diff/8004/content/test/test_launcher.cc File content/test/test_launcher.cc (right): http://codereview.chromium.org/10441103/diff/8004/content/test/test_launcher.... content/test/test_launcher.cc:640: if (!should_shard && !command_line->HasSwitch(kGTestFilterFlag)) { On 2012/06/04 14:38:11, Matt Menke wrote: > On 2012/06/04 10:28:16, Paweł Hajdan Jr. wrote: > > I think the right place to fix it is here. Remove this "if", and it should be > > much prettier. > > Done (Well...Mostly. I kept the "!command_line->HasSwitch(kGTestFilterFlag)" > half of the conditional). Err...Wait... How are tests sharded across multiple test on the waterfall? By using --gtest_filter, too? If so, I'll either have to remove that check, too, or use "(should_shard || !command_line->HasSwitch(kGTestFilterFlag)"
http://codereview.chromium.org/10441103/diff/8004/content/test/test_launcher.cc File content/test/test_launcher.cc (right): http://codereview.chromium.org/10441103/diff/8004/content/test/test_launcher.... content/test/test_launcher.cc:640: if (!should_shard && !command_line->HasSwitch(kGTestFilterFlag)) { On 2012/06/04 14:57:19, Matt Menke wrote: > On 2012/06/04 14:38:11, Matt Menke wrote: > > On 2012/06/04 10:28:16, Paweł Hajdan Jr. wrote: > > > I think the right place to fix it is here. Remove this "if", and it should > be > > > much prettier. > > > > Done (Well...Mostly. I kept the "!command_line->HasSwitch(kGTestFilterFlag)" > > half of the conditional). > > Err...Wait... How are tests sharded across multiple test on the waterfall? By > using --gtest_filter, too? If so, I'll either have to remove that check, too, > or use "(should_shard || !command_line->HasSwitch(kGTestFilterFlag)" Ah, |total_slaves|. Saw that before, but forgot about it, so should be fine as-is.
This means that each try run will take a few minutes more, because of the hundreds of time browser_tests will run the empty test. Can't we make sharding_supervisor special case 'browser_tests' and run the empty test serially first?
On 2012/06/04 15:39:13, John Abd-El-Malek wrote: > This means that each try run will take a few minutes more, because of the > hundreds of time browser_tests will run the empty test. Can't we make > sharding_supervisor special case 'browser_tests' and run the empty test serially > first? That's what patch set 4 did, but Paweł said to use this approach. I'm fine either way.
On 2012/06/04 15:39:13, John Abd-El-Malek wrote: > This means that each try run will take a few minutes more, because of the > hundreds of time browser_tests will run the empty test. Can't we make > sharding_supervisor special case 'browser_tests' and run the empty test serially > first? The problem with the above approach is having two pieces of login in different places that cancel out each other. I think it would be more elegant to teach test_launcher.cc to shard tests. Meanwhile, aren't possibly slower browser_tests better than randomly timing out tests? Note that running the empty test should be fast except for the first time. Something that might work better is making sharding supervisor force running of the empty test, but without canceling it out inside test launcher. Note that the problem with log parsing is going to occur anyway. I guess I'll need to fix the log parser in chromium-build-logs, so please also remove that comment about log parsing.
On 2012/06/04 16:37:48, Paweł Hajdan Jr. wrote: > On 2012/06/04 15:39:13, John Abd-El-Malek wrote: > > This means that each try run will take a few minutes more, because of the > > hundreds of time browser_tests will run the empty test. Can't we make > > sharding_supervisor special case 'browser_tests' and run the empty test > serially > > first? > > The problem with the above approach is having two pieces of login in different > places that cancel out each other. I think it would be more elegant to teach > test_launcher.cc to shard tests. > > Meanwhile, aren't possibly slower browser_tests better than randomly timing out > tests? Note that running the empty test should be fast except for the first > time. > > Something that might work better is making sharding supervisor force running of > the empty test, but without canceling it out inside test launcher. > > Note that the problem with log parsing is going to occur anyway. I guess I'll > need to fix the log parser in chromium-build-logs, so please also remove that > comment about log parsing. The one right above the "if (test_name == kEmptyTestName)" line? Done.
On 2012/06/04 16:41:49, Matt Menke wrote: > On 2012/06/04 16:37:48, Paweł Hajdan Jr. wrote: > > On 2012/06/04 15:39:13, John Abd-El-Malek wrote: > > > This means that each try run will take a few minutes more, because of the > > > hundreds of time browser_tests will run the empty test. Can't we make > > > sharding_supervisor special case 'browser_tests' and run the empty test > > serially > > > first? > > > > The problem with the above approach is having two pieces of login in different > > places that cancel out each other. I think it would be more elegant to teach > > test_launcher.cc to shard tests. I don't understand what you mean about "login". > > > > Meanwhile, aren't possibly slower browser_tests better than randomly timing > > out tests? Note that running the empty test should be fast except for the first > > time. As I mentioned, this wasn't slow, it used to take 2 minutes for the > 200 invocations of the empty test. Since then I have converted the ui tests to browser_tests, so this number will only be longer now. > > > > Something that might work better is making sharding supervisor force running > of > > the empty test, but without canceling it out inside test launcher. > > > > Note that the problem with log parsing is going to occur anyway. I guess I'll > > need to fix the log parser in chromium-build-logs, so please also remove that > > comment about log parsing. > > The one right above the "if (test_name == kEmptyTestName)" line? Done.
On 2012/06/04 15:40:51, Matt Menke wrote: > On 2012/06/04 15:39:13, John Abd-El-Malek wrote: > > This means that each try run will take a few minutes more, because of the > > hundreds of time browser_tests will run the empty test. Can't we make > > sharding_supervisor special case 'browser_tests' and run the empty test > serially > > first? > > That's what patch set 4 did, but Paweł said to use this approach. I'm fine > either way. oh, I didn't see that. patchset 4 lgtm.
LGTM In my previous comment, "login" should have been "logic", indeed it wouldn't make sense otherwise.
On 2012/06/04 17:00:09, Paweł Hajdan Jr. wrote: > LGTM > > In my previous comment, "login" should have been "logic", indeed it wouldn't > make sense otherwise. Great. Once we reach some sort of consensus on patch set 4 vs patch set 6, I'll land. I feel that if I landed either set now, it would be over someone's objections. I'm a bit concerned about this ending up in a stalemate. I favor 4, but I could happily live with either solution. While 4 might not be the most elegant solution, it does solve the problem without running the empty test a bunch of times. It's not the most elegant solution, but it does solve a problem that's going on now without slowing things down much. While moving the logic into test_launcher.cc may be a better long term solution, we have tests failing now, and patch set 4 fixes the issue. Patch set 6 also fixes the issue, with the downside being it slows test runs down by ~2 minutes (according to jam). Skimming over a couple random test runs of the slowest XP bots, the tests take about 30 minutes total. Including reruns, looks like there are 20 shards run, not including re-run shards (Which we won't be running the Empty test again on, in either case). Assuming 4 seconds per test, that's 20n = 80 seconds wasted. However, since we're sharding those tests across 4 instances, that should end up being closer to 80/4 = 20 seconds longer. We get 120/4 = 30 seconds, if the empty test were run on re-run shards, which may be where jam's two minutes comes from. So test runs may take about 1.1% longer (20/(30*60)). This doesn't seem like a huge penalty, though it does all add up. There may also be some advantage to set 4 not sharding the warmup time, or there may not.
On 2012/06/04 17:25:35, Matt Menke wrote: > On 2012/06/04 17:00:09, Paweł Hajdan Jr. wrote: > > LGTM > > > > In my previous comment, "login" should have been "logic", indeed it wouldn't > > make sense otherwise. > > Great. Once we reach some sort of consensus on patch set 4 vs patch set 6, I'll > land. I feel that if I landed either set now, it would be over someone's > objections. I'm a bit concerned about this ending up in a stalemate. I favor > 4, but I could happily live with either solution. > > While 4 might not be the most elegant solution, it does solve the problem > without running the empty test a bunch of times. It's not the most elegant > solution, but it does solve a problem that's going on now without slowing things > down much. While moving the logic into test_launcher.cc may be a better long > term solution, we have tests failing now, and patch set 4 fixes the issue. Changing how we do sharding is outside the scope of this, and has other dependencies as well (i.e. gtest seems to know about sharding). > > Patch set 6 also fixes the issue, with the downside being it slows test runs > down by ~2 minutes (according to jam). Skimming over a couple random test runs > of the slowest XP bots, the tests take about 30 minutes total. The overhead is lower on the _build bots_ because browser_tests is sharded across multiple machines. But on _try bots_, I had seen this empty test run 200 times. > Including > reruns, looks like there are 20 shards run, not including re-run shards (Which > we won't be running the Empty test again on, in either case). Assuming 4 > seconds per test, that's 20n = 80 seconds wasted. However, since we're sharding > those tests across 4 instances, that should end up being closer to 80/4 = 20 > seconds longer. We get 120/4 = 30 seconds, if the empty test were run on re-run > shards, which may be where jam's two minutes comes from. So test runs may take > about 1.1% longer (20/(30*60)). This doesn't seem like a huge penalty, though > it does all add up. There may also be some advantage to set 4 not sharding the > warmup time, or there may not.
On 2012/06/04 18:02:45, John Abd-El-Malek wrote: > On 2012/06/04 17:25:35, Matt Menke wrote: > > On 2012/06/04 17:00:09, Paweł Hajdan Jr. wrote: > > > LGTM > > > > > > In my previous comment, "login" should have been "logic", indeed it wouldn't > > > make sense otherwise. > > > > Great. Once we reach some sort of consensus on patch set 4 vs patch set 6, > I'll > > land. I feel that if I landed either set now, it would be over someone's > > objections. I'm a bit concerned about this ending up in a stalemate. I favor > > 4, but I could happily live with either solution. > > > > While 4 might not be the most elegant solution, it does solve the problem > > without running the empty test a bunch of times. It's not the most elegant > > solution, but it does solve a problem that's going on now without slowing > things > > down much. While moving the logic into test_launcher.cc may be a better long > > term solution, we have tests failing now, and patch set 4 fixes the issue. > > Changing how we do sharding is outside the scope of this, and has other > dependencies as well (i.e. gtest seems to know about sharding). > > > > > Patch set 6 also fixes the issue, with the downside being it slows test runs > > down by ~2 minutes (according to jam). Skimming over a couple random test > runs > > of the slowest XP bots, the tests take about 30 minutes total. > > The overhead is lower on the _build bots_ because browser_tests is sharded > across multiple machines. But on _try bots_, I had seen this empty test run 200 > times. Ahh... You're right. On my test run, it ran at least 82 times. About a second each, on average, but that is a lot of runs, and it's easy to see how that could get even more out of hand.
On Mon, Jun 4, 2012 at 11:23 AM, <mmenke@chromium.org> wrote: > On 2012/06/04 18:02:45, John Abd-El-Malek wrote: > >> On 2012/06/04 17:25:35, Matt Menke wrote: >> > On 2012/06/04 17:00:09, Paweł Hajdan Jr. wrote: >> > > LGTM >> > > >> > > In my previous comment, "login" should have been "logic", indeed it >> > wouldn't > >> > > make sense otherwise. >> > >> > Great. Once we reach some sort of consensus on patch set 4 vs patch >> set 6, >> I'll >> > land. I feel that if I landed either set now, it would be over >> someone's >> > objections. I'm a bit concerned about this ending up in a stalemate. I >> > favor > >> > 4, but I could happily live with either solution. >> > >> > While 4 might not be the most elegant solution, it does solve the >> problem >> > without running the empty test a bunch of times. It's not the most >> elegant >> > solution, but it does solve a problem that's going on now without >> slowing >> things >> > down much. While moving the logic into test_launcher.cc may be a better >> > long > >> > term solution, we have tests failing now, and patch set 4 fixes the >> issue. >> > > Changing how we do sharding is outside the scope of this, and has other >> dependencies as well (i.e. gtest seems to know about sharding). >> > > > >> > Patch set 6 also fixes the issue, with the downside being it slows test >> runs >> > down by ~2 minutes (according to jam). Skimming over a couple random >> test >> runs >> > of the slowest XP bots, the tests take about 30 minutes total. >> > > The overhead is lower on the _build bots_ because browser_tests is sharded >> across multiple machines. But on _try bots_, I had seen this empty test >> run >> > 200 > >> times. >> > > Ahh... You're right. On my test run, it ran at least 82 times. About a > second > each, on average, but that is a lot of runs, and it's easy to see how that > could > get even more out of hand. > yeah, the next generation of trybot hardware has 2x as many cores, so there'll be even more sharding :) > > http://codereview.chromium.**org/10441103/<http://codereview.chromium.org/104... >
I think that flaky test timeouts are a more serious problem than possibly wasting some time on running the empty test too many times. Can't we just make sharding supervisor force running the empty test, and then use a gtest_filter for the shards, so that they don't run the empty test? We could add a flag for the test_launcher.cc like --warmup to avoid duplicating the logic in sharding supervisor. --warmup would make test_launcher.cc run the empty test with bigger timeout and quit. WDYT?
On 2012/06/06 10:35:38, Paweł Hajdan Jr. wrote: > I think that flaky test timeouts are a more serious problem than possibly > wasting some time on running the empty test too many times. > > Can't we just make sharding supervisor force running the empty test, and then > use a gtest_filter for the shards, so that they don't run the empty test? We > could add a flag for the test_launcher.cc like --warmup to avoid duplicating the > logic in sharding supervisor. --warmup would make test_launcher.cc run the empty > test with bigger timeout and quit. > > WDYT? I don't understand why you're suggesting this, what's wrong with patchset 4? That solves the initial test problem. Wasting 2 minutes over every try run adds up. I think enough time has been spent on this and we should just commit patchset 4.
On 2012/06/06 16:56:33, John Abd-El-Malek wrote: > I don't understand why you're suggesting this, what's wrong with patchset 4? > That solves the initial test problem. Code review is not only about whether the patch solves the problem (most of the time it does). It's also about quality, and I have a strong opinion that patchset 4 degrades the quality of our test infrastructure, making it less maintainable. > Wasting 2 minutes over every try run adds up. Agreed. I have suggested possible solutions. > I think enough time has been spent on this and we should just commit patchset 4. I can't stop everyone from making hacky changes. If you really want to do it, feel free. Note that "let's just do this quick hack" shouldn't be a convincing argument. See http://dev.chromium.org/developers/committers-responsibility : "In short, do the right thing for the project, not the easiest thing to get code committed, and above all: use your best judgement.". I would be surprised if my suggested changes took more than 1-2 days to land.
On 2012/06/18 15:36:57, Paweł Hajdan Jr. wrote: > On 2012/06/06 16:56:33, John Abd-El-Malek wrote: > > I don't understand why you're suggesting this, what's wrong with patchset 4? > > That solves the initial test problem. > > Code review is not only about whether the patch solves the problem (most of the > time it does). > > It's also about quality, and I have a strong opinion that patchset 4 degrades > the quality of our test infrastructure, making it less maintainable. > > > Wasting 2 minutes over every try run adds up. > > Agreed. I have suggested possible solutions. > > > I think enough time has been spent on this and we should just commit patchset > 4. > > I can't stop everyone from making hacky changes. If you really want to do it, > feel free. Note that "let's just do this quick hack" shouldn't be a convincing > argument. See http://dev.chromium.org/developers/committers-responsibility : "In > short, do the right thing for the project, not the easiest thing to get code > committed, and above all: use your best judgement.". > > I would be surprised if my suggested changes took more than 1-2 days to land. Been meaning to get back to this, but just haven't had the time to work on areas this far afield of my purview. Paweł: I assume your suggestion was either add "--warmup" (Just to browser_tests? To all test binaries?) or add gtest_filter logic to the logic to the python files.
On 2012/06/18 15:36:57, Paweł Hajdan Jr. wrote: > On 2012/06/06 16:56:33, John Abd-El-Malek wrote: > > I don't understand why you're suggesting this, what's wrong with patchset 4? > > That solves the initial test problem. > > Code review is not only about whether the patch solves the problem (most of the > time it does). > > It's also about quality, and I have a strong opinion that patchset 4 degrades > the quality of our test infrastructure, making it less maintainable. > > > Wasting 2 minutes over every try run adds up. > > Agreed. I have suggested possible solutions. > > > I think enough time has been spent on this and we should just commit patchset > 4. > > I can't stop everyone from making hacky changes. If you really want to do it, > feel free. Note that "let's just do this quick hack" shouldn't be a convincing > argument. See http://dev.chromium.org/developers/committers-responsibility : "In > short, do the right thing for the project, not the easiest thing to get code > committed, and above all: use your best judgement.". > > I would be surprised if my suggested changes took more than 1-2 days to land. Just saying something is hacky doesn't make it so. Have you read all this thread? Can you explain exactly is hacky with patchset 4?
And to avoid another two week round-trip, here's why I don't think it's hacky. test_launcher.cc is run for every shard. It's not the right place to run this initial test because otherwise it would have to know which set of shards are running initially, and then to run the empty test in each of them. I don't know if the first part of this is possible. Even if it was, that means that the empty test would have to be run n times. So you might think your proposed solution is less hacky, but for these reasons I don't agree. The logic that Matt added is minimal, so it's not much of a duplication. I also think that running it in the sharding supervisor is the right place, again because it's the serial path that runs initially before sharding begins. We really want this empty test to just be run _once_. If you look at my reviews, you'll see that I never advocate for a hacky shortcut. I always advocate for doing the right thing. So just because you don't agree with what I'm arguing for, it's not reasonable to label it as a hack. And lastly, if you'd like to block changes from landing, then waiting two weeks to reply isn't reasonable. i.e. I understand you're busy with other stuff now, which is understandable/fine of course. But you can't chime in and say you don't like something and then step away for two weeks which leaves the original author waiting for you. On Mon, Jun 18, 2012 at 10:30 AM, <jam@chromium.org> wrote: > On 2012/06/18 15:36:57, Paweł Hajdan Jr. wrote: > >> On 2012/06/06 16:56:33, John Abd-El-Malek wrote: >> > I don't understand why you're suggesting this, what's wrong with >> patchset 4? >> > That solves the initial test problem. >> > > Code review is not only about whether the patch solves the problem (most >> of >> > the > >> time it does). >> > > It's also about quality, and I have a strong opinion that patchset 4 >> degrades >> the quality of our test infrastructure, making it less maintainable. >> > > > Wasting 2 minutes over every try run adds up. >> > > Agreed. I have suggested possible solutions. >> > > > I think enough time has been spent on this and we should just commit >> > patchset > >> 4. >> > > I can't stop everyone from making hacky changes. If you really want to do >> it, >> feel free. Note that "let's just do this quick hack" shouldn't be a >> convincing >> argument. See http://dev.chromium.org/**developers/committers-** >> responsibility<http://dev.chromium.org/developers/committers-responsibility>: >> > "In > >> short, do the right thing for the project, not the easiest thing to get >> code >> committed, and above all: use your best judgement.". >> > > I would be surprised if my suggested changes took more than 1-2 days to >> land. >> > > Just saying something is hacky doesn't make it so. Have you read all this > thread? > > Can you explain exactly is hacky with patchset 4? > > http://codereview.chromium.**org/10441103/<http://codereview.chromium.org/104... >
On 2012/06/18 16:46:03, Matt Menke wrote: > Paweł: I assume your suggestion was either add "--warmup" (Just to > browser_tests? To all test binaries?) Yeah, add --warmup (or equivalent) to browser_tests. test_launcher.cc to be more precise. Then make sharding supervisor use it to avoid duplication. > or add gtest_filter logic to the logic to the python files. I'm not sure what do you mean by that. John: note that browser_test doesn't always run through sharding supervisor, so it needs some logic to run the empty test. Sharding supervisor also needs it. Adding the --warmup switch should address my last concern. I agree that running the empty test more than once wastes time - the --warmup switch also fixes that in a way that looks better to me. Please ping me when I don't respond to a review within 1-2 days.
On 2012/06/19 16:29:25, Paweł Hajdan Jr. wrote: > On 2012/06/18 16:46:03, Matt Menke wrote: > > Paweł: I assume your suggestion was either add "--warmup" (Just to > > browser_tests? To all test binaries?) > > Yeah, add --warmup (or equivalent) to browser_tests. test_launcher.cc to be more > precise. Then make sharding supervisor use it to avoid duplication. > > > or add gtest_filter logic to the logic to the python files. > > I'm not sure what do you mean by that. You said: Can't we just make sharding supervisor force running the empty test, and then use a gtest_filter for the shards, so that they don't run the empty test? We could add a flag for the test_launcher.cc like --warmup to avoid duplicating the logic in sharding supervisor. --warmup would make test_launcher.cc run the empty test with bigger timeout and quit. The first sentence - "use a gtest_filter for the shards, so that they don't run the empty test" seems to be in direct contradiction to the rest of the paragraph.
On 2012/06/19 16:42:07, Matt Menke wrote: > The first sentence - "use a gtest_filter for the shards, so that they don't run > the empty test" seems to be in direct contradiction to the rest of the > paragraph. Ah OK - yeah that's also some option, we already skip the empty test when a filter is used. But I think it's not an optimal way, --warmup would be better.
On 2012/06/19 18:51:54, Paweł Hajdan Jr. wrote: > On 2012/06/19 16:42:07, Matt Menke wrote: > > The first sentence - "use a gtest_filter for the shards, so that they don't > run > > the empty test" seems to be in direct contradiction to the rest of the > > paragraph. > > Ah OK - yeah that's also some option, we already skip the empty test when a > filter is used. But I think it's not an optimal way, --warmup would be better. i continue to think we're wasting time here. you're advocating contradictory things when patchset 4 was fine. having sharding_supervisor know about passing in the name of the empty test vs a new flag seems seems like such a minor difference, and not a hack.
ping On Tue, Jun 19, 2012 at 12:19 PM, <jam@chromium.org> wrote: > On 2012/06/19 18:51:54, Paweł Hajdan Jr. wrote: > >> On 2012/06/19 16:42:07, Matt Menke wrote: >> > The first sentence - "use a gtest_filter for the shards, so that they >> don't >> run >> > the empty test" seems to be in direct contradiction to the rest of the >> > paragraph. >> > > Ah OK - yeah that's also some option, we already skip the empty test when >> a >> filter is used. But I think it's not an optimal way, --warmup would be >> better. >> > > i continue to think we're wasting time here. you're advocating > contradictory > things when patchset 4 was fine. having sharding_supervisor know about > passing > in the name of the empty test vs a new flag seems seems like such a minor > difference, and not a hack. > > http://codereview.chromium.**org/10441103/<http://codereview.chromium.org/104... >
My understanding is that Matt is working on --warmup switch. I think I've explained my reasoning. Feel free to commit if you don't agree with it.
On 2012/06/23 07:25:10, Paweł Hajdan Jr. wrote: > My understanding is that Matt is working on --warmup switch. > > I think I've explained my reasoning. Feel free to commit if you don't agree with > it. This is indeed the case. Sorry I haven't gotten to it sooner, just been swamped with higher priority issues.
Paweł: Is this more what you had in mind? Unfortunately, we can't really remove the old tests - running when not sharding and without a filter is needed by interactive_ui_tests, and we don't want to always run the empty test when sharding, or with a filter. Sorry it took me so long to doing such a trivial CL, just been aiming to finish up a couple things on my Q2 OKRs.
LGTM This is _exactly_ what I had in mind - thank you for doing the work. I'm going on vacation, so please do not wait for me (just in case). http://codereview.chromium.org/10441103/diff/45004/content/test/test_launcher.cc File content/test/test_launcher.cc (right): http://codereview.chromium.org/10441103/diff/45004/content/test/test_launcher... content/test/test_launcher.cc:644: if (warmup || (!should_shard && !command_line->HasSwitch(kGTestFilterFlag))) { nit: For consistency, please add bool has_filter = command_line->HasSwitch(kGTestFilterFlag); above, and make the condition if (warmup || (!should_shard && !has_filter)) http://codereview.chromium.org/10441103/diff/45004/tools/sharding_supervisor/... File tools/sharding_supervisor/sharding_supervisor.py (right): http://codereview.chromium.org/10441103/diff/45004/tools/sharding_supervisor/... tools/sharding_supervisor/sharding_supervisor.py:634: if test.find("browser_tests") > -1: nit: Why not 'browser_tests' in test?
I plan on committing this tomorrow. The code and behavior should suit everyone. If anyone wants to look at it, please either do so by then, or ask me to hold off on committing. http://codereview.chromium.org/10441103/diff/45004/content/test/test_launcher.cc File content/test/test_launcher.cc (right): http://codereview.chromium.org/10441103/diff/45004/content/test/test_launcher... content/test/test_launcher.cc:644: if (warmup || (!should_shard && !command_line->HasSwitch(kGTestFilterFlag))) { On 2012/06/27 07:48:52, Paweł Hajdan Jr. wrote: > nit: For consistency, please add > > bool has_filter = command_line->HasSwitch(kGTestFilterFlag); > > above, and make the condition > > if (warmup || (!should_shard && !has_filter)) Done. http://codereview.chromium.org/10441103/diff/45004/tools/sharding_supervisor/... File tools/sharding_supervisor/sharding_supervisor.py (right): http://codereview.chromium.org/10441103/diff/45004/tools/sharding_supervisor/... tools/sharding_supervisor/sharding_supervisor.py:634: if test.find("browser_tests") > -1: On 2012/06/27 07:48:52, Paweł Hajdan Jr. wrote: > nit: Why not 'browser_tests' in test? Mostly because I don't really know Python. Done.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/mmenke@chromium.org/10441103/56001
Try job failure for 10441103-56001 (retry) on win_rel for step "browser_tests". It's a second try, previously, step "browser_tests" failed. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_rel&nu...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/mmenke@chromium.org/10441103/56001
Try job failure for 10441103-56001 (retry) on mac_rel for step "browser_tests". It's a second try, previously, step "browser_tests" failed. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=mac_rel&nu...
And it turns out not to be working... Looks like the CL to enable incremental linking for browser_tests and unit_tests killed InProcessBrowserTest.Empty. I didn't notice because the output when running an empty set of tests looks very much like the output when running a single test.
doh. looking into this.
ok, supporting the empty test might be hard since test_launcher.cc is compiled in test_support_common now (as opposed to having to be added to every browser_test binary) and that target doesn't have the necessary defines to define a browser test. It seems that what you really care about is loading the binary. Whether it launches itself again (through an empty test) is unnecessary then, since the binary is already loaded. So can we just make the warmup step just launch the test binary with "gtest_list_tests" and that's it?
On 2012/06/29 18:14:42, John Abd-El-Malek wrote: > ok, supporting the empty test might be hard since test_launcher.cc is compiled > in test_support_common now (as opposed to having to be added to every > browser_test binary) and that target doesn't have the necessary defines to > define a browser test. > > It seems that what you really care about is loading the binary. Whether it > launches itself again (through an empty test) is unnecessary then, since the > binary is already loaded. So can we just make the warmup step just launch the > test binary with "gtest_list_tests" and that's it? I don't think that's enough...The binary itself is the one that launches the Empty test in another process, so the binary is actually already loaded. Also, we're actually running the binary a second time with a gtest_filter=InProcessBrowserTest.Empty, it just doesn't find the test when walking through its tests, so don't think that gtest_list_tests would add any new behavior.
On 2012/06/29 18:18:26, Matt Menke wrote: > On 2012/06/29 18:14:42, John Abd-El-Malek wrote: > > ok, supporting the empty test might be hard since test_launcher.cc is compiled > > in test_support_common now (as opposed to having to be added to every > > browser_test binary) and that target doesn't have the necessary defines to > > define a browser test. > > > > It seems that what you really care about is loading the binary. Whether it > > launches itself again (through an empty test) is unnecessary then, since the > > binary is already loaded. So can we just make the warmup step just launch the > > test binary with "gtest_list_tests" and that's it? > > I don't think that's enough...The binary itself is the one that launches the > Empty test in another process, so the binary is actually already loaded. Also, > we're actually running the binary a second time with a > gtest_filter=InProcessBrowserTest.Empty, it just doesn't find the test when > walking through its tests, so don't think that gtest_list_tests would add any > new behavior. good point, ok let me look into this more.
ok, this should work: http://codereview.chromium.org/10692048 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
