|
|
Chromium Code Reviews|
Created:
7 years, 6 months ago by bajones Modified:
7 years, 5 months ago CC:
chromium-reviews, chrome-speed-team+watch_google.com, telemetry+watch_chromium.org, Ken Russell (switch to Gerrit) Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionDefining the test expectations object
BUG=245741
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=212742
Patch Set 1 #Patch Set 2 : Updated to check OS modifiers on platform, not browser #Patch Set 3 : Updated to lcase for consistency, removed option for array of expectations #Patch Set 4 : Modifiers is now explicitly an array, removed "SetExpectations" method #Patch Set 5 : Added expectations integration #
Total comments: 16
Patch Set 6 : Addressing Dave's feedback #Patch Set 7 : Now using SetExpectations, added Fail handling. #
Total comments: 15
Patch Set 8 : Addressed feedback, Added unit tests #Patch Set 9 : Rebase #Messages
Total messages: 36 (0 generated)
WIP
Using this expectations could be defined two ways. One is on a page set json
entry:
{
"url": "some_url.html",
"expectations": [
{"bug": 1234, "modifiers": "WIN MAC", "expect": "SKIP"},
{"bug": None, "modifiers": "CHROMEOS", "expect": "FAIL"}
]
}
The second way would be via an explicit class instatiation that would be passed
to the page set (probably during construction):
class MyExpectations(PageSetExpectations):
def SetExpectations(self, page_set):
self.SKIP(1234, "MAC WIN", "url1")
self.FAIL(5678, "LINUX", "url2")
self.TIMEOUT(None, "MOUNTAINLION", "url3_*")
The page set calls SetExpectations with itself so that the class can set
expectations algorithmically if needed.
We talked about expectations for run_benchmark. We should make sure this fits in with that. Also, why have a separate expectations file instead of just listing the expectations in the pageset itself?
On 2013/06/07 17:21:27, tonyg wrote: > We talked about expectations for run_benchmark. We should make sure this fits in > with that. I'm not sure what that entails. > Also, why have a separate expectations file instead of just listing the > expectations in the pageset itself? Expectations _could_ be set directly on the pageset (see "way one" in my first comment), but in the case of dynamically generated page sets (as with the WebGL conformance tests) the expectations need to reside in a separate, sheriff-friendly location and format.
On 2013/06/07 17:58:34, bajones wrote:
> On 2013/06/07 17:21:27, tonyg wrote:
> > We talked about expectations for run_benchmark. We should make sure this
fits
> in
> > with that.
>
> I'm not sure what that entails.
>
You can set expectations at the test level (disabling the entire test) and at
the page level. Tests are defined something like this:
class Smoothness(test.TelemetryTest):
test = measurements.Smoothness
page_set = 'page_sets/top_25.json'
disabled = True
class WebGlConformance(Test.TelemetryTest):
test = tests.WebGlConformance
def CreatePageSet(self, options):
return ...
> > Also, why have a separate expectations file instead of just listing the
> > expectations in the pageset itself?
>
> Expectations _could_ be set directly on the pageset (see "way one" in my first
> comment), but in the case of dynamically generated page sets (as with the
WebGL
> conformance tests) the expectations need to reside in a separate,
> sheriff-friendly location and format.
I can see two ideas for this:
- Having a method alongside CreatePageSet() that sets the expectations for that
page set.
- Having a negative page set file that is like a page set file, but disables
pages instead of adds them.
I'm also not a big fan of having a lot of different types of expectations. What
are the use cases for having separate flaky/fail/skip/timeout? I think it could
just be either run test normally, or run test and don't report the result.
On 2013/06/07 18:11:31, Dave Tu wrote: > I can see two ideas for this: > - Having a method alongside CreatePageSet() that sets the expectations for that > page set. > - Having a negative page set file that is like a page set file, but disables > pages instead of adds them. Your second idea is essentially what this code does, just stated in terms of python rather than JSON or plain text. I having the exceptions listed in a separate location from the test logic because they will presumably be updated frequently to help keep the bots green. I'd rather not force people to dig through test code to mark something flaky. > I'm also not a big fan of having a lot of different types of expectations. What > are the use cases for having separate flaky/fail/skip/timeout? I think it could > just be either run test normally, or run test and don't report the result. +1 to that. I have no idea why the current WebKit tests (which this is modeled after) have so many variants currently. I'm all for simplifying the Telemetry version down to "SKIP" (Don't run the test at all) and "FAIL" (Run the test but don't report failures/log unexpected passes)
We should use the chrome form of gardening marking tests as fail when possible, and only separate out expectations when strictly necessary
On 2013/06/10 19:59:43, nduca wrote: > We should use the chrome form of gardening marking tests as fail when possible, > and only separate out expectations when strictly necessary Forgive me, but I'm not familiar with what the "Chrome form of gardening" means in this case. Is there a different way to mark tests as failed besides expectations?
On 2013/06/07 20:14:02, bajones wrote: > On 2013/06/07 18:11:31, Dave Tu wrote: > > I can see two ideas for this: > > - Having a method alongside CreatePageSet() that sets the expectations for > that > > page set. > > - Having a negative page set file that is like a page set file, but disables > > pages instead of adds them. > > Your second idea is essentially what this code does, just stated in terms of > python rather than JSON or plain text. > > I having the exceptions listed in a separate location from the test logic > because they will presumably be updated frequently > to help keep the bots green. I'd rather not force people to dig through test > code to mark something flaky. The new thing will separate CreatePageSet from the measurement logic. There will be a new file that just defines a list of tests, where a test is (name + description + which measurement + which page set). > > > I'm also not a big fan of having a lot of different types of expectations. > What > > are the use cases for having separate flaky/fail/skip/timeout? I think it > could > > just be either run test normally, or run test and don't report the result. > > +1 to that. I have no idea why the current WebKit tests (which this is modeled > after) have so many variants currently. I'm all for simplifying the Telemetry > version down to "SKIP" (Don't run the test at all) and "FAIL" (Run the test but > don't report failures/log unexpected passes)
Now that the Telemetry test runner has been update, it's a good time to work on
the expectations again. I've made sure my code works with the ToT Telemetry
(though that has little effect on this particular file) and would like to get
the basic structure of this class OKed before pushing the CL that integrates it
with the PageSet/PageRunner
To review, expectations would be defined using this class one of two ways. For
PageSets defined by JSON the expectations could be defined inline:
{
"url": "some_url.html",
"expectations": [
{"bug": 1234, "modifiers": "win mac", "expect": "skip"},
{"bug": None, "modifiers": "chromeos", "expect": "fail"}
]
}
For dynamically generated PageSets, like the WebGL Conformance test, the
expectations would be added via a separate Python file:
class MyExpectations(PageSetExpectations):
def SetExpectations(self, page_set):
self.Skip(1234, "mac win", "url1")
self.Fail(5678, "linux", "url2")
This would most likely be provided to the PageSet's constructor, and it would
call SetExpectations after the PageSet was complete (so that SetExpectations
could iterate through the pages if necessary)
Friendly ping after a long holiday weekend! :)
Re-Ping. dtu (or tonyg?), if you could review this it would be appreciated. nduca is swamped right now and unavailable for review.
On 2013/07/10 22:20:58, bajones wrote: > Re-Ping. > > dtu (or tonyg?), if you could review this it would be appreciated. nduca is > swamped right now and unavailable for review. I think the place for this is to have a method alongside CreatePageSet() that also does SetExpectations(), inside the test.Test. The expectations are specific to a (PageTest, PageSet) combination, and that's exactly what a test.Test is. I'm not convinced you need more than a boolean, either you expect it to pass and report the result, or you run it but don't report the result. I don't know if skipping should be handled for individual pages. Nat's thing (https://chromiumcodereview.appspot.com/18576004/) allows us to control which tests run on which platforms for each test.Test. I might be missing some potential use cases here, though. Do we need to be able to express combinations of OS and GPU modifiers, e.g. we fail specifically on Intel GPUs running WinXP? So maybe instead of 'mac win' we could do ['mac, 'win'] to specify a logical OR and 'mac nvidia debug' to specify a logical AND? Just one possibility I'm throwing out there.
+zmo for requirements on the test expectations. I believe it is required that OS+GPU combinations have to be expressible.
On 2013/07/11 17:49:39, Ken Russell wrote: > +zmo for requirements on the test expectations. I believe it is required that > OS+GPU combinations have to be expressible. I see there is a TODO in the file to support more dimensions. This CL is not final, not ready to be used as it will disable too many tests than necessary, but I am ok it lands as is, if the intention is to finish the TODO before turning this on. A few things are missing: 1) as ken mentioned, combination of modifiers should be supported. 2) should be able to describe a test expectation based on a specific card, not just the vendor. This will allow us to add multiple cards to the lab from the same vendor (which we plan to do as infra-team recently told us the space is not a problem)
On 2013/07/11 08:22:29, Dave Tu wrote: > On 2013/07/10 22:20:58, bajones wrote: > > Re-Ping. > > > > dtu (or tonyg?), if you could review this it would be appreciated. nduca is > > swamped right now and unavailable for review. > > I think the place for this is to have a method alongside CreatePageSet() that > also does SetExpectations(), inside the test.Test. The expectations are specific > to a (PageTest, PageSet) combination, and that's exactly what a test.Test is. This would probably indicate that the SetExpectations method is unnecessary, but otherwise (on nduca's suggestion) I'm leaving integration details to a separate CL. > I'm not convinced you need more than a boolean, either you expect it to pass and > report the result, or you run it but don't report the result. I don't know if > skipping should be handled for individual pages. Nat's thing > (https://chromiumcodereview.appspot.com/18576004/) allows us to control which > tests run on which platforms for each test.Test. I might be missing some > potential use cases here, though. > > Do we need to be able to express combinations of OS and GPU modifiers, e.g. we > fail specifically on Intel GPUs running WinXP? So maybe instead of 'mac win' we > could do ['mac, 'win'] to specify a logical OR and 'mac nvidia debug' to specify > a logical AND? Just one possibility I'm throwing out there. We do, as it's not uncommon for us to find things like OpenGL features that are only buggy on Macs with AMD chipsets. Beyond that the modifiers have a natural boolean order. You can't be on mor than one OS at a time, and your can't have more than one GPU at a time (for our purposes anyway), and you can't be running release and debug at the same time. So it logically becomes: if (OS or OS) and (GPU or GPU) and (CONFIG or CONFIG): Although using actual arrays here rather than string splitting is, in retrospect, a much better way of handling this and I'll update the code shortly.
On 2013/07/11 17:59:49, bajones wrote: > On 2013/07/11 08:22:29, Dave Tu wrote: > > On 2013/07/10 22:20:58, bajones wrote: > > > Re-Ping. > > > > > > dtu (or tonyg?), if you could review this it would be appreciated. nduca is > > > swamped right now and unavailable for review. > > > > I think the place for this is to have a method alongside CreatePageSet() that > > also does SetExpectations(), inside the test.Test. The expectations are > specific > > to a (PageTest, PageSet) combination, and that's exactly what a test.Test is. > > This would probably indicate that the SetExpectations method is unnecessary, but > otherwise (on nduca's suggestion) I'm leaving integration details to a separate > CL. > > > I'm not convinced you need more than a boolean, either you expect it to pass > and > > report the result, or you run it but don't report the result. I don't know if > > skipping should be handled for individual pages. Nat's thing > > (https://chromiumcodereview.appspot.com/18576004/) allows us to control which > > tests run on which platforms for each test.Test. I might be missing some > > potential use cases here, though. > > > > Do we need to be able to express combinations of OS and GPU modifiers, e.g. we > > fail specifically on Intel GPUs running WinXP? So maybe instead of 'mac win' > we > > could do ['mac, 'win'] to specify a logical OR and 'mac nvidia debug' to > specify > > a logical AND? Just one possibility I'm throwing out there. > > We do, as it's not uncommon for us to find things like OpenGL features that are > only buggy on Macs with AMD chipsets. Beyond that the modifiers have a natural > boolean order. You can't be on mor than one OS at a time, and your can't have > more than one GPU at a time (for our purposes anyway), and you can't be running > release and debug at the same time. So it logically becomes: > > if (OS or OS) and (GPU or GPU) and (CONFIG or CONFIG): > > Although using actual arrays here rather than string splitting is, in > retrospect, a much better way of handling this and I'll update the code shortly. What if you wanted to specify Windows+Intel and OSX+Nvidia, but not Windows+Nvidia?
On 2013/07/11 23:50:43, Dave Tu wrote: > On 2013/07/11 17:59:49, bajones wrote: > > On 2013/07/11 08:22:29, Dave Tu wrote: > > > On 2013/07/10 22:20:58, bajones wrote: > > > > Re-Ping. > > > > > > > > dtu (or tonyg?), if you could review this it would be appreciated. nduca > is > > > > swamped right now and unavailable for review. > > > > > > I think the place for this is to have a method alongside CreatePageSet() > that > > > also does SetExpectations(), inside the test.Test. The expectations are > > specific > > > to a (PageTest, PageSet) combination, and that's exactly what a test.Test > is. > > > > This would probably indicate that the SetExpectations method is unnecessary, > but > > otherwise (on nduca's suggestion) I'm leaving integration details to a > separate > > CL. > > > > > I'm not convinced you need more than a boolean, either you expect it to pass > > and > > > report the result, or you run it but don't report the result. I don't know > if > > > skipping should be handled for individual pages. Nat's thing > > > (https://chromiumcodereview.appspot.com/18576004/) allows us to control > which > > > tests run on which platforms for each test.Test. I might be missing some > > > potential use cases here, though. > > > > > > Do we need to be able to express combinations of OS and GPU modifiers, e.g. > we > > > fail specifically on Intel GPUs running WinXP? So maybe instead of 'mac win' > > we > > > could do ['mac, 'win'] to specify a logical OR and 'mac nvidia debug' to > > specify > > > a logical AND? Just one possibility I'm throwing out there. > > > > We do, as it's not uncommon for us to find things like OpenGL features that > are > > only buggy on Macs with AMD chipsets. Beyond that the modifiers have a natural > > boolean order. You can't be on mor than one OS at a time, and your can't have > > more than one GPU at a time (for our purposes anyway), and you can't be > running > > release and debug at the same time. So it logically becomes: > > > > if (OS or OS) and (GPU or GPU) and (CONFIG or CONFIG): > > > > Although using actual arrays here rather than string splitting is, in > > retrospect, a much better way of handling this and I'll update the code > shortly. > > What if you wanted to specify Windows+Intel and OSX+Nvidia, but not > Windows+Nvidia? In the current expectations files you would express that as two different entries, which I feel is an appropriate pattern to carry through into Telemetry.
On 2013/07/12 03:35:12, bajones wrote: > On 2013/07/11 23:50:43, Dave Tu wrote: > > On 2013/07/11 17:59:49, bajones wrote: > > > On 2013/07/11 08:22:29, Dave Tu wrote: > > > > On 2013/07/10 22:20:58, bajones wrote: > > > > > Re-Ping. > > > > > > > > > > dtu (or tonyg?), if you could review this it would be appreciated. nduca > > is > > > > > swamped right now and unavailable for review. > > > > > > > > I think the place for this is to have a method alongside CreatePageSet() > > that > > > > also does SetExpectations(), inside the test.Test. The expectations are > > > specific > > > > to a (PageTest, PageSet) combination, and that's exactly what a test.Test > > is. > > > > > > This would probably indicate that the SetExpectations method is unnecessary, > > but > > > otherwise (on nduca's suggestion) I'm leaving integration details to a > > separate > > > CL. > > > > > > > I'm not convinced you need more than a boolean, either you expect it to > pass > > > and > > > > report the result, or you run it but don't report the result. I don't know > > if > > > > skipping should be handled for individual pages. Nat's thing > > > > (https://chromiumcodereview.appspot.com/18576004/) allows us to control > > which > > > > tests run on which platforms for each test.Test. I might be missing some > > > > potential use cases here, though. > > > > > > > > Do we need to be able to express combinations of OS and GPU modifiers, > e.g. > > we > > > > fail specifically on Intel GPUs running WinXP? So maybe instead of 'mac > win' > > > we > > > > could do ['mac, 'win'] to specify a logical OR and 'mac nvidia debug' to > > > specify > > > > a logical AND? Just one possibility I'm throwing out there. > > > > > > We do, as it's not uncommon for us to find things like OpenGL features that > > are > > > only buggy on Macs with AMD chipsets. Beyond that the modifiers have a > natural > > > boolean order. You can't be on mor than one OS at a time, and your can't > have > > > more than one GPU at a time (for our purposes anyway), and you can't be > > running > > > release and debug at the same time. So it logically becomes: > > > > > > if (OS or OS) and (GPU or GPU) and (CONFIG or CONFIG): > > > > > > Although using actual arrays here rather than string splitting is, in > > > retrospect, a much better way of handling this and I'll update the code > > shortly. > > > > What if you wanted to specify Windows+Intel and OSX+Nvidia, but not > > Windows+Nvidia? > > In the current expectations files you would express that as two different > entries, which I feel is an appropriate pattern to carry through into Telemetry. Okay, that sounds reasonable.
Alright, I'd really love to move on to the next stage of this feature (integration with TestRunner) so lets try and wrap this up. Please keep in mind that this patch is explicitly not intended to look at integration with the rest of the system, only the definition of the expectations themselves. I feel there's a justification for states beyond simply "pass/fail". "skip", for example, is useful if a test is known to destabilize the system. If "fail" still runs the test but expects failure then it doesn't handle a kernel panic gracefully. I also see a use case for "timeout", where a test is intended to succeed but runs too long. "flaky" also makes sense. I'm leaving most of this for future patches on an "as needed" basis, but since "skip" is trivial to implement I've included it to prevent anyone from aggressively refactoring this value into a bool. Otherwise, I think the current code provides the required functionality for implementing the first pass at test expectations, and is capable of expanding to meet future needs. If there are no further concerns can I get a LG?
On 2013/07/15 17:29:56, bajones wrote: > Alright, I'd really love to move on to the next stage of this feature > (integration with TestRunner) so lets try and wrap this up. Please keep in mind > that this patch is explicitly not intended to look at integration with the rest > of the system, only the definition of the expectations themselves. Thats not how we do things. We land things that work, or we dont land them. Please design for scale. "Does your patch make the project cleaner, or harder to understand."
What changes are needed or desired for this CL to land? It looks like all the previous review feedback has been addressed.
On 2013/07/15 18:34:35, nduca wrote: > On 2013/07/15 17:29:56, bajones wrote: > > Alright, I'd really love to move on to the next stage of this feature > > (integration with TestRunner) so lets try and wrap this up. Please keep in > mind > > that this patch is explicitly not intended to look at integration with the > rest > > of the system, only the definition of the expectations themselves. > Thats not how we do things. We land things that work, or we dont land them. > Please design for scale. "Does your patch make the project cleaner, or harder to > understand." @nduca: If we want to address the integration of this component before we land it I will be happy to merge in that portion of the patch so we can examine the end-to-end story. The fact that this CL is separate was based on your suggestion of modularizing the CLs for easier reviewing. What I was trying to steer the conversation away from is getting caught minutia about how the expectations will be used that doesn't directly inform the design of this object. CL feedback should be actionable, which would imply that either most integration discussion is deferred or that the scope of this CL should be expanded to include it.
After a brief face-to-face with dtu, I've added the integration code to help clarify this review. https://codereview.chromium.org/16158006/diff/36001/tools/telemetry/telemetry... File tools/telemetry/telemetry/page/page_runner.py (right): https://codereview.chromium.org/16158006/diff/36001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/page_runner.py:354: expectation == "fail" or In this revision of the code "fail" is being treated as a skip for simplicity. This is not correct. "fail" should indicate that the test is run as usual but that failures are ignored/overridden, and if the test completes successfully we should probably raise a flag about it. It's not clear to me what the best way to handle this is, and advice on the matter is appreciated!
https://codereview.chromium.org/16158006/diff/36001/tools/telemetry/telemetry... File tools/telemetry/telemetry/page/page_runner.py (right): https://codereview.chromium.org/16158006/diff/36001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/page_runner.py:354: expectation == "fail" or On 2013/07/15 23:37:17, bajones wrote: > In this revision of the code "fail" is being treated as a skip for simplicity. > This is not correct. "fail" should indicate that the test is run as usual but > that failures are ignored/overridden, and if the test completes successfully we > should probably raise a flag about it. > > It's not clear to me what the best way to handle this is, and advice on the > matter is appreciated! In the "except page_test.Failure" clause below, can the result be added as a success if the expectation is "fail", and otherwise, be added as a failure? That would match how the Blink layout test harness reports its results.
Thanks! It's easier for me to understand how the Expectations object fits in now. https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... File tools/telemetry/telemetry/page/page_runner.py (right): https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_runner.py:233: expectation = page_set.expectations.GetExpectationForPage( If expectations are tied to the Test instead of the PageSet (see my comment in PageSet), then you can't get page_set.expectations. Better to pass it in from Run(test, page_set, expectations, options) https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_runner.py:354: expectation == "fail" or On 2013/07/15 23:52:59, Ken Russell wrote: > On 2013/07/15 23:37:17, bajones wrote: > > In this revision of the code "fail" is being treated as a skip for simplicity. > > This is not correct. "fail" should indicate that the test is run as usual but > > that failures are ignored/overridden, and if the test completes successfully > we > > should probably raise a flag about it. > > > > It's not clear to me what the best way to handle this is, and advice on the > > matter is appreciated! > > In the "except page_test.Failure" clause below, can the result be added as a > success if the expectation is "fail", and otherwise, be added as a failure? That > would match how the Blink layout test harness reports its results. Yep, that's the way to do it. https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... File tools/telemetry/telemetry/page/page_set.py (right): https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_set.py:57: if 'expectations' in page_attributes: For now I'm gonna say don't allow setting expectations in the page set. You run the risk of disabling pages in benchmarks that you didn't intend do, since more and more of them are using the same page sets. https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... File tools/telemetry/telemetry/page/page_set_expectations.py (right): https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_set_expectations.py:14: def __init__(self, bug, modifiers, name, expectation): "url_pattern" instead of "name". Same with "url" below. "conditions" instead of "modifiers" https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_set_expectations.py:24: if not "://" in self.name: Single quotes for all strings. https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_set_expectations.py:27: if modifiers: Make the argument explicitly optional in the parameters. https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_set_expectations.py:35: self.config_modifiers.append(mod) else raise ValueError https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_set_expectations.py:45: class PageSetExpectations(object): TestExpectations https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_set_expectations.py:51: def Pass(self, bug, modifiers, name): I'm okay with this being extensible and not just a boolean pass/fail, but let's keep the set minimal to start with and add more as needed. Can we just stick with "Fail" for now? "Pass" is implicit from not having an expectation defined. https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_set_expectations.py:60: def Expect(self, bug, modifiers, name, expectation): I don't want to allow arbitrary expectation strings in the public API, because the different behaviors have to be hard-coded into PageRunner. Merge with AddExpectation and make private. https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_set_expectations.py:77: if not expectation: Disallow calling this method with no expectation argument. https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... File tools/telemetry/telemetry/page/page_test.py (right): https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/page/page_test.py:148: return None Return an empty Expectations object. Then you don't need an if check in PageTestRunner. https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... File tools/telemetry/telemetry/test.py (right): https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/test.py:28: expect = self.SetExpectations(ps) Better to pass in ps.expectations instead of returning it, to make the definition side cleaner. def SetExpectations(expect): expect.Fail(123456, ['win'], 'file:///*.html') instead of def SetExpectations(ps): ps.expectations.Fail(123456, ['win'], 'file:///*.html') return ps.expectations https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... tools/telemetry/telemetry/test.py:42: return page_set.PageSet.FromFile(self.page_set) And stick in a default SetExpectations() definition that returns an empty Expectations object.
Thanks for the detailed feedback! I've got a couple changes that I still need to make, but I wanted to bounce some thoughts off you first: On 2013/07/16 09:50:49, Dave Tu wrote: > def SetExpectations(expect): > expect.Fail(123456, ['win'], 'file:///*.html') > > instead of > > def SetExpectations(ps): > ps.expectations.Fail(123456, ['win'], 'file:///*.html') > return ps.expectations > > https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... > tools/telemetry/telemetry/test.py:42: return > page_set.PageSet.FromFile(self.page_set) > And stick in a default SetExpectations() definition that returns an empty > Expectations object. For one, I changed up the argument order in the latest patch. You would now declare them as: expectations.Fail("file:///*.html", ['win'], bug=123456) Which feels a bit more logical to me. It allows you to omit conditions entirely if you choose, so that the test is always marked for failure. Secondly, I feel it's pretty important that we be able to define the expectations in a separate file from the test itself. This prevents large amounts of clutter in the middle of the code and makes it easier for gardeners to manipulate. As such, my current plan was to change this to "CreateExpectations" and have a separate file defined like so: class WebGLConformanceExpectations(test_expectations.TestExpectations): def __init__(self): super(WebGLConformanceExpectations, self).__init__() self.Fail("gl-enable-vertex-attrib.html", ["mac", "win", "linux"]) self.Fail("conformance_extensions_oes_standard_derivatives", ["win"], bug=134743) # etc... Which was then included in the test like so: class WebglConformance(test_module.Test): # Blah blah blah def CreateExpectations(self, page_set): return WebGLConformanceExpectations() It's a little awkward to have the constructor boilerplate in the Expectations file, but I think it's reasonable. Thoughts?
On 2013/07/16 18:29:10, bajones wrote: > Thanks for the detailed feedback! I've got a couple changes that I still need to > make, but I wanted to bounce some thoughts off you first: > > On 2013/07/16 09:50:49, Dave Tu wrote: > > def SetExpectations(expect): > > expect.Fail(123456, ['win'], 'file:///*.html') > > > > instead of > > > > def SetExpectations(ps): > > ps.expectations.Fail(123456, ['win'], 'file:///*.html') > > return ps.expectations > > > > > https://chromiumcodereview.appspot.com/16158006/diff/36001/tools/telemetry/te... > > tools/telemetry/telemetry/test.py:42: return > > page_set.PageSet.FromFile(self.page_set) > > And stick in a default SetExpectations() definition that returns an empty > > Expectations object. > > For one, I changed up the argument order in the latest patch. You would now > declare them as: > > expectations.Fail("file:///*.html", ['win'], bug=123456) > > Which feels a bit more logical to me. It allows you to omit conditions entirely > if you choose, so that the test is always marked for failure. Cool, sgtm. > > Secondly, I feel it's pretty important that we be able to define the > expectations in a separate file from the test itself. This prevents large > amounts of clutter in the middle of the code and makes it easier for gardeners > to manipulate. As such, my current plan was to change this to > "CreateExpectations" and have a separate file defined like so: > > class WebGLConformanceExpectations(test_expectations.TestExpectations): > def __init__(self): > super(WebGLConformanceExpectations, self).__init__() > self.Fail("gl-enable-vertex-attrib.html", ["mac", "win", "linux"]) > self.Fail("conformance_extensions_oes_standard_derivatives", ["win"], > bug=134743) > # etc... > > Which was then included in the test like so: > > class WebglConformance(test_module.Test): > # Blah blah blah > def CreateExpectations(self, page_set): > return WebGLConformanceExpectations() > > It's a little awkward to have the constructor boilerplate in the Expectations > file, but I think it's reasonable. > > Thoughts? This also sounds good. Measurements and PageSets are also done in a similar way. You could reduce the amount of boilerplate in two ways: 1) Have the constructor call a method that you can override to set the expectations. # definition class TestExpectations(object): def __init__(self): self.SetExpectations() def SetExpectations(self): pass # user class WebGLConformanceExpectations(test_expectations.TestExpectations): def SetExpectations(self): self.Fail('gl-enable-vertex-attrib.html', ['mac', 'win', 'linux']) 2) For the Test, just assign the expectations class to a member variable. # definition class Test(object): # Existing code... expectations = test_expectations.TestExpectations() def Run(self, options): # Existing code... results = page_runner.Run(test, ps, self.expectations(), options) # Existing code... # user class WebglConformance(test_module.Test): expectations = WebGLConformanceExpectations
TestExpectations are now set by calling SetExpectations in the constructor. https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... File tools/telemetry/telemetry/page/page_runner.py (right): https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/page_runner.py:366: except page_test.Failure: I would appreciate a logic check here. My tests with the WebGL conformance tests don't seem to be triggering failures properly, but I suspect that's a problem with the WebGL tests, not this code. https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... File tools/telemetry/telemetry/page/test_expectations.py (right): https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/test_expectations.py:4: Is this file in the right location?
Getting close :) https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... File tools/telemetry/telemetry/page/page_runner.py (right): https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/page_runner.py:238: expectation) nit: indent https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/page_runner.py:352: def _RunPage(test, page, tab, results, options, expectation): nit: test, page, tab, expectation, results https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/page_runner.py:366: except page_test.Failure: On 2013/07/16 20:24:07, bajones wrote: > I would appreciate a logic check here. My tests with the WebGL conformance tests > don't seem to be triggering failures properly, but I suspect that's a problem > with the WebGL tests, not this code. Oh, I see. In the WebGL tests, instead of results.AddFailureMessage, it should be raise page_test.Failure https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/page_runner.py:368: if expectation == "fail": nit: single quotes, below too https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... File tools/telemetry/telemetry/page/page_runner_unittest.py (right): https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/page_runner_unittest.py:15: from telemetry.page import test_expectations Looks like we need to add unit tests for test expectations too. https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... File tools/telemetry/telemetry/page/page_test_runner.py (right): https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/page_test_runner.py:39: test, ps, expectations, = self.ParseCommandLine(sys.argv, base_dir, nit: extra comma https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/page_test_runner.py:171: ps, expectations = self.GetPageSet(test, page_set_filenames) It doesn't make sense for GetPageSet to return expectations. Just call test.CreateExpectations(ps) here. https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... File tools/telemetry/telemetry/page/test_expectations.py (right): https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/test_expectations.py:4: On 2013/07/16 20:24:07, bajones wrote: > Is this file in the right location? Yup! https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/test_expectations.py:37: raise ValueError Sorry, it's raise ValueError('Unknown expectation condition: "%s"' % condition) https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/test_expectations.py:40: def FromDict(cls, url_pattern, data): Remove now-unused method. https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/test_expectations.py:59: self._Expect("fail", url_pattern, conditions, bug) nit: single quotes https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/page/test_expectations.py:74: os_matches = (len(expectation.os_conditions) == 0 or nit: os_matches = (not expectations.os_conditions or https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... File tools/telemetry/telemetry/test.py (right): https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... tools/telemetry/telemetry/test.py:49: return test_expectations.TestExpectations() if hasattr(self, 'expectations'): return self.expectations else: return test_expectations.TestExpectations()
Feedback addressed, thanks! Sorry for the delayed response, I'm OOO at the moment. On 2013/07/16 20:58:02, Dave Tu wrote: > https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... > tools/telemetry/telemetry/page/page_runner.py:366: except page_test.Failure: > On 2013/07/16 20:24:07, bajones wrote: > > I would appreciate a logic check here. My tests with the WebGL conformance > tests > > don't seem to be triggering failures properly, but I suspect that's a problem > > with the WebGL tests, not this code. > > Oh, I see. In the WebGL tests, instead of results.AddFailureMessage, it should > be raise page_test.Failure That's pretty simple to fix for the WebGL tests (separate CL), but is this a pattern that all tests are expected to follow?
On 2013/07/19 19:04:25, bajones wrote: > Feedback addressed, thanks! Sorry for the delayed response, I'm OOO at the > moment. > > On 2013/07/16 20:58:02, Dave Tu wrote: > > > https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... > > tools/telemetry/telemetry/page/page_runner.py:366: except page_test.Failure: > > On 2013/07/16 20:24:07, bajones wrote: > > > I would appreciate a logic check here. My tests with the WebGL conformance > > tests > > > don't seem to be triggering failures properly, but I suspect that's a > problem > > > with the WebGL tests, not this code. > > > > Oh, I see. In the WebGL tests, instead of results.AddFailureMessage, it should > > be raise page_test.Failure > > That's pretty simple to fix for the WebGL tests (separate CL), but is this a > pattern that all tests are expected to follow? Yes, for now. It's true that it's a little weird that perf benchmarks use results.Add() but validation tests don't use the results object. I think it looks good, but this patch set is acting weird. When I try to diff against Base, it just diffs against PS7. Can you try re-uploading?
On 2013/07/19 20:03:40, Dave Tu wrote: > On 2013/07/19 19:04:25, bajones wrote: > > Feedback addressed, thanks! Sorry for the delayed response, I'm OOO at the > > moment. > > > > On 2013/07/16 20:58:02, Dave Tu wrote: > > > > > > https://codereview.chromium.org/16158006/diff/52001/tools/telemetry/telemetry... > > > tools/telemetry/telemetry/page/page_runner.py:366: except page_test.Failure: > > > On 2013/07/16 20:24:07, bajones wrote: > > > > I would appreciate a logic check here. My tests with the WebGL conformance > > > tests > > > > don't seem to be triggering failures properly, but I suspect that's a > > problem > > > > with the WebGL tests, not this code. > > > > > > Oh, I see. In the WebGL tests, instead of results.AddFailureMessage, it > should > > > be raise page_test.Failure > > > > That's pretty simple to fix for the WebGL tests (separate CL), but is this a > > pattern that all tests are expected to follow? > > Yes, for now. It's true that it's a little weird that perf benchmarks use > results.Add() but validation tests don't use the results object. > > I think it looks good, but this patch set is acting weird. When I try to diff > against Base, it just diffs against PS7. Can you try re-uploading? Eh, whatever, it's fine. lgtm
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/bajones@chromium.org/16158006/63001
Failed to apply patch for tools/telemetry/telemetry/page/page_runner.py:
While running patch -p1 --forward --force --no-backup-if-mismatch;
patching file tools/telemetry/telemetry/page/page_runner.py
Hunk #1 FAILED at 234.
Hunk #2 FAILED at 349.
Hunk #3 FAILED at 365.
Hunk #4 FAILED at 381.
4 out of 4 hunks FAILED -- saving rejects to file
tools/telemetry/telemetry/page/page_runner.py.rej
Patch: tools/telemetry/telemetry/page/page_runner.py
Index: tools/telemetry/telemetry/page/page_runner.py
diff --git a/tools/telemetry/telemetry/page/page_runner.py
b/tools/telemetry/telemetry/page/page_runner.py
index
554fb5eced4533879e06b0bb828b5a9194650f2c..c9df8bc5f8163cca53210148caaae87afed105b1
100644
--- a/tools/telemetry/telemetry/page/page_runner.py
+++ b/tools/telemetry/telemetry/page/page_runner.py
@@ -234,8 +234,8 @@ def Run(test, page_set, expectations, options):
state.browser.platform, page)
try:
- _RunPage(test, page, state.tab, results_for_current_run, options,
- expectation)
+ _RunPage(test, page, state.tab, expectation,
+ results_for_current_run, options)
_CheckThermalThrottling(state.browser.platform)
except exceptions.TabCrashException:
stdout = ''
@@ -349,7 +349,7 @@ def _CheckArchives(page_set, pages, results):
pages_missing_archive_path + pages_missing_archive_data]
-def _RunPage(test, page, tab, results, options, expectation):
+def _RunPage(test, page, tab, expectation, results, options):
if not test.CanRunForPage(page):
logging.warning('Skipping test: it cannot run for %s', page.url)
results.AddSkip(page, 'Test cannot run')
@@ -365,7 +365,7 @@ def _RunPage(test, page, tab, results, options,
expectation):
util.CloseConnections(tab)
except page_test.Failure:
logging.warning('%s:\n%s', page.url, traceback.format_exc())
- if expectation == "fail":
+ if expectation == 'fail':
logging.info('Failure was expected\n')
results.AddSuccess(page)
else:
@@ -381,7 +381,7 @@ def _RunPage(test, page, tab, results, options,
expectation):
except Exception:
raise
else:
- if expectation == "fail":
+ if expectation == 'fail':
logging.warning('%s was expected to fail, but passed.\n', page.url)
results.AddSuccess(page)
finally:
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/bajones@chromium.org/16158006/73001
Message was sent while issue was closed.
Change committed as 212742 |
