Issue 10831330: Repeat Failed Tests in Serial

Issue 10831330: Repeat Failed Tests in Serial (Closed)

Created:
8 years, 4 months ago by csharp

Modified:
8 years, 4 months ago

Reviewers:
Nicolas Sylvain, nsylvain, rvargas (doing something else), M-A Ruel

CC:
chromium-reviews, pam+watch_chromium.org, M-A Ruel

Base URL:
http://git.chromium.org/chromium/src.git@master

Visibility:
Public.

More Reviews

Description

Repeat Failed Tests in Serial Before considering a test failed in run_test_cases.py try running the failed tests serial, since they may have only failed before because they were conflicting with other tests running at the same time. BUG= Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=151893

Patch Set 1 : #

Created: 8 years, 4 months ago

Download [raw] [tar.bz2]

		Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+99 lines, -21 lines)			Patch
	M	tools/isolate/run_test_cases.py	View		4 chunks	+35 lines, -5 lines	0 comments	Download
	M	tools/isolate/run_test_cases_smoke_test.py	View		6 chunks	+64 lines, -16 lines	0 comments	Download

Messages

Total messages: 16 (0 generated)

Expand Messages | Collapse Messages

csharp

This should fix the flakiness that is current visible on the swarm bots when running ...

8 years, 4 months ago (2012-08-15 17:02:05 UTC) #1

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/csharp@chromium.org/10831330/2001

8 years, 4 months ago (2012-08-16 12:45:32 UTC) #3

jam

hmm, this seems like it's masking over the problem. all of our tests (with the ...

8 years, 4 months ago (2012-08-16 18:10:24 UTC) #5

nsylvain

Sharding supervisor does the same thing right? It retries failed tests at the end serially. ...

8 years, 4 months ago (2012-08-16 18:21:25 UTC) #6

csharp

On 2012/08/16 18:21:25, nsylvain wrote: > Sharding supervisor does the same thing right? It retries ...

8 years, 4 months ago (2012-08-16 18:23:18 UTC) #7

jam

On 2012/08/16 18:21:25, nsylvain wrote: > Sharding supervisor does the same thing right? It retries ...

8 years, 4 months ago (2012-08-16 18:28:09 UTC) #8

rvargas1

http://code.google.com/p/chromium/issues/detail?id=139272

8 years, 4 months ago (2012-08-16 18:28:59 UTC) #9

jam

On 2012/08/16 18:23:18, csharp wrote: > On 2012/08/16 18:21:25, nsylvain wrote: > > Sharding supervisor ...

8 years, 4 months ago (2012-08-16 18:33:30 UTC) #10

On 2012/08/16 18:23:18, csharp wrote:
> On 2012/08/16 18:21:25, nsylvain wrote:
> > Sharding supervisor does the same thing right? It retries failed tests at
the
> > end serially.
> > 
> > +rvargas, who probably know about the cache tests.  But there was a thread
> > earlier I believe and the consensus was that it was hard to have those tests
> use
> > temporary folder for the cache data, or something like that.
> 
> In the little digging I did the test were usually failing when they attempted
to
> setup their cache.
> 
> The output was usually:
> [==========] Running 1 test from 1 test case.
> [----------] Global test environment set-up.
> [----------] 1 test from DiskCacheBackendTest
> [ RUN      ] DiskCacheBackendTest.RecoverRemove
> [28352:28360:0816/105211:80282438598:WARNING:backend_impl.cc(1916)] Destroying
> invalid entry.
> [28352:28360:0816/105211:80282443094:WARNING:backend_impl.cc(1916)] Destroying
> invalid entry.
> [28352:28352:0816/105211:80282447120:ERROR:backend_impl.cc(221)] Unable to
> create cache
> ../../net/disk_cache/disk_cache_test_base.cc:273: Failure
> Value of: cb.GetResult(rv)
>   Actual: -2
> Expected: net::OK
> Which is: 0
> ../../net/disk_cache/disk_cache_test_base.cc:74: Failure
> Value of: NULL != cache_
>   Actual: false
> Expected: true
> 	base::debug::StackTrace::StackTrace() [0x12614ce]
> 	base::(anonymous namespace)::StackDumpSignalHandler() [0x1283a09]
> 	0x7fde3b6e8af0
> 	DiskCacheBackendTest::BackendTransaction() [0x6555aa]
> 	DiskCacheBackendTest::BackendRecoverRemove() [0x655c7f]
> 	testing::Test::Run() [0xd0a9b1]
> 	testing::TestInfo::Run() [0xd0aa7a]
> 	testing::TestCase::Run() [0xd0abc7]
> 	testing::internal::UnitTestImpl::RunAllTests() [0xd0ddad]
> 	testing::UnitTest::Run() [0xd079d3]
> 	base::TestSuite::Run() [0xd2cf58]
> 	main [0x537a0e]
> 	0x7fde3b6d3c4d
> 	0x4160a9

I see. looks like 139272 is filed for the disk cache tests, which need to be
fixed.

can we just not run these specific tests in this new system until they're fixed?
i'm worried that by doing this workaround in the higher level scripts, more
tests that don't run in parallel will sneak through.

csharp

It should be possible to modify the new system to separate those tests, but I'm ...

8 years, 4 months ago (2012-08-16 19:13:08 UTC) #11

jam

On 2012/08/16 19:13:08, csharp wrote: > It should be possible to modify the new system ...

8 years, 4 months ago (2012-08-16 20:12:17 UTC) #12

jam

On 2012/08/16 20:12:17, John Abd-El-Malek wrote: > On 2012/08/16 19:13:08, csharp wrote: > > It ...

8 years, 4 months ago (2012-08-17 15:58:40 UTC) #13

csharp

On 2012/08/17 15:58:40, John Abd-El-Malek wrote: > On 2012/08/16 20:12:17, John Abd-El-Malek wrote: > > ...

8 years, 4 months ago (2012-08-17 16:02:09 UTC) #14

M-A Ruel

Thanks Chris for trying to work around the issue but the previous behavior was intended, ...

8 years, 4 months ago (2012-08-18 01:30:31 UTC) #15

jam

8 years, 4 months ago (2012-08-19 20:21:26 UTC) #16

On 2012/08/18 01:30:31, Marc-Antoine Ruel wrote:
> Thanks Chris for trying to work around the issue but the previous behavior was
> intended, I wanted to have the flaky tests being very flaky and have the
maximum
> performance as possible. Technically, the failed tests should added to the
queue
> but that's a separate issue.
> 
> Thanks John for fixing the tests, that is really really appreciated.
> 
> Also, this adds yet another retry, raising the number of retries to 4, which
is
> too high at that point IMHO.
> 
> I don't think we should spend time to modify sharding_supervisor.py, since the
> way shards are done is significantly different. run_test_case.py hides all
> inter-tests issues so I think it's preferable to only use it long-term.

+1 to everything you said.

(in reply to csharp's email): rerunning tests just masks flakiness, which we
should fix. i don't think sharding_supervisor should do the current behavior,
and I'd be happy to see it stop. note that sharding_supervisor is only used on
only browser_tests and content_browsertests, all the unit tests, interactive
tests, sync, nacl etc all don't rerun failed tests.

Expand Messages | Collapse Messages