|
|
Created:
7 years, 4 months ago by navabi Modified:
7 years, 2 months ago CC:
chromium-reviews, craigdh+watch_chromium.org, bulach+watch_chromium.org, yfriedman+watch_chromium.org, ilevy-cc_chromium.org, klundberg+watch_chromium.org, frankf+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Visibility:
Public. |
DescriptionPut cleanup steps in finally block so cleanup happens after halt on failure.
The device_status_check hangs because there are still processes launched by the
buildbot script that are not cleaned up. E.g.
https://chromegw.corp.google.com/i/clank/builders/manta-sharded-official-perf-clankium/builds/2359
This makes sure the post commands are executed after halt on failure.
BUG=265578
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=215365
Patch Set 1 #Patch Set 2 : Use try/finally block instead of cleanup thunk. #Patch Set 3 : Use finally block. #
Total comments: 3
Patch Set 4 : Removed GetPostTestStepCmds(), call functions directly. #Patch Set 5 : Rebased. #Messages
Total messages: 16 (0 generated)
I think easier would be to set a process group in bb_run_bot and then kill the group on exit..
On 2013/07/30 23:40:52, Isaac wrote: > I think easier would be to set a process group in bb_run_bot and then kill the > group on exit.. But we don't just want to kill. e.g. just killing the logcat monitor would not give us a logcat dump when the device_status_check step halts on failure. In general, the logcat dump is useful when a step has to halt on failure.
On 2013/07/31 00:08:46, navabi wrote: > On 2013/07/30 23:40:52, Isaac wrote: > > I think easier would be to set a process group in bb_run_bot and then kill the > > group on exit.. > > But we don't just want to kill. e.g. just killing the logcat monitor would not > give us a logcat dump when the device_status_check step halts on failure. In > general, the logcat dump is useful when a step has to halt on failure. But in this case, we want to just kill processes, correct? So let's not add plumbing until we're using it. It seems like a process group would be a good match for what this CL is trying to accomplish. If the old process was not hanging around, you wouldn't get stalls on the next build and so would get build logs.
On 2013/07/31 00:41:11, Isaac wrote: > On 2013/07/31 00:08:46, navabi wrote: > > On 2013/07/30 23:40:52, Isaac wrote: > > > I think easier would be to set a process group in bb_run_bot and then kill > the > > > group on exit.. > > > > But we don't just want to kill. e.g. just killing the logcat monitor would not > > give us a logcat dump when the device_status_check step halts on failure. In > > general, the logcat dump is useful when a step has to halt on failure. > > But in this case, we want to just kill processes, correct? So let's not add > plumbing > until we're using it. It seems like a process group would be a good match for > what > this CL is trying to accomplish. If the old process was not hanging around, you > wouldn't > get stalls on the next build and so would get build logs. No, I don't want to 'just' kill the processes. It is true that killing the process would make it not hang. But the extra plumbing you are referring to would be used (i.e. there would be a Logcat Dump step at the end and full_log would be written) vs. just killing it, which would leave full_log incomplete. If its the implementation you have a problem with, please feel free to provide suggestions, but I would rather not just kill the processes without proper cleanup.
Why would killing the process group when the build ends affect the log dump step? The kill would not happen until all the steps finished.
PTAL. https://codereview.chromium.org/21261006/diff/12001/build/android/buildbot/bb... File build/android/buildbot/bb_device_steps.py (left): https://codereview.chromium.org/21261006/diff/12001/build/android/buildbot/bb... build/android/buildbot/bb_device_steps.py:396: This moved to GetPostTestStepCmds.
lgtm w/ nit https://codereview.chromium.org/21261006/diff/12001/build/android/buildbot/bb... File build/android/buildbot/bb_device_steps.py (right): https://codereview.chromium.org/21261006/diff/12001/build/android/buildbot/bb... build/android/buildbot/bb_device_steps.py:356: finally: Could we just call the functions here and remove GetPostTestStepCmds()? Seems unneeded.
https://codereview.chromium.org/21261006/diff/12001/build/android/buildbot/bb... File build/android/buildbot/bb_device_steps.py (right): https://codereview.chromium.org/21261006/diff/12001/build/android/buildbot/bb... build/android/buildbot/bb_device_steps.py:356: finally: On 2013/08/02 02:20:26, Isaac wrote: > Could we just call the functions here and remove GetPostTestStepCmds()? Seems > unneeded. Sure. That way we also don't need KillHostHeartbeat() function, since it only exists to take an options argument and ignore it.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/navabi@google.com/21261006/18001
Failed to apply patch for build/android/buildbot/bb_device_steps.py: While running patch -p1 --forward --force --no-backup-if-mismatch; patching file build/android/buildbot/bb_device_steps.py Hunk #1 FAILED at 320. Hunk #2 succeeded at 377 (offset -15 lines). 1 out of 2 hunks FAILED -- saving rejects to file build/android/buildbot/bb_device_steps.py.rej Patch: build/android/buildbot/bb_device_steps.py Index: build/android/buildbot/bb_device_steps.py diff --git a/build/android/buildbot/bb_device_steps.py b/build/android/buildbot/bb_device_steps.py index 9ec3fe3dcf48579350e651343a303660d74ec79b..3d447052e4b245a241d7bc64b1eaad5b81cde75d 100755 --- a/build/android/buildbot/bb_device_steps.py +++ b/build/android/buildbot/bb_device_steps.py @@ -320,35 +320,32 @@ def GenerateTestReport(options): os.remove(report) -def GetPostTestStepCmds(): - return [ - ('logcat_dump', LogcatDump), - ('test_report', GenerateTestReport) - ] - - def MainTestWrapper(options): - # Spawn logcat monitor - SpawnLogcatMonitor() - - # Run all device setup steps - for _, cmd in GetDeviceSetupStepCmds(): - cmd(options) + try: + # Spawn logcat monitor + SpawnLogcatMonitor() - if options.install: - test_obj = INSTRUMENTATION_TESTS[options.install] - InstallApk(options, test_obj, print_step=True) + # Run all device setup steps + for _, cmd in GetDeviceSetupStepCmds(): + cmd(options) - if options.test_filter: - bb_utils.RunSteps(options.test_filter, GetTestStepCmds(), options) + if options.install: + test_obj = INSTRUMENTATION_TESTS[options.install] + InstallApk(options, test_obj, print_step=True) - if options.experimental: - RunTestSuites(options, gtest_config.EXPERIMENTAL_TEST_SUITES) - RunBrowserTestSuite(options) + if options.test_filter: + bb_utils.RunSteps(options.test_filter, GetTestStepCmds(), options) - # Run all post test steps - for _, cmd in GetPostTestStepCmds(): - cmd(options) + if options.experimental: + RunTestSuites(options, gtest_config.EXPERIMENTAL_TEST_SUITES) + RunBrowserTestSuite(options) + finally: + # Run all post test steps + LogcatDump(options) + GenerateTestReport(options) + # KillHostHeartbeat() has logic to check if heartbeat process is running, + # and kills only if it finds the process is running on the host. + provision_devices.KillHostHeartbeat() def GetDeviceStepsOptParser(): @@ -392,7 +389,6 @@ def main(argv): setattr(options, 'target', options.factory_properties.get('target', 'Debug')) MainTestWrapper(options) - provision_devices.KillHostHeartbeat() if __name__ == '__main__':
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/navabi@google.com/21261006/22001
Message was sent while issue was closed.
Change committed as 215365
Message was sent while issue was closed.
On 2013/08/02 21:08:09, I haz the power (commit-bot) wrote: > Change committed as 215365 I suspect this change makes all android bots hanging with the following messages: < build/android/test_runner.py instrumentation --test-apk ChromeTest --test_data clank:clank/test/data/android/device_files --verbose --flakiness-dashboard-server=clank-test-results.appspot.com --python_test_root=/b/build/slave/instrumentation-yakju-clankium/build/src/clank/java/apps/tests/src -A LargeTest --screenshot Traceback (most recent call last): File "/b/build/slave/instrumentation-yakju-clankium/build/src/clank/build/buildbot/bb_device_steps_internal.py", line 292, in <module> sys.exit(main(sys.argv)) File "/b/build/slave/instrumentation-yakju-clankium/build/src/clank/build/buildbot/bb_device_steps_internal.py", line 283, in main bb_device_steps.GetPostTestStepCmds() + AttributeError: 'module' object has no attribute 'GetPostTestStepCmds' # 08/02/13 19:10:16: Waiting for device package manager... # 08/02/13 19:10:16: Waiting for device package manager... # 08/02/13 19:10:16: Waiting for device package manager... # 08/02/13 19:10:36: No response for adb -s 014E04E119016009 wait-for-device, retrying # 08/02/13 19:10:16: Waiting for device package manager... # 08/02/13 19:10:36: No response for adb -s 0149CD260E017014 wait-for-device, retrying
Message was sent while issue was closed.
On 2013/08/03 02:40:25, qinmin wrote: > On 2013/08/02 21:08:09, I haz the power (commit-bot) wrote: > > Change committed as 215365 > > I suspect this change makes all android bots hanging with the following > messages: > > < build/android/test_runner.py instrumentation --test-apk ChromeTest --test_data > clank:clank/test/data/android/device_files --verbose > --flakiness-dashboard-server=clank-test-results.appspot.com > --python_test_root=/b/build/slave/instrumentation-yakju-clankium/build/src/clank/java/apps/tests/src > -A LargeTest --screenshot > Traceback (most recent call last): > File > "/b/build/slave/instrumentation-yakju-clankium/build/src/clank/build/buildbot/bb_device_steps_internal.py", > line 292, in <module> > sys.exit(main(sys.argv)) > File > "/b/build/slave/instrumentation-yakju-clankium/build/src/clank/build/buildbot/bb_device_steps_internal.py", > line 283, in main > bb_device_steps.GetPostTestStepCmds() + > AttributeError: 'module' object has no attribute 'GetPostTestStepCmds' > # 08/02/13 19:10:16: Waiting for device package manager... > # 08/02/13 19:10:16: Waiting for device package manager... > # 08/02/13 19:10:16: Waiting for device package manager... > # 08/02/13 19:10:36: No response for adb -s 014E04E119016009 wait-for-device, > retrying > # 08/02/13 19:10:16: Waiting for device package manager... > # 08/02/13 19:10:36: No response for adb -s 0149CD260E017014 wait-for-device, > retrying I believe Isaac fixed this. |