|
|
Created:
6 years, 3 months ago by Owen Lin Modified:
6 years, 2 months ago CC:
chromium-reviews, posciak+watch_chromium.org, jam, mcasas+watch_chromium.org, feature-media-reviews_chromium.org, darin-cc_chromium.org, piman+watch_chromium.org, wjia+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Project:
chromium Visibility:
Public. |
Descriptionrendering_helper - Warm up the rendering.
The rendering for the first few frames is much slower (As much as
100ms on Peach Pit). To get stable/correct numbers in performance
tests, we try to warm it up by rendering several frames in the
beginning.
BUG=411123
TEST=Run the vda_unittest on Peach pit.
Committed: https://crrev.com/7a41151b1c94a470c1259dacfe69259fa5610096
Cr-Commit-Position: refs/heads/master@{#299861}
Patch Set 1 #Patch Set 2 : pass warm_up_iter from command line #
Total comments: 5
Patch Set 3 : address review comments #
Total comments: 1
Patch Set 4 : address review comment #Patch Set 5 : #
Total comments: 1
Patch Set 6 : rebase and nit #
Messages
Total messages: 20 (3 generated)
owenlin@chromium.org changed reviewers: + posciak@chromium.org, wuchengli@chromium.org
PTAL. Thanks.
I don't understand this CL. What does this actually do and how rendering 5 (and why 5?) empty frames in the beginning helps the issue?
owenlin@chromium.org changed reviewers: + djkurtz@google.com, piman@chromium.org
Hi, Antoine and Daniel, I found the GL took more time to rendering the first few textures. This affects the performance numbers we got. (For example, it will always drop the first few frames in the beginning because of the late rendering.) In this CL, I tried to warm up the rendering. Is there any better way to do this? A more fundamental question is the slowness in the beginning is expected. Thanks for the help.
+ihf, +marcheu Yes, there are several reasons why rendering will be slower at first. Reliably testing performance is non trivial. (1) both the cpu and gpu have frequency governors. If the system is initially idle, both are probably being clocked at their slowest frequency. Once the system utilization increases, the governors increase clocks to speed things up. For the Mali GPU, this is known as Dynamic Voltage and Frequency Scaling" (DVFS). The algorithm used by the Samsung exynos driver is relatively slow to ramp up/down, and more tries to guarantee a given frequency for an on-average constant load. To see if slow DVFS ramp up is affecting your results, you can try disabling DVFS and running the test with fixed GPU frequencies. Similarly, if the test is CPU performance limited you can try disabling cpufreq and running at fixed cpu clocks. (2) Fixed overhead, the first time the GPU processes a command stream, it may need to do (a lot of) extra work to set things up. Thus, the first iteration can sometimes be much much worsr (3) Overheating - the system temperature can have a significant impact on performance. Ilja has put a lot of time and effort into making the graphices tests more robust in the face of these issues. To combat these effects we do a couple of things for graphics tests (particularly in glbench): (1) first run a few iterations and then discard the results (2) average over a large number of iterations (3) run several rounds with an exponentially increasing number of iterations, and look for a linear "trend line" to appear in the data. (4) monitor temperature and cool down the system between test runs Take a look at: autotest/client/deps/glbench/src/testbase.cc autotest/client/site_tests/graphics_GLBench/graphics_GLBench.py autotest/client/cros/perf.py Perhaps you can reuse some of the ideas or common code from here. -Dan On Wed, Oct 1, 2014 at 3:24 PM, <owenlin@chromium.org> wrote: > Hi, Antoine and Daniel, > > I found the GL took more time to rendering the first few textures. This > affects > the performance numbers we got. (For example, it will always drop the first > few > frames in the beginning because of the late rendering.) > > In this CL, I tried to warm up the rendering. Is there any better way to do > this? A more fundamental question is the slowness in the beginning is > expected. > > Thanks for the help. > > https://chromiumcodereview.appspot.com/583503002/ -- Daniel Kurtz | Software Engineer | djkurtz@google.com | 650.204.0722 To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Pass the number of warmup frames on the comamand line? So that you can set it from the testing infrastrcture, possibly per-board if needed?
I think Daniel explained some aspects well. If you want take a look at the pure graphics performance framework take a look at client/site_tests/graphics_PerfControl/graphics_PerfControl.py This ensures that governors are in a high performance state, CPU is idle and machine temperature is reasonably low before proceeding to the test. It also checks that temperatures don't get too high during the test and fails it otherwise. This cause more failures than otherwise but the numbers/graphs it returns are cleaner. One thing that I would like to change in the future is that instead of putting the governors into the highest performance state at test begin, like the toolchain team does it I would put it into a medium performance state that does not cause long term overheating. A further idea for noise reduction is in graphics_WebGLPerformance.py starting with variant = utils.get_board_with_frequency_and_memory() It changes the plotting on the dashboard based on memory available/CPU frequency, but the dashboard graphs are still kind of ugly as they don't expect discontinuous data. I will have to play with the plotting some more but it would be nice to have this default for all plots. Antoine is right with regards to parameterizing the loops. I think 5 is kind of short, I would probably throw away a second worth of frames. Some drivers may set up a lot of state initially (caching) before they get fast. Ilja. On Wed, Oct 1, 2014 at 1:07 AM, Daniel Kurtz <djkurtz@google.com> wrote: > +ihf, +marcheu > > Yes, there are several reasons why rendering will be slower at first. > Reliably testing performance is non trivial. > > (1) both the cpu and gpu have frequency governors. If the system is > initially idle, both are probably being clocked at their slowest > frequency. Once the system utilization increases, the governors > increase clocks to speed things up. For the Mali GPU, this is known > as Dynamic Voltage and Frequency Scaling" (DVFS). The algorithm used > by the Samsung exynos driver is relatively slow to ramp up/down, and > more tries to guarantee a given frequency for an on-average constant > load. > > To see if slow DVFS ramp up is affecting your results, you can try > disabling DVFS and running the test with fixed GPU frequencies. > Similarly, if the test is CPU performance limited you can try > disabling cpufreq and running at fixed cpu clocks. > > (2) Fixed overhead, the first time the GPU processes a command stream, > it may need to do (a lot of) extra work to set things up. Thus, the > first iteration can sometimes be much much worsr > > (3) Overheating - the system temperature can have a significant impact > on performance. > > Ilja has put a lot of time and effort into making the graphices tests > more robust in the face of these issues. To combat these effects we > do a couple of things for graphics tests (particularly in glbench): > (1) first run a few iterations and then discard the results > (2) average over a large number of iterations > (3) run several rounds with an exponentially increasing number of > iterations, and look for a linear "trend line" to appear in the data. > (4) monitor temperature and cool down the system between test runs > > Take a look at: > autotest/client/deps/glbench/src/testbase.cc > autotest/client/site_tests/graphics_GLBench/graphics_GLBench.py > autotest/client/cros/perf.py > > Perhaps you can reuse some of the ideas or common code from here. > > -Dan > > > > On Wed, Oct 1, 2014 at 3:24 PM, <owenlin@chromium.org> wrote: > > Hi, Antoine and Daniel, > > > > I found the GL took more time to rendering the first few textures. This > > affects > > the performance numbers we got. (For example, it will always drop the > first > > few > > frames in the beginning because of the late rendering.) > > > > In this CL, I tried to warm up the rendering. Is there any better way to > do > > this? A more fundamental question is the slowness in the beginning is > > expected. > > > > Thanks for the help. > > > > https://chromiumcodereview.appspot.com/583503002/ > > > > -- > Daniel Kurtz | Software Engineer | djkurtz@google.com | 650.204.0722 > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Thank you guys. This is valuable information. It seems we can learn a lot from Graphics tests. I am going to take the following actions: Step 1. Change the warm up iteration count to a command line option and pass it from autotest. For now, I will warm up the rendering for one second./ Step 2. For frame delivery time and cpu usage test, set the performance governor to performance. For frame drop test, since it's more like a user experience monitor, I don't think I should change the performance governor settings. Step 3. Monitor the temperature before running a test and try to stabilize the performance stats by average the number over more iterations. On Thu, Oct 2, 2014 at 4:51 AM, Ilja Friedel <ihf@google.com> wrote: > I think Daniel explained some aspects well. If you want take a look at the > pure graphics performance framework take a look > at client/site_tests/graphics_PerfControl/graphics_PerfControl.py > This ensures that governors are in a high performance state, CPU is idle > and machine temperature is reasonably low before proceeding to the test. It > also checks that temperatures don't get too high during the test and fails > it otherwise. This cause more failures than otherwise but the > numbers/graphs it returns are cleaner. One thing that I would like to > change in the future is that instead of putting the governors into the > highest performance state at test begin, like the toolchain team does it I > would put it into a medium performance state that does not cause long term > overheating. > > A further idea for noise reduction is in graphics_WebGLPerformance.py > starting with > variant = utils.get_board_with_frequency_and_memory() > It changes the plotting on the dashboard based on memory available/CPU > frequency, but the dashboard graphs are still kind of ugly as they don't > expect discontinuous data. I will have to play with the plotting some more > but it would be nice to have this default for all plots. > > Antoine is right with regards to parameterizing the loops. I think 5 is > kind of short, I would probably throw away a second worth of frames. Some > drivers may set up a lot of state initially (caching) before they get fast. > > Ilja. > > On Wed, Oct 1, 2014 at 1:07 AM, Daniel Kurtz <djkurtz@google.com> wrote: > >> +ihf, +marcheu >> >> Yes, there are several reasons why rendering will be slower at first. >> Reliably testing performance is non trivial. >> >> (1) both the cpu and gpu have frequency governors. If the system is >> initially idle, both are probably being clocked at their slowest >> frequency. Once the system utilization increases, the governors >> increase clocks to speed things up. For the Mali GPU, this is known >> as Dynamic Voltage and Frequency Scaling" (DVFS). The algorithm used >> by the Samsung exynos driver is relatively slow to ramp up/down, and >> more tries to guarantee a given frequency for an on-average constant >> load. >> >> To see if slow DVFS ramp up is affecting your results, you can try >> disabling DVFS and running the test with fixed GPU frequencies. >> Similarly, if the test is CPU performance limited you can try >> disabling cpufreq and running at fixed cpu clocks. >> >> (2) Fixed overhead, the first time the GPU processes a command stream, >> it may need to do (a lot of) extra work to set things up. Thus, the >> first iteration can sometimes be much much worsr >> >> (3) Overheating - the system temperature can have a significant impact >> on performance. >> >> Ilja has put a lot of time and effort into making the graphices tests >> more robust in the face of these issues. To combat these effects we >> do a couple of things for graphics tests (particularly in glbench): >> (1) first run a few iterations and then discard the results >> (2) average over a large number of iterations >> (3) run several rounds with an exponentially increasing number of >> iterations, and look for a linear "trend line" to appear in the data. >> (4) monitor temperature and cool down the system between test runs >> >> Take a look at: >> autotest/client/deps/glbench/src/testbase.cc >> autotest/client/site_tests/graphics_GLBench/graphics_GLBench.py >> autotest/client/cros/perf.py >> >> Perhaps you can reuse some of the ideas or common code from here. >> >> -Dan >> >> >> >> On Wed, Oct 1, 2014 at 3:24 PM, <owenlin@chromium.org> wrote: >> > Hi, Antoine and Daniel, >> > >> > I found the GL took more time to rendering the first few textures. This >> > affects >> > the performance numbers we got. (For example, it will always drop the >> first >> > few >> > frames in the beginning because of the late rendering.) >> > >> > In this CL, I tried to warm up the rendering. Is there any better way >> to do >> > this? A more fundamental question is the slowness in the beginning is >> > expected. >> > >> > Thanks for the help. >> > >> > https://chromiumcodereview.appspot.com/583503002/ >> >> >> >> -- >> Daniel Kurtz | Software Engineer | djkurtz@google.com | 650.204.0722 >> > > -- Owen Lin Google Taipei Software Engineer To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
You can do 2) and most of 3) just by changing "run_my_test()" to with perf.PerfControl() as pc: if not pc.verify_is_valid(): raise error.TestError(pc.get_error_reason()) run_my_test() if not pc.verify_is_valid(): raise error.TestError(pc.get_error_reason()) Please let me know if it doesn't work for you and we can refactor it. Ilja. On Wed, Oct 1, 2014 at 9:35 PM, Owen Lin <owenlin@google.com> wrote: > Thank you guys. This is valuable information. It seems we can learn a lot > from Graphics tests. > > I am going to take the following actions: > > Step 1. Change the warm up iteration count to a command line option and > pass it from autotest. For now, I will warm up the rendering for one > second./ > > Step 2. For frame delivery time and cpu usage test, set the performance > governor to performance. For frame drop test, since it's more like a user > experience monitor, I don't think I should change the performance governor > settings. > > Step 3. Monitor the temperature before running a test and try to stabilize > the performance stats by average the number over more iterations. > > > > On Thu, Oct 2, 2014 at 4:51 AM, Ilja Friedel <ihf@google.com> wrote: > >> I think Daniel explained some aspects well. If you want take a look at >> the pure graphics performance framework take a look >> at client/site_tests/graphics_PerfControl/graphics_PerfControl.py >> This ensures that governors are in a high performance state, CPU is idle >> and machine temperature is reasonably low before proceeding to the test. It >> also checks that temperatures don't get too high during the test and fails >> it otherwise. This cause more failures than otherwise but the >> numbers/graphs it returns are cleaner. One thing that I would like to >> change in the future is that instead of putting the governors into the >> highest performance state at test begin, like the toolchain team does it I >> would put it into a medium performance state that does not cause long term >> overheating. >> >> A further idea for noise reduction is in graphics_WebGLPerformance.py >> starting with >> variant = utils.get_board_with_frequency_and_memory() >> It changes the plotting on the dashboard based on memory available/CPU >> frequency, but the dashboard graphs are still kind of ugly as they don't >> expect discontinuous data. I will have to play with the plotting some more >> but it would be nice to have this default for all plots. >> >> Antoine is right with regards to parameterizing the loops. I think 5 is >> kind of short, I would probably throw away a second worth of frames. Some >> drivers may set up a lot of state initially (caching) before they get fast. >> >> Ilja. >> >> On Wed, Oct 1, 2014 at 1:07 AM, Daniel Kurtz <djkurtz@google.com> wrote: >> >>> +ihf, +marcheu >>> >>> Yes, there are several reasons why rendering will be slower at first. >>> Reliably testing performance is non trivial. >>> >>> (1) both the cpu and gpu have frequency governors. If the system is >>> initially idle, both are probably being clocked at their slowest >>> frequency. Once the system utilization increases, the governors >>> increase clocks to speed things up. For the Mali GPU, this is known >>> as Dynamic Voltage and Frequency Scaling" (DVFS). The algorithm used >>> by the Samsung exynos driver is relatively slow to ramp up/down, and >>> more tries to guarantee a given frequency for an on-average constant >>> load. >>> >>> To see if slow DVFS ramp up is affecting your results, you can try >>> disabling DVFS and running the test with fixed GPU frequencies. >>> Similarly, if the test is CPU performance limited you can try >>> disabling cpufreq and running at fixed cpu clocks. >>> >>> (2) Fixed overhead, the first time the GPU processes a command stream, >>> it may need to do (a lot of) extra work to set things up. Thus, the >>> first iteration can sometimes be much much worsr >>> >>> (3) Overheating - the system temperature can have a significant impact >>> on performance. >>> >>> Ilja has put a lot of time and effort into making the graphices tests >>> more robust in the face of these issues. To combat these effects we >>> do a couple of things for graphics tests (particularly in glbench): >>> (1) first run a few iterations and then discard the results >>> (2) average over a large number of iterations >>> (3) run several rounds with an exponentially increasing number of >>> iterations, and look for a linear "trend line" to appear in the data. >>> (4) monitor temperature and cool down the system between test runs >>> >>> Take a look at: >>> autotest/client/deps/glbench/src/testbase.cc >>> autotest/client/site_tests/graphics_GLBench/graphics_GLBench.py >>> autotest/client/cros/perf.py >>> >>> Perhaps you can reuse some of the ideas or common code from here. >>> >>> -Dan >>> >>> >>> >>> On Wed, Oct 1, 2014 at 3:24 PM, <owenlin@chromium.org> wrote: >>> > Hi, Antoine and Daniel, >>> > >>> > I found the GL took more time to rendering the first few textures. This >>> > affects >>> > the performance numbers we got. (For example, it will always drop the >>> first >>> > few >>> > frames in the beginning because of the late rendering.) >>> > >>> > In this CL, I tried to warm up the rendering. Is there any better way >>> to do >>> > this? A more fundamental question is the slowness in the beginning is >>> > expected. >>> > >>> > Thanks for the help. >>> > >>> > https://chromiumcodereview.appspot.com/583503002/ >>> >>> >>> >>> -- >>> Daniel Kurtz | Software Engineer | djkurtz@google.com | 650.204.0722 >>> >> >> > > > -- > Owen Lin > Google Taipei Software Engineer > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
PTAL. Thanks.
https://codereview.chromium.org/583503002/diff/20001/content/common/gpu/media... File content/common/gpu/media/rendering_helper.cc (right): https://codereview.chromium.org/583503002/diff/20001/content/common/gpu/media... content/common/gpu/media/rendering_helper.cc:326: NULL); It'd be better not to render an uninitialized texture. https://codereview.chromium.org/583503002/diff/20001/content/common/gpu/media... File content/common/gpu/media/rendering_helper.h (right): https://codereview.chromium.org/583503002/diff/20001/content/common/gpu/media... content/common/gpu/media/rendering_helper.h:57: int warm_up_iterations; new field needs new initializer.
PTAL. Thanks. https://codereview.chromium.org/583503002/diff/20001/content/common/gpu/media... File content/common/gpu/media/rendering_helper.cc (right): https://codereview.chromium.org/583503002/diff/20001/content/common/gpu/media... content/common/gpu/media/rendering_helper.cc:326: NULL); On 2014/10/07 19:31:07, piman (Very slow to review) wrote: > It'd be better not to render an uninitialized texture. Done. https://codereview.chromium.org/583503002/diff/20001/content/common/gpu/media... File content/common/gpu/media/rendering_helper.h (right): https://codereview.chromium.org/583503002/diff/20001/content/common/gpu/media... content/common/gpu/media/rendering_helper.h:57: int warm_up_iterations; On 2014/10/07 19:31:07, piman (Very slow to review) wrote: > new field needs new initializer. We didn't use initializer for other fields.
https://codereview.chromium.org/583503002/diff/20001/content/common/gpu/media... File content/common/gpu/media/rendering_helper.h (right): https://codereview.chromium.org/583503002/diff/20001/content/common/gpu/media... content/common/gpu/media/rendering_helper.h:57: int warm_up_iterations; On 2014/10/14 06:01:50, Owen Lin wrote: > On 2014/10/07 19:31:07, piman (Very slow to review) wrote: > > new field needs new initializer. > > We didn't use initializer for other fields. Other POD fields need initializers too. https://codereview.chromium.org/583503002/diff/40001/content/common/gpu/media... File content/common/gpu/media/rendering_helper.cc (right): https://codereview.chromium.org/583503002/diff/40001/content/common/gpu/media... content/common/gpu/media/rendering_helper.cc:316: std::vector<GLubyte> emptyData(screen_size_.GetArea() * 2); nit: scoped_ptr<GLubyte[]> empty_data(new GLubyte[screen_size_.GetArea() * 2]); std::vector doesn't guarantee contiguous storage (before c++11 STL, which we don't use consistenly yet).
Thanks. PTAL.
lgtm https://codereview.chromium.org/583503002/diff/80001/content/common/gpu/media... File content/common/gpu/media/rendering_helper.cc (right): https://codereview.chromium.org/583503002/diff/80001/content/common/gpu/media... content/common/gpu/media/rendering_helper.cc:318: scoped_ptr<GLubyte[]> emptyData(new GLubyte[screen_size_.GetArea() * 2]()); nit: no need for () after the new GLubyte[...]
The CQ bit was checked by owenlin@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/583503002/100001
Message was sent while issue was closed.
Committed patchset #6 (id:100001)
Message was sent while issue was closed.
Patchset 6 (id:??) landed as https://crrev.com/7a41151b1c94a470c1259dacfe69259fa5610096 Cr-Commit-Position: refs/heads/master@{#299861} |