|
|
Created:
8 years, 2 months ago by sheu Modified:
7 years, 11 months ago CC:
chromium-reviews, joi+watch-content_chromium.org, darin-cc_chromium.org, jam, apatrick_chromium, feature-media-reviews_chromium.org Base URL:
https://git.chromium.org/git/chromium/src@git-svn Visibility:
Public. |
DescriptionVDA implementation for Exynos, using V4L2
This is a VDA implementation for Exynos that does not use the OMX layer,
instead interfacing directly with the V4L2 devices exported by the
s5p-mfc and gsc-m2m kernel drivers.
TEST=local build, run on snow, unittests
BUG=chrome-os-partner:10057
BUG=chromium-os:37807
Change-Id: Id39a38ba739bab166a484190d9e207c5523ef345
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=177141
Patch Set 1 #
Total comments: 96
Patch Set 2 : Update.wq #Patch Set 3 : And one more. #
Total comments: 114
Patch Set 4 : Halfway through .cc review.wq #
Total comments: 74
Patch Set 5 : Final? #
Total comments: 20
Patch Set 6 : Final?? #
Total comments: 56
Patch Set 7 : ioctl bits? #Patch Set 8 : <WIP> #Patch Set 9 : Rebase. #Patch Set 10 : Switch to eventfd for device-polling interrupt. #
Total comments: 13
Patch Set 11 : Rebase. #Patch Set 12 : Added H264 parsing for frame boundaries. #
Total comments: 1
Patch Set 13 : Rebase. #Patch Set 14 : H264 parsing. poll(), Flush() fixes. #Patch Set 15 : Rebase. #Patch Set 16 : scoped_array<> deprecated. #
Total comments: 3
Patch Set 17 : Rebase. #Patch Set 18 : Replaced one LOG() with DLOG() per posciak@ #
Total comments: 142
Patch Set 19 : Rebase. #Patch Set 20 : Revisions from @fischman #Patch Set 21 : Rebase. #Patch Set 22 : Flush() path rework. #
Total comments: 31
Patch Set 23 : Rebase. #Patch Set 24 : Comments and code movement. #
Total comments: 4
Patch Set 25 : Removed Flush() early-completion for resetting case. #Patch Set 26 : Bug number update. #
Total comments: 71
Patch Set 27 : content:: fixes from piman@. #
Total comments: 4
Patch Set 28 : Rebase. #Patch Set 29 : Comment removal. #
Total comments: 4
Patch Set 30 : Made FindFrameFragment check a CHECK. #Patch Set 31 : mark kUseExynosVda as CONTENT_EXPORT. #Patch Set 32 : Qualify use of kUseExynosVda symbol in content::switches namespace. #Patch Set 33 : Put check for kUseExynoVda behind OS_CHROMEOS/ARCH_CPU_ARMEL defs. #Patch Set 34 : Rebase. #Patch Set 35 : Reuploading so I can resbumit because the Windows builders were broken due to a WebKit roll, and I … #Messages
Total messages: 87 (0 generated)
It's been awhile.
And it's a beaut. I'm looping Youtube videos overnight to check for leaks. But it does HTML5 and Flash, H264 and VP8, and doesn't jank nearly as much while seeking. I haven't traced it yet for perf, so there's likely some room for improvement there too.
lolwat? I had no idea anyone was working on this. I haven't gone through the meat of the CL yet (EVDA*) but the linked bug says m24; do you really expect to get it in for 24 (branch cut in 10 days)? That seems pretty (extremely) aggressive considering the amount of code here. Is this really an ExynosVDA or is it a V4L2VDA? (if the latter, could it replace VAVDA?) https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:6: // that utilizes hardware video decoder present on Intel CPUs. no it doesn't ;) https://chromiumcodereview.appspot.com/11198060/diff/1/content/gpu/gpu_main.cc File content/gpu/gpu_main.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/1/content/gpu/gpu_main.c... content/gpu/gpu_main.cc:275: ExynosVideoDecodeAccelerator::PreSandboxInitialization(); we should only load the relevant .so's.
Heh. Secret Santa. I'm not aligning this for any particular branch point actually, but I have been working nights on this for a while, and finally tracked down the last issue tonight. (If you're curious -- the last bit was http://crosbug.com/p/14521.) I'm loading both the OMX and Exynos sets for now, for testing purposes. This CL adds a "use-exynos-vda" runtime switch so we can switch between them without a recompile.
It should be possible to generalize this to V4L2 devices in general, though my focus right now is more in killing sec_omx :-)
On 2012/10/18 08:48:27, sheu wrote: > It should be possible to generalize this to V4L2 devices in general, though my > focus right now is more in killing sec_omx :-) I hear ya, but my focus always is minimizing the number of VDA impls we carry around in chrome ;)
On 2012/10/18 08:30:25, Ami Fischman wrote: > lolwat? I had no idea anyone was working on this. > > I haven't gone through the meat of the CL yet (EVDA*) but the linked bug says > m24; do you really expect to get it in for 24 (branch cut in 10 days)? That > seems pretty (extremely) aggressive considering the amount of code here. > > Is this really an ExynosVDA or is it a V4L2VDA? > (if the latter, could it replace VAVDA?) > It is a V4L2VDA, but only Samsung codecs support V4L2 for now. It can't replace VAVDA and please, please, don't even think about it :) > https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... > File content/common/gpu/media/exynos_video_decode_accelerator.h (right): > > https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... > content/common/gpu/media/exynos_video_decode_accelerator.h:6: // that utilizes > hardware video decoder present on Intel CPUs. > no it doesn't ;) > > https://chromiumcodereview.appspot.com/11198060/diff/1/content/gpu/gpu_main.cc > File content/gpu/gpu_main.cc (right): > > https://chromiumcodereview.appspot.com/11198060/diff/1/content/gpu/gpu_main.c... > content/gpu/gpu_main.cc:275: > ExynosVideoDecodeAccelerator::PreSandboxInitialization(); > we should only load the relevant .so's.
On 2012/10/18 08:18:04, sheu wrote: > It's been awhile. Great! I will do a thorough review, but it will take me awhile.
> > It is a V4L2VDA, but only Samsung codecs support V4L2 for now. > It can't replace VAVDA and please, please, don't even think about it :) Why not? Will this at least replace OVDA (someday)? Carrying 3 VDA's in perpetuity for CrOS smells wrong.
> TEST=local build, run on lucas You should definitely make content/common/gpu/media/video_decode_accelerator_unittest.cc use this and test that way, too.
On 2012/10/29 20:54:09, Ami Fischman wrote: > > > > It is a V4L2VDA, but only Samsung codecs support V4L2 for now. > > It can't replace VAVDA and please, please, don't even think about it :) > > > Why not? > Will this at least replace OVDA (someday)? > Carrying 3 VDA's in perpetuity for CrOS smells wrong. Today (and I think in the near and far future), Intel implements video decode with VAAPI on top of drm. In short, today intel is: chrome -> vavda -> vaapi -> drm (in the kernel) -> hardware this CL works as follows: chrome -> exynos VDA -> v4l2 (in the kernel) -> hardware So you can't replace vavda with this code, since that'd mean you'd need v4l2 in the kernel for intel. That doesn't exist and probably will never exist.
Thanks for the explanation, Stephane. Do you know whether tegra exposes v4l2? (would we want to keep OVDA if this landed?)
On 2012/10/29 21:54:27, Ami Fischman wrote: > Thanks for the explanation, Stephane. > Do you know whether tegra exposes v4l2? (would we want to keep OVDA if > this landed?) Sorry, I have no idea what tegra does. Someone should dig in the chrome os kernel-3.2 branch...
Mostly only reviewed EVDA.h in this pass, with just a few comments in the .cc file where I happened to look while reading the .h. Per IM holding off on reviewing the .cc file until these are addressed to avoid the naming churn etc that'll be coming. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:44: shm.reset(NULL); unnecessary https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:46: input_id = -1; nit: both of these can be done as initializers https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:91: if (egl_images != NULL) { can reverse the test and early-return, to reduce indentation. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:197: goto error; No goto, please. You probably want something like RETURN_ON_{,OMX_}FAILURE() from omx_video_decode_accelerator.cc https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:491: " driver handle"; here and elsewhere, please use << explicitly instead of taking advantage of implicit string literal concatenation. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:575: decoder_decode_buffer_tasks_scheduled_ += 1; here and elsewhere, why do you use +=1 or -=1 instead of ++/--? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:690: // We don't handle midstream resizes right now. You shouldn't expose DCHECKs & crashes to the web; something not supported should result in a NOTIMPLEMENTED & decode-error. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:39: // * The API thread, which is the main GPU process thread which calls the Chromium terminology calls this thread the ChildThread (the "main" thread of each of the child processes, such as renderer, gpu, plugin, etc). I don't think it serves any purpose to refer to it as the "API thread". Replace references throughout? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:57: ExynosVideoDecodeAccelerator(gfx::GLContext* gl_context, style: here & elsewhere, arg lists should be either all +4 indented or indented to the opening paren, not mixed (so here you want to add a newline before the first arg) https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:80: // Call through the static function pointers we hold. Can't these be private? For that matter, they seem like they could just live in the .cc file since they are fully static. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:88: // These are rather subjectively tuned. Have you tested flash? ihf@ found he needed 6-8 output buffers in flight to keep smoothness in some cases. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:103: kAfterReset, // After Reset(), ready to start decoding again. I believe the only diff between this and kInitialized is buffer-allocation stuff, which makes me think that this state should be deleted and instead have a kNeedBuffers state between kInitialized and kDecoding. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:107: // Record for decoder input buffers. Comments that apply to each of the structs defined below: - Can move into the .cc file w/ a fwd-decl here? - Declare the dtor explicitly instead of accepting a compiler-generated dtor in every compilation unit that includes this file. - Any reason not to initialize the values through the ctor? That'll let you make the fields const and simplify call-sites, I think. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:118: bool queued : 1; // queued to device. here and below: why bother with the :1? If the compiler actually obeyed you it'd make the following fields unaligned, and probably cost in performance. Generally unless you're going to COMPILE_ASSERT about the size of the struct, specifying field widths is unwarranted (at least in chrome code). https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:118: bool queued : 1; // queued to device. s/queued/at_device/ ? (ditto for other "queued" fields below) https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:131: size_t length[2]; // mmap() length for each plane. what are these dmabufs/planes? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:139: int mfc_output; // MFC output to recycle when this input is complete s/MFC output/MfcOutputRecord/ https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:146: bool in_vda : 1; // held by VDA. _this_ is the VDA :) Do you mean at_client? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:176: // DecodeBufferTask() to actually decoe the buffer. typo: decoe https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:177: void DecodeTask(scoped_ptr<BitstreamBufferRecord> bitstream_record); I think you can improve function naming here. WDYT of: rename: EnqueueDecodeBuffer https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:181: void DecodeBufferTask(); ...and rename: DoSomeDecodeWork (and below) https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:207: // receives the NotifyFlushDone callback. this is fine, but make sure you can deal with a Reset() or Destroy() coming in before NotifyFlushDone fires. (this is tested by v_d_a_unittest.cc, I believe) https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:223: bool StopDevice(); You should doco the return value of these methods, but more alarming is that almost all of their callsites ignore the return value. Either void'ify if safe, or make use of ret code everywhere (or ignore_result() it). https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:253: bool DestroyGscOutputBuffers(); Again, return values seem to be silently ignored sometimes. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:256: scoped_refptr<base::MessageLoopProxy> message_loop_proxy_; name it something more distinct, like {main,child}_loop_proxy_ ? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:263: // decoder thread to the API thread should use |weak_this_|. Comment that |device_thread_| follows the same logic as |decoder_thread_| ? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:271: // This thread handles VDA API entry points and callbacks. I think this comment is misleading. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:273: // Decoder state. Owned by decoder_thread_. Replace "Owned by..." with a blanket comment at l.271 since all of this state below is read/written only by decoder_thread_ ? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:274: State decoder_state_; ...except this seems to be read from both the decoder_thread_ and the main thread. What's up with that? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:276: scoped_ptr<BitstreamBufferRecord> decoder_current_bitstream_buffer_; s/decoder_// https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:278: int decoder_current_input_buffer_; ditto s/decoder_// here and below. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:283: // Buffers in the pipe, five-by-five. lolwat? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:288: bool decoder_notify_flush_; Can you make a pass and align these variable names with their purposes? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:300: media::VideoCodecProfile video_codec_; if you passed this to CreateMfcInputBuffers as a param you wouldn't need this member. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:300: media::VideoCodecProfile video_codec_; it's a profile, not a codec. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:303: EGLContext egl_context_; you could assign these in the ctor and avoid the need for gl_context_ as a member https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:309: // Completed decode buffers, waiting for MSC. typo: MSC https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:309: // Completed decode buffers, waiting for MSC. I'm a bit confused about what makes an input decode buffer "complete". Is there a reference you can link to from the top of this file for how the V4L2 decode API is used? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:313: int mfc_fd_; Use base::ScopedFD, per IM convo. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:318: // Mapping to track MFC input buffers. There's a lot of "map" commentary/variable-naming below, but it's all applied to vectors. You want to clarify that other code refers to elements of these vectors by their index? If that's the case, you should describe when these are populated and the fact that they are never reordered or otherwise modified until tear-down time. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:327: size_t mfc_output_buffer_size_[2]; doco what the [2] is about. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:334: int gsc_fd_; scoped https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:339: std::deque<int> gsc_free_input_buffers_; does this really need to be a FIFO or could it just as well be a LIFO and use a vector (pushing & popping on the back)? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:350: std::deque<int> gsc_free_output_buffers_; ditto https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:353: ISTM there's a ton of repetition among the above 4 blocks. Would it make sense to have a common DeviceState struct and just have 4 of them, one per MFC/GSC input/output? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:355: unsigned int frame_width_; gfx::Size https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:358: // "private" API pointers as exported by libmali, that we require. see above; l.358-369 belong only in the .cc file IMO
No, Tegra does not support V4L2. If we ever decide to go with them again, I'll see if it'd be possible to have it use V4L2, also depending on the quality of their openmax libs. On Tue, Oct 30, 2012 at 6:00 AM, <marcheu@chromium.org> wrote: > On 2012/10/29 21:54:27, Ami Fischman wrote: > >> Thanks for the explanation, Stephane. >> Do you know whether tegra exposes v4l2? (would we want to keep OVDA if >> this landed?) >> > > Sorry, I have no idea what tegra does. Someone should dig in the chrome os > kernel-3.2 branch... > > https://chromiumcodereview.**appspot.com/11198060/<https://chromiumcodereview... >
Commentated, updated. PTAL. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:91: if (egl_images != NULL) { On 2012/10/29 23:30:16, Ami Fischman wrote: > can reverse the test and early-return, to reduce indentation. Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:197: goto error; On 2012/10/29 23:30:16, Ami Fischman wrote: > No goto, please. > > You probably want something like RETURN_ON_{,OMX_}FAILURE() from > omx_video_decode_accelerator.cc Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:491: " driver handle"; On 2012/10/29 23:30:16, Ami Fischman wrote: > here and elsewhere, please use << explicitly instead of taking advantage of > implicit string literal concatenation. Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:575: decoder_decode_buffer_tasks_scheduled_ += 1; On 2012/10/29 23:30:16, Ami Fischman wrote: > here and elsewhere, why do you use +=1 or -=1 instead of ++/--? Habit, I guess. I tend not to use postincrement, and preincrement looks ugly. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:690: // We don't handle midstream resizes right now. On 2012/10/29 23:30:16, Ami Fischman wrote: > You shouldn't expose DCHECKs & crashes to the web; something not supported > should result in a NOTIMPLEMENTED & decode-error. Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:6: // that utilizes hardware video decoder present on Intel CPUs. On 2012/10/18 08:30:26, Ami Fischman wrote: > no it doesn't ;) Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:39: // * The API thread, which is the main GPU process thread which calls the On 2012/10/29 23:30:16, Ami Fischman wrote: > Chromium terminology calls this thread the ChildThread (the "main" thread of > each of the child processes, such as renderer, gpu, plugin, etc). I don't think > it serves any purpose to refer to it as the "API thread". Replace references > throughout? Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:57: ExynosVideoDecodeAccelerator(gfx::GLContext* gl_context, On 2012/10/29 23:30:16, Ami Fischman wrote: > style: here & elsewhere, arg lists should be either all +4 indented or indented > to the opening paren, not mixed (so here you want to add a newline before the > first arg) Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:80: // Call through the static function pointers we hold. On 2012/10/29 23:30:16, Ami Fischman wrote: > Can't these be private? > For that matter, they seem like they could just live in the .cc file since they > are fully static. Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:88: // These are rather subjectively tuned. On 2012/10/29 23:30:16, Ami Fischman wrote: > Have you tested flash? ihf@ found he needed 6-8 output buffers in flight to > keep smoothness in some cases. I tested flash. Over and over again. I can turn things down pretty well on this VDA :-) https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:103: kAfterReset, // After Reset(), ready to start decoding again. On 2012/10/29 23:30:16, Ami Fischman wrote: > I believe the only diff between this and kInitialized is buffer-allocation > stuff, which makes me think that this state should be deleted and instead have a > kNeedBuffers state between kInitialized and kDecoding. I think the kInitialized/kDecoding/kAfterReset stages match the logical flow of the state machine better. Something like kNeedsBuffers would be better suited for an independent flag, but I don't see that we need this. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:107: // Record for decoder input buffers. On 2012/10/29 23:30:16, Ami Fischman wrote: > Comments that apply to each of the structs defined below: > > - Can move into the .cc file w/ a fwd-decl here? > - Declare the dtor explicitly instead of accepting a compiler-generated dtor in > every compilation unit that includes this file. > - Any reason not to initialize the values through the ctor? That'll let you > make the fields const and simplify call-sites, I think. I suppose I could move BitstreamBufferRecord, but the rest have containers declared on them and have to be declared here. I'd keep them in the same place for consistency. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:118: bool queued : 1; // queued to device. On 2012/10/29 23:30:16, Ami Fischman wrote: > here and below: why bother with the :1? > If the compiler actually obeyed you it'd make the following fields unaligned, > and probably cost in performance. > > Generally unless you're going to COMPILE_ASSERT about the size of the struct, > specifying field widths is unwarranted (at least in chrome code). Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:131: size_t length[2]; // mmap() length for each plane. On 2012/10/29 23:30:16, Ami Fischman wrote: > what are these dmabufs/planes? It's hardware-specific business; I'm not sure that a comment here would help if you didn't already know what was up with dmabufs. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:139: int mfc_output; // MFC output to recycle when this input is complete On 2012/10/29 23:30:16, Ami Fischman wrote: > s/MFC output/MfcOutputRecord/ Well, we're tracking the buffer index here, not really the record itself. Updated slightly. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:139: int mfc_output; // MFC output to recycle when this input is complete On 2012/10/29 23:30:16, Ami Fischman wrote: > s/MFC output/MfcOutputRecord/ We're tracking the output buffer itself, not the record. Updated slightly. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:146: bool in_vda : 1; // held by VDA. On 2012/10/29 23:30:16, Ami Fischman wrote: > _this_ is the VDA :) > Do you mean at_client? Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:176: // DecodeBufferTask() to actually decoe the buffer. On 2012/10/29 23:30:16, Ami Fischman wrote: > typo: decoe Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:177: void DecodeTask(scoped_ptr<BitstreamBufferRecord> bitstream_record); On 2012/10/29 23:30:16, Ami Fischman wrote: > I think you can improve function naming here. WDYT of: > rename: EnqueueDecodeBuffer Generally I'm matching each "Foo" call from the API with a "FooTask" call to the decoder thread, so in this case it's named DecodeTask because it comes from Decode. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:181: void DecodeBufferTask(); On 2012/10/29 23:30:16, Ami Fischman wrote: > ...and rename: DoSomeDecodeWork > (and below) They're pretty closely associated with DecodeBufferTask(). Did you mean to rename all the helper functions to Do*() calls? https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:207: // receives the NotifyFlushDone callback. On 2012/10/29 23:30:16, Ami Fischman wrote: > this is fine, but make sure you can deal with a Reset() or Destroy() coming in > before NotifyFlushDone fires. (this is tested by v_d_a_unittest.cc, I believe) Yep, that case should work. It just sets a flag actually, that either Reset or Destroy will unset. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:223: bool StopDevice(); On 2012/10/29 23:30:16, Ami Fischman wrote: > You should doco the return value of these methods, but more alarming is that > almost all of their callsites ignore the return value. Either void'ify if safe, > or make use of ret code everywhere (or ignore_result() it). Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:253: bool DestroyGscOutputBuffers(); On 2012/10/29 23:30:16, Ami Fischman wrote: > Again, return values seem to be silently ignored sometimes. Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:256: scoped_refptr<base::MessageLoopProxy> message_loop_proxy_; On 2012/10/29 23:30:16, Ami Fischman wrote: > name it something more distinct, like {main,child}_loop_proxy_ ? > Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:263: // decoder thread to the API thread should use |weak_this_|. On 2012/10/29 23:30:16, Ami Fischman wrote: > Comment that |device_thread_| follows the same logic as |decoder_thread_| ? Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:271: // This thread handles VDA API entry points and callbacks. On 2012/10/29 23:30:16, Ami Fischman wrote: > I think this comment is misleading. Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:273: // Decoder state. Owned by decoder_thread_. On 2012/10/29 23:30:16, Ami Fischman wrote: > Replace "Owned by..." with a blanket comment at l.271 since all of this state > below is read/written only by decoder_thread_ ? Done. I did some declaration moving-around to hopefully make this more clear. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:274: State decoder_state_; On 2012/10/29 23:30:16, Ami Fischman wrote: > ...except this seems to be read from both the decoder_thread_ and the main > thread. What's up with that? Probably the trickiest bit of state. It's only read and written from the decoder thread, except for when the decoder thread hasn't started yet, at which time the child thread does read/write it. Not even NotifyError touches it. I should probably document this better. I'm not sure though how I should explicitly mark ownership of this by one thread. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:278: int decoder_current_input_buffer_; On 2012/10/29 23:30:16, Ami Fischman wrote: > ditto s/decoder_// here and below. I was trying to mark these as more closely "decoder" state, as opposed to all the mfc_ and gsc_ state bits which are also owned by the decoder thread, but are more closely associated with the device. I've updated the comments and moved things around. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:283: // Buffers in the pipe, five-by-five. On 2012/10/29 23:30:16, Ami Fischman wrote: > lolwat? http://www.youtube.com/watch?v=fOZk--oZdQk (My recollection of it is mostly from StarCraft, but the original is Aliens) https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:288: bool decoder_notify_flush_; On 2012/10/29 23:30:16, Ami Fischman wrote: > Can you make a pass and align these variable names with their purposes? Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:300: media::VideoCodecProfile video_codec_; On 2012/10/29 23:30:16, Ami Fischman wrote: > it's a profile, not a codec. Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:303: EGLContext egl_context_; On 2012/10/29 23:30:16, Ami Fischman wrote: > you could assign these in the ctor and avoid the need for gl_context_ as a > member A little forward-thinking: I'd like to standardize the constructors that the various VDAs present in GpuVideoDecodeAccelerator::Initialize(). The lowest common denominator seems to be a gfx::GLContext*, a Client, and a make_current callback. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:309: // Completed decode buffers, waiting for MSC. On 2012/10/29 23:30:16, Ami Fischman wrote: > typo: MSC Done. The bit about "decode_output_mfc_input_queue_" was a hold-over from when I did some pre-processing before I stuffed it into the MFC. Renamed now. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:313: int mfc_fd_; On 2012/10/29 23:30:16, Ami Fischman wrote: > Use base::ScopedFD, per IM convo. Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:318: // Mapping to track MFC input buffers. On 2012/10/29 23:30:16, Ami Fischman wrote: > There's a lot of "map" commentary/variable-naming below, but it's all applied to > vectors. You want to clarify that other code refers to elements of these > vectors by their index? > If that's the case, you should describe when these are populated and the fact > that they are never reordered or otherwise modified until tear-down time. I tweaked the comment a bit. I think the fact that they're called a mapping and accessed with the usual [] business might be self-documenting enough. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:327: size_t mfc_output_buffer_size_[2]; On 2012/10/29 23:30:16, Ami Fischman wrote: > doco what the [2] is about. Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:334: int gsc_fd_; On 2012/10/29 23:30:16, Ami Fischman wrote: > scoped Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:339: std::deque<int> gsc_free_input_buffers_; On 2012/10/29 23:30:16, Ami Fischman wrote: > does this really need to be a FIFO or could it just as well be a LIFO and use a > vector (pushing & popping on the back)? Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:350: std::deque<int> gsc_free_output_buffers_; On 2012/10/29 23:30:16, Ami Fischman wrote: > ditto This one does have to be a FIFO -- before we can reuse a buffer, we have to do an expensive client sync on it. The longer the buffer's been in the queue, the less likely it is that we'll actually have to stall during the sync. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:353: On 2012/10/29 23:30:16, Ami Fischman wrote: > ISTM there's a ton of repetition among the above 4 blocks. > Would it make sense to have a common DeviceState struct and just have 4 of them, > one per MFC/GSC input/output? The only things really common to them are *_streamon_ and *_buffer_count_. If I end up generalizing this to a V4L VDA, then it would most likely make sense to break off that state into a separate *_decoder class like VAAPI has it at the moment. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:355: unsigned int frame_width_; On 2012/10/29 23:30:16, Ami Fischman wrote: > gfx::Size Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:358: // "private" API pointers as exported by libmali, that we require. On 2012/10/29 23:30:16, Ami Fischman wrote: > see above; l.358-369 belong only in the .cc file IMO Done. https://chromiumcodereview.appspot.com/11198060/diff/1/content/gpu/gpu_main.cc File content/gpu/gpu_main.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/1/content/gpu/gpu_main.c... content/gpu/gpu_main.cc:275: ExynosVideoDecodeAccelerator::PreSandboxInitialization(); On 2012/10/18 08:30:26, Ami Fischman wrote: > we should only load the relevant .so's. Done.
https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:575: decoder_decode_buffer_tasks_scheduled_ += 1; On 2012/10/30 09:05:03, sheu wrote: > On 2012/10/29 23:30:16, Ami Fischman wrote: > > here and elsewhere, why do you use +=1 or -=1 instead of ++/--? > > Habit, I guess. I tend not to use postincrement, and preincrement looks ugly. I'm going to cite "consistency with surrounding code" and ask you to replace +=1 and -=1 with pre-{in,de}crement, resp. ISTM +=1 and -=1 are so un-idiomatic in chromium code as to make the reader wonder what's going on (since if it was as simple as it seems, surely you'd have used ++ or --). https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.h:274: State decoder_state_; On 2012/10/30 09:05:03, sheu wrote: > On 2012/10/29 23:30:16, Ami Fischman wrote: > > ...except this seems to be read from both the decoder_thread_ and the main > > thread. What's up with that? > > Probably the trickiest bit of state. It's only read and written from the > decoder thread, except for when the decoder thread hasn't started yet, at which > time the child thread does read/write it. Not even NotifyError touches it. > > I should probably document this better. I'm not sure though how I should > explicitly mark ownership of this by one thread. If you wanted to be super-XTREME about it, you could pass it in to the decoder_thread_'s ctor, and make it ThreadLocal. Just sayin'. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:37: (EGLImageKHR, EGLint*, void*) = NULL; style: here and below, opening paren belongs on previous line https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:47: : shm(shm), self-assign? (here and below) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:53: at_device = false; here and below, use initializer lists for stuff that can use it. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:91: egl_images_count(egl_images_count) { ditto https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:102: if (egl_image_fds[i] != -1) If it possible for the test here and the one two lines up to disagree? (worth DCHECK'ing?) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:172: DCHECK_EQ(child_message_loop_proxy_, base::MessageLoopProxy::current()); here and below, DCHECK_EQ(child_message_loop_proxy_, base::MessageLoopProxy::current()); is more readable as: DCHECK(child_message_loop_proxy_->BelongsToCurrentThread()); https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:174: int ret; style: declare vars as near to first use as possible. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:175: const __u32 caps_required = consts get kCapsRequired-style naming https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:175: const __u32 caps_required = Please move south to first use https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:176: V4L2_CAP_VIDEO_CAPTURE_MPLANE | why do we care about video /capture/ for this file? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:202: DLOG(ERROR) << "Initialize(): PostSandboxInitialization() failed"; nit: if failure was in PreSandbox msg is slightly misleading. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:210: DLOG(ERROR) << "Initialize(): could not set EGLContext"; s/set/get/? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:220: return false; wouldn't life be nicer if NOTIFY_ERROR also took care of setting the decoder_state_ and DLOG'ing like OVDA.cc's macros do? (although I note that sometimes you set decoder_state_ explicitly and other times you let NOTIFY_ERROR->NotifyError->CleanupTask do it for you; IWBN to have consistency) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:269: DLOG(ERROR) << "Initialize(): ioctl() failed: VIDIOC_QUERYCAP" errors from from MFC & GSC fd's will look the same except for line number. Disambiguate? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:280: control.value = 8; // Magic number. random/magic doco'd somewhere? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:281: errno = 0; here and elsewhere, is the errno=0 really required, considering you only read it (with DPLOG) if ret says to read it? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:322: decoder_state_ = kInitialized; didn't somebody tell me the child thread only touches this var until decoder_thread_ is Start()ed? :) You could post this assignment to the decoder_thread_ and then trigger the NotifyInitializeDone from there, or you could just move this assignment above the Start() at l.315 and let the assignment at l.317 overwrite in case Start() fails. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:330: DVLOG(1) << "Decode(): input_id=" << bitstream_buffer.id() << style: << goes on next line (even though it's an operator, and normally operators go on previous line, stream operators go on next line *and* line up with each other instead of following usual +4 indent; crazy!) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:335: bitstream_record(new BitstreamBufferRecord( bitstream_record could go on previous line (and de-indent -4 the following 2 lines) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:362: DCHECK_EQ(buffers.size(), (size_t)gsc_output_buffer_count_); style: no c-style casts; reinterpret_cast<>, unfortch (or +0 trickery) (here and elsewhere; please make a pass) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:372: const static EGLint image_attrs[] = { Should this be using Gles2TextureToEglImageTranslator::TranslateToEglImage() instead? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:391: egl_image = eglCreateImageKHR(egl_display_, EGL_NO_CONTEXT, style: don't mix arg styles; either everything's +4 indent or everything's indented to opening paren (can't ahve first arg on same line and rest +4) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:403: ret = (*mali_egl_image_get_buffer_ext_phandle)( (*foo)(...) == foo(...) and the latter is less messy. (here and elsewhere) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:421: // Must be run on API thread, as we're inserting a sync in the EGL context. s/API/Child/ but I think this comment really applies to the sync part below, not the API as a whole (which must run on the child thread b/c that's where GVDA will run it ;)) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:443: return; drop https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:468: if (decoder_thread_.IsRunning()) { This /looks/ racy (although is probably not b/c the only way for decoder_thread_ to stop/be joined is to come through here). Better would be to PostTask unconditionally and inspect the return value. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:475: // Otherwise, destroy directly. This implies DestroyTask is named unfortunately. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:640: // If we're behind on tasks, schedule another one. Is this equivalent to testing whether the input buffer just failed to get fully consumed? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:641: if ((size_t)decoder_decode_buffer_tasks_scheduled_ < c-style cast https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:692: if (frame_buffer_size_.width() != 0 || frame_buffer_size_.height() != 0) { !IsEmpty() https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:693: if (frame_buffer_size_.width() != (int)format.fmt.pix_mp.width || frame_buffer_size_ != gfx::Size(format.fmt.pix_mp.width, format.fmt.pix_mp.height) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:750: DCHECK_EQ(device_thread_.IsRunning(), true); What changed since l.656 where this was false, and then we're called at l.662? (how is this not crashing?) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:750: DCHECK_EQ(device_thread_.IsRunning(), true); DCHECK_EQ(Foo(), true) is more readable as DCHECK(Foo()) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:789: LOG(ERROR) << "AppendToInputFrame(): over-size frame, truncating"; where is input_record.length populated? Is it really ok to truncate randomly like this? Shouldn't it be a terminal decode error? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:810: // Queue it to MSC. typo: MSC https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:818: if (decoder_state_ == kError) l.818-821 is: return decoder_state_ != kError; https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:857: StartDevice(); We can only get here through AssignPictureBuffers which is only sent in response to RequestPictureBuffers which is only sent from CreateGscOutputBuffers() which is only called from DecodeBufferInitial which already calls StartDevice, so why do it again here? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:865: DCHECK_NE(decoder_state_, kAfterReset); How can you guarantee this? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:882: mfc_input_ready_queue_.size() << "] => MSC[" << typo: MSC (just do a global search for this, wouldja? :)) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:905: while (!mfc_input_ready_queue_.empty()) { how can this queue ever contain more than a single element? (and not have the decoder in error state) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:909: DCHECK_EQ(input_record.at_device, false); drop the _EQ https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:913: qbuf.type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE; I'm going to defer to posciak@ for reviewing the v4l2-specific stuff. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:933: ret = ioctl(mfc_fd_, VIDIOC_STREAMON, &type); Shouldn't this be done much earlier, during initialization? (ditto for the other _streamon_'s below) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:987: DCHECK_NE(decoder_state_, kAfterReset); I don't think you can guarantee this https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1000: decoder_buffers + mfc_free_input_buffers_.size() + 1) { I don't understand l.993-1000. Can you explain? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1038: ret = ioctl(mfc_fd_, VIDIOC_DQBUF, &dqbuf); move the ioctl into the while condition to avoid having to repeat it at l.1055? (ret is unused) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1054: errno = 0; unfortch that l.1048-1054 is a copy of l.1031-1037 https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1058: // We're not streaming this queue; skip. what does this mean? In what scenario can this happen? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1068: void ExynosVideoDecodeAccelerator::EnqueueGsc() { Made it down to here and ran out of steam. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1391: decoder_state_ = kError; what's this? more decoder_state_ writing on the child thread after the decoder thread has been started! https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:5: // This file contains an implementation of VideoDecoderAccelerator OOC how does perf of this VDA compare to OVDA on exynos? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:110: ~BitstreamBufferRecord() {} here and below, dtors should be defined out-of-line, in the cc file. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:166: const scoped_array<int> egl_image_fds; this and the previous are just arrays of PODs, so not sure why you wouldn't instead keep them in a std::vector<> (and copy as necessary). https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:189: bool DecodeBufferInitial(const void* data, size_t size); doco ret values of these two. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:226: void CleanupTask(); This is named "Task" implying running on decoder_thread_ but in reality it runs on the child thread (and asserts that fact, too!) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... File content/common/gpu/media/gpu_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/gpu_video_decode_accelerator.cc:28: #include "content/common/gpu/media/exynos_video_decode_accelerator.h" goes above previous line for sorting https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/video_decode_accelerator_unittest.cc:52: #include "content/common/gpu/media/exynos_video_decode_accelerator.h" above prev line https://chromiumcodereview.appspot.com/11198060/diff/19001/content/gpu/gpu_ma... File content/gpu/gpu_main.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/19001/content/gpu/gpu_ma... content/gpu/gpu_main.cc:37: #include "content/common/gpu/media/exynos_video_decode_accelerator.h" above prev line
Updated. PTALSMKTHXBAI (PTAL some more *) https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/1/content/common/gpu/med... content/common/gpu/media/exynos_video_decode_accelerator.cc:575: decoder_decode_buffer_tasks_scheduled_ += 1; For POD it doesn't matter much, but the return-copy-of-original _could_ be a problem for types that overload post-ops. But sure, I'll replace them here :-P https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:37: (EGLImageKHR, EGLint*, void*) = NULL; On 2012/10/31 01:06:50, Ami Fischman wrote: > style: here and below, opening paren belongs on previous line Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:47: : shm(shm), On 2012/10/31 01:06:50, Ami Fischman wrote: > self-assign? > (here and below) Not actually an error: the outer resolves in class scope and the inner resolves in the function call scope (see: http://stackoverflow.com/questions/2227244/what-if-a-constructor-parameter-ha...) The style guide has has no guidance on this, and I thought this was cleaner. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:53: at_device = false; On 2012/10/31 01:06:50, Ami Fischman wrote: > here and below, use initializer lists for stuff that can use it. I figured this was more uniform-looking, but if you insist... :) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:102: if (egl_image_fds[i] != -1) On 2012/10/31 01:06:50, Ami Fischman wrote: > If it possible for the test here and the one two lines up to disagree? > (worth DCHECK'ing?) It can happen if we fail in AssignPictureBuffers while creating the images. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:172: DCHECK_EQ(child_message_loop_proxy_, base::MessageLoopProxy::current()); On 2012/10/31 01:06:50, Ami Fischman wrote: > here and below, > DCHECK_EQ(child_message_loop_proxy_, base::MessageLoopProxy::current()); > is more readable as: > DCHECK(child_message_loop_proxy_->BelongsToCurrentThread()); Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:174: int ret; On 2012/10/31 01:06:50, Ami Fischman wrote: > style: declare vars as near to first use as possible. Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:175: const __u32 caps_required = On 2012/10/31 01:06:50, Ami Fischman wrote: > consts get kCapsRequired-style naming Done. Used to be a problem when we had gotos :-) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:176: V4L2_CAP_VIDEO_CAPTURE_MPLANE | On 2012/10/31 01:06:50, Ami Fischman wrote: > why do we care about video /capture/ for this file? Hehe. Blame V4L2. VIDEO_OUTPUT == decoder input VIDEO_CAPTURE == decoder output https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:202: DLOG(ERROR) << "Initialize(): PostSandboxInitialization() failed"; On 2012/10/31 01:06:50, Ami Fischman wrote: > nit: if failure was in PreSandbox msg is slightly misleading. It'll report what failed, which should be fairly obviously traceable to PreSandbox. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:210: DLOG(ERROR) << "Initialize(): could not set EGLContext"; On 2012/10/31 01:06:50, Ami Fischman wrote: > s/set/get/? Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:220: return false; Removed CleanupTask (see .h). The NOTIFY_ERROR should be callable from any thread, so it can't set decoder_state_ directly. Also, sometimes I DLOG and sometimes DPLOG, so I can't fold that one in readily either. (The macro does have a LOG, so it's possible to track the line number at which the actual failure occured.) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:269: DLOG(ERROR) << "Initialize(): ioctl() failed: VIDIOC_QUERYCAP" On 2012/10/31 01:06:50, Ami Fischman wrote: > errors from from MFC & GSC fd's will look the same except for line number. > Disambiguate? Yeah, I'm relying on line number here, but I don't think that's too much an issue. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:280: control.value = 8; // Magic number. On 2012/10/31 01:06:50, Ami Fischman wrote: > random/magic doco'd somewhere? Updated comment. It's from Samsung. :-) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:281: errno = 0; Paranoia I guess. I like to initialize to known value before I call out to external functions. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:322: decoder_state_ = kInitialized; On 2012/10/31 01:06:50, Ami Fischman wrote: > didn't somebody tell me the child thread only touches this var until > decoder_thread_ is Start()ed? :) > > You could post this assignment to the decoder_thread_ and then trigger the > NotifyInitializeDone from there, or you could just move this assignment above > the Start() at l.315 and let the assignment at l.317 overwrite in case Start() > fails. Heh. Well, at this point (until return of this function), no tasks can have been posted to the decoder_task_ yet, so there's still no race. I guess I should annotate all naughty touches of decoder_state_ with a comment. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:330: DVLOG(1) << "Decode(): input_id=" << bitstream_buffer.id() << On 2012/10/31 01:06:50, Ami Fischman wrote: > style: << goes on next line (even though it's an operator, and normally > operators go on previous line, stream operators go on next line *and* line up > with each other instead of following usual +4 indent; crazy!) Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:335: bitstream_record(new BitstreamBufferRecord( On 2012/10/31 01:06:50, Ami Fischman wrote: > bitstream_record could go on previous line (and de-indent -4 the following 2 > lines) Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:362: DCHECK_EQ(buffers.size(), (size_t)gsc_output_buffer_count_); On 2012/10/31 01:06:50, Ami Fischman wrote: > style: no c-style casts; reinterpret_cast<>, unfortch (or +0 trickery) > > (here and elsewhere; please make a pass) Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:372: const static EGLint image_attrs[] = { My original version of EVDA did; I removed it to remove header dependencies and some overhead. The ImageTranslator keeps around the X pixmap until the EGLImageKHR is destroyed, which is actually not necessary according to spec (see l.393). Perhaps the NVIDIA driver required it (for OVDA), so I haven't changed ImageTranslator itself. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:391: egl_image = eglCreateImageKHR(egl_display_, EGL_NO_CONTEXT, On 2012/10/31 01:06:50, Ami Fischman wrote: > style: don't mix arg styles; either everything's +4 indent or everything's > indented to opening paren (can't ahve first arg on same line and rest +4) Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:403: ret = (*mali_egl_image_get_buffer_ext_phandle)( On 2012/10/31 01:06:50, Ami Fischman wrote: > (*foo)(...) == foo(...) > and the latter is less messy. > > (here and elsewhere) Another bit of pedantry -- it's a function pointer, it deserves to be dereferenced first :-) Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:421: // Must be run on API thread, as we're inserting a sync in the EGL context. On 2012/10/31 01:06:50, Ami Fischman wrote: > s/API/Child/ > but I think this comment really applies to the sync part below, not the API as a > whole (which must run on the child thread b/c that's where GVDA will run it ;)) Updated comment, but I think it still makes sense. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:443: return; On 2012/10/31 01:06:50, Ami Fischman wrote: > drop Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:468: if (decoder_thread_.IsRunning()) { On 2012/10/31 01:06:50, Ami Fischman wrote: > This /looks/ racy (although is probably not b/c the only way for decoder_thread_ > to stop/be joined is to come through here). > > Better would be to PostTask unconditionally and inspect the return value. Good point. I presume that message_loop_proxy() is preferable then. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:475: // Otherwise, destroy directly. On 2012/10/31 01:06:50, Ami Fischman wrote: > This implies DestroyTask is named unfortunately. Not sure what you mean here. DestroyTask() is a task to be run from the decoder thread when possible -- otherwise we'll just run it directly. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:640: // If we're behind on tasks, schedule another one. On 2012/10/31 01:06:50, Ami Fischman wrote: > Is this equivalent to testing whether the input buffer just failed to get fully > consumed? Not necessarily; each run through DecodeBufferTask only (potentially) consumes one buffer, but more than one input buffer may be queued by the client. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:641: if ((size_t)decoder_decode_buffer_tasks_scheduled_ < On 2012/10/31 01:06:50, Ami Fischman wrote: > c-style cast Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:692: if (frame_buffer_size_.width() != 0 || frame_buffer_size_.height() != 0) { On 2012/10/31 01:06:50, Ami Fischman wrote: > !IsEmpty() Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:693: if (frame_buffer_size_.width() != (int)format.fmt.pix_mp.width || On 2012/10/31 01:06:50, Ami Fischman wrote: > frame_buffer_size_ != gfx::Size(format.fmt.pix_mp.width, > format.fmt.pix_mp.height) I'm a bit disinclined to create objects just to compare here, or this the Chromium way? https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:750: DCHECK_EQ(device_thread_.IsRunning(), true); On 2012/10/31 01:06:50, Ami Fischman wrote: > DCHECK_EQ(Foo(), true) > is more readable as > DCHECK(Foo()) Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:789: LOG(ERROR) << "AppendToInputFrame(): over-size frame, truncating"; On 2012/10/31 01:06:50, Ami Fischman wrote: > where is input_record.length populated? > Is it really ok to truncate randomly like this? Shouldn't it be a terminal > decode error? The length gets filled in in CreateMfcInputBuffers(). I we can make this a fatal error then. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:818: if (decoder_state_ == kError) On 2012/10/31 01:06:50, Ami Fischman wrote: > l.818-821 is: > return decoder_state_ != kError; Sure. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:857: StartDevice(); On 2012/10/31 01:06:50, Ami Fischman wrote: > We can only get here through AssignPictureBuffers which is only sent in response > to RequestPictureBuffers which is only sent from CreateGscOutputBuffers() which > is only called from DecodeBufferInitial which already calls StartDevice, so why > do it again here? Right after initialization, we get calls to Decode() that set up the decoding pipe. At this point, the calls to StartDevice() in DecodeBufferInitial() will fail because we don't have output buffers yet, because we haven't called AssignPictureBuffers() yet, because we have not requested any buffers yet, because we don't know the image size until we run DecodeBufferInitial() successfully at least once. During a Reset(), we call StopDevice(). Then, if we resume decoding, DecodeBufferInitial() will be called again, and in this case the StartDevice() will succeed there. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:865: DCHECK_NE(decoder_state_, kAfterReset); On 2012/10/31 01:06:50, Ami Fischman wrote: > How can you guarantee this? We only get put into kAfterReset after a ResetDoneTask(). It goes like this: 1. Reset() calls StopDevice(), which stops the DeviceTask(), so no more ServiceDeviceTask() calls get posted to the decoder_thread_ 2. Reset() posts a ResetDoneTask(), and then drains all intervening tasks up to the ResetDoneTask 3. ResetDoneTask() then sets device state to kAfterReset. After step (2), there are no ServiceDeviceTask()s outstanding, so at no point should an outstanding ServiceDeviceTask() ever see kAfterReset state. I've updated the comment in the .h about ResetTask(). https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:882: mfc_input_ready_queue_.size() << "] => MSC[" << On 2012/10/31 01:06:50, Ami Fischman wrote: > typo: MSC > (just do a global search for this, wouldja? :)) :-) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:905: while (!mfc_input_ready_queue_.empty()) { On 2012/10/31 01:06:50, Ami Fischman wrote: > how can this queue ever contain more than a single element? > (and not have the decoder in error state) If the decoder is backed up, we can have the client sending us BitstreamBuffers while we haven't finished decoding the data from previous buffers. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:909: DCHECK_EQ(input_record.at_device, false); On 2012/10/31 01:06:50, Ami Fischman wrote: > drop the _EQ Did some regexing on this. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:933: ret = ioctl(mfc_fd_, VIDIOC_STREAMON, &type); On 2012/10/31 01:06:50, Ami Fischman wrote: > Shouldn't this be done much earlier, during initialization? > (ditto for the other _streamon_'s below) Nope. V4L2 requires that you have at least one buffer queued before you can STREAMON. Dumb, I know. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:987: DCHECK_NE(decoder_state_, kAfterReset); On 2012/10/31 01:06:50, Ami Fischman wrote: > I don't think you can guarantee this Ah, turns out you are correct here, since we can get here through DecodeBufferInitial() after a reset. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1000: decoder_buffers + mfc_free_input_buffers_.size() + 1) { On 2012/10/31 01:06:50, Ami Fischman wrote: > I don't understand l.993-1000. Can you explain? So. We don't want to completely drain MFC. So we need to count how many buffers we have in it. We do this by counting the number of buffers we _don't_ have in it, which is the sum of the ready queue size + [1 if there is a buffer currently being filled] + number of free buffers. We'll keep draining MFC unti draining the MFC would cause this number to be equal to the total number of buffers. You know what? That's confusing and error-prone. I've updated to use an explicit count. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1038: ret = ioctl(mfc_fd_, VIDIOC_DQBUF, &dqbuf); On 2012/10/31 01:06:50, Ami Fischman wrote: > move the ioctl into the while condition to avoid having to repeat it at l.1055? > (ret is unused) Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1058: // We're not streaming this queue; skip. On 2012/10/31 01:06:50, Ami Fischman wrote: > what does this mean? In what scenario can this happen? We could be streaming the OUTPUT but not the CAPTURE queue presently, or vice-versa. Don't completely wig out if this is the case, and go on to servicing the other queue. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1068: void ExynosVideoDecodeAccelerator::EnqueueGsc() { On 2012/10/31 01:06:50, Ami Fischman wrote: > Made it down to here and ran out of steam. Halfway there! https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1391: decoder_state_ = kError; On 2012/10/31 01:06:50, Ami Fischman wrote: > what's this? more decoder_state_ writing on the child thread after the decoder > thread has been started! Removed, see .h. This one's an actual bug :-) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:5: // This file contains an implementation of VideoDecoderAccelerator I haven't tuned it, but it's looking to be about 5 percentage points less CPU usage (as measured by top) when decoding 1080p. Also, it's not the buggy piece of thread-dangerous crap that secomx on top of OVDA is. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:166: const scoped_array<int> egl_image_fds; To keep things lightweight? I initially inclined to do this also for the maps. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:189: bool DecodeBufferInitial(const void* data, size_t size); On 2012/10/31 01:06:50, Ami Fischman wrote: > doco ret values of these two. Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:226: void CleanupTask(); On 2012/10/31 01:06:50, Ami Fischman wrote: > This is named "Task" implying running on decoder_thread_ but in reality it runs > on the child thread (and asserts that fact, too!) Removed entirely. Wasn't doing much anyways, and was touching decoder_state_. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:228: // Start/stop DeviceTask() running on device_thread_. Renamed to StartDevicePoll/StopDevicePoll. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:237: void DeviceTask(); Renamed to DevicePollLoop(). https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... File content/common/gpu/media/gpu_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/gpu_video_decode_accelerator.cc:28: #include "content/common/gpu/media/exynos_video_decode_accelerator.h" On 2012/10/31 01:06:50, Ami Fischman wrote: > goes above previous line for sorting Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/video_decode_accelerator_unittest.cc:52: #include "content/common/gpu/media/exynos_video_decode_accelerator.h" On 2012/10/31 01:06:50, Ami Fischman wrote: > above prev line Done. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/gpu/gpu_ma... File content/gpu/gpu_main.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/19001/content/gpu/gpu_ma... content/gpu/gpu_main.cc:37: #include "content/common/gpu/media/exynos_video_decode_accelerator.h" On 2012/10/31 01:06:50, Ami Fischman wrote: > above prev line Done.
Also: I did rename DeviceTask() and associated functions to DevicePollLoop(), since it isn't a decoder task.
Also: I did rename DeviceTask() and associated functions to DevicePollLoop(), since it isn't a decoder task.
https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1005: // queue. Note Not uploaded yet, but fixed https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1630: // on their queuespoll . Not uploaded yet, but fixed. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:2017: << ", height=" << frame_buffer_size_.height(); Not uploaded yet, but fixed
Still haven't made it into the second half of the .cc file. https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:53: at_device = false; On 2012/11/01 02:16:08, sheu wrote: > On 2012/10/31 01:06:50, Ami Fischman wrote: > > here and below, use initializer lists for stuff that can use it. > > I figured this was more uniform-looking, but if you insist... :) FWIW, initializer lists have the benefit over explicit assignments that they both support const members better and enforce initialization order (helpful for avoiding out-of-order bugs). https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:176: V4L2_CAP_VIDEO_CAPTURE_MPLANE | On 2012/11/01 02:16:08, sheu wrote: > On 2012/10/31 01:06:50, Ami Fischman wrote: > > why do we care about video /capture/ for this file? > > Hehe. Blame V4L2. > > VIDEO_OUTPUT == decoder input > VIDEO_CAPTURE == decoder output Like I said, gonna let posciak@ review for v4l2 correctness ;) https://chromiumcodereview.appspot.com/11198060/diff/19001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:857: StartDevice(); On 2012/11/01 02:16:08, sheu wrote: > On 2012/10/31 01:06:50, Ami Fischman wrote: > > We can only get here through AssignPictureBuffers which is only sent in > response > > to RequestPictureBuffers which is only sent from CreateGscOutputBuffers() > which > > is only called from DecodeBufferInitial which already calls StartDevice, so > why > > do it again here? > > Right after initialization, we get calls to Decode() that set up the decoding > pipe. At this point, the calls to StartDevice() in DecodeBufferInitial() will > fail because we don't have output buffers yet, because we haven't called > AssignPictureBuffers() yet, because we have not requested any buffers yet, > because we don't know the image size until we run DecodeBufferInitial() > successfully at least once. > > During a Reset(), we call StopDevice(). Then, if we resume decoding, > DecodeBufferInitial() will be called again, and in this case the StartDevice() > will succeed there. Wow. Please comment both calls to indicate which of the two codepaths lead there (something as simple as "needed after reset" and "needed by first decode") https://chromiumcodereview.appspot.com/11198060/diff/23001/content/browser/gp... File content/browser/gpu/gpu_process_host.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/browser/gp... content/browser/gpu/gpu_process_host.cc:49: namespace content { If you rebase during a CR, your reviewer is gonna have a bad time. Proper rebase/CR etiquette is: avoid rebasing during the review if possible. If you need to rebase (usually b/c the patch is very big and you want to avoid later harder conflicts), the thing to do is respond to all extant comments, upload new patchset titled something like "CR responses", *then* rebase and upload another patchset titled "rebase only" which only has the rebase. That way your reviewer can confirm CR responses and just skim over the rebase. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:114: for (int i = 0; i < egl_images_count; i++) { pre-increment, not post- (everywhere) https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:257: int ret; declare at first use (l.264 below https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:390: const static EGLint kImageAttrs[] = { The translator only has support for backing pixmaps for exynos boards; NVIDIA doesn't need pixmaps at all: http://code.google.com/searchframe#OAMlx_jo-ck/src/content/common/gpu/media/o... So you can actually make the change you want in the translator without worrying about nvidia, and use it here. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:492: DestroyTask(); On 2012/11/01 02:16:08, sheu wrote: > On 2012/10/31 01:06:50, Ami Fischman wrote: > > This implies DestroyTask is named unfortunately. > > Not sure what you mean here. DestroyTask() is a task to be run from the decoder > thread when possible -- otherwise we'll just run it directly. My point is that all other FooTask methods are unconditionally run on the decoder thread, but this is only best-effort that way. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:659: static_cast<int>(decoder_input_queue_.size())) { I wonder whether you'd be happier by turning a bunch of the state-tracking members into appropriate types for the comparisons they end up in (probably mostly size_t). https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:685: int ret; decl at first assignment at l.689 below. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:711: static_cast<int>(format.fmt.pix_mp.width) || On 2012/11/01 02:16:08, sheu wrote: > On 2012/10/31 01:06:50, Ami Fischman wrote: > > frame_buffer_size_ != gfx::Size(format.fmt.pix_mp.width, > > format.fmt.pix_mp.height) > > I'm a bit disinclined to create objects just to compare here, or this the > Chromium way? The Chromium Way is to strongly prefer more-readable code. I doubt "creating the object" will actually translate into less efficient object code once -O2 is done with it. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1003: // device, epoll() will return EPOLLERR and our DevicePollLoop() will exit It seems kinda crazy to have this affordance here. Is it stupid to suggest that DevicePollLoop() should instead be aware of this condition and re-enter the epoll? If you're worried about burning CPU going into no-op epoll's repeatedly, my suggestion is to use a condvar/notification/waitableevent. I guess you might be averse to that as an explicit piece of synchronization in a class that is otherwise beautifully sync-free, but do you really think this bunch of code is a better answer? Does this prevent receiving the last picture fed to Decode() when decode is paused (i.e. the client sends a bunch of buffers, but doesn't request Flush or Reset, it just wants to receive pictures)? https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1013: int ret; decl at first use (I'm going to stop saying that now; please make a pass) https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1070: // We're not streaming this queue; skip. does this happen only during startup/shutdown, and only during the first call to ioctl (not subsequent loop iterations)? https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1707: NOTIFY_ERROR(PLATFORM_FAILURE); Your rationale for not including decoder_state_=kError; in NOTIFY_ERROR is that you want N_E to be callable from any thread. But in these calls to N_E from the device thread, what is charged with putting the decoder into an error state? https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:189: // There return true if an input buffer is consumed, otherwise false. It may s/There/These/ but more stylish to use the active voice: Return true if... https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:248: // Other utility functions doco which thread they run on?
Got through it all! https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:175: Destroy(); Part of VDA::Destroy()'s API is that it deletes |this| (and in fact that's the only correct way to enter the dtor) so not sure what's up with this explicit call, and the following DCHECKs will be use-after-free's if Destroy obeys its contract. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:510: decoder_state_ = kError; where's the delete this; ? https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1094: break; return (it's a long way down to the closing braces of the broken while, and then nothing else happens) https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1098: break; ditto return https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1103: break; ditto https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1204: // Work around this by never _completely_ draining the GSC input queue. ditto point about synchronization vs. holding an extra buffer in the device https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1239: // the VDA. Don't recycle to its free list yet -- we can't do that until s/VDA/VDA::Client/ https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1363: // jobs will early-out in the kResetting state. Do you want to make a pass and verify this? Many FooTask() methods don't actually early-return in kResetting. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1370: DVLOG(3) << "ResetDoneTask()"; DCHECK decoder_thread https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1417: return true; why true? https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1459: mfc_output_streamon_ = true; can most/all of the last 30 lines be deduped against another place in this file? https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1495: DPLOG(ERROR) << "StartDevicePoll(): ioctl() failed: VIDIOC_STREAMON"; ditto dedup this ~30 lines https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1529: return false; This ioctl'ing business look really ripe for a helper to me. (here and in lots of other places) https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1682: } this if block can be extracted out of the fd check to dedup with the gsc_fd_ version? https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1688: if (decoder_current_bitstream_buffer_ != NULL) why is it safe to inspect this on this thread? https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1690: if (decoder_decode_buffer_tasks_scheduled_ < buffers_pending) { ditto thread safety here and below. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1708: return; I think you can simplify & shorten this function significantly by early-detecting event.data.fd != gsc_fd_ && event.data.fd != mfc_fd_ and otherwise scheduling a ServiceDoneTask which can also take care of requesting DecodeBufferTask if necessary. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1739: video_profile_ <= media::VP8PROFILE_MAX) { indent https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1741: } else NOTREACHED? https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1959: } man, I *really* hope you figure out a way to do ioctl's in fewer LOC... https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1985: format.fmt.pix_mp.pixelformat = V4L2_PIX_FMT_RGB32; Is V_P_F_RGB32 == RGBA8888 ? https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:229: // Start/stop DevicePollLoop() running on device_monitor_thread_. doco return value. in particular ISTM Start() can return true w/out the thread being started (???)
https://chromiumcodereview.appspot.com/11198060/diff/23001/content/browser/gp... File content/browser/gpu/gpu_process_host.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/browser/gp... content/browser/gpu/gpu_process_host.cc:49: namespace content { On 2012/11/02 17:57:06, Ami Fischman wrote: > If you rebase during a CR, your reviewer is gonna have a bad time. > > Proper rebase/CR etiquette is: avoid rebasing during the review if possible. If > you need to rebase (usually b/c the patch is very big and you want to avoid > later harder conflicts), the thing to do is respond to all extant comments, > upload new patchset titled something like "CR responses", *then* rebase and > upload another patchset titled "rebase only" which only has the rebase. That > way your reviewer can confirm CR responses and just skim over the rebase. Gotcha, thanks. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:114: for (int i = 0; i < egl_images_count; i++) { On 2012/11/02 17:57:06, Ami Fischman wrote: > pre-increment, not post- > (everywhere) Style guide says both are fine for POD scalar types. For loop iteration though, I guess there _is_ an argument to be made for it to be treated like any other iterator type, i.e. always use pre-increment. So ok. Pre-increment in the loops, but post-increment other places because it still looks funny to me. (I still expect to see operators on the rhs of an l-value.) https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:175: Destroy(); On 2012/11/02 21:11:08, Ami Fischman wrote: > Part of VDA::Destroy()'s API is that it deletes |this| (and in fact that's the > only correct way to enter the dtor) so not sure what's up with this explicit > call, and the following DCHECKs will be use-after-free's if Destroy obeys its > contract. Did not see that. Alright, I'll have to run the VDA a few more times to check that we're not actually leaking stuff. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:257: int ret; On 2012/11/02 17:57:06, Ami Fischman wrote: > declare at first use (l.264 below Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:510: decoder_state_ = kError; On 2012/11/02 21:11:08, Ami Fischman wrote: > where's the > delete this; > ? Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:685: int ret; On 2012/11/02 17:57:06, Ami Fischman wrote: > decl at first assignment at l.689 below. Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:711: static_cast<int>(format.fmt.pix_mp.width) || On 2012/11/02 17:57:06, Ami Fischman wrote: > On 2012/11/01 02:16:08, sheu wrote: > > On 2012/10/31 01:06:50, Ami Fischman wrote: > > > frame_buffer_size_ != gfx::Size(format.fmt.pix_mp.width, > > > format.fmt.pix_mp.height) > > > > I'm a bit disinclined to create objects just to compare here, or this the > > Chromium way? > > The Chromium Way is to strongly prefer more-readable code. > I doubt "creating the object" will actually translate into less efficient object > code once -O2 is done with it. Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1003: // device, epoll() will return EPOLLERR and our DevicePollLoop() will exit On 2012/11/02 17:57:06, Ami Fischman wrote: > It seems kinda crazy to have this affordance here. > Is it stupid to suggest that DevicePollLoop() should instead be aware of this > condition and re-enter the epoll? > If you're worried about burning CPU going into no-op epoll's repeatedly, my > suggestion is to use a condvar/notification/waitableevent. I guess you might be > averse to that as an explicit piece of synchronization in a class that is > otherwise beautifully sync-free, but do you really think this bunch of code is a > better answer? > > Does this prevent receiving the last picture fed to Decode() when decode is > paused (i.e. the client sends a bunch of buffers, but doesn't request Flush or > Reset, it just wants to receive pictures)? If DevicePollLoop() wants to be aware of this, it's going to have to touch decoder state, and we're going to have to do some locking. I'm not sure how you would intend for the DevicePollLoop to wait on an event, when it's already epoll()-ing on an FD. I know this seems strange, but this doesn't affect the last picture returned to client, since this buffer retention is on the OUTPUT queue (the decoder input queue), and the CAPTURE queue (the decoder output queue) is fully drained as far as it will go. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1013: int ret; On 2012/11/02 17:57:06, Ami Fischman wrote: > decl at first use (I'm going to stop saying that now; please make a pass) Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1070: // We're not streaming this queue; skip. On 2012/11/02 17:57:06, Ami Fischman wrote: > does this happen only during startup/shutdown, and only during the first call to > ioctl (not subsequent loop iterations)? Again, the OUTPUT and CAPTURE queues for each device are independent, so I want to be able to run one without the other. From the logical flow of the code, I don't think this particular case can happen, but it's not really an error if it does. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1094: break; On 2012/11/02 21:11:08, Ami Fischman wrote: > return (it's a long way down to the closing braces of the broken while, and then > nothing else happens) It is a long way down, but this isn't like error-out-return, it's more like we're-done-here-break, and we just happen not to do anything afterwards. I don't think returning explicitly quite documents that, and it would have to be changed if we were to want to add more work at the bottom of this function. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1204: // Work around this by never _completely_ draining the GSC input queue. On 2012/11/02 21:11:08, Ami Fischman wrote: > ditto point about synchronization vs. holding an extra buffer in the device See above https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1239: // the VDA. Don't recycle to its free list yet -- we can't do that until On 2012/11/02 21:11:08, Ami Fischman wrote: > s/VDA/VDA::Client/ Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1363: // jobs will early-out in the kResetting state. On 2012/11/02 21:11:08, Ami Fischman wrote: > Do you want to make a pass and verify this? Many FooTask() methods don't > actually early-return in kResetting. Done. FlushTask() was missing the early-return; everything else was there already. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1370: DVLOG(3) << "ResetDoneTask()"; On 2012/11/02 21:11:08, Ami Fischman wrote: > DCHECK decoder_thread Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1417: return true; On 2012/11/02 21:11:08, Ami Fischman wrote: > why true? Added comment. Basically, if we don't have output buffers yet, this isn't necessarily an error; see the case of running StartDevicePoll from an initial DecodeBufferInitial(). https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1459: mfc_output_streamon_ = true; On 2012/11/02 21:11:08, Ami Fischman wrote: > can most/all of the last 30 lines be deduped against another place in this file? It could be, but I'd then be breaking the QBUF for MFC CAPTURE and GSC CAPTURE out into its own helper function, which means I'd probably break out MFC OUTPUT and GSC OUTPUT as well for consistency. I could, but I think the StartDevicePoll() and path is conceptually different enough that it should be explicitly copied out. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1495: DPLOG(ERROR) << "StartDevicePoll(): ioctl() failed: VIDIOC_STREAMON"; On 2012/11/02 21:11:08, Ami Fischman wrote: > ditto dedup this ~30 lines See above https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1529: return false; On 2012/11/02 21:11:08, Ami Fischman wrote: > This ioctl'ing business look really ripe for a helper to me. > (here and in lots of other places) Don't know that I can actually make this very much more compact. I've collapsed the call into the conditional, but past that, every ioctl call is a unique and special snowflake. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1682: } On 2012/11/02 21:11:08, Ami Fischman wrote: > this if block can be extracted out of the fd check to dedup with the gsc_fd_ > version? done (see below) https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1688: if (decoder_current_bitstream_buffer_ != NULL) On 2012/11/02 21:11:08, Ami Fischman wrote: > why is it safe to inspect this on this thread? Dang. A real live thread-unsafety. Fixed by moving to ServiceDeviceTask(), and refactored this function. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1690: if (decoder_decode_buffer_tasks_scheduled_ < buffers_pending) { On 2012/11/02 21:11:08, Ami Fischman wrote: > ditto thread safety here and below. (see above) https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1708: return; On 2012/11/02 21:11:08, Ami Fischman wrote: > I think you can simplify & shorten this function significantly by > early-detecting > event.data.fd != gsc_fd_ && event.data.fd != mfc_fd_ > and otherwise scheduling a ServiceDoneTask which can also take care of > requesting DecodeBufferTask if necessary. My thoughts exactly. Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1739: video_profile_ <= media::VP8PROFILE_MAX) { On 2012/11/02 21:11:08, Ami Fischman wrote: > indent Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1741: } On 2012/11/02 21:11:08, Ami Fischman wrote: > else NOTREACHED? Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1959: } On 2012/11/02 21:11:08, Ami Fischman wrote: > man, I *really* hope you figure out a way to do ioctl's in fewer LOC... I don't really know that there is, unfortunately. Each ioctl call needs its control struct filled out, and after collapsing the ioctl call into the condition, there's not actually any more compactness to be had. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1985: format.fmt.pix_mp.pixelformat = V4L2_PIX_FMT_RGB32; On 2012/11/02 21:11:08, Ami Fischman wrote: > Is V_P_F_RGB32 == RGBA8888 ? Yep. There's also V4L2_PIX_FMT_BGR32 for the other way. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:189: // There return true if an input buffer is consumed, otherwise false. It may On 2012/11/02 17:57:06, Ami Fischman wrote: > s/There/These/ but more stylish to use the active voice: > Return true if... Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:229: // Start/stop DevicePollLoop() running on device_monitor_thread_. On 2012/11/02 21:11:08, Ami Fischman wrote: > doco return value. in particular ISTM Start() can return true w/out the thread > being started (???) Done. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:248: // Other utility functions On 2012/11/02 17:57:06, Ami Fischman wrote: > doco which thread they run on? Done.
Last revision had an issue. This one should be better. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1707: NOTIFY_ERROR(PLATFORM_FAILURE); On 2012/11/02 17:57:06, Ami Fischman wrote: > Your rationale for not including > decoder_state_=kError; > in NOTIFY_ERROR is that you want N_E to be callable from any thread. But in > these calls to N_E from the device thread, what is charged with putting the > decoder into an error state? That's error-prone, I agree. I was just counting on this not to happen, which it shouldn't (unless a kernel bug comes up or something), but still... Fixed by allowing a decoder_state_ update through PostTask(). https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:656: } Whoops, didn't mean to delete this. Uploading new version.
Left comments in PS 5 and 6. In general, there is quite a bit of code duplication that we could perhaps extract into functions? Other than that, it looks pretty good, really happy to see this! https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:36: // This class handls Exynos video acceleration directly through the V4L2 devices s/handls/handles https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:53: // the V4L2 device and notify the decoder_thread_ when something interesting s/device/devices ? https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:83: // Lazily initialize static data after sandbox is enabled. Return false on Any reason why you seem to like leaving two spaces between sentences? https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:91: kMfcOutputBufferExtraCount = 2, Document what "Extra" is? https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:112: const size_t size; Could we use SharedMemory.size()? https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:132: size_t bytes_used[2]; // bytes used in each dmabuf. Perhaps consider have kNumOutputPlanes = 2 constant? https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:354: // GSC input buffer state. s/buffer/queue https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:366: int gsc_output_buffer_prepared_count_; Could you document better what "queued and not in use" means? https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:369: // Output buffers ready to use. We need a LIFO here. Why LIFO? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:33: #define EXYNOS_GSC_DEVICE "/dev/gsc1" We'd like to be using all gsc nodes available. Could we have some logic to assign them equally among instances of this class? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:232: errno = 0; Why zero-out errno? Your are not using it... https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:256: errno = 0; You have quite a lot of those 6 lines everywhere. Personally, I'd suggest wrapping all ioctl calls into something like this: IOCTL_SUCCESS_OR_NOTIFY_AND_RETURN(fd, type, arg) \ do { \ errno = 0; \ if (ioctl(fd, type, arg) != 0) { \ DPLOG(ERROR) << __func__ << "(): failed: " #type; \ NOTIFY_ERROR(PLATFORM_FAILURE); \ return false; \ } \ while(0) https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:324: } If this function fails somewhere along the way, will Destroy() be called? It doesn't seem so... Will we properly clean up then (free buffers, close mfc/gsc fds, etc?). https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:607: size_t size = decoder_current_bitstream_buffer_->size; Why not use shm->size() ? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:684: DCHECK_EQ(format.fmt.pix_mp.num_planes, 2); I'm wondering if maybe this should be more than a DCHECK... https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:878: // If we're behind on decode, schedule another one. Extract l.878-885 into a function and use it here and at l.657? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:990: // Done all that we could. s/Done all that we could./No buffers to dequeue./ https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:992: } else if (errno == EINVAL) { s/errno == EINVAL/errno == EINVAL && !mfc_output_streamon_/ just to be precise, EINVAL can mean other errors as well. Here and everywhere else. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1029: memset(&dqbuf, 0, sizeof(dqbuf)); Perhaps refactor the while loop to be: while(1) { memset(&dqbuf, 0, sizeof(dqbuf)); memset(planes, 0, sizeof(planes)); dqbuf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE; dqbuf.memory = V4L2_MEMORY_MMAP; dqbuf.m.planes = planes; dqbuf.length = 2; errno = 0; ret = ioctl(); if (!ret) break; // handle successful dqbuf } to avoid repeating l.1013-1019? Here and everywhere else? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1061: // Bug workaround: GSC is liable to race conditions if more than one This is bug-worthy for SLSI I'd say... https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1079: egl_destroy_sync_khr(egl_display_, output_record.egl_sync); Is it valid to do this on decoder thread, i.e. a different thread that it was created on and/or without making context current? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1288: // If we don't have anything actually queued, we can notify immediatey. s/immediatey/immediately https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1307: decoder_current_bitstream_buffer_.reset(NULL); Enough to say reset() https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1404: qbuf.index = buffer; There is some code duplication here, this is basically the same as l.938. Perhaps we could extract this into a function? And in other similar places. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1419: if (ioctl(mfc_fd_, VIDIOC_STREAMON, &type) != 0) { Same here, streamon-related code is duplicated in a few places, could be extracted to a DeviceStreamon(fd, type). https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1481: __u32 type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE; Wrap all ifs into DeviceStreamoff(fd, type) ? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1667: &ExynosVideoDecodeAccelerator::SetDecoderState, Indent error? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1721: for (int i = 0; i < mfc_input_buffer_count_; ++i) { This whole sequence could be extracted into a function, MapDeviceBuffers(fd, type, memory, num_planes) perhaps and replace l.1721- and l.1792-? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1742: void* offset = mmap(NULL, buffer.m.planes[0].length, Nit: calling this an "offset" is a bit misleading, it's a pointer. Especially since it's different from mem_offset. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1860: memset(&control, 0, sizeof(control)); Wrap all of those sequences into DeviceSetCtrl(fd, ctrl) ? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1977: void ExynosVideoDecodeAccelerator::DestroyMfcInputBuffers() { DestroyDeviceBuffers(fd, type, memory, num_planes) in all below functions perhaps? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:152: bool at_client; // held by client. Since at_device and at_client are mutually exclusive, should this perhaps be one enum state {kUnused, kAtDevice, kAtClient }? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/public/com... File content/public/common/content_switches.h (right): https://chromiumcodereview.appspot.com/11198060/diff/34002/content/public/com... content/public/common/content_switches.h:240: extern const char kUseExynosVda[]; Wrap in #if defined for ARMEL/CHROMEOS? If not it'd have to go into alphabetical list at the top of the file.
Imma hold off reviewing the rest of this until posciak@ LG's. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1003: // device, epoll() will return EPOLLERR and our DevicePollLoop() will exit On 2012/11/03 00:37:14, sheu wrote: > On 2012/11/02 17:57:06, Ami Fischman wrote: > > It seems kinda crazy to have this affordance here. > > Is it stupid to suggest that DevicePollLoop() should instead be aware of this > > condition and re-enter the epoll? > > If you're worried about burning CPU going into no-op epoll's repeatedly, my > > suggestion is to use a condvar/notification/waitableevent. I guess you might > be > > averse to that as an explicit piece of synchronization in a class that is > > otherwise beautifully sync-free, but do you really think this bunch of code is > a > > better answer? > > > > Does this prevent receiving the last picture fed to Decode() when decode is > > paused (i.e. the client sends a bunch of buffers, but doesn't request Flush or > > Reset, it just wants to receive pictures)? > > If DevicePollLoop() wants to be aware of this, it's going to have to touch > decoder state, and we're going to have to do some locking. > > I'm not sure how you would intend for the DevicePollLoop to wait on an event, > when it's already epoll()-ing on an FD. An example of what I meant is: have a mutex dedicated to buffer deque. Acquire the mutex here for the duration of the draining, and release/acquire the same mutex in the device thread around the epoll; that way the code handling EPOLLERR can know whether it was triggered by the last buffer being yanked away from the epoll. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1094: break; On 2012/11/03 00:37:14, sheu wrote: > On 2012/11/02 21:11:08, Ami Fischman wrote: > > return (it's a long way down to the closing braces of the broken while, and > then > > nothing else happens) > > It is a long way down, but this isn't like error-out-return, it's more like > we're-done-here-break, and we just happen not to do anything afterwards. I > don't think returning explicitly quite documents that, and it would have to be > changed if we were to want to add more work at the bottom of this function. The flip side is code added at the bottom of this function (far away from here) will be run by paths through these breaks. I think this is a mistake but will not force the issue. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1959: } On 2012/11/03 00:37:14, sheu wrote: > On 2012/11/02 21:11:08, Ami Fischman wrote: > > man, I *really* hope you figure out a way to do ioctl's in fewer LOC... > > I don't really know that there is, unfortunately. Each ioctl call needs its > control struct filled out, and after collapsing the ioctl call into the > condition, there's not actually any more compactness to be had. Even if you can't reduce the LOCs, making dedicated helper functions for the ioctls will seriously improve readability. Can you take a crack at that?
A major theme seems to be "break out all those ioctls into helper functions". Well, first I did the ioctl() macro thing, and a lot of the boilerplate fell away. As for the rest, here's an accounting of the ioctls that we repeat: VIDIOC_QBUF: each of the four (MFC/GSC output/capture) have different requirements, and I'd end up with a pretty large argument list for a helper function that I don't think would really help readability. VIDIOC_STREAMON: with the ioctl macro now this is two lines and I don't think quite worth it. VIDIOC_STREAMOFF: similarly small now. VIDIOC_REQBUFS: small now, but I could see this getting helper-fied. VIDIOC_S_CTRL: similarly could use a helper, but again this is small now. I still would like to see them broken out separately, especially since this way the NOTIFY_ERROR macro will report the line number of the actual failure, instead of the line number of the multiplexed helper function. https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/23001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1003: // device, epoll() will return EPOLLERR and our DevicePollLoop() will exit On 2012/11/06 19:50:10, Ami Fischman wrote: > On 2012/11/03 00:37:14, sheu wrote: > > On 2012/11/02 17:57:06, Ami Fischman wrote: > > > It seems kinda crazy to have this affordance here. > > > Is it stupid to suggest that DevicePollLoop() should instead be aware of > this > > > condition and re-enter the epoll? > > > If you're worried about burning CPU going into no-op epoll's repeatedly, my > > > suggestion is to use a condvar/notification/waitableevent. I guess you > might > > be > > > averse to that as an explicit piece of synchronization in a class that is > > > otherwise beautifully sync-free, but do you really think this bunch of code > is > > a > > > better answer? > > > > > > Does this prevent receiving the last picture fed to Decode() when decode is > > > paused (i.e. the client sends a bunch of buffers, but doesn't request Flush > or > > > Reset, it just wants to receive pictures)? > > > > If DevicePollLoop() wants to be aware of this, it's going to have to touch > > decoder state, and we're going to have to do some locking. > > > > I'm not sure how you would intend for the DevicePollLoop to wait on an event, > > when it's already epoll()-ing on an FD. > > An example of what I meant is: have a mutex dedicated to buffer deque. > Acquire the mutex here for the duration of the draining, and release/acquire the > same mutex in the device thread around the epoll; that way the code handling > EPOLLERR can know whether it was triggered by the last buffer being yanked away > from the epoll. Right, that would let us know if our EINVAL was due to streamoff or queue empty. That doesn't solve the problem though of having the device poll thread spin on the epoll in the case of an empty buffer. https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:36: // This class handls Exynos video acceleration directly through the V4L2 devices On 2012/11/06 19:26:16, posciak wrote: > s/handls/handles Done. https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:53: // the V4L2 device and notify the decoder_thread_ when something interesting On 2012/11/06 19:26:16, posciak wrote: > s/device/devices ? Done. https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:83: // Lazily initialize static data after sandbox is enabled. Return false on On 2012/11/06 19:26:16, posciak wrote: > Any reason why you seem to like leaving two spaces between sentences? I picked up the habit in grade school. Is this a style thing? There seems to be a lively debate about this: http://en.wikipedia.org/wiki/Sentence_spacing https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:91: kMfcOutputBufferExtraCount = 2, On 2012/11/06 19:26:16, posciak wrote: > Document what "Extra" is? Done. https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:112: const size_t size; On 2012/11/06 19:26:16, posciak wrote: > Could we use SharedMemory.size()? SharedMemory doesn't have a size(). Shall I add one? https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:132: size_t bytes_used[2]; // bytes used in each dmabuf. On 2012/11/06 19:26:16, posciak wrote: > Perhaps consider have kNumOutputPlanes = 2 constant? Just added a comment here. Otherwise I'd be looping (for int i = 0; i < kNumOutputPlanes) in more places than I care to do... https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:354: // GSC input buffer state. On 2012/11/06 19:26:16, posciak wrote: > s/buffer/queue I'm referring to the input buffers that the HW read/writes from not the queues of buffers that this is managing. https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:369: // Output buffers ready to use. We need a LIFO here. On 2012/11/06 19:26:16, posciak wrote: > Why LIFO? Doh. I even discussed this with fischman@. Should be "FIFO"; fixed. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:33: #define EXYNOS_GSC_DEVICE "/dev/gsc1" On 2012/11/06 19:26:17, posciak wrote: > We'd like to be using all gsc nodes available. Could we have some logic to > assign them equally among instances of this class? I did some tracing, and GSC time barely factors into the hardware usage (about 2.4 ms per frame). We could probably load-balance, but I don't see it benefitting performance, even with multiple VDA instances. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:232: errno = 0; On 2012/11/06 19:26:17, posciak wrote: > Why zero-out errno? Your are not using it... Paranoia, and DPLOG uses it. Fine, I'll take it out :-P. It does touch TLS so that's probably a tad more expensive than the usual "variable = 0". https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:256: errno = 0; On 2012/11/06 19:26:17, posciak wrote: > You have quite a lot of those 6 lines everywhere. Personally, I'd suggest > wrapping all ioctl calls into something like this: > > IOCTL_SUCCESS_OR_NOTIFY_AND_RETURN(fd, type, arg) \ > do { \ > errno = 0; \ > if (ioctl(fd, type, arg) != 0) { \ > DPLOG(ERROR) << __func__ << "(): failed: " #type; \ > NOTIFY_ERROR(PLATFORM_FAILURE); \ > return false; \ > } \ > while(0) Done. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:324: } On 2012/11/06 19:26:17, posciak wrote: > If this function fails somewhere along the way, will Destroy() be called? It > doesn't seem so... Will we properly clean up then (free buffers, close mfc/gsc > fds, etc?). I assume that the client will take care of calling Destroy() if Initialize() fails. That's what VAVDA assumes. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:607: size_t size = decoder_current_bitstream_buffer_->size; On 2012/11/06 19:26:17, posciak wrote: > Why not use shm->size() ? shm->size() doesn't exist. I could add it. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:684: DCHECK_EQ(format.fmt.pix_mp.num_planes, 2); On 2012/11/06 19:26:17, posciak wrote: > I'm wondering if maybe this should be more than a DCHECK... Done. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:878: // If we're behind on decode, schedule another one. On 2012/11/06 19:26:17, posciak wrote: > Extract l.878-885 into a function and use it here and at l.657? It's just two lines -- that seems kind of excessive. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:990: // Done all that we could. On 2012/11/06 19:26:17, posciak wrote: > s/Done all that we could./No buffers to dequeue./ Done. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:992: } else if (errno == EINVAL) { On 2012/11/06 19:26:17, posciak wrote: > s/errno == EINVAL/errno == EINVAL && !mfc_output_streamon_/ > > just to be precise, EINVAL can mean other errors as well. > > Here and everywhere else. Done. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1029: memset(&dqbuf, 0, sizeof(dqbuf)); On 2012/11/06 19:26:17, posciak wrote: > Perhaps refactor the while loop to be: > > while(1) { > memset(&dqbuf, 0, sizeof(dqbuf)); > memset(planes, 0, sizeof(planes)); > dqbuf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE; > dqbuf.memory = V4L2_MEMORY_MMAP; > dqbuf.m.planes = planes; > dqbuf.length = 2; > errno = 0; > ret = ioctl(); > if (!ret) > break; > // handle successful dqbuf > } > > to avoid repeating l.1013-1019? > Here and everywhere else? Done. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1061: // Bug workaround: GSC is liable to race conditions if more than one On 2012/11/06 19:26:17, posciak wrote: > This is bug-worthy for SLSI I'd say... They are aware of this bug, and are as usual not doing anything about it. It bites secomx in the multi-video case. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1079: egl_destroy_sync_khr(egl_display_, output_record.egl_sync); On 2012/11/06 19:26:17, posciak wrote: > Is it valid to do this on decoder thread, i.e. a different thread that it was > created on and/or without making context current? Yes. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1288: // If we don't have anything actually queued, we can notify immediatey. On 2012/11/06 19:26:17, posciak wrote: > s/immediatey/immediately Done. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1307: decoder_current_bitstream_buffer_.reset(NULL); On 2012/11/06 19:26:17, posciak wrote: > Enough to say reset() Done. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1419: if (ioctl(mfc_fd_, VIDIOC_STREAMON, &type) != 0) { On 2012/11/06 19:26:17, posciak wrote: > Same here, streamon-related code is duplicated in a few places, could be > extracted to a DeviceStreamon(fd, type). Replaced with the IOCTL macro, and it's much smaller now. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1481: __u32 type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE; On 2012/11/06 19:26:17, posciak wrote: > Wrap all ifs into DeviceStreamoff(fd, type) ? Replaced with the IOCTL macro, and it's much smaller now. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1667: &ExynosVideoDecodeAccelerator::SetDecoderState, On 2012/11/06 19:26:17, posciak wrote: > Indent error? Done. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1721: for (int i = 0; i < mfc_input_buffer_count_; ++i) { On 2012/11/06 19:26:17, posciak wrote: > This whole sequence could be extracted into a function, > MapDeviceBuffers(fd, type, memory, num_planes) perhaps and replace l.1721- and > l.1792-? The *_buffer_map_ map that we're updating is different between here and there, so we'd have to break this out into a separate loop. At this point I 'd just live with this. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1742: void* offset = mmap(NULL, buffer.m.planes[0].length, On 2012/11/06 19:26:17, posciak wrote: > Nit: calling this an "offset" is a bit misleading, it's a pointer. Especially > since it's different from mem_offset. Done. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1860: memset(&control, 0, sizeof(control)); On 2012/11/06 19:26:17, posciak wrote: > Wrap all of those sequences into DeviceSetCtrl(fd, ctrl) ? Possibly, but I replaced these with the IOCTL macro and they're much smaller now. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1977: void ExynosVideoDecodeAccelerator::DestroyMfcInputBuffers() { On 2012/11/06 19:26:17, posciak wrote: > DestroyDeviceBuffers(fd, type, memory, num_planes) in all below functions > perhaps? Unfortunately each device buffer type has its own map to also destroy, and that code is different between devices. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/public/com... File content/public/common/content_switches.h (right): https://chromiumcodereview.appspot.com/11198060/diff/34002/content/public/com... content/public/common/content_switches.h:240: extern const char kUseExynosVda[]; On 2012/11/06 19:26:17, posciak wrote: > Wrap in #if defined for ARMEL/CHROMEOS? If not it'd have to go into alphabetical > list at the top of the file. Done.
https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:132: size_t bytes_used[2]; // bytes used in each dmabuf. On 2012/11/08 21:16:30, sheu wrote: > On 2012/11/06 19:26:16, posciak wrote: > > Perhaps consider have kNumOutputPlanes = 2 constant? > > Just added a comment here. Otherwise I'd be looping (for int i = 0; i < > kNumOutputPlanes) in more places than I care to do... Not sure how would that differ from looping over i<2, but not really that important I guess... https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:33: #define EXYNOS_GSC_DEVICE "/dev/gsc1" On 2012/11/08 21:16:31, sheu wrote: > On 2012/11/06 19:26:17, posciak wrote: > > We'd like to be using all gsc nodes available. Could we have some logic to > > assign them equally among instances of this class? > > I did some tracing, and GSC time barely factors into the hardware usage (about > 2.4 ms per frame). We could probably load-balance, but I don't see it > benefitting performance, even with multiple VDA instances. Hm, what resolution was that? What was the MFC time? https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:324: } On 2012/11/08 21:16:31, sheu wrote: > On 2012/11/06 19:26:17, posciak wrote: > > If this function fails somewhere along the way, will Destroy() be called? It > > doesn't seem so... Will we properly clean up then (free buffers, close mfc/gsc > > fds, etc?). > > I assume that the client will take care of calling Destroy() if Initialize() > fails. That's what VAVDA assumes. Not exactly, in VAVDA, VaapiH264Decoder::Destroy() is called on ~VaapiH264Decoder() and the buffers are freed from there. This is independent from VaapiVideoDecodeAccelerator::Destroy(). https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:607: size_t size = decoder_current_bitstream_buffer_->size; On 2012/11/08 21:16:31, sheu wrote: > On 2012/11/06 19:26:17, posciak wrote: > > Why not use shm->size() ? > > shm->size() doesn't exist. I could add it. Oh hah, I was seeing things. Nevermind. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:878: // If we're behind on decode, schedule another one. On 2012/11/08 21:16:31, sheu wrote: > On 2012/11/06 19:26:17, posciak wrote: > > Extract l.878-885 into a function and use it here and at l.657? > > It's just two lines -- that seems kind of excessive. I meant extracting 8 lines and having: void ScheduleDecodeBufferTaskIfNeeded() { // If we're behind on tasks, schedule another one. if (decoder_decode_buffer_tasks_scheduled_ < static_cast<int>(decoder_input_queue_.size())) { decoder_decode_buffer_tasks_scheduled_++; decoder_thread_.message_loop()->PostTask(FROM_HERE, base::Bind( &ExynosVideoDecodeAccelerator::DecodeBufferTask, base::Unretained(this))); } } And using it here and at l.657. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1721: for (int i = 0; i < mfc_input_buffer_count_; ++i) { On 2012/11/08 21:16:31, sheu wrote: > On 2012/11/06 19:26:17, posciak wrote: > > This whole sequence could be extracted into a function, > > MapDeviceBuffers(fd, type, memory, num_planes) perhaps and replace l.1721- and > > l.1792-? > > The *_buffer_map_ map that we're updating is different between here and there, > so we'd have to break this out into a separate loop. At this point I 'd just > live with this. I'd really suggest making them the same class. The only difference is the number of elements in arrays. I think we are ok with wasting 3 ints for mfc records to save quite a bit of code, here and all around. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1977: void ExynosVideoDecodeAccelerator::DestroyMfcInputBuffers() { On 2012/11/08 21:16:31, sheu wrote: > On 2012/11/06 19:26:17, posciak wrote: > > DestroyDeviceBuffers(fd, type, memory, num_planes) in all below functions > > perhaps? > > Unfortunately each device buffer type has its own map to also destroy, and that > code is different between devices. Please see comment at l.1721. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/public/com... File content/public/common/content_switches.h (right): https://chromiumcodereview.appspot.com/11198060/diff/34002/content/public/com... content/public/common/content_switches.h:240: extern const char kUseExynosVda[]; On 2012/11/08 21:16:31, sheu wrote: > On 2012/11/06 19:26:17, posciak wrote: > > Wrap in #if defined for ARMEL/CHROMEOS? If not it'd have to go into > alphabetical > > list at the top of the file. > > Done. Hm, although I think we might prefer #if ARMEL. I wonder what's Ami's opinion about this?
On 2012/11/08 21:16:30, sheu wrote: > A major theme seems to be "break out all those ioctls into helper functions". > Well, first I did the ioctl() macro thing, and a lot of the boilerplate fell > away. As for the rest, here's an accounting of the ioctls that we repeat: > > VIDIOC_QBUF: each of the four (MFC/GSC output/capture) have different > requirements, and I'd end up with a pretty large argument list for a helper > function that I don't think would really help readability. > VIDIOC_STREAMON: with the ioctl macro now this is two lines and I don't think > quite worth it. > VIDIOC_STREAMOFF: similarly small now. > VIDIOC_REQBUFS: small now, but I could see this getting helper-fied. > VIDIOC_S_CTRL: similarly could use a helper, but again this is small now. > > I still would like to see them broken out separately, especially since this way > the NOTIFY_ERROR macro will report the line number of the actual failure, > instead of the line number of the multiplexed helper function. Looks better. You can have the line number of the actual failure by adding __LINE__ to the error macro, similarly to __func__.
BTW, IWBN to run EVDA and OVDA through vdatest and see the FPS diff before we enable this.
Alright, the latest patch isn't just <WIP>, it should be good. Haven't run VDATest yet; current ToT Flash video playback is hanging (even with sec_omx). https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/31003/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:366: int gsc_output_buffer_prepared_count_; On 2012/11/06 19:26:16, posciak wrote: > Could you document better what "queued and not in use" means? Done. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:324: } On 2012/11/13 18:10:39, posciak wrote: > On 2012/11/08 21:16:31, sheu wrote: > > On 2012/11/06 19:26:17, posciak wrote: > > > If this function fails somewhere along the way, will Destroy() be called? It > > > doesn't seem so... Will we properly clean up then (free buffers, close > mfc/gsc > > > fds, etc?). > > > > I assume that the client will take care of calling Destroy() if Initialize() > > fails. That's what VAVDA assumes. > > Not exactly, in VAVDA, VaapiH264Decoder::Destroy() is called on > ~VaapiH264Decoder() and the buffers are freed from there. This is independent > from VaapiVideoDecodeAccelerator::Destroy(). Ami mentioned that the destructor isn't called explicitly (except through Destroy()), so on initialization failure we do eventually get the Destroy() call. VaapiVideoDecodeAccelerator has the similar empty destructor, and the buffer cleanup is done through VAVDA::Destroy() there (which calls Cleanup(), which calls Destroy() on the decoder). https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:878: // If we're behind on decode, schedule another one. On 2012/11/13 18:10:39, posciak wrote: > On 2012/11/08 21:16:31, sheu wrote: > > On 2012/11/06 19:26:17, posciak wrote: > > > Extract l.878-885 into a function and use it here and at l.657? > > > > It's just two lines -- that seems kind of excessive. > > I meant extracting 8 lines and having: > > void ScheduleDecodeBufferTaskIfNeeded() { > // If we're behind on tasks, schedule another one. > if (decoder_decode_buffer_tasks_scheduled_ < > static_cast<int>(decoder_input_queue_.size())) { > decoder_decode_buffer_tasks_scheduled_++; > decoder_thread_.message_loop()->PostTask(FROM_HERE, base::Bind( > &ExynosVideoDecodeAccelerator::DecodeBufferTask, > base::Unretained(this))); > } > } > > And using it here and at l.657. Fine :-P
lgtm with one comment and putting the ball back in Ami's court. https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/34002/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:324: } On 2012/11/17 03:25:08, sheu wrote: > On 2012/11/13 18:10:39, posciak wrote: > > On 2012/11/08 21:16:31, sheu wrote: > > > On 2012/11/06 19:26:17, posciak wrote: > > > > If this function fails somewhere along the way, will Destroy() be called? > It > > > > doesn't seem so... Will we properly clean up then (free buffers, close > > mfc/gsc > > > > fds, etc?). > > > > > > I assume that the client will take care of calling Destroy() if Initialize() > > > fails. That's what VAVDA assumes. > > > > Not exactly, in VAVDA, VaapiH264Decoder::Destroy() is called on > > ~VaapiH264Decoder() and the buffers are freed from there. This is independent > > from VaapiVideoDecodeAccelerator::Destroy(). > > Ami mentioned that the destructor isn't called explicitly (except through > Destroy()), so on initialization failure we do eventually get the Destroy() > call. VaapiVideoDecodeAccelerator has the similar empty destructor, and the > buffer cleanup is done through VAVDA::Destroy() there (which calls Cleanup(), > which calls Destroy() on the decoder). My point is, this is very different from VAVDA. In VAVDA, whether VAVDA::Destroy() is called or not doesn't matter, VaapiH264Decoder::~VaapiH264Decoder will be called always and it will clean up. Here, you depend on EVDA::Destroy() being called explicitly. I'm just worried if it always is...
On 2012/11/20 18:36:00, posciak wrote: > My point is, this is very different from VAVDA. In VAVDA, whether > VAVDA::Destroy() is called or not doesn't matter, > VaapiH264Decoder::~VaapiH264Decoder will be called always and it will clean up. > Here, you depend on EVDA::Destroy() being called explicitly. I'm just worried if > it always is... Ah, that's a good point. I suppose I should move things into the destructor then. Also: was running vdatest, and came up with even more fun issues. More change a'comin.
PTAL. I rebased the previous CL first for an easier diff. * Change to using ANGLE headers, so unittests link correctly. This is not strictly correct -- after discussion with @gman, I'll be fixing up GL header dependencies in general, but for now this will do. (This is what the rest of the VDAs use.) * Rename BitstreamBufferRecord to BitstreamBufferRef, and make it handle the NotifyEndOfBitstreamBuffer callback on destruction. This avoids some potential leaks. * Switch to using an eventfd FD to signal the device-poll thread to stop. This allows us to epoll() only on the V4L2 devices that require it, and do away with the hacky dequeue-all-buffers-except-one business. This also requires some tweaking to the relationship between ServiceDeviceTask() and DevicePollTask(); the latter is no longer a loop. * Handle Flush() correctly. This is fixed after VDA unittesting.
Looking good, just a few nits. https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:355: epoll_ctl(device_poll_epoll_fd_, EPOLL_CTL_ADD, device_poll_interrupt_fd_, Should we test the return value for failures? https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:730: decoder_current_bitstream_buffer_.reset(); Perhaps IWBN to have a doc saying destructor will call the client? https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:832: // This routine should handle data == NULL and size == 0, which occurs when Should this be a TODO():? Seems scary not to do this already if it's possible, since we are doing a memcpy(, data,) below... https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:976: if (mfc_input_buffer_queued_count_ != 0) { What if we always said EPOLLOUT | EPOLLIN? If there were no buffers of each type, wouldn't that be ok anyway, since we wouldn't get any signals? https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:978: } And nit: no braces needed. https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1186: // Enqueue a GSC input (VIDEO_OUTPUT) buffer for a complete MFC output TBH I liked the old Enqueue*() functions, even if they were to be called only once, I feel they would improve the readability of these functions...
Fixed up, and added in a bit of H264 parsing (since VDA unittest likes to squish frames together into one decode buffer). We should be good here! https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:355: epoll_ctl(device_poll_epoll_fd_, EPOLL_CTL_ADD, device_poll_interrupt_fd_, On 2012/11/29 19:19:53, posciak wrote: > Should we test the return value for failures? Done. https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:730: decoder_current_bitstream_buffer_.reset(); On 2012/11/29 19:19:53, posciak wrote: > Perhaps IWBN to have a doc saying destructor will call the client? Done. https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:832: // This routine should handle data == NULL and size == 0, which occurs when On 2012/11/29 19:19:53, posciak wrote: > Should this be a TODO():? Seems scary not to do this already if it's possible, > since we are doing a memcpy(, data,) below... I was relying on memcpy() to handle NULL pointers when the length is 0. I went by and read the spec, and it looks like it works for glibc, but I can't rely on it... So, fixed up, also because I have to add some other fragment-finding code. https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:976: if (mfc_input_buffer_queued_count_ != 0) { On 2012/11/29 19:19:53, posciak wrote: > What if we always said EPOLLOUT | EPOLLIN? If there were no buffers of each > type, wouldn't that be ok anyway, since we wouldn't get any signals? We can't poll if there are no buffers in both input or output (or we return EPOLLERR right off), but it looks like we can do that if there is at least one in either. Fixed this up. https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:978: } On 2012/11/29 19:19:53, posciak wrote: > And nit: no braces needed. Done. https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1186: // Enqueue a GSC input (VIDEO_OUTPUT) buffer for a complete MFC output On 2012/11/29 19:19:53, posciak wrote: > TBH I liked the old Enqueue*() functions, even if they were to be called only > once, I feel they would improve the readability of these functions... Fine :-P
https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/53001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1186: // Enqueue a GSC input (VIDEO_OUTPUT) buffer for a complete MFC output On 2012/11/30 23:48:21, sheu wrote: > On 2012/11/29 19:19:53, posciak wrote: > > TBH I liked the old Enqueue*() functions, even if they were to be called only > > once, I feel they would improve the readability of these functions... > > Fine :-P Thanks :) Much easier to read now :) https://chromiumcodereview.appspot.com/11198060/diff/47004/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/47004/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:753: bool ExynosVideoDecodeAccelerator::FindFrameFragment( Please use content::H264Parser for this. Something like what is in VaapiH264Decoder::Decode{Initial,OneFrame}() should do: parser_.AdvanceToNextNALU(&nalu); (error check) switch (nalu.nal_unit_type) { case H264NALU::kNonIDRSlice: case H264NALU::kIDRSlice: ... etc. } AdvanceToNextNALU() will take care of getting to the NALU as well, so all this will not be needed.
Rebased and updated. * Now using H264Parser for parsing. * Moved away from using epoll() in favor of using plain poll(). Saves us two syscalls and a fd. * Some more optimization on the Flush() path. * Fixed picture buffer ids returned to client; we were using [0, n), not the IDs supplied by the client. * Added TRACE_EVENT tracing. This is it, folks. Passes all unittests and all the YouTube (HTML5 and Flash) I care to throw at it. PTAL.
Please also test with vimeo (HTML5 & flash) and dailymotion.com/html5effects as well as the normal dailymotion.com site (with flash enabled/disabled to toggle flash/html5). We've had VAVDA trouble in the past that came from different encoding pipelines, and it would be nice to vet this approach widely. (might also try the video from crbug 164053) John/Pawel: let me know when I should look at this CL again.
You can take a look at this CL again right about now :-) I've tried vimeo HTML5 and dailymotion HTML5 already, and both fail at ToT, even with secomx. Yet another thing to debug, but at least this one's not my fault!
> You can take a look at this CL again right about now :-) > You sure? :) (i.e. is posciak@ done reviewing?) > I've tried vimeo HTML5 and dailymotion HTML5 already, and both fail at > ToT, even > with secomx. Yet another thing to debug, but at least this one's not my > fault! > lol "fault". Isn't m25 branch cut today? Are there bugs filed for the failures you've seen?
On 2012/12/17 21:17:47, Ami Fischman wrote: > lol "fault". Isn't m25 branch cut today? Are there bugs filed for the > failures you've seen? I'm pulling down a canary image to check if it's just me or something.
Updating due to deprecation of scoped_array<>: see http://src.chromium.org/viewvc/chrome?view=rev&revision=174057. PTAL.
lgtm https://chromiumcodereview.appspot.com/11198060/diff/72001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/72001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:255: LOG(ERROR) << "Initialize(): unsupported profile=" << profile; This is probably too verbose (learned from VAVDA reviews). https://chromiumcodereview.appspot.com/11198060/diff/72001/content/common/gpu... File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/72001/content/common/gpu... content/common/gpu/media/video_decode_accelerator_unittest.cc:525: GetBytesForNextNALU(*end_pos, end_pos); Now that I think of it all this parsing stuff could also use H264Parser... just a thought though, something for later maybe.
Ami: Pawel is fine with this. Your turn :-) https://chromiumcodereview.appspot.com/11198060/diff/72001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/72001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:255: LOG(ERROR) << "Initialize(): unsupported profile=" << profile; On 2012/12/27 16:38:37, posciak wrote: > This is probably too verbose (learned from VAVDA reviews). Done.
Looking great, though I haven't finished the evda.cc file yet. I think there's only one non-nit comment so far (exynos_video_decode_accelerator.h:51). https://chromiumcodereview.appspot.com/11198060/diff/92001/content/browser/gp... File content/browser/gpu/gpu_process_host.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/browser/gp... content/browser/gpu/gpu_process_host.cc:1: // Copyright (c) 2012 The Chromium Authors. All rights reserved. As I'm reviewing PS#18 I'm feeling like I've asked some of these questions before, but I'm only finding some of them in the reviewlog record. Please be patient with me if I'm re-asking something you've answered already. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:29: #define NOTIFY_ERROR(x) \ What does this macro buy you over simply moving the SetDecoderState() call into NotifyError()? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:32: LOG(ERROR) << "calling NotifyError(): " << x; \ DLOG ? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:39: DPLOG(ERROR) << __func__ << "(): ioctl() failed: " << #type; \ FWIW the last << is unnecessary. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:54: typedef void* GLeglImageOES; Any reason this doesn't come from a GLES2 header? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:56: static EGLBoolean(*mali_egl_image_get_buffer_ext_phandle)( here and below having no space between the return type and the (* looks weird. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:84: if (input_id != -1 && client_message_loop_proxy != NULL) when can the proxy be NULL? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:136: input_id(-1) { could zero-initialize the [2] members, as well. (using initializer syntax) https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:199: frame_buffer_size_(0, 0), equiv to default ctor? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:209: // Nuke the entire site from orbit -- it's the only way to be sure. dcheck that the member threads have already joined? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:423: Pixmap pixmap = XCreatePixmap(x_display, RootWindow(x_display, 0), I still think you can use Gles2TextureToEglImageTranslator. Previously you said you didn't want to b/c the translator held on to pixmaps too long, possibly b/c of tegra considerations, but the pixmap-using path of the translator was only added for exynos' benefit (tegra doesn't need pixmaps allocated explicitly). https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:516: // Otherwise, call the destroy task directly. How can this happen? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:525: PS#18 reviewed down to here. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1982: DCHECK_EQ(mfc_output_buffer_pixelformat_, V4L2_PIX_FMT_NV12MT_16X16); Any reason to DCHECK this here instead of where the LHS is assigned (l.858)? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:6: // that utilizes hardware video decoder present on the Exynos SoC. s/hardware/the hardware/ https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:51: // decoder_thread_ when something interesting happens. Can this thread be eliminated entirely (and with it some amount of trampolining/boilerplate code) by making decoder_thread_ use a TYPE_IO MessageLoop and watching the MFC/GSC fd's in that loop? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:98: kInitialized, // Initialize() called. Ready to start decoding. s/called/returned true/ https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:113: struct BitstreamBufferRef { This and the other inner classes could as well move to the .cc file (with fwd-decls here). Up to you. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:137: const scoped_ptr<EGLImageKHR[]> egl_images; does scoped_ptr<[]> really work? IOW does it call delete[] instead of delete? Either way, wouldn't the calling code be simpler if you just used std::vector<>'s? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:149: }; append newline before next comment https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:167: void* address[2]; // mmap() address for each plane. I wish there was a different word to "plane" here (because "plane" sounds YUVish). s/plane/dmabuf plane/ maybe? (up to you) https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:190: int32 picture_id; // picture ID as returned to PictureReady(). picture _buffer_ id? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:286: // these (e.g. in Initialize() or Destroy()). Can it happen that Initialize() didn't return true but Destroy() should be processed? (same comment at l.516 of the .cc file) https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:302: scoped_refptr<base::MessageLoopProxy> child_message_loop_proxy_; How is this different from ChildThread::current()->message_loop() ? (why is it worth caching?) https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:338: // Buffers held by the client. picturebuffers ? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:343: std::deque<linked_ptr<BitstreamBufferRef> > decoder_input_queue_; here and elsewhere I wonder at the use of deque<>. For this one, where you use FIFO semantics, a list<> would do. Is there some subtle win in deque for its uses in this class? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:365: // Input buffers ready to use, as a FIFO since we don't care about ordering. s/FIFO/LIFO/ I think, here and at l.376 and l.396. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:407: // Output buffers ready to use. We need a LIFO here. Ironically, this is a FIFO :) https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... File content/common/gpu/media/gpu_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/gpu_video_decode_accelerator.cc:180: if (CommandLine::ForCurrentProcess()->HasSwitch(switches::kUseExynosVda)) { I don't expect the answer to be shocking, but OOC, what's the impact on the size of a binary Release build of incl. EVDA? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/public/com... File content/public/common/content_switches.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/public/com... content/public/common/content_switches.cc:758: // die secomx die Do you mean Die secomx, die. ? http://www.youtube.com/watch?v=dMkzdn2TDuI Probably want to reword.
https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:167: void* address[2]; // mmap() address for each plane. On 2013/01/04 01:57:05, Ami Fischman wrote: > I wish there was a different word to "plane" here (because "plane" sounds > YUVish). > s/plane/dmabuf plane/ maybe? (up to you) It is just that (well, almost), those two are a Y-plane and an interleaved UV-plane. It also comes from V4L2's struct v4l2_plane.
Finished reviewing PS#18. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:403: DCHECK_EQ(gsc_output_buffer_count_, static_cast<int>(buffers.size())); This should be stronger than a DCHECK - you don't want an untrusted renderer messing up the GPU process in the wild. if != then NOTIFY_ERROR (and DLOG(FATAL) if you like). https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:529: dlerror(); Is this standard operating procedure in some codebases? Seems icky to me to be silently dropping previous dlerrors... (ditto l.546 below) https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:609: } l.547-609 is pretty boilerplateiriffic, making it harder than necessary to see what's going on. If you typedef the signatures at the global declarations then you get much shorter reinterpret_cast<> expressions, at little or no cost of vertical space at the declarations (which get to use the typedefs). Devoting 5 lines to each error-handling clause seems crazy given the only value is the name of the failing symbol, but this will only be emitted in Debug (and only be interesting when dropping a new libmali.so). Replace with a single if (!mali_egl_image_get_buffer_ext_phandle || !egl_create_image_khr || ...) ? If you really want to keep the name of the missing symbol and the dlerror(), maybe use a macro to reduce copy/pasta? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:623: DVLOG(2) << "DecodeTask(): early out: kResetting state"; I think this is wrong. A client should be able to call Reset() followed immediately by Decode(), before NotifyResetDone is triggered, to mean "drop decoder state and then decode this other buffer". Search OmxVideoDecodeAccelerator::Decode() for RESETTING to see what is done there. I'm surprised vda_test doesn't test for this; want to add that? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:630: decoder_input_queue_.push_front( I think this is correct, but surprising. Why push_front and pop_back instead of push_back and pop_front? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:654: // We're waiting for a new buffer -- exit without scheduling a new task. Can this really happen? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:680: if (!FlushInputFrame() || !AppendToInputFrame(NULL, 0) These methods are confusing because unlike the rest of the bool-returning private methods in this class, returning false from them is *not* a cause for NotifyError. Please doco return values. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:681: || !FlushInputFrame()) { || goes on previous line https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:681: || !FlushInputFrame()) { What does short-circuiting mean here? IOW, if one of the !'s evals true and the other one or two is not eval'd. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:685: schedule_task = true; FWIW this if/else is equiv to: schedule_task = FlushInputFrame() && AppendToInputFrame(NULL, 0) && FlushInputFrame(); https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:689: // This is a buffer queued from the client, with actual `contents. Decode. s/`// https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:739: decoder_h264_parser_->SetStream(reinterpret_cast<const uint8*>(data), size); You can save the cast here and twice more in this method by passing data as a const uint8* in the first place, no? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:786: } else { You could save a +2 indent for l.736-785 by re-organizing this method as: if (VP8) { ... } if (!H264) { DLOG(FATAL) << "lolwat: " << video_profile_) return false; } ...h264 case, de-indented... https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:795: void ExynosVideoDecodeAccelerator::ScheduleDecodeBufferTaskIfNeeded() { DCHECK on decoder_thread_? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:847: if (!frame_buffer_size_.IsEmpty() && frame_buffer_size_ != How can we be mid-stream if decoder_state_==kInitialized? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:890: *endpos = size; lol to always setting this output param to an input param (meaning the caller could do the same) and double-lol to the input & output params being the same :) Drop? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:905: // Flush if we're too big comment belongs at l.910? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:943: DCHECK(data != NULL); WDYT of DCHECK_EQ(data == NULL, size == 0); going at l.936 above? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1005: static_cast<int>(gsc_output_buffer_map_.size())); indent to opening paren (don't mix +4 and opening paren styles). https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1153: DCHECK(mfc_input_streamon_); move to DCHECK pile at top of function? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1194: const long int input_id = dqbuf.timestamp.tv_sec; why "long int" and not "int"? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1210: output_record.bytes_used[1] = dqbuf.m.planes[1].bytesused; AFAICT l.1207-1210 can be moved outside the if/else and the corresponding 4 lines in the "then" clause can be dropped (greatly clarifying the difference between the flush and non-flush cases, which is whether the output_record is added to gsc_input or mfc_free). https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1221: NotifyFlushDoneIfNeeded(); Move to l.1203? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1232: const int old_gsc_inputs_queued_ = gsc_input_buffer_queued_count_; s/old_gsc_inputs_queued_/old_gsc_inputs_queued/ same sin is committed elsewhere; search for int old_ and drop trailing underscores from function-local variables. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1234: if (gsc_free_input_buffers_.empty()) while (foo()) { if (!bar()) break; ... } is equivalent to the shorter (and, IMO, clearer): while (foo() && bar()) { ... } https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1357: // * All GSC input (VIDEO_OUTPUT) buffers are returned. Does this list mean that there can be pictures in the GSC output that haven't been sent to the client yet when you consider the pipeline empty (and thus notify of FlushDone)? That would be wrong. FlushDone should only be sent after all PictureReady()s that are the result of Decode() calls made before the Flush() call. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1480: // buffer that has been in th queue the longest. s/th/the/ https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1483: egl_destroy_sync_khr(egl_display_, output_record.egl_sync); Why not pass ownership of the EGLSyncKHRRef to GscOutputRecord (instead of just handing over the native token) and then give EGLSyncKHRRef a Wait() method that waits & deletes the token if there is one? This is motivated partially by wanting to keep the sync logic out of this method, but also by the if test at l.1474 looking funny; I believe the only way for it to fail is on the initial AssignPictureBuffers; the vast majortity of the passes through that test will be in reaction to ReusePictureBuffers. (the way the code is written now makes it look like when the sync token floats by the GL engine some magic hand will come down and clear the output_record.egl_sync field, so it'll be NO_SYNC by the time l.1474 is hit) https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1542: DVLOG(3) << "FlushTask()"; Why no decoder_thread_-is-current DCHECK here? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1545: // Flush the currently-building frame. Comment is strange/confusing. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1547: DVLOG(2) << "FlushTask(): early out: kResetting state"; See the comment about deferring Decode during Reset. ISTM we should defer Flush the same way (though OVDA doesn't seem to, instead DCHECKing). https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1558: decoder_notify_flush_requested_ = true; DCHECK it wasn't already true? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1631: // early-exit. what subsequent tasks? :) This is the last thing that will run on decoder_thread_, device_poll_thread_ is already Stop()'d, and the child thread will |delete this| as soon as we return... https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1639: // Early-out if we're already running. How can this happen? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1742: DVLOG(3) << "SetDevicePollInterrupt()"; I believe it's true that this and CDPI below can only be called on the device_thread_; DCHECK that fact? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1744: const uint64 buf = 1; Is there some reason to use a uint64 to carry a single bit? (why not a uint8?) https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1756: int ret; declare at first use two lines down https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1759: if (ret == -1) { l.1759-1769 can be summarized as if (ret == -1 && errno != EAGAIN) { DPLOG(ERROR) << "ClearDevicePollInterrupt(): read() failed"; NOTIFY_ERROR(PLATFORM_FAILURE); return false; } return true; https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1777: // This routine just polls and the set of device fds, and schedules a s/and the/the/ https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1782: nfds_t nfds; declare at first assignment, or init to 0 here and ++ for interrupt_fd_ https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:2155: size_t i = 0; any reason not to replace this with a for loop? https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:186: bool at_client; // held by client. These two fields could be replaced with a single field of type: enum Disposition { FREE, AT_DEVICE, AT_CLIENT }; Up to you. (maybe it used to be like that in a much earlier patchset?)
Revision is up. It compiles but I haven't tested yet -- consider this a preview. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:29: #define NOTIFY_ERROR(x) \ On 2013/01/04 01:57:05, Ami Fischman wrote: > What does this macro buy you over simply moving the SetDecoderState() call into > NotifyError()? NotifyError() calls itself if called not-on the child thread, so a SetDecoderState() in that case would be called twice. I broke out the NotifyError into a macro also so LOG(ERROR) would pick up the actual source line where the macro is invoked from. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:32: LOG(ERROR) << "calling NotifyError(): " << x; \ On 2013/01/04 01:57:05, Ami Fischman wrote: > DLOG ? Something actually broke. I think it's import enough to LOG for. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:39: DPLOG(ERROR) << __func__ << "(): ioctl() failed: " << #type; \ On 2013/01/04 01:57:05, Ami Fischman wrote: > FWIW the last << is unnecessary. There's a couple other bits (e.g. l.280) where I was advised to use it even for string concat. Your call. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:54: typedef void* GLeglImageOES; On 2013/01/04 01:57:05, Ami Fischman wrote: > Any reason this doesn't come from a GLES2 header? It's not in any one of them that we use. I'd like to go do a GLES2 header cleanup at some point. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:56: static EGLBoolean(*mali_egl_image_get_buffer_ext_phandle)( On 2013/01/04 01:57:05, Ami Fischman wrote: > here and below having no space between the return type and the (* looks weird. I was doing it analogously with function calls. Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:84: if (input_id != -1 && client_message_loop_proxy != NULL) On 2013/01/04 01:57:05, Ami Fischman wrote: > when can the proxy be NULL? I suppose not. Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:136: input_id(-1) { On 2013/01/04 01:57:05, Ami Fischman wrote: > could zero-initialize the [2] members, as well. > (using initializer syntax) I don't think you can do that in an initializer list. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:199: frame_buffer_size_(0, 0), On 2013/01/04 01:57:05, Ami Fischman wrote: > equiv to default ctor? Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:209: // Nuke the entire site from orbit -- it's the only way to be sure. On 2013/01/04 01:57:05, Ami Fischman wrote: > dcheck that the member threads have already joined? Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:403: DCHECK_EQ(gsc_output_buffer_count_, static_cast<int>(buffers.size())); On 2013/01/04 21:38:55, Ami Fischman wrote: > This should be stronger than a DCHECK - you don't want an untrusted renderer > messing up the GPU process in the wild. > if != then NOTIFY_ERROR (and DLOG(FATAL) if you like). Good point. Tightening up ReusePictureBufferTask() as well while I'm at it. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:423: Pixmap pixmap = XCreatePixmap(x_display, RootWindow(x_display, 0), On 2013/01/04 01:57:05, Ami Fischman wrote: > I still think you can use Gles2TextureToEglImageTranslator. > Previously you said you didn't want to b/c the translator held on to pixmaps too > long, possibly b/c of tegra considerations, but the pixmap-using path of the > translator was only added for exynos' benefit (tegra doesn't need pixmaps > allocated explicitly). It adds a std::map<> and some other stuff we don't need, even as we're not using it. I'd kinda prefer to kill that dead too, and (maybe) nuke OMX VDA as well. If this becomes a common pattern I guess I could clean up the translator later and add the use back in. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:516: // Otherwise, call the destroy task directly. On 2013/01/04 01:57:05, Ami Fischman wrote: > How can this happen? If Initialize() fails. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:529: dlerror(); On 2013/01/04 21:38:55, Ami Fischman wrote: > Is this standard operating procedure in some codebases? > Seems icky to me to be silently dropping previous dlerrors... > > (ditto l.546 below) If previous dlerror() calls aren't being checked, this isn't going to help them anyways. I'm doing it here to clear dlerror() to a known state, since dlsym() potentially returns NULL even on success (if the symbol requested is NULL), and in that case whether the following call to dlerror() returns NULL or not does have meaning. Check the manpage, I guess. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:609: } On 2013/01/04 21:38:55, Ami Fischman wrote: > l.547-609 is pretty boilerplateiriffic, making it harder than necessary to see > what's going on. > > If you typedef the signatures at the global declarations then you get much > shorter reinterpret_cast<> expressions, at little or no cost of vertical space > at the declarations (which get to use the typedefs). > > Devoting 5 lines to each error-handling clause seems crazy given the only value > is the name of the failing symbol, but this will only be emitted in Debug (and > only be interesting when dropping a new libmali.so). Replace with a single > if (!mali_egl_image_get_buffer_ext_phandle || !egl_create_image_khr || ...) > ? > > If you really want to keep the name of the missing symbol and the dlerror(), > maybe use a macro to reduce copy/pasta? Typedef-ed and macro-ed. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:623: DVLOG(2) << "DecodeTask(): early out: kResetting state"; On 2013/01/04 21:38:55, Ami Fischman wrote: > I think this is wrong. > A client should be able to call Reset() followed immediately by Decode(), before > NotifyResetDone is triggered, to mean "drop decoder state and then decode this > other buffer". > Search OmxVideoDecodeAccelerator::Decode() for RESETTING to see what is done > there. > > I'm surprised vda_test doesn't test for this; want to add that? Alrighty then. Good catch! Updated, and associated commentary added. vda_test tests for this somewhat indirectly in the MidStreamReset tests by counting frames, but it's a bit timing-dependent (if the Reset() and associated tasks complete quickly, before the next Decode() comes in) whether vda_test will catch this. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:630: decoder_input_queue_.push_front( On 2013/01/04 21:38:55, Ami Fischman wrote: > I think this is correct, but surprising. > Why push_front and pop_back instead of push_back and pop_front? Not sure what the convention was. Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:654: // We're waiting for a new buffer -- exit without scheduling a new task. On 2013/01/04 21:38:55, Ami Fischman wrote: > Can this really happen? I don't think it can happen, but it's not an error if it does. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:680: if (!FlushInputFrame() || !AppendToInputFrame(NULL, 0) On 2013/01/04 21:38:55, Ami Fischman wrote: > These methods are confusing because unlike the rest of the bool-returning > private methods in this class, returning false from them is *not* a cause for > NotifyError. Please doco return values. Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:681: || !FlushInputFrame()) { On 2013/01/04 21:38:55, Ami Fischman wrote: > || goes on previous line Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:681: || !FlushInputFrame()) { On 2013/01/04 21:38:55, Ami Fischman wrote: > What does short-circuiting mean here? > IOW, if one of the !'s evals true and the other one or two is not eval'd. If we can't enqueue a completely empty frame and guarantee that it's flushed down (so nobody else tries to append to it), then try again later. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:685: schedule_task = true; On 2013/01/04 21:38:55, Ami Fischman wrote: > FWIW this if/else is equiv to: > > schedule_task = FlushInputFrame() && AppendToInputFrame(NULL, 0) && > FlushInputFrame(); A little easier to document and comment this way. :-) https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:689: // This is a buffer queued from the client, with actual `contents. Decode. On 2013/01/04 21:38:55, Ami Fischman wrote: > s/`// Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:739: decoder_h264_parser_->SetStream(reinterpret_cast<const uint8*>(data), size); On 2013/01/04 21:38:55, Ami Fischman wrote: > You can save the cast here and twice more in this method by passing data as a > const uint8* in the first place, no? Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:786: } else { On 2013/01/04 21:38:55, Ami Fischman wrote: > You could save a +2 indent for l.736-785 by re-organizing this method as: > if (VP8) { > ... > } > if (!H264) { > DLOG(FATAL) << "lolwat: " << video_profile_) > return false; > } > ...h264 case, de-indented... I like symmetry, which leads me to excessive use of "else" cases sometimes. For situations like this (essentially a "switch"), I'd rather that all cases get indented the same. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:795: void ExynosVideoDecodeAccelerator::ScheduleDecodeBufferTaskIfNeeded() { On 2013/01/04 21:38:55, Ami Fischman wrote: > DCHECK on decoder_thread_? Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:847: if (!frame_buffer_size_.IsEmpty() && frame_buffer_size_ != On 2013/01/04 21:38:55, Ami Fischman wrote: > How can we be mid-stream if decoder_state_==kInitialized? This used to be on the outside of this conditional. Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:890: *endpos = size; On 2013/01/04 21:38:55, Ami Fischman wrote: > lol to always setting this output param to an input param (meaning the caller > could do the same) and double-lol to the input & output params being the same :) > Drop? Symmetry with DecodeBufferInitial()? That's kinda my thing. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:905: // Flush if we're too big On 2013/01/04 21:38:55, Ami Fischman wrote: > comment belongs at l.910? It's the purpose of the whole block here, really. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:943: DCHECK(data != NULL); On 2013/01/04 21:38:55, Ami Fischman wrote: > WDYT of > DCHECK_EQ(data == NULL, size == 0); > going at l.936 above? Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1005: static_cast<int>(gsc_output_buffer_map_.size())); On 2013/01/04 21:38:55, Ami Fischman wrote: > indent to opening paren (don't mix +4 and opening paren styles). Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1153: DCHECK(mfc_input_streamon_); On 2013/01/04 21:38:55, Ami Fischman wrote: > move to DCHECK pile at top of function? In case we get called before STREAMON occurs. I'd like the Dequeue{Mfc,Gsc}() functions to be callable regardless of the states of the queues. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1194: const long int input_id = dqbuf.timestamp.tv_sec; On 2013/01/04 21:38:55, Ami Fischman wrote: > why "long int" and not "int"? tv_sec is a long int, and I'm caching that value really. The conversion happens when this value gets assigned to something else. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1221: NotifyFlushDoneIfNeeded(); On 2013/01/04 21:38:55, Ami Fischman wrote: > Move to l.1203? I figure I'll do it here once all state is resolved, in case the logic for flush-done changes. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1232: const int old_gsc_inputs_queued_ = gsc_input_buffer_queued_count_; On 2013/01/04 21:38:55, Ami Fischman wrote: > s/old_gsc_inputs_queued_/old_gsc_inputs_queued/ > same sin is committed elsewhere; search for > int old_ > and drop trailing underscores from function-local variables. Done (and elsewhere) https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1357: // * All GSC input (VIDEO_OUTPUT) buffers are returned. On 2013/01/04 21:38:55, Ami Fischman wrote: > Does this list mean that there can be pictures in the GSC output that haven't > been sent to the client yet when you consider the pipeline empty (and thus > notify of FlushDone)? That would be wrong. > FlushDone should only be sent after all PictureReady()s that are the result of > Decode() calls made before the Flush() call. Right now I rely on the GSC input and output queues advancing in lockstep. I agree that this is maybe brittle, so I've added gsc_output_buffer_queued_count_ to this. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1480: // buffer that has been in th queue the longest. On 2013/01/04 21:38:55, Ami Fischman wrote: > s/th/the/ Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1483: egl_destroy_sync_khr(egl_display_, output_record.egl_sync); On 2013/01/04 21:38:55, Ami Fischman wrote: > Why not pass ownership of the EGLSyncKHRRef to GscOutputRecord (instead of just > handing over the native token) and then give EGLSyncKHRRef a Wait() method that > waits & deletes the token if there is one? > > This is motivated partially by wanting to keep the sync logic out of this > method, but also by the if test at l.1474 looking funny; I believe the only way > for it to fail is on the initial AssignPictureBuffers; the vast majortity of the > passes through that test will be in reaction to ReusePictureBuffers. > > (the way the code is written now makes it look like when the sync token floats > by the GL engine some magic hand will come down and clear the > output_record.egl_sync field, so it'll be NO_SYNC by the time l.1474 is hit) I think we'd end up (assuming that egl_sync becomes a scoped_ptr<EGLSyncKHRRef>) either checking egl_sync != NULL up at l1474 anyways, or constructing empty refs in AssignPictureBuffersTask() (so that Wait() completes immediately), so I don't see how this would make things less confusing. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1542: DVLOG(3) << "FlushTask()"; On 2013/01/04 21:38:55, Ami Fischman wrote: > Why no decoder_thread_-is-current DCHECK here? Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1545: // Flush the currently-building frame. On 2013/01/04 21:38:55, Ami Fischman wrote: > Comment is strange/confusing. Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1547: DVLOG(2) << "FlushTask(): early out: kResetting state"; On 2013/01/04 21:38:55, Ami Fischman wrote: > See the comment about deferring Decode during Reset. ISTM we should defer Flush > the same way (though OVDA doesn't seem to, instead DCHECKing). Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1558: decoder_notify_flush_requested_ = true; On 2013/01/04 21:38:55, Ami Fischman wrote: > DCHECK it wasn't already true? Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1631: // early-exit. On 2013/01/04 21:38:55, Ami Fischman wrote: > what subsequent tasks? :) > This is the last thing that will run on decoder_thread_, device_poll_thread_ is > already Stop()'d, and the child thread will |delete this| as soon as we > return... Paranoia. :-) https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1639: // Early-out if we're already running. On 2013/01/04 21:38:55, Ami Fischman wrote: > How can this happen? I guess not. DCHECK()ed. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1742: DVLOG(3) << "SetDevicePollInterrupt()"; On 2013/01/04 21:38:55, Ami Fischman wrote: > I believe it's true that this and CDPI below can only be called on the > device_thread_; DCHECK that fact? Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1744: const uint64 buf = 1; On 2013/01/04 21:38:55, Ami Fischman wrote: > Is there some reason to use a uint64 to carry a single bit? > (why not a uint8?) The eventfd interface requires uint64. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1756: int ret; On 2013/01/04 21:38:55, Ami Fischman wrote: > declare at first use two lines down I kinda like to declare them in use order, and in this case it's ret and then buf. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1759: if (ret == -1) { On 2013/01/04 21:38:55, Ami Fischman wrote: > l.1759-1769 can be summarized as > if (ret == -1 && errno != EAGAIN) { > DPLOG(ERROR) << "ClearDevicePollInterrupt(): read() failed"; > NOTIFY_ERROR(PLATFORM_FAILURE); > return false; > } > return true; Yep, but I wanted to explicitly document the EAGAIN case. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1777: // This routine just polls and the set of device fds, and schedules a On 2013/01/04 21:38:55, Ami Fischman wrote: > s/and the/the/ Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1782: nfds_t nfds; On 2013/01/04 21:38:55, Ami Fischman wrote: > declare at first assignment, or init to 0 here and ++ for interrupt_fd_ Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:1982: DCHECK_EQ(mfc_output_buffer_pixelformat_, V4L2_PIX_FMT_NV12MT_16X16); On 2013/01/04 01:57:05, Ami Fischman wrote: > Any reason to DCHECK this here instead of where the LHS is assigned (l.858)? I guess I could do that. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:2155: size_t i = 0; On 2013/01/04 21:38:55, Ami Fischman wrote: > any reason not to replace this with a for loop? Save on the first iteration, checking against gsc_output_buffer_map_.size(). We already know it's >= 1, since we checked that before running make_current_context_. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:6: // that utilizes hardware video decoder present on the Exynos SoC. On 2013/01/04 01:57:05, Ami Fischman wrote: > s/hardware/the hardware/ Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:51: // decoder_thread_ when something interesting happens. On 2013/01/04 01:57:05, Ami Fischman wrote: > Can this thread be eliminated entirely (and with it some amount of > trampolining/boilerplate code) by making decoder_thread_ use a TYPE_IO > MessageLoop and watching the MFC/GSC fd's in that loop? That's an interesting idea actually, but in the interest of holding off on churn I'm going to punt that until after the initial commit. It'll help with the latency of the decode thread, and might be good for some perf. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:98: kInitialized, // Initialize() called. Ready to start decoding. On 2013/01/04 01:57:05, Ami Fischman wrote: > s/called/returned true/ Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:113: struct BitstreamBufferRef { On 2013/01/04 01:57:05, Ami Fischman wrote: > This and the other inner classes could as well move to the .cc file (with > fwd-decls here). Up to you. That's a good idea actually; I wasn't aware that you can fwd-declare nested classes at all. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:137: const scoped_ptr<EGLImageKHR[]> egl_images; On 2013/01/04 01:57:05, Ami Fischman wrote: > does scoped_ptr<[]> really work? > IOW does it call delete[] instead of delete? > > Either way, wouldn't the calling code be simpler if you just used > std::vector<>'s? I figured I'd just hold one count for all of them. See: http://src.chromium.org/viewvc/chrome?view=rev&revision=174057 https://code.google.com/p/chromium/issues/detail?id=109874 https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:149: }; On 2013/01/04 01:57:05, Ami Fischman wrote: > append newline before next comment Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:190: int32 picture_id; // picture ID as returned to PictureReady(). On 2013/01/04 01:57:05, Ami Fischman wrote: > picture _buffer_ id? Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:286: // these (e.g. in Initialize() or Destroy()). On 2013/01/04 01:57:05, Ami Fischman wrote: > Can it happen that Initialize() didn't return true but Destroy() should be > processed? > (same comment at l.516 of the .cc file) That's pretty much it. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:302: scoped_refptr<base::MessageLoopProxy> child_message_loop_proxy_; On 2013/01/04 01:57:05, Ami Fischman wrote: > How is this different from ChildThread::current()->message_loop() ? > (why is it worth caching?) Robustness? I'm caching the proxy (not the message loop itself) in case the child thread goes down (with exit() or some such), before the decoder thread is destructed. Basically, paranoia. And because VAVDA does it this way. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:338: // Buffers held by the client. On 2013/01/04 01:57:05, Ami Fischman wrote: > picturebuffers ? Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:343: std::deque<linked_ptr<BitstreamBufferRef> > decoder_input_queue_; On 2013/01/04 01:57:05, Ami Fischman wrote: > here and elsewhere I wonder at the use of deque<>. > For this one, where you use FIFO semantics, a list<> would do. > Is there some subtle win in deque for its uses in this class? Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:365: // Input buffers ready to use, as a FIFO since we don't care about ordering. On 2013/01/04 01:57:05, Ami Fischman wrote: > s/FIFO/LIFO/ I think, here and at l.376 and l.396. Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.h:407: // Output buffers ready to use. We need a LIFO here. On 2013/01/04 01:57:05, Ami Fischman wrote: > Ironically, this is a FIFO :) Done. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/public/com... File content/public/common/content_switches.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/public/com... content/public/common/content_switches.cc:758: // die secomx die On 2013/01/04 01:57:05, Ami Fischman wrote: > Do you mean > Die secomx, die. > ? > http://www.youtube.com/watch?v=dMkzdn2TDuI > > Probably want to reword. I wasn't aware of the Simpsons connection actually. Done.
I reworked the Flush() path a bit to take into account fischman@'s points about ordering. PTAL. FYI: this CL can't actually land until I land: https://gerrit.chromium.org/gerrit/#/c/40498/ https://gerrit.chromium.org/gerrit/#/c/39643/
https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:32: LOG(ERROR) << "calling NotifyError(): " << x; \ On 2013/01/05 00:51:49, sheu wrote: > On 2013/01/04 01:57:05, Ami Fischman wrote: > > DLOG ? > > Something actually broke. I think it's import enough to LOG for. We don't LOG for invalid .png's; we shouldn't LOG for invalid .mp4 streams. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:2155: size_t i = 0; On 2013/01/05 00:51:49, sheu wrote: > On 2013/01/04 21:38:55, Ami Fischman wrote: > > any reason not to replace this with a for loop? > > Save on the first iteration, checking against gsc_output_buffer_map_.size(). We > already know it's >= 1, since we checked that before running > make_current_context_. lolwat? Are you saying you're using i=0; do { ...1 ++i; } while (...2); instead of for (i=0; ...2; ++i) { ...1 } to save the one comparison of 0<gsc_output_buffer_map_.size() (and its attendant eval of size()) ? That's crazy. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:87: const int egl_images_count; My point about vector<> was that if you used it you could avoid having an explicit count at all. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:97: typedef void* GLeglImageOES; What I meant was: are you sure it's legit / standardized? If it is, it should be in a standards .h. Add a TODO/file a crbug? https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:258: DCHECK(!decoder_thread_.message_loop()); s/message_loop/IsRunning/ here and in the next line https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:719: // current input, enqueue no data to the next flame, then flush that down. typo: flame https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:725: // reprocessed when the pipeline frees up. So it's ok that if the first FIF() and ATIF() return true, but the last FIF() returns false, then the next run through this codepath will call all of FIF/ATIF/FIF again? (ends up with two flushing frames in the pipeline) https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:743: return; Should this notify error? E.g. if the bitstream is invalid or unsupported, what's the desired/actual behavior? https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:930: *endpos = size; Symmetry is nice and all, but this is just a pure head-scratcher. I'd drop the out param entirely. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1011: // If input_id >= 0, this input buffer was promped by a bitstream buffer we typo: promped https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1012: // got from the client. We can skip it if it is empty. I don't understand this comment and the following line (esp. given I thought I understood PS#18's version of this code at l.971-973). Can you clarify? https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1251: output_record.bytes_used[1] = dqbuf.m.planes[1].bytesused; I think you missed this comment: AFAICT l.1207-1210 can be moved outside the if/else and the corresponding 4 lines in the "then" clause can be dropped (greatly clarifying the difference between the flush and non-flush cases, which is whether the output_record is added to gsc_input or mfc_free). https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1275: if (gsc_free_input_buffers_.empty()) I think you missed the comment: while (foo()) { if (!bar()) break; ... } is equivalent to the shorter (and, IMO, clearer): while (foo() && bar()) { ... } https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1597: // Nothing in the pipe; return done immediately. In kResetting how do you know there's "nothing in the pipe"? Is there any reason to do this special-casing short-circuit? https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1817: ret = read(device_poll_interrupt_fd_, &buf, sizeof(buf)); On 2013/01/05 00:51:49, sheu wrote: > On 2013/01/04 21:38:55, Ami Fischman wrote: > > declare at first use two lines down > > I kinda like to declare them in use order, and in this case it's ret and then > buf. buf is "used" before ret (param is eval'd pre-dispatch, which happens before assignment). Either way, this contravenes chrome style guidelines. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:51: // decoder_thread_ when something interesting happens. Please add a TODO to kill this thread (TODO in this CL, actually do it in a follow-on CL :)) https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:98: kInitialized, // Initialize() returned. Ready to start decoding. s/returned/returned true/ https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... File content/common/gpu/media/gpu_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/gpu_video_decode_accelerator.cc:180: if (CommandLine::ForCurrentProcess()->HasSwitch(switches::kUseExynosVda)) { I think you missed my question: I don't expect the answer to be shocking, but OOC, what's the impact on the size of a binary Release build of incl. EVDA?
I'll get the code size measurements sometime. But I have it in mind to remove the OMX VDA from the build, so I expect an even smaller change. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:32: LOG(ERROR) << "calling NotifyError(): " << x; \ On 2013/01/11 00:46:33, Ami Fischman wrote: > On 2013/01/05 00:51:49, sheu wrote: > > On 2013/01/04 01:57:05, Ami Fischman wrote: > > > DLOG ? > > Something actually broke. I think it's import enough to LOG for. > We don't LOG for invalid .png's; we shouldn't LOG for invalid .mp4 streams. Still think it should be a LOG, but I'm not going to argue. I was tempted to make it a VLOG but then I figured that's worse for clutter. https://chromiumcodereview.appspot.com/11198060/diff/92001/content/common/gpu... content/common/gpu/media/exynos_video_decode_accelerator.cc:2155: size_t i = 0; On 2013/01/11 00:46:33, Ami Fischman wrote: > On 2013/01/05 00:51:49, sheu wrote: > > On 2013/01/04 21:38:55, Ami Fischman wrote: > > > any reason not to replace this with a for loop? > > > > Save on the first iteration, checking against gsc_output_buffer_map_.size(). > We > > already know it's >= 1, since we checked that before running > > make_current_context_. > > lolwat? Are you saying you're using > i=0; > do { > ...1 > ++i; > } while (...2); > instead of > for (i=0; ...2; ++i) { > ...1 > } > > to save the one comparison of 0<gsc_output_buffer_map_.size() (and its attendant > eval of size()) ? > That's crazy. Yeah, it's crazy. Don't think it's incorrect though :-) https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:87: const int egl_images_count; On 2013/01/11 00:46:33, Ami Fischman wrote: > My point about vector<> was that if you used it you could avoid having an > explicit count at all. Alright, I'll just fold it into an array-of-structs type. Now that this class def is out of the main header file, I feel a little less bad about defining a nested struct for it :-) https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:97: typedef void* GLeglImageOES; On 2013/01/11 00:46:33, Ami Fischman wrote: > What I meant was: are you sure it's legit / standardized? > If it is, it should be in a standards .h. Add a TODO/file a crbug? Done. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:258: DCHECK(!decoder_thread_.message_loop()); On 2013/01/11 00:46:33, Ami Fischman wrote: > s/message_loop/IsRunning/ here and in the next line Done. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:719: // current input, enqueue no data to the next flame, then flush that down. On 2013/01/11 00:46:33, Ami Fischman wrote: > typo: flame Done. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:725: // reprocessed when the pipeline frees up. On 2013/01/11 00:46:33, Ami Fischman wrote: > So it's ok that if the first FIF() and ATIF() return true, but the last FIF() > returns false, then the next run through this codepath will call all of > FIF/ATIF/FIF again? > > (ends up with two flushing frames in the pipeline) We'll just end up with two flushing frame in the pipe. This should not be a problem. There is a rather unlikely condition where we manage to keep trying to queue a flush faster than the hardware can process completely empty buffers, which might livelock. Rather far-fetched (I think), but well, better to be safe. Done. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:743: return; On 2013/01/11 00:46:33, Ami Fischman wrote: > Should this notify error? > E.g. if the bitstream is invalid or unsupported, what's the desired/actual > behavior? I suppose we can error here, instead of trying to continue. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:930: *endpos = size; On 2013/01/11 00:46:33, Ami Fischman wrote: > Symmetry is nice and all, but this is just a pure head-scratcher. I'd drop the > out param entirely. Fine :-P https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1011: // If input_id >= 0, this input buffer was promped by a bitstream buffer we On 2013/01/11 00:46:33, Ami Fischman wrote: > typo: promped Done. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1012: // got from the client. We can skip it if it is empty. On 2013/01/11 00:46:33, Ami Fischman wrote: > I don't understand this comment and the following line (esp. given I thought I > understood PS#18's version of this code at l.971-973). Can you clarify? I reworded and reversed the order of the condition because I thought it would make things less confusing. It looks like that didn't help :-( The logic is still the same. Adding comment. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1251: output_record.bytes_used[1] = dqbuf.m.planes[1].bytesused; On 2013/01/11 00:46:33, Ami Fischman wrote: > I think you missed this comment: > AFAICT l.1207-1210 can be moved outside the if/else and the corresponding 4 > lines in the "then" clause can be dropped (greatly clarifying the difference > between the flush and non-flush cases, which is whether the output_record is > added to gsc_input or mfc_free). Sounds good. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1275: if (gsc_free_input_buffers_.empty()) On 2013/01/11 00:46:33, Ami Fischman wrote: > I think you missed the comment: > while (foo()) { > if (!bar()) > break; > ... > } > > is equivalent to the shorter (and, IMO, clearer): > > while (foo() && bar()) { > ... > } Done. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1597: // Nothing in the pipe; return done immediately. On 2013/01/11 00:46:33, Ami Fischman wrote: > In kResetting how do you know there's "nothing in the pipe"? > Is there any reason to do this special-casing short-circuit? If we're resetting, ResetTask() has already wiped the pipeline, so we can early-out. Evidently this calls for a comment. Done. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1817: ret = read(device_poll_interrupt_fd_, &buf, sizeof(buf)); On 2013/01/11 00:46:33, Ami Fischman wrote: > On 2013/01/05 00:51:49, sheu wrote: > > On 2013/01/04 21:38:55, Ami Fischman wrote: > > > declare at first use two lines down > > > > I kinda like to declare them in use order, and in this case it's ret and then > > buf. > > buf is "used" before ret (param is eval'd pre-dispatch, which happens before > assignment). Either way, this contravenes chrome style guidelines. Alrighty then, fixed here and above. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:51: // decoder_thread_ when something interesting happens. On 2013/01/11 00:46:33, Ami Fischman wrote: > Please add a TODO to kill this thread (TODO in this CL, actually do it in a > follow-on CL :)) Done. https://chromiumcodereview.appspot.com/11198060/diff/120001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:98: kInitialized, // Initialize() returned. Ready to start decoding. On 2013/01/11 00:46:33, Ami Fischman wrote: > s/returned/returned true/ Done.
So close! https://chromiumcodereview.appspot.com/11198060/diff/138001/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/138001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:148: buffer.egl_image = EGL_NO_IMAGE_KHR; If PictureBufferRef had an explicit ctor w/ an initializer list of these three assignments, this loop would be unnecessary. Just sayin'. Up to you. (ditto for dtor) https://chromiumcodereview.appspot.com/11198060/diff/138001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1577: // * kResetting, then ResetTask() already wiped all buffers. I think this is wrong. If the client calls: Reset, Decode, Flush and the decoder thread sees ResetTask, DecodeTask, FlushTask before NotifyResetDone is called, then this code will immediately notify FlushDone even though the Decode hasn't completed yet.
https://chromiumcodereview.appspot.com/11198060/diff/138001/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/138001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:148: buffer.egl_image = EGL_NO_IMAGE_KHR; On 2013/01/11 20:14:42, Ami Fischman wrote: > If PictureBufferRef had an explicit ctor w/ an initializer list of these three > assignments, this loop would be unnecessary. Just sayin'. Up to you. > > (ditto for dtor) Destructor won't work, since it needs egl_display from the outer class. (Which is the whole reason it's wrapped, instead of just a naked std::vector; I didn't want to have to stuff an EGLDisplay with every ref.) I'd have auto-destructed it otherwise. And since we're looping in the destructor, I figured I'd do it in the constructor, for... symmetry. (If I were a supervillain I'd be SuperSymmetryMan, with the power to drive particle physicists and software engineers crazy) https://chromiumcodereview.appspot.com/11198060/diff/138001/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1577: // * kResetting, then ResetTask() already wiped all buffers. On 2013/01/11 20:14:42, Ami Fischman wrote: > I think this is wrong. If the client calls: > Reset, Decode, Flush > and the decoder thread sees > ResetTask, DecodeTask, FlushTask > before NotifyResetDone is called, then this code will immediately notify > FlushDone even though the Decode hasn't completed yet. Bleargh. I hate state machines. Done.
LGTM! Before submitting, though, please open a tracking crbug for converting snow from OVDA to EVDA, and point at that bug from the description of this CL. I'm hoping that crbug will contain info on your plans for effecting the switchover, esp. in terms of target mstones for the flag-flip.
Actually, before submitting, I've been thinking about this quite a bit and I think we should call this V4L2VDA after all. We want to make it universal and the remaining Exynos bits will have to be standardized and upstreamed. Would you agree? On 2013/01/11 22:12:36, Ami Fischman wrote: > LGTM! > > Before submitting, though, please open a tracking crbug for converting snow from > OVDA to EVDA, and point at that bug from the description of this CL. > > I'm hoping that crbug will contain info on your plans for effecting the > switchover, esp. in terms of target mstones for the flag-flip.
I've been tracking the effort in the bug referenced: https://code.google.com/p/chrome-os-partner/issues/detail?id=10057. Should I open another one?
On 2013/01/11 22:21:50, posciak wrote: > Actually, before submitting, I've been thinking about this quite a bit and I > think we should call this V4L2VDA after all. We want to make it universal and > the remaining Exynos bits will have to be standardized and upstreamed. Would you > agree? I"ve been thinking about it as well, and it seems doable, but until we actually have another V4L2 platform we want to run it on and compare against, I think the refactor (or even the rename) is premature. For Exynos in particular it will be a lot easier to do this once we move to YUV->RGB conversion in the GPU (see: http://crbug.com/167417), since we can then drop gSC and just have one device to track the state of.
On 2013/01/11 22:30:43, sheu wrote: > On 2013/01/11 22:21:50, posciak wrote: > > Actually, before submitting, I've been thinking about this quite a bit and I > > think we should call this V4L2VDA after all. We want to make it universal and > > the remaining Exynos bits will have to be standardized and upstreamed. Would > you > > agree? > > I"ve been thinking about it as well, and it seems doable, but until we actually > have another V4L2 platform we want to run it on and compare against, I think the > refactor (or even the rename) is premature. For Exynos in particular it will be > a lot easier to do this once we move to YUV->RGB conversion in the GPU (see: > http://crbug.com/167417), since we can then drop gSC and just have one device to > track the state of. Good point. So in the future we should completely take out conversion code (gsc) out of this class, have a separate v4l2 converter class and set up a pipeline between them on as needed basis (if conversion is needed), passing only /dev/* configuration to initialization of each class per platform. Afterwards we can even get rid of that per-platform configuration and query the nodes ourselves. This would make it truly universal.
John: I think it'd be useful to use a public bug instead of the restricted one for the going-forward work. Pawel: I don't see a reason not to do the rename as a followup.
LET'S LAND THIS BUFF
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sheu@chromium.org/11198060/145005
Presubmit check for 11198060-145005 failed and returned exit status 1. Running presubmit commit checks ... ** Presubmit ERRORS ** Missing LGTM from an OWNER for files in these directories: content content/public content/browser/gpu content/gpu Presubmit checks took 2.4s to calculate.
Missing another reviewer. piman@: can you check this one? The big chunks (exynos_video_decode_accelerator.*) have been reviewed to death already, so I guess you can nitpick those or not, at your leisure.
Bunch of nits, couple of possible problems (but nothing fundamental). In particular, we should use HANDLE_EINTR where it makes sense around syscalls that we may need to restart in case of signals. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/browser/g... File content/browser/gpu/gpu_process_host.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/145005/content/browser/g... content/browser/gpu/gpu_process_host.cc:994: switches::kUseExynosVda, note: you will need it also in LoginUtilsImpl::GetOffTheRecordCommandLine in chrome/browser/chromeos/login/login_utils.cc, otherwise the flag will not be enabled in guest mode. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:25: #define EXYNOS_MFC_DEVICE "/dev/mfc-dec" nit: const char [] kExynosMfcDevice = "/dev/mfc-dec"; Same below. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:122: gl_egl_image_target_texture_2d_oes = NULL; nit: you can put all of these into an anonymous namespace to not need the 'static' everywhere. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:137: if (input_id >= 0) nit: need {} if the statement spans more than 1 line https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:160: close(buffer.egl_image_fd); Use HANDLE_EINTR from base/posix/eintr_wrapper.h on all syscalls (where it makes sense). https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:263: // Nuke the entire site from orbit -- it's the only way to be sure. nit: This comment is pretty alien to me, I'm not sure what it refers to. It sounds like this just does normal cleanup, so I don't think the comment brings anything. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:265: close(device_poll_interrupt_fd_); HANDLE_EINTR https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:271: close(gsc_fd_); HANDLE_EINTR https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:277: close(mfc_fd_); HANDLE_EINTR https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:335: mfc_fd_ = open(EXYNOS_MFC_DEVICE, O_RDWR | O_NONBLOCK | O_CLOEXEC); HANDLE_EINTR https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:343: gsc_fd_ = open(EXYNOS_GSC_DEVICE, O_RDWR | O_NONBLOCK | O_CLOEXEC); HANDLE_EINTR https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:353: device_poll_interrupt_fd_ = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC); Is HANDLE_EINTR needed? https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:366: IOCTL_OR_ERROR_RETURN_FALSE(mfc_fd_, VIDIOC_QUERYCAP, &caps); Should HANDLE_EINTR be used on this syscall, and/or other ones? https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:433: if (!bitstream_record->shm->Map(bitstream_buffer.size())) { Would it make sense to do the Map on the decoder thread? That way if it takes time, you get a bit more parallelism. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:471: glActiveTexture(GL_TEXTURE0); If you change the state of the current context (active texture unit, currently bound texture), it will be confused. You need to either restore it, or let the GLES2CmdDecoder that you messed with its state so that it can restore it (assuming it tracks it). Make sure you handle the failure/early return cases as well... https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:486: egl_image = egl_create_image_khr( nit: declaration + initialization on same statement. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:527: egl_sync = egl_create_sync_khr(egl_display_, EGL_SYNC_FENCE_KHR, NULL); nit: declaration + initialization on same statement. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:704: size_t decoded_size; please initialize here - it's hard to track that all paths do initialize it. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:712: schedule_task = true; nit: this looks redundant with all the paths below. You can either remove this, or maybe remove some branches below? https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:736: const void* const data = make this a const uint8* const data, it saves a cast below. (or remove the reinterpret_cast on this line). https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:781: bool ExynosVideoDecodeAccelerator::FindFrameFragment( Could we have a unit test for this function? https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:996: memcpy((char*)input_record.address + input_record.bytes_used, data, size); nit: no c-style cast please https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1646: decoder_decode_buffer_tasks_scheduled_ = 0; This is suspicious... It's incremented when scheduling a DecodeBufferTask, and decremented when the task executes. You may be executing this while one (or several) DecodeBufferTask are in the queue, and they will decrement before testing for kResetting, so you'll end up with a negative decoder_decode_buffer_tasks_scheduled_, is that really what you want? https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1811: if (write(device_poll_interrupt_fd_, &buf, sizeof(buf)) == -1) { HANDLE_EINTR https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1824: if (read(device_poll_interrupt_fd_, &buf, sizeof(buf)) == -1) { HANDLE_EINTR https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1873: } while (ret < 1 && errno == EINTR); You can just use HANDLE_EINTR https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:2223: close(output_record.fd); HANDLE_EINTR https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:38: // is waited on with epoll(). There are three threads involved in this class: note: you can register a fd into the MessageLoop (in different ways whether it's a UI loop or an IO loop), the core of which is a poll. It can save you one thread (i.e. you can most likely fold the device_poll_thread_ into the decoder_thread_). See MessagePumpLibevent::WatchFileDescriptor https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:40: // * The child thread, which is the main GPU process thread which calls the nit: we commonly use "main" thread to mean the ChildThread (historically that thread was not the main one, but it changed in, like, Chrome 2), it would be nice to be consistent. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:57: // right order, not fiddling with locks. Note: the last paragraph is basically the modus operandi in most chrome code, so it's a bit superfluous... https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:94: }; nit: we don't typically use anonymous enums with values to define constants, rather static const (unsigned) ints. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:312: // child thread and device service callbacks posted from the device thread. This class is pretty big. It could be nice to separate into a tear-off class everything that runs on the decoder thread (and/or the polling thread). It makes it harder to screw up which field can be accessed on which thread, and clearer which method is called on which thread (you can even enforce using base::NonThreadSafe). https://chromiumcodereview.appspot.com/11198060/diff/145005/content/public/co... File content/public/common/content_switches.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/145005/content/public/co... content/public/common/content_switches.cc:764: const char kUseExynosVda[] = "use-exynos-vda"; Note: it would be nice at some point to have a selection switch --use-vda=exynos|omx|libva|dxva|... like we have --use-gl, rather than an explosion of switches with undocumented precedence as we add more paths. Not necessarily for this CL, but maybe as a follow-up?
nit/note meta-review ;) > https://chromiumcodereview.**appspot.com/11198060/diff/** > 145005/content/common/gpu/**media/exynos_video_decode_**accelerator.h<https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gpu/media/exynos_video_decode_accelerator.h> > File content/common/gpu/media/**exynos_video_decode_**accelerator.h > (right): > > https://chromiumcodereview.**appspot.com/11198060/diff/** > 145005/content/common/gpu/**media/exynos_video_decode_** > accelerator.h#newcode57<https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gpu/media/exynos_video_decode_accelerator.h#newcode57> > content/common/gpu/media/**exynos_video_decode_**accelerator.h:57: // > right > order, not fiddling with locks. > Note: the last paragraph is basically the modus operandi in most chrome > code, so it's a bit superfluous... > Sadly that's not the case in media code. > https://chromiumcodereview.**appspot.com/11198060/diff/** > 145005/content/common/gpu/**media/exynos_video_decode_** > accelerator.h#newcode94<https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gpu/media/exynos_video_decode_accelerator.h#newcode94> > content/common/gpu/media/**exynos_video_decode_**accelerator.h:94: }; > nit: we don't typically use anonymous enums with values to define > constants, rather static const (unsigned) ints. > This is a typical usage for media code. It has the advantage of avoiding duplication between decl/definition (as well avoiding unnecessary storage). > https://chromiumcodereview.**appspot.com/11198060/diff/** > 145005/content/public/common/**content_switches.cc<https://chromiumcodereview.appspot.com/11198060/diff/145005/content/public/common/content_switches.cc> > File content/public/common/content_**switches.cc (right): > > https://chromiumcodereview.**appspot.com/11198060/diff/** > 145005/content/public/common/**content_switches.cc#newcode764<https://chromiumcodereview.appspot.com/11198060/diff/145005/content/public/common/content_switches.cc#newcode764> > content/public/common/content_**switches.cc:764: const char > kUseExynosVda[] = "use-exynos-vda"; > Note: it would be nice at some point to have a selection switch > --use-vda=exynos|omx|libva|**dxva|... like we have --use-gl, rather than > an explosion of switches with undocumented precedence as we add more > paths. Not necessarily for this CL, but maybe as a follow-up? > Before this CL there is no platform with more than one VDA (so none of libva/dxva/... would be sensible values for such a flag). The hope is that this flag will not be long-lived, and that the VDA used will be fixed per-platform (with OVDA probably being killed off until/unless we have a viable platform needing it).
Updated, PTAL. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/browser/g... File content/browser/gpu/gpu_process_host.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/145005/content/browser/g... content/browser/gpu/gpu_process_host.cc:994: switches::kUseExynosVda, On 2013/01/12 03:24:58, piman wrote: > note: you will need it also in LoginUtilsImpl::GetOffTheRecordCommandLine in > chrome/browser/chromeos/login/login_utils.cc, otherwise the flag will not be > enabled in guest mode. Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:25: #define EXYNOS_MFC_DEVICE "/dev/mfc-dec" On 2013/01/12 03:24:58, piman wrote: > nit: const char [] kExynosMfcDevice = "/dev/mfc-dec"; > > Same below. Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:122: gl_egl_image_target_texture_2d_oes = NULL; On 2013/01/12 03:24:58, piman wrote: > nit: you can put all of these into an anonymous namespace to not need the > 'static' everywhere. Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:137: if (input_id >= 0) On 2013/01/12 03:24:58, piman wrote: > nit: need {} if the statement spans more than 1 line Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:160: close(buffer.egl_image_fd); On 2013/01/12 03:24:58, piman wrote: > Use HANDLE_EINTR from base/posix/eintr_wrapper.h on all syscalls (where it makes > sense). Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:263: // Nuke the entire site from orbit -- it's the only way to be sure. On 2013/01/12 03:24:58, piman wrote: > nit: This comment is pretty alien to me Not sure if being ironic or just hilarious coincidence it _is_ the destructor so it should be nuking everything. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:265: close(device_poll_interrupt_fd_); On 2013/01/12 03:24:58, piman wrote: > HANDLE_EINTR Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:271: close(gsc_fd_); On 2013/01/12 03:24:58, piman wrote: > HANDLE_EINTR Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:277: close(mfc_fd_); On 2013/01/12 03:24:58, piman wrote: > HANDLE_EINTR Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:335: mfc_fd_ = open(EXYNOS_MFC_DEVICE, O_RDWR | O_NONBLOCK | O_CLOEXEC); On 2013/01/12 03:24:58, piman wrote: > HANDLE_EINTR Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:343: gsc_fd_ = open(EXYNOS_GSC_DEVICE, O_RDWR | O_NONBLOCK | O_CLOEXEC); On 2013/01/12 03:24:58, piman wrote: > HANDLE_EINTR Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:353: device_poll_interrupt_fd_ = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC); On 2013/01/12 03:24:58, piman wrote: > Is HANDLE_EINTR needed? eventfd does not return EINTR. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:366: IOCTL_OR_ERROR_RETURN_FALSE(mfc_fd_, VIDIOC_QUERYCAP, &caps); On 2013/01/12 03:24:58, piman wrote: > Should HANDLE_EINTR be used on this syscall, and/or other ones? The V4L2 API docs don't mention their ioctls being interruptible, but some of the driver code does seem to return EINTR on occasion. To be safe then I'm going to wrap the ioctl calls with HANDLE_EINTR. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:433: if (!bitstream_record->shm->Map(bitstream_buffer.size())) { On 2013/01/12 03:24:58, piman wrote: > Would it make sense to do the Map on the decoder thread? That way if it takes > time, you get a bit more parallelism. Since the decoder thread also services the device, that would add more average latency to the decode than it would save by offloading the child thread, I think. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:486: egl_image = egl_create_image_khr( On 2013/01/12 03:24:58, piman wrote: > nit: declaration + initialization on same statement. Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:527: egl_sync = egl_create_sync_khr(egl_display_, EGL_SYNC_FENCE_KHR, NULL); On 2013/01/12 03:24:58, piman wrote: > nit: declaration + initialization on same statement. Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:704: size_t decoded_size; On 2013/01/12 03:24:58, piman wrote: > please initialize here - it's hard to track that all paths do initialize it. Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:712: schedule_task = true; On 2013/01/12 03:24:58, piman wrote: > nit: this looks redundant with all the paths below. You can either remove this, > or maybe remove some branches below? Not quite redundant. I've moved it down a few lines though to make the purpose more clear. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:736: const void* const data = On 2013/01/12 03:24:58, piman wrote: > make this a const uint8* const data, it saves a cast below. (or remove the > reinterpret_cast on this line). Yeah. Leftover from fischman@'s earlier review. Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:781: bool ExynosVideoDecodeAccelerator::FindFrameFragment( On 2013/01/12 03:24:58, piman wrote: > Could we have a unit test for this function? vda_unittests already tests multi-fragment decode. (Hence the whole reason I put this in -- usual VDA usage through Chrome doesn't actually require this frame fragment splitting.) https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:996: memcpy((char*)input_record.address + input_record.bytes_used, data, size); On 2013/01/12 03:24:58, piman wrote: > nit: no c-style cast please Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1646: decoder_decode_buffer_tasks_scheduled_ = 0; On 2013/01/12 03:24:58, piman wrote: > This is suspicious... It's incremented when scheduling a DecodeBufferTask, and > decremented when the task executes. > You may be executing this while one (or several) DecodeBufferTask are in the > queue, and they will decrement before testing for kResetting, so you'll end up > with a negative decoder_decode_buffer_tasks_scheduled_, is that really what you > want? Aaaand... that's a bug. Thanks. Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1811: if (write(device_poll_interrupt_fd_, &buf, sizeof(buf)) == -1) { On 2013/01/12 03:24:58, piman wrote: > HANDLE_EINTR Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1824: if (read(device_poll_interrupt_fd_, &buf, sizeof(buf)) == -1) { On 2013/01/12 03:24:58, piman wrote: > HANDLE_EINTR Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1873: } while (ret < 1 && errno == EINTR); On 2013/01/12 03:24:58, piman wrote: > You can just use HANDLE_EINTR Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:1873: } while (ret < 1 && errno == EINTR); On 2013/01/12 03:24:58, piman wrote: > You can just use HANDLE_EINTR Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:2223: close(output_record.fd); On 2013/01/12 03:24:58, piman wrote: > HANDLE_EINTR Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:38: // is waited on with epoll(). There are three threads involved in this class: On 2013/01/12 03:24:58, piman wrote: > note: you can register a fd into the MessageLoop (in different ways whether it's > a UI loop or an IO loop), the core of which is a poll. It can save you one > thread (i.e. you can most likely fold the device_poll_thread_ into the > decoder_thread_). > > See MessagePumpLibevent::WatchFileDescriptor fischman@ suggested this previously. I will do this in a follow-up; see TODO at end of comment block. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:40: // * The child thread, which is the main GPU process thread which calls the On 2013/01/12 03:24:58, piman wrote: > nit: we commonly use "main" thread to mean the ChildThread (historically that > thread was not the main one, but it changed in, like, Chrome 2), it would be > nice to be consistent. The terminology I picked up from the VAAPI decoder. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:57: // right order, not fiddling with locks. On 2013/01/12 03:24:58, piman wrote: > Note: the last paragraph is basically the modus operandi in most chrome code, so > it's a bit superfluous... Yeah, and I'm glad it's this way. It's not this way in sec_omx, which is what this code is replacing... https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:94: }; On 2013/01/12 03:24:58, piman wrote: > nit: we don't typically use anonymous enums with values to define constants, > rather static const (unsigned) ints. It seems to be the convention with media files, so I'm using that. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.h:312: // child thread and device service callbacks posted from the device thread. On 2013/01/12 03:24:58, piman wrote: > This class is pretty big. It could be nice to separate into a tear-off class > everything that runs on the decoder thread (and/or the polling thread). It makes > it harder to screw up which field can be accessed on which thread, and clearer > which method is called on which thread (you can even enforce using > base::NonThreadSafe). If/when this becomes a general V4L2 video decoder class, I will do the refactor then. I agree that right now the non-enforcement of thread-safe access is potentially dangerous. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/public/co... File content/public/common/content_switches.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/145005/content/public/co... content/public/common/content_switches.cc:764: const char kUseExynosVda[] = "use-exynos-vda"; On 2013/01/12 03:24:58, piman wrote: > Note: it would be nice at some point to have a selection switch > --use-vda=exynos|omx|libva|dxva|... like we have --use-gl, rather than an > explosion of switches with undocumented precedence as we add more paths. Not > necessarily for this CL, but maybe as a follow-up? This is just a temporary flag, which I intend to remove as soon as we finish the switchover from sec_omx.
https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:263: // Nuke the entire site from orbit -- it's the only way to be sure. On 2013/01/14 23:49:49, sheu wrote: > On 2013/01/12 03:24:58, piman wrote: > > nit: This comment is pretty alien to me > > Not sure if being ironic or just hilarious coincidence It has to be a coincidence, there's no other conceivable explanation. > it _is_ the destructor so it should be nuking everything. What I meant is that we usually eschew comments that don't bring anything. It is the destructor, the reader expects it to clean everything up. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:433: if (!bitstream_record->shm->Map(bitstream_buffer.size())) { On 2013/01/14 23:49:49, sheu wrote: > On 2013/01/12 03:24:58, piman wrote: > > Would it make sense to do the Map on the decoder thread? That way if it takes > > time, you get a bit more parallelism. > > Since the decoder thread also services the device, that would add more average > latency to the decode than it would save by offloading the child thread, I > think. Ok, it's a minor thing in any case, but I was thinking about offloading the main thread (throughput concern). If you feel strongly, leave it as is. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:781: bool ExynosVideoDecodeAccelerator::FindFrameFragment( On 2013/01/14 23:49:49, sheu wrote: > On 2013/01/12 03:24:58, piman wrote: > > Could we have a unit test for this function? > > vda_unittests already tests multi-fragment decode. (Hence the whole reason I > put this in -- usual VDA usage through Chrome doesn't actually require this > frame fragment splitting.) The intent is that here, you have a bit parsing function (parsing untrusted data), in a weakly sandboxed process, that may have security impact (it returns a range of bytes to be consumed by the rest of the code). At the same time, it should be relatively easy to write a unit test for it (you can make it a free function by injecting the decoder_h264_parser_ and the profile), so that we have a stronger guarantee that this code, or a future refactoring thereof, can't return crap. https://chromiumcodereview.appspot.com/11198060/diff/147002/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/147002/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:200: length[1] = 0; nit: indentation was right before... https://chromiumcodereview.appspot.com/11198060/diff/147002/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:730: schedule_task = true; nit: you can skip this since it's already true.
https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:263: // Nuke the entire site from orbit -- it's the only way to be sure. On 2013/01/15 01:36:51, piman wrote: > On 2013/01/14 23:49:49, sheu wrote: > > On 2013/01/12 03:24:58, piman wrote: > > > nit: This comment is pretty alien to me > > > > Not sure if being ironic or just hilarious coincidence > > It has to be a coincidence, there's no other conceivable explanation. > > > it _is_ the destructor so it should be nuking everything. > > What I meant is that we usually eschew comments that don't bring anything. It is > the destructor, the reader expects it to clean everything up. Done. https://chromiumcodereview.appspot.com/11198060/diff/145005/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:781: bool ExynosVideoDecodeAccelerator::FindFrameFragment( On 2013/01/15 01:36:51, piman wrote: > On 2013/01/14 23:49:49, sheu wrote: > > On 2013/01/12 03:24:58, piman wrote: > > > Could we have a unit test for this function? > > > > vda_unittests already tests multi-fragment decode. (Hence the whole reason I > > put this in -- usual VDA usage through Chrome doesn't actually require this > > frame fragment splitting.) > > The intent is that here, you have a bit parsing function (parsing untrusted > data), in a weakly sandboxed process, that may have security impact (it returns > a range of bytes to be consumed by the rest of the code). > At the same time, it should be relatively easy to write a unit test for it (you > can make it a free function by injecting the decoder_h264_parser_ and the > profile), so that we have a stronger guarantee that this code, or a future > refactoring thereof, can't return crap. video_decode_accelerator_unittest covers the multi-fragment case, and h264_parser_unittest covers the H264 parser; between the two I think we should have coverage. The only failure case really is overrunning the bounds, so I added a DCHECK to the caller of this code just in case. Adding a unittest specific to Exynos VDA I could do, but I a bit resistant to do the extra work, especially given that none of the other ones (Mac, OMX, even VAAPI) have VDA-specific unittests, and VAAPI in particular is pretty hairy with the parsing. https://chromiumcodereview.appspot.com/11198060/diff/147002/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/147002/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:200: length[1] = 0; On 2013/01/15 01:36:51, piman wrote: > nit: indentation was right before... The more things I touch, the more stuff breaks... https://chromiumcodereview.appspot.com/11198060/diff/147002/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:730: schedule_task = true; On 2013/01/15 01:36:51, piman wrote: > nit: you can skip this since it's already true. I figure the compiler would take care of it, and I wanted to explicitly document the final output value here.
LGTM+nits https://chromiumcodereview.appspot.com/11198060/diff/145007/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/145007/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:752: DCHECK_LE(decoded_size, data_size); nit: make it a real CHECK and I'll take it. https://chromiumcodereview.appspot.com/11198060/diff/145007/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:839: *endpos = (nalu.data + nalu.size) - reinterpret_cast<const uint8*>(data); nit: actually data is a const uint8* already so no need for the reinterpret_cast
Done. https://chromiumcodereview.appspot.com/11198060/diff/145007/content/common/gp... File content/common/gpu/media/exynos_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/11198060/diff/145007/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:752: DCHECK_LE(decoded_size, data_size); On 2013/01/15 05:11:29, piman wrote: > nit: make it a real CHECK and I'll take it. Done. https://chromiumcodereview.appspot.com/11198060/diff/145007/content/common/gp... content/common/gpu/media/exynos_video_decode_accelerator.cc:839: *endpos = (nalu.data + nalu.size) - reinterpret_cast<const uint8*>(data); On 2013/01/15 05:11:29, piman wrote: > nit: actually data is a const uint8* already so no need for the reinterpret_cast Done.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sheu@chromium.org/11198060/158001
Presubmit check for 11198060-158001 failed and returned exit status 1. Running presubmit commit checks ... ** Presubmit ERRORS ** Missing LGTM from an OWNER for files in these directories: chrome/browser/chromeos/login Presubmit checks took 3.6s to calculate.
login lgtm
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sheu@chromium.org/11198060/158001
Retried try job too often on mac for step(s) compile
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sheu@chromium.org/11198060/171002
Retried try job too often on win for step(s) compile
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sheu@chromium.org/11198060/188001
Failed to trigger a try job on mac HTTP Error 400: Bad Request
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sheu@chromium.org/11198060/186004
Retried try job too often on win for step(s) compile
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sheu@chromium.org/11198060/186004
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sheu@chromium.org/11198060/181005
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/sheu@chromium.org/11198060/186014
Message was sent while issue was closed.
Change committed as 177141 |