Issue 137023008: Add support for Tegra V4L2 VDA

shivdasp

Do not review yet. This is first draft version with some more changes needed.

6 years, 10 months ago (2014-01-28 13:49:36 UTC) #1

shivdasp

This change adds support for Tegra V4L2Device into the V4L2VDA. Please have a look.

6 years, 10 months ago (2014-02-06 09:48:46 UTC) #2

shivdasp

Is there a change in gles2_cmd_decoder.cc recently ? Because this CL used to work on ...

6 years, 10 months ago (2014-02-06 17:37:40 UTC) #3

Ami GONE FROM CHROMIUM

On Thu, Feb 6, 2014 at 9:37 AM, <shivdasp@nvidia.com> wrote: > Is there a change ...

6 years, 10 months ago (2014-02-06 17:46:18 UTC) #4

shivdasp

That was the first place I checked but there are 4-5 changes recently in that ...

6 years, 10 months ago (2014-02-07 05:33:52 UTC) #5

Ami GONE FROM CHROMIUM

@posciak: as usual I rely on you to review the called platform code for calling ...

6 years, 10 months ago (2014-02-07 09:09:30 UTC) #6

Pawel Osciak

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gpu/media/exynos_v4l2_video_device.cc File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode123 content/common/gpu/media/exynos_v4l2_video_device.cc:123: EGLImageKHR egl_image = eglCreateImageKHR( We need to make GL ...

6 years, 10 months ago (2014-02-10 06:36:17 UTC) #7

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/exynos_v4l2_video_device.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/exynos_v4l2_video_device.cc:123: EGLImageKHR egl_image
= eglCreateImageKHR(
We need to make GL context current to be able to call this.
We should at least require this in the doc that it should be done before calling
this method.

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/exynos_v4l2_video_device.h (right):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/exynos_v4l2_video_device.h:40: unsigned int
GetTextureTarget();
Documentation for methods please.
Also, texture target should be a GLenum I think...

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:4: #include
<libdrm/drm_fourcc.h>
Not needed?

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:6: #include <poll.h>
Not needed.

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:7: #include <sys/eventfd.h>
Not needed.

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:9: #include <sys/mman.h>
Not needed?

Please clean up the headers in this file to a minimal set of required ones only
(also applies to the Chrome headers below).

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] =
"/dev/video0";
Is this is the codec device exposed by Tegra kernel driver?

You can't assume it will be this on all configurations.
Please use udev rules to create a codec specific device (see Exynos example at
https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...)

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:54: return (void*)offset;
QUERYBUFS from V4L2VDA is used to acquire offsets for mmap, but from
CreateEGLImage to pass them for texture binding to the library? I'm guessing
that this works, because for the former you call QUERYBUFS with OUTPUT buffer,
while in the latter you call it with CAPTURE?

Please don't do this. Suggested changes for EGLImages in my comments below. 

As for mmap, how (and from what) is the memory mapping actually acquired for the
buffer in the library? Why do this on QUERYBUFS and not on Mmap()? How is it
managed and when it is destroyed?

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for
tegrav4l2 device.
If so, how is unmapping handled then? What if we want to free the buffers and
reallocate them? You cannot call REQBUFS(0) without unmapping the buffers
first...

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:136: unsigned int
buffer_index) {
This method should take a v4l2_buffer instead. 
Depending on format and other circumstances format, memory type, etc. change. We
shouldn't hardcode this in the device class.

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:158: capture_buffer.length =
2;
arraysize(planes)

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset
= reinterpret_cast<unsigned int>(egl_image);
mem_offset is u32 always, an unsigned int shouldn't be assigned to it. Also,
there are two planes, but passing only one offset is a bit inconsistent.
Although, why have two planes, if only one is used?

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:168: }
After reading through it, this method feels unnecessary.

I'm assuming querybufs implementation in the library calls a method in the GPU
driver anyway?

This is basically redefining querybufs to do something completely different than
it normally does, turning it into a custom call. The offsets should be coming
from the callee of QUERYBUFS, not the other way around, and there should be no
custom side effects.
A non-V4L2, library-specific custom call would be better than this.

But why not implement eglCreateImageKHR that accepts dmabufs (or even offsets)
that come from the v4l2 library to create EGL images, just like we do on Mali on
Exynos?

Would it be possible to have an extension for eglCreateImage like Exynos does
instead please? It doesn't seem to be much of a difference, instead of calling
querybufs with custom arguments and having the library call something in the
driver, call eglCreateImage instead with custom arguments and have it do
everything?

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/tegra_v4l2_video_device.h (right):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.h:41: EGLContext egl_context_;
To be honest I'm not especially excited with EGLContext becoming a part of
V4L2Device, since the two are unrelated. A V4L2Device should have no need for an
EGL context.

Please see my comments in .cc for details, I think CreateEGLImage should not be
in this class.

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.h:61: }
Empty line above and // namespace content please.

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:382:
glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image);
I would like to understand the big picture here please.

We strive to stay as close as possible to using (and/or creating)
platform-independent standards where we can, like the sequence above, instead of
providing custom calls for each platform. Removing this from here and TVDA is a
step into an opposite direction, and I would like to understand what technical
difficulties force us to do this first.

Binding textures to EGLImages also serves to keep track of ownership. There are
multiple users of the shared buffer, the renderer, the GPU that renders the
textures, this class and the HW codec. How is ownership/destruction managed and
how is it ensured that the buffer is valid while any of the users are still
referring to/using it (both in userspace and in kernel)?

What happens if the renderer crashes and the codec is writing to the textures?
What happens when this class is destroyed, but the texture is in the renderer?
What happens when the whole Chrome crashes, but the HW codec is using a buffer
(i.e. kernel has ownership)?

Could you please explain how is ownership managed for shared buffers on Tegra?

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1654: // Since the
underlying library at the moment is not updated this hack
Could this hack be moved into the library itself then?
This class should not have any awareness of device-specific issues.

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/v4l2_video_device.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/v4l2_video_device.cc:29: return
exynos_device.PassAs<V4L2Device>();
On 2014/02/07 09:09:30, Ami Fischman wrote:
> l.19-29 would be clearer as:
> 
> scoped_ptr<EV4L2D> e_d(...);
> if (e_d->Initialize())
>   return e_d.PassAs<V4L2D>();
> DLOG(ERROR) << "Failed to open exynos v4l2 device.";
> 
> scoped_ptr<TV4L2D> t_d(...);
> if (t_d->Initialize(e_c))
>   return t_d.PassAs<V4L2D>();
> DLOG(ERROR) << "Failed to open tegra v4l2 device.";
> 
> return scoped_ptr<V4L2D>(NULL);

+1 to this, but I don't think DLOGging failing to open a particular device is
good here. It would always log it on Tegra, even though it wouldn't be an error.
I feel we should only log an error if we return NULL and say no device could be
opened.

shivdasp

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media/tegra_v4l2_video_device.cc#newcode23 content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; This is a v4l2 ...

6 years, 10 months ago (2014-02-10 13:31:16 UTC) #8

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] =
"/dev/video0";
This is a v4l2 decoder device name which we use to initialize a decoder context
within the libtegrav4l2 library.
This can be anything really as long as decoder and encoder device names are
different since we do not open a v4l2 video device underneath. Libtegrav4l2 is
really a pseudo implementation. I can change it /dev/tegra-dec and
/dev/tegra-enc for it to mean tegra specific.

On 2014/02/10 06:36:17, Pawel Osciak wrote:
> Is this is the codec device exposed by Tegra kernel driver?
> 
> You can't assume it will be this on all configurations.
> Please use udev rules to create a codec specific device (see Exynos example at
>
https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...)

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:54: return (void*)offset;
When REQBUFS is called for OUTPUT_PLANE, the library creates internal buffers
which can be shared with the AVP for decoding. AVP is a video processor which
runs the firmware that actually does all the decoding. While creating this
buffers, they are already mmapped to get a virtual address. The library returns
this address in QUERYBUF API. Hence there is not real need for mmap.
When REQBUFS with 0 is called on OUTPUT_PLANE, these buffers are internally
unmapped and destroyed.
I will explain the need for CreateEGLImage in later comments.

On 2014/02/10 06:36:17, Pawel Osciak wrote:
> QUERYBUFS from V4L2VDA is used to acquire offsets for mmap, but from
> CreateEGLImage to pass them for texture binding to the library? I'm guessing
> that this works, because for the former you call QUERYBUFS with OUTPUT buffer,
> while in the latter you call it with CAPTURE?
> 
> Please don't do this. Suggested changes for EGLImages in my comments below. 
> 
> As for mmap, how (and from what) is the memory mapping actually acquired for
the
> buffer in the library? Why do this on QUERYBUFS and not on Mmap()? How is it
> managed and when it is destroyed?

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for
tegrav4l2 device.
Buffers are unmapped in REQBUFS(0) call and destroyed.
Since there is no real need for mmap and munmap, we did not implement it in the
library.
So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and mmaps the
buffer whereas REQBUF(0) unmaps and destroys them.

On 2014/02/10 06:36:17, Pawel Osciak wrote:
> If so, how is unmapping handled then? What if we want to free the buffers and
> reallocate them? You cannot call REQBUFS(0) without unmapping the buffers
> first...

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:79: TegraV4L2Open =
reinterpret_cast<TegraV4L2OpenFunc>(
Yes I will do this. That looks very clean.
On 2014/02/07 09:09:30, Ami Fischman wrote:
> Please avoid the sort of code duplication below (see vaapi_wrapper.cc for
> example of dlsym'ing).

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:136: unsigned int
buffer_index) {
Since ExynosV4L2Device does not need the v4l2_buffer like TegraV4L2Device, I
thought it would add less noise to the V4L2VDA code.
Also the fields within the v4l2_buffer (the eglimage handle) is created within
this method so can't fully initialize the structure here.
On 2014/02/10 06:36:17, Pawel Osciak wrote:
> This method should take a v4l2_buffer instead. 
> Depending on format and other circumstances format, memory type, etc. change.
We
> shouldn't hardcode this in the device class.

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset
= reinterpret_cast<unsigned int>(egl_image);
We are really passing in the EglImage handle here to the library. The library
associates this with the corresponding v4l2_buffer on the CAPTURE plane and use
the underlying conversion APIs to transform the decoder's yuv output into the
egl image.
We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where the number
of planes are checked with 2 (line #1660 of V4L2VDA).

On 2014/02/10 06:36:17, Pawel Osciak wrote:
> mem_offset is u32 always, an unsigned int shouldn't be assigned to it. Also,
> there are two planes, but passing only one offset is a bit inconsistent.
> Although, why have two planes, if only one is used?

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:165: return
EGL_NO_IMAGE_KHR;
Yes will fix this.
On 2014/02/07 09:09:30, Ami Fischman wrote:
> leak?  (no eglDestroyImageKHR needed?)

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:168: }
Since we started with implementing this as a V4L2-Like library we have tried to
follow V4L2 syntax to provide the input & output buffers.
QUERYBUF can be made into a custom call since it is doing very custom thing
here.
If introducing another API is acceptable I can do that.

We do not yet have eglCreateImageKHR that accepts dmabufs. I can check with the
our Graphics team but I don't there is any such plan to implement such
extension.

On 2014/02/10 06:36:17, Pawel Osciak wrote:
> After reading through it, this method feels unnecessary.
> 
> I'm assuming querybufs implementation in the library calls a method in the GPU
> driver anyway?
> 
> This is basically redefining querybufs to do something completely different
than
> it normally does, turning it into a custom call. The offsets should be coming
> from the callee of QUERYBUFS, not the other way around, and there should be no
> custom side effects.
> A non-V4L2, library-specific custom call would be better than this.
> 
> But why not implement eglCreateImageKHR that accepts dmabufs (or even offsets)
> that come from the v4l2 library to create EGL images, just like we do on Mali
on
> Exynos?
> 
> Would it be possible to have an extension for eglCreateImage like Exynos does
> instead please? It doesn't seem to be much of a difference, instead of calling
> querybufs with custom arguments and having the library call something in the
> driver, call eglCreateImage instead with custom arguments and have it do
> everything?

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:382:
glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image);
The decoder's output buffers are created when REQBUFS(x) is called on
CAPTURE_PLANE. These buffers are hardware buffers which can be shared with the
AVP processor for decoder to write into.
Then V4L2VDA triggers ProvidePictureBuffers() on which textures are created and
sent back in AssignPictureBuffers().
Now V4L2VDA creates EglImages from these textures and sends each EglImage handle
to library using the QUERYBUF (but can use a custom call too). The tegrav4l2
library cannot create EglImages from DMABUFS like in Exynos since there is no
such extension. We create EglImage from this texture itself so there is a
binding between texture and eglImage.
Now when this EglImage is sent to libtegrav4l2, it is mapped with the
corresponding decoder buffer created in REQBUF() call.
This way there is one map of EglImage, texture and decoder buffer.
When any buffer is enqueued in QBUF, the library sends it down to the decoder.
Once the decoder buffer is ready, the library uses graphics apis to populate the
corresponding EglImage with the RGB data and then pushes into a queue thereby
making it available for DQBUF after which this buffer can be used only when it
is back in QBUF call.
This way the buffer ownership is managed.
So in summary the library uses queues and does all the buffer management between
decoder and the graphics stack for conversion.

On 2014/02/10 06:36:17, Pawel Osciak wrote:
> I would like to understand the big picture here please.
> 
> We strive to stay as close as possible to using (and/or creating)
> platform-independent standards where we can, like the sequence above, instead
of
> providing custom calls for each platform. Removing this from here and TVDA is
a
> step into an opposite direction, and I would like to understand what technical
> difficulties force us to do this first.
> 
> Binding textures to EGLImages also serves to keep track of ownership. There
are
> multiple users of the shared buffer, the renderer, the GPU that renders the
> textures, this class and the HW codec. How is ownership/destruction managed
and
> how is it ensured that the buffer is valid while any of the users are still
> referring to/using it (both in userspace and in kernel)?
> 
> What happens if the renderer crashes and the codec is writing to the textures?
> What happens when this class is destroyed, but the texture is in the renderer?
> What happens when the whole Chrome crashes, but the HW codec is using a buffer
> (i.e. kernel has ownership)?
> 
> Could you please explain how is ownership managed for shared buffers on Tegra?

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1654: // Since the
underlying library at the moment is not updated this hack
Apologies. This code was not meant to be here. Will remove it.
On 2014/02/10 06:36:17, Pawel Osciak wrote:
> Could this hack be moved into the library itself then?
> This class should not have any awareness of device-specific issues.

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/v4l2_video_device.cc:21: exynos_device.reset(NULL);
On 2014/02/07 09:09:30, Ami Fischman wrote:
> Can drop NULL (scoped_ptr::reset(NULL) is equiv to scoped_ptr::reset()).

Done.

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/v4l2_video_device.cc:29: return
exynos_device.PassAs<V4L2Device>();
Agreed. Will make this change in next patchset.

On 2014/02/10 06:36:17, Pawel Osciak wrote:
> On 2014/02/07 09:09:30, Ami Fischman wrote:
> > l.19-29 would be clearer as:
> > 
> > scoped_ptr<EV4L2D> e_d(...);
> > if (e_d->Initialize())
> >   return e_d.PassAs<V4L2D>();
> > DLOG(ERROR) << "Failed to open exynos v4l2 device.";
> > 
> > scoped_ptr<TV4L2D> t_d(...);
> > if (t_d->Initialize(e_c))
> >   return t_d.PassAs<V4L2D>();
> > DLOG(ERROR) << "Failed to open tegra v4l2 device.";
> > 
> > return scoped_ptr<V4L2D>(NULL);
> 
> +1 to this, but I don't think DLOGging failing to open a particular device is
> good here. It would always log it on Tegra, even though it wouldn't be an
error.
> I feel we should only log an error if we return NULL and say no device could
be
> opened.

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/v4l2_video_device.h (right):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/v4l2_video_device.h:56: virtual EGLImageKHR
CreateEGLImage(EGLDisplay egl_display,
This method exists because of the difference in how Exynos and Tegra create the
EglImages. See previous comments.

On 2014/02/07 09:09:30, Ami Fischman wrote:
> Need to document these methods & parameters.
> I'm especially unclear at this point in the review what |buffer_index| is
> supposed to be (I see how it's used, but I don't understand what it means in
the
> context of a generic V4L2Device).

https://codereview.chromium.org/137023008/diff/80001/content/content_common.gypi
File content/content_common.gypi (right):

https://codereview.chromium.org/137023008/diff/80001/content/content_common.g...
content/content_common.gypi:560: 'common/gpu/media/tegra_v4l2_video_device.h',
On 2014/02/07 09:09:30, Ami Fischman wrote:
> please keep lists in alphabetical order.

Done.

shivdasp

Hi Pawel, Could you check my comments especially regarding the EglImage section ? Your views ...

6 years, 10 months ago (2014-02-11 14:41:14 UTC) #9

Pawel Osciak

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gpu/media/tegra_v4l2_video_device.cc#newcode23 content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; On 2014/02/10 13:31:17, shivdasp ...

6 years, 10 months ago (2014-02-12 09:15:12 UTC) #10

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] =
"/dev/video0";
On 2014/02/10 13:31:17, shivdasp wrote:
> This is a v4l2 decoder device name which we use to initialize a decoder
context
> within the libtegrav4l2 library.
> This can be anything really as long as decoder and encoder device names are
> different since we do not open a v4l2 video device underneath. Libtegrav4l2 is
> really a pseudo implementation. I can change it /dev/tegra-dec and
> /dev/tegra-enc for it to mean tegra specific.

Which device is actually being used? Does the library just talk to DRM driver
via custom ioctls?

> 
> On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > Is this is the codec device exposed by Tegra kernel driver?
> > 
> > You can't assume it will be this on all configurations.
> > Please use udev rules to create a codec specific device (see Exynos example
at
> >
>
https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...)
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:54: return (void*)offset;
On 2014/02/10 13:31:17, shivdasp wrote:
> When REQBUFS is called for OUTPUT_PLANE, the library creates internal buffers
> which can be shared with the AVP for decoding. AVP is a video processor which
> runs the firmware that actually does all the decoding. While creating this
> buffers, they are already mmapped to get a virtual address. The library
returns
> this address in QUERYBUF API. Hence there is not real need for mmap.
> When REQBUFS with 0 is called on OUTPUT_PLANE, these buffers are internally
> unmapped and destroyed.

The library must be using some kind of an mmap call to get a mapping though.
Would it be possible to move it to be done on this call, as expected?

Also, how will this work for VEA, where CAPTURE buffers need to be mapped
instead?

> I will explain the need for CreateEGLImage in later comments.
> 
> On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > QUERYBUFS from V4L2VDA is used to acquire offsets for mmap, but from
> > CreateEGLImage to pass them for texture binding to the library? I'm guessing
> > that this works, because for the former you call QUERYBUFS with OUTPUT
buffer,
> > while in the latter you call it with CAPTURE?
> > 
> > Please don't do this. Suggested changes for EGLImages in my comments below. 
> > 
> > As for mmap, how (and from what) is the memory mapping actually acquired for
> the
> > buffer in the library? Why do this on QUERYBUFS and not on Mmap()? How is it
> > managed and when it is destroyed? 
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for
tegrav4l2 device.
On 2014/02/10 13:31:17, shivdasp wrote:
> Buffers are unmapped in REQBUFS(0) call and destroyed.
> Since there is no real need for mmap and munmap, we did not implement it in
the
> library.
> So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and mmaps the
> buffer whereas REQBUF(0) unmaps and destroys them.

We should not rely on V4L2VDA to be the only place where the underlying memory
will be destroyed. Even if we REQBUFS(0) on these buffers, the renderer process
may still be keeping ownership of the textures bound to them. Is this taken into
account?

> 
> On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > If so, how is unmapping handled then? What if we want to free the buffers
and
> > reallocate them? You cannot call REQBUFS(0) without unmapping the buffers
> > first...
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset
= reinterpret_cast<unsigned int>(egl_image);
On 2014/02/10 13:31:17, shivdasp wrote:
> We are really passing in the EglImage handle here to the library.

EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed in an u32
variable.

The whole idea behind offsets is that they are usually not really offsets, but
sort of platform-independent 32bit IDs. They are acquired from the V4L2 driver
(or library) via QUERYBUFS, and can be passed back to other calls to uniquely
identify the buffers (e.g. to mmap).

The client is not supposed to generate them by itself and pass them to
QUERYBUFS.

> The library
> associates this with the corresponding v4l2_buffer on the CAPTURE plane and
use
> the underlying conversion APIs to transform the decoder's yuv output into the
> egl image.
> We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where the
number
> of planes are checked with 2 (line #1660 of V4L2VDA).

You mean
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu...
? 

This is an overassumption from the time where there was only one format
supported.
The number of planes to be used should be taken from the v4l2_format struct,
returned from G_FMT. This assumption should be fixed.

From what I'm seeing here, your HW doesn't really use V4L2_PIX_FMT_NV12M? Which
fourcc format does it use? Are the planes separate memory buffers?

> 
> On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > mem_offset is u32 always, an unsigned int shouldn't be assigned to it. Also,
> > there are two planes, but passing only one offset is a bit inconsistent.
> > Although, why have two planes, if only one is used?
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:168: }
On 2014/02/10 13:31:17, shivdasp wrote:
> Since we started with implementing this as a V4L2-Like library we have tried
to
> follow V4L2 syntax to provide the input & output buffers.
> QUERYBUF can be made into a custom call since it is doing very custom thing
> here.

Please understand that:
1. We are asking for a V4L2-like interface that conforms to the V4L2 API. A call
documented to work in the same way regardless of buffer type passed should not
do otherwise, if it can be prevented. If it's expected to return values, it
shouldn't be accepting them instead. And so on.
Of course, this is an adapter class, so the actual inner workings may be
different and it's not always possible to do exactly the same thing, but from
the point of view of the client the effects should be as close to what each call
is documented to do, as possible.

Otherwise this whole exercise of using V4L2 API is doing us more bad than good.

Please understand that the V4L2VDA and V4L2VEA classes will live on and will
work with multiple platforms. There will be many changes to them. People working
on them will expect behavior as documented in V4L2 API. Otherwise things will
break (and not only for other platforms, but Tegra too) and it will be very
difficult to reason why.

So it's very important for Tegra V4L2Device to behave like V4L2 API specifies
and not be tailored to how things are laid out in V4L2VDA currently.

2. This V4L2Device class should work with V4L2VEA class as well. I don't think
we can make it work if this hack on QUERYBUFS is here.


> If introducing another API is acceptable I can do that.
> 
> We do not yet have eglCreateImageKHR that accepts dmabufs. I can check with
the
> our Graphics team but I don't there is any such plan to implement such
> extension.
> 

That's why I gave the option of using offsets. If you prefer not to use dmabufs,
could we please:

- provide offsets via querybufs from the driver/library
- pass those offsets to a new eglCreateImage extension and move it back to
V4L2VDA
- keep using texture binding API?

This should eliminate the need for this method as well.

> On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > After reading through it, this method feels unnecessary.
> > 
> > I'm assuming querybufs implementation in the library calls a method in the
GPU
> > driver anyway?
> > 
> > This is basically redefining querybufs to do something completely different
> than
> > it normally does, turning it into a custom call. The offsets should be
coming
> > from the callee of QUERYBUFS, not the other way around, and there should be
no
> > custom side effects.
> > A non-V4L2, library-specific custom call would be better than this.
> > 
> > But why not implement eglCreateImageKHR that accepts dmabufs (or even
offsets)
> > that come from the v4l2 library to create EGL images, just like we do on
Mali
> on
> > Exynos?
> > 
> > Would it be possible to have an extension for eglCreateImage like Exynos
does
> > instead please? It doesn't seem to be much of a difference, instead of
calling
> > querybufs with custom arguments and having the library call something in the
> > driver, call eglCreateImage instead with custom arguments and have it do
> > everything?
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:382:
glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image);
On 2014/02/10 13:31:17, shivdasp wrote:
> The decoder's output buffers are created when REQBUFS(x) is called on
> CAPTURE_PLANE. These buffers are hardware buffers which can be shared with the
> AVP processor for decoder to write into.

By decoder do you mean V4L2VDA class?

> Then V4L2VDA triggers ProvidePictureBuffers() on which textures are created
and
> sent back in AssignPictureBuffers().
> Now V4L2VDA creates EglImages from these textures and sends each EglImage
handle
> to library using the QUERYBUF (but can use a custom call too). The tegrav4l2
> library cannot create EglImages from DMABUFS like in Exynos since there is no
> such extension. We create EglImage from this texture itself so there is a
> binding between texture and eglImage.

Sounds like the eglCreateImage extension taking offsets I described in the
comment in tegra_v4l2_video_device.cc could work for this?

> Now when this EglImage is sent to libtegrav4l2, it is mapped with the
> corresponding decoder buffer created in REQBUF() call.
> This way there is one map of EglImage, texture and decoder buffer.

My understanding is you mean the buffer is bound to a texture? If so, then it
also seems like we could use the current bind texture to eglimage calls?

> When any buffer is enqueued in QBUF, the library sends it down to the decoder.
> Once the decoder buffer is ready, the library uses graphics apis to populate
the
> corresponding EglImage with the RGB data and then pushes into a queue thereby
> making it available for DQBUF after which this buffer can be used only when it
> is back in QBUF call.
> This way the buffer ownership is managed.
> So in summary the library uses queues and does all the buffer management
between
> decoder and the graphics stack for conversion.

What happens when this class calls REQBUFS(0), but the corresponding textures
are being rendered to the screen?
How will the buffers be freed if the GPU process crashes without calling
REQBUFS(0)?
What happens when the bound textures are deleted, but the HW codec is still
using them?

> 
> On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > I would like to understand the big picture here please.
> > 
> > We strive to stay as close as possible to using (and/or creating)
> > platform-independent standards where we can, like the sequence above,
instead
> of
> > providing custom calls for each platform. Removing this from here and TVDA
is
> a
> > step into an opposite direction, and I would like to understand what
technical
> > difficulties force us to do this first.
> > 
> > Binding textures to EGLImages also serves to keep track of ownership. There
> are
> > multiple users of the shared buffer, the renderer, the GPU that renders the
> > textures, this class and the HW codec. How is ownership/destruction managed
> and
> > how is it ensured that the buffer is valid while any of the users are still
> > referring to/using it (both in userspace and in kernel)?
> > 
> > What happens if the renderer crashes and the codec is writing to the
textures?
> > What happens when this class is destroyed, but the texture is in the
renderer?
> > What happens when the whole Chrome crashes, but the HW codec is using a
buffer
> > (i.e. kernel has ownership)?
> > 
> > Could you please explain how is ownership managed for shared buffers on
Tegra?
>

shivdasp

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media/tegra_v4l2_video_device.cc#newcode23 content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; This library internally talks ...

6 years, 10 months ago (2014-02-12 10:11:55 UTC) #11

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] =
"/dev/video0";
This library internally talks to MM layer which talks to the device
(/dev/tegra_avpchannel) which is the nvavp driver.

On 2014/02/12 09:15:13, Pawel Osciak wrote:
> On 2014/02/10 13:31:17, shivdasp wrote:
> > This is a v4l2 decoder device name which we use to initialize a decoder
> context
> > within the libtegrav4l2 library.
> > This can be anything really as long as decoder and encoder device names are
> > different since we do not open a v4l2 video device underneath. Libtegrav4l2
is
> > really a pseudo implementation. I can change it /dev/tegra-dec and
> > /dev/tegra-enc for it to mean tegra specific.
> 
> Which device is actually being used? Does the library just talk to DRM driver
> via custom ioctls?
> 
> > 
> > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > Is this is the codec device exposed by Tegra kernel driver?
> > > 
> > > You can't assume it will be this on all configurations.
> > > Please use udev rules to create a codec specific device (see Exynos
example
> at
> > >
> >
>
https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...)
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:54: return (void*)offset;
Okay I will add Mmap and Munmap calls to the library and have it return the
appropriate value internally.

On 2014/02/12 09:15:13, Pawel Osciak wrote:
> On 2014/02/10 13:31:17, shivdasp wrote:
> > When REQBUFS is called for OUTPUT_PLANE, the library creates internal
buffers
> > which can be shared with the AVP for decoding. AVP is a video processor
which
> > runs the firmware that actually does all the decoding. While creating this
> > buffers, they are already mmapped to get a virtual address. The library
> returns
> > this address in QUERYBUF API. Hence there is not real need for mmap.
> > When REQBUFS with 0 is called on OUTPUT_PLANE, these buffers are internally
> > unmapped and destroyed.
> 
> The library must be using some kind of an mmap call to get a mapping though.
> Would it be possible to move it to be done on this call, as expected?
> 
> Also, how will this work for VEA, where CAPTURE buffers need to be mapped
> instead?
> 
> > I will explain the need for CreateEGLImage in later comments.
> > 
> > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > QUERYBUFS from V4L2VDA is used to acquire offsets for mmap, but from
> > > CreateEGLImage to pass them for texture binding to the library? I'm
guessing
> > > that this works, because for the former you call QUERYBUFS with OUTPUT
> buffer,
> > > while in the latter you call it with CAPTURE?
> > > 
> > > Please don't do this. Suggested changes for EGLImages in my comments
below. 
> > > 
> > > As for mmap, how (and from what) is the memory mapping actually acquired
for
> > the
> > > buffer in the library? Why do this on QUERYBUFS and not on Mmap()? How is
it
> > > managed and when it is destroyed? 
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for
tegrav4l2 device.
If not in REQBUFS(0) then what will be the appropriate place to destroy the
buffers ?
V4L2VDA::DestroyOutputBuffers() calls REQBUFS(0) and hence we destroy it there.
How does the renderer then inform the ownership of textures ?



On 2014/02/12 09:15:13, Pawel Osciak wrote:
> On 2014/02/10 13:31:17, shivdasp wrote:
> > Buffers are unmapped in REQBUFS(0) call and destroyed.
> > Since there is no real need for mmap and munmap, we did not implement it in
> the
> > library.
> > So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and mmaps the
> > buffer whereas REQBUF(0) unmaps and destroys them.
> 
> We should not rely on V4L2VDA to be the only place where the underlying memory
> will be destroyed. Even if we REQBUFS(0) on these buffers, the renderer
process
> may still be keeping ownership of the textures bound to them. Is this taken
into
> account?
> 
> > 
> > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > If so, how is unmapping handled then? What if we want to free the buffers
> and
> > > reallocate them? You cannot call REQBUFS(0) without unmapping the buffers
> > > first...
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset
= reinterpret_cast<unsigned int>(egl_image);
The output is YUV420 planar. I think rather than using the QUERYBUF to pass the
EglImage handles and stuffing the required information I would rather introduce
a custom API (UseEglImage ?).
I hope that is fine.

On 2014/02/12 09:15:13, Pawel Osciak wrote:
> On 2014/02/10 13:31:17, shivdasp wrote:
> > We are really passing in the EglImage handle here to the library.
> 
> EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed in an
u32
> variable.
> 
> The whole idea behind offsets is that they are usually not really offsets, but
> sort of platform-independent 32bit IDs. They are acquired from the V4L2 driver
> (or library) via QUERYBUFS, and can be passed back to other calls to uniquely
> identify the buffers (e.g. to mmap).
> 
> The client is not supposed to generate them by itself and pass them to
> QUERYBUFS.
> 
> > The library
> > associates this with the corresponding v4l2_buffer on the CAPTURE plane and
> use
> > the underlying conversion APIs to transform the decoder's yuv output into
the
> > egl image.
> > We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where the
> number
> > of planes are checked with 2 (line #1660 of V4L2VDA).
> 
> You mean
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu...
> ? 
> 
> This is an overassumption from the time where there was only one format
> supported.
> The number of planes to be used should be taken from the v4l2_format struct,
> returned from G_FMT. This assumption should be fixed.
> 
> From what I'm seeing here, your HW doesn't really use V4L2_PIX_FMT_NV12M?
Which
> fourcc format does it use? Are the planes separate memory buffers?
> 
> > 
> > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > mem_offset is u32 always, an unsigned int shouldn't be assigned to it.
Also,
> > > there are two planes, but passing only one offset is a bit inconsistent.
> > > Although, why have two planes, if only one is used?
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:168: }
There is no extension to create EglImages from dmabufs or the offsets at the
moment unfortunately.
I agree using the QUERYBUF for sending the EglImage can be misleading and I will
change it. V4L2VEA should not affected since we use the QUERYBUF for providing
the actual offsets. This hack was only for sending the EglImages in case of
CAPTURE PLANE of decoder.
As I said earlier will adding a custom API (for now) to send the EglImages be
okay ?


On 2014/02/12 09:15:13, Pawel Osciak wrote:
> On 2014/02/10 13:31:17, shivdasp wrote:
> > Since we started with implementing this as a V4L2-Like library we have tried
> to
> > follow V4L2 syntax to provide the input & output buffers.
> > QUERYBUF can be made into a custom call since it is doing very custom thing
> > here.
> 
> Please understand that:
> 1. We are asking for a V4L2-like interface that conforms to the V4L2 API. A
call
> documented to work in the same way regardless of buffer type passed should not
> do otherwise, if it can be prevented. If it's expected to return values, it
> shouldn't be accepting them instead. And so on.
> Of course, this is an adapter class, so the actual inner workings may be
> different and it's not always possible to do exactly the same thing, but from
> the point of view of the client the effects should be as close to what each
call
> is documented to do, as possible.
> 
> Otherwise this whole exercise of using V4L2 API is doing us more bad than
good.
> 
> Please understand that the V4L2VDA and V4L2VEA classes will live on and will
> work with multiple platforms. There will be many changes to them. People
working
> on them will expect behavior as documented in V4L2 API. Otherwise things will
> break (and not only for other platforms, but Tegra too) and it will be very
> difficult to reason why.
> 
> So it's very important for Tegra V4L2Device to behave like V4L2 API specifies
> and not be tailored to how things are laid out in V4L2VDA currently.
> 
> 2. This V4L2Device class should work with V4L2VEA class as well. I don't think
> we can make it work if this hack on QUERYBUFS is here.
> 
> 
> > If introducing another API is acceptable I can do that.
> > 
> > We do not yet have eglCreateImageKHR that accepts dmabufs. I can check with
> the
> > our Graphics team but I don't there is any such plan to implement such
> > extension.
> > 
> 
> That's why I gave the option of using offsets. If you prefer not to use
dmabufs,
> could we please:
> 
> - provide offsets via querybufs from the driver/library
> - pass those offsets to a new eglCreateImage extension and move it back to
> V4L2VDA
> - keep using texture binding API?
> 
> This should eliminate the need for this method as well.
> 
> > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > After reading through it, this method feels unnecessary.
> > > 
> > > I'm assuming querybufs implementation in the library calls a method in the
> GPU
> > > driver anyway?
> > > 
> > > This is basically redefining querybufs to do something completely
different
> > than
> > > it normally does, turning it into a custom call. The offsets should be
> coming
> > > from the callee of QUERYBUFS, not the other way around, and there should
be
> no
> > > custom side effects.
> > > A non-V4L2, library-specific custom call would be better than this.
> > > 
> > > But why not implement eglCreateImageKHR that accepts dmabufs (or even
> offsets)
> > > that come from the v4l2 library to create EGL images, just like we do on
> Mali
> > on
> > > Exynos?
> > > 
> > > Would it be possible to have an extension for eglCreateImage like Exynos
> does
> > > instead please? It doesn't seem to be much of a difference, instead of
> calling
> > > querybufs with custom arguments and having the library call something in
the
> > > driver, call eglCreateImage instead with custom arguments and have it do
> > > everything?
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:382:
glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image);
On 2014/02/12 09:15:13, Pawel Osciak wrote:
> On 2014/02/10 13:31:17, shivdasp wrote:
> > The decoder's output buffers are created when REQBUFS(x) is called on
> > CAPTURE_PLANE. These buffers are hardware buffers which can be shared with
the
> > AVP processor for decoder to write into.
> 
> By decoder do you mean V4L2VDA class?

No I meant the decoder entity within the library.
> 
> > Then V4L2VDA triggers ProvidePictureBuffers() on which textures are created
> and
> > sent back in AssignPictureBuffers().
> > Now V4L2VDA creates EglImages from these textures and sends each EglImage
> handle
> > to library using the QUERYBUF (but can use a custom call too). The tegrav4l2
> > library cannot create EglImages from DMABUFS like in Exynos since there is
no
> > such extension. We create EglImage from this texture itself so there is a
> > binding between texture and eglImage.
> 
> Sounds like the eglCreateImage extension taking offsets I described in the
> comment in tegra_v4l2_video_device.cc could work for this?
Unfortunately there is no such extension today.
> 
> > Now when this EglImage is sent to libtegrav4l2, it is mapped with the
> > corresponding decoder buffer created in REQBUF() call.
> > This way there is one map of EglImage, texture and decoder buffer.
> 
> My understanding is you mean the buffer is bound to a texture? If so, then it
> also seems like we could use the current bind texture to eglimage calls?
The libtegrav4l2 talks to another internal library which actually creates the
YUV buffer. This is what is given to the AVP and where the decoded output is
actually filled.
There is a corresponding RGB buffer created when the EGLImage is called, this is
owned by the graphics library. While enqueuing buffers for CAPTURE PLANE, there
is a conversion performed to do YUV to RGB.
> 
> > When any buffer is enqueued in QBUF, the library sends it down to the
decoder.
> > Once the decoder buffer is ready, the library uses graphics apis to populate
> the
> > corresponding EglImage with the RGB data and then pushes into a queue
thereby
> > making it available for DQBUF after which this buffer can be used only when
it
> > is back in QBUF call.
> > This way the buffer ownership is managed.
> > So in summary the library uses queues and does all the buffer management
> between
> > decoder and the graphics stack for conversion.
> 
> What happens when this class calls REQBUFS(0), but the corresponding textures
> are being rendered to the screen?
> How will the buffers be freed if the GPU process crashes without calling
> REQBUFS(0)?
> What happens when the bound textures are deleted, but the HW codec is still
> using them?

I guess I am missing something here. I did not understand "REQBUFS(0) is called
but corresponding textures are being rendered ?". Doesn't DestroyOutputBuffers()
call guarantee that buffers on CAPTURE plane are no longer used.
I will confirm about the buffer freeing in gpu process crash scenario.
The last scenario (bound texture are deleted but HW codec is still using them)
is taken care by the conversion step performed using the library. The texture is
bound to the EGlImage. So that binding will fail. Since the libtegrav4l2 has the
EglImage backed by a RGB buffer the conversion can happen. How can I test this
scenario ?

> 
> > 
> > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > I would like to understand the big picture here please.
> > > 
> > > We strive to stay as close as possible to using (and/or creating)
> > > platform-independent standards where we can, like the sequence above,
> instead
> > of
> > > providing custom calls for each platform. Removing this from here and TVDA
> is
> > a
> > > step into an opposite direction, and I would like to understand what
> technical
> > > difficulties force us to do this first.
> > > 
> > > Binding textures to EGLImages also serves to keep track of ownership.
There
> > are
> > > multiple users of the shared buffer, the renderer, the GPU that renders
the
> > > textures, this class and the HW codec. How is ownership/destruction
managed
> > and
> > > how is it ensured that the buffer is valid while any of the users are
still
> > > referring to/using it (both in userspace and in kernel)?
> > > 
> > > What happens if the renderer crashes and the codec is writing to the
> textures?
> > > What happens when this class is destroyed, but the texture is in the
> renderer?
> > > What happens when the whole Chrome crashes, but the HW codec is using a
> buffer
> > > (i.e. kernel has ownership)?
> > > 
> > > Could you please explain how is ownership managed for shared buffers on
> Tegra?
> > 
>

Pawel Osciak

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gpu/media/tegra_v4l2_video_device.cc#newcode23 content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; On 2014/02/12 10:11:55, shivdasp ...

6 years, 10 months ago (2014-02-13 10:42:54 UTC) #12

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] =
"/dev/video0";
On 2014/02/12 10:11:55, shivdasp wrote:
> This library internally talks to MM layer which talks to the device
> (/dev/tegra_avpchannel) which is the nvavp driver.

This means you will have to add it to sandbox rules in Chrome, right? So the
library should actually use the device path string provided from Chrome to
Open() and not have the string hardcoded in the library please.

> On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > On 2014/02/10 13:31:17, shivdasp wrote:
> > > This is a v4l2 decoder device name which we use to initialize a decoder
> > context
> > > within the libtegrav4l2 library.
> > > This can be anything really as long as decoder and encoder device names
are
> > > different since we do not open a v4l2 video device underneath.
Libtegrav4l2
> is
> > > really a pseudo implementation. I can change it /dev/tegra-dec and
> > > /dev/tegra-enc for it to mean tegra specific.
> > 
> > Which device is actually being used? Does the library just talk to DRM
driver
> > via custom ioctls?
> > 
> > > 
> > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > Is this is the codec device exposed by Tegra kernel driver?
> > > > 
> > > > You can't assume it will be this on all configurations.
> > > > Please use udev rules to create a codec specific device (see Exynos
> example
> > at
> > > >
> > >
> >
>
https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...)
> > > 
> > 
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:54: return (void*)offset;
On 2014/02/12 10:11:55, shivdasp wrote:
> Okay I will add Mmap and Munmap calls to the library and have it return the
> appropriate value internally.
> 

Great. Thank you!

> On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > On 2014/02/10 13:31:17, shivdasp wrote:
> > > When REQBUFS is called for OUTPUT_PLANE, the library creates internal
> buffers
> > > which can be shared with the AVP for decoding. AVP is a video processor
> which
> > > runs the firmware that actually does all the decoding. While creating this
> > > buffers, they are already mmapped to get a virtual address. The library
> > returns
> > > this address in QUERYBUF API. Hence there is not real need for mmap.
> > > When REQBUFS with 0 is called on OUTPUT_PLANE, these buffers are
internally
> > > unmapped and destroyed.
> > 
> > The library must be using some kind of an mmap call to get a mapping though.
> > Would it be possible to move it to be done on this call, as expected?
> > 
> > Also, how will this work for VEA, where CAPTURE buffers need to be mapped
> > instead?
> > 
> > > I will explain the need for CreateEGLImage in later comments.
> > > 
> > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > QUERYBUFS from V4L2VDA is used to acquire offsets for mmap, but from
> > > > CreateEGLImage to pass them for texture binding to the library? I'm
> guessing
> > > > that this works, because for the former you call QUERYBUFS with OUTPUT
> > buffer,
> > > > while in the latter you call it with CAPTURE?
> > > > 
> > > > Please don't do this. Suggested changes for EGLImages in my comments
> below. 
> > > > 
> > > > As for mmap, how (and from what) is the memory mapping actually acquired
> for
> > > the
> > > > buffer in the library? Why do this on QUERYBUFS and not on Mmap()? How
is
> it
> > > > managed and when it is destroyed? 
> > > 
> > 
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for
tegrav4l2 device.
On 2014/02/12 10:11:55, shivdasp wrote:
> If not in REQBUFS(0) then what will be the appropriate place to destroy the
> buffers ?

Sorry, perhaps I wasn't clear, the term "buffers" is a bit overloaded. I mean
the underlying memory. REQBUFS(0) may be called, but the actual memory that
backed the v4l2_buffers may have to live on if it's still tied to the textures.
This will be a common case actually, because we don't explicitly destroy
textures first unless they are dismissed. The memory should be then freed when
the textures are deleted, not on REQBUFS(0). I'm wondering if the library/driver
take this into account.
Of course, it's still possible for REQUBFS(0) to have to trigger destruction of
underlying memory, in case the textures get unbound and deleted before
REQBUFS(0) is called.

> V4L2VDA::DestroyOutputBuffers() calls REQBUFS(0) and hence we destroy it
there.
> How does the renderer then inform the ownership of textures ?

glDeleteTextures(). So the textures and the underlying memory may have to
outlive REQBUFS(0).

> 
> 
> 
> On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > On 2014/02/10 13:31:17, shivdasp wrote:
> > > Buffers are unmapped in REQBUFS(0) call and destroyed.
> > > Since there is no real need for mmap and munmap, we did not implement it
in
> > the
> > > library.
> > > So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and mmaps
the
> > > buffer whereas REQBUF(0) unmaps and destroys them.
> > 
> > We should not rely on V4L2VDA to be the only place where the underlying
memory
> > will be destroyed. Even if we REQBUFS(0) on these buffers, the renderer
> process
> > may still be keeping ownership of the textures bound to them. Is this taken
> into
> > account?
> > 
> > > 
> > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > If so, how is unmapping handled then? What if we want to free the
buffers
> > and
> > > > reallocate them? You cannot call REQBUFS(0) without unmapping the
buffers
> > > > first...
> > > 
> > 
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset
= reinterpret_cast<unsigned int>(egl_image);
On 2014/02/12 10:11:55, shivdasp wrote:
> The output is YUV420 planar.

Are all planes non-interleaved and contiguous in memory? If so, then you need to
use either V4L2_PIX_FMT_YVU420 ('YV12') or V4L2_PIX_FMT_YUV420 ('YU12'), please
see http://linuxtv.org/downloads/v4l-dvb-apis/re23.html.

Please don't use V4L2_PIX_FMT_NV12M if this is not what codec produces.

> I think rather than using the QUERYBUF to pass the
> EglImage handles and stuffing the required information I would rather
introduce
> a custom API (UseEglImage ?).
> I hope that is fine.

Yes, it's preferable over using QUERYBUF for this. But let's agree on the shape
of it. What would UseEglImage do?
Could we instead pass the offsets to eglCreateImageKHR?
Will we be able to also retain texture binding in V4L2VDA then?

> 
> On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > On 2014/02/10 13:31:17, shivdasp wrote:
> > > We are really passing in the EglImage handle here to the library.
> > 
> > EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed in an
> u32
> > variable.
> > 
> > The whole idea behind offsets is that they are usually not really offsets,
but
> > sort of platform-independent 32bit IDs. They are acquired from the V4L2
driver
> > (or library) via QUERYBUFS, and can be passed back to other calls to
uniquely
> > identify the buffers (e.g. to mmap).
> > 
> > The client is not supposed to generate them by itself and pass them to
> > QUERYBUFS.
> > 
> > > The library
> > > associates this with the corresponding v4l2_buffer on the CAPTURE plane
and
> > use
> > > the underlying conversion APIs to transform the decoder's yuv output into
> the
> > > egl image.
> > > We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where the
> > number
> > > of planes are checked with 2 (line #1660 of V4L2VDA).
> > 
> > You mean
> >
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu...
> > ? 
> > 
> > This is an overassumption from the time where there was only one format
> > supported.
> > The number of planes to be used should be taken from the v4l2_format struct,
> > returned from G_FMT. This assumption should be fixed.
> > 
> > From what I'm seeing here, your HW doesn't really use V4L2_PIX_FMT_NV12M?
> Which
> > fourcc format does it use? Are the planes separate memory buffers?
> > 
> > > 
> > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > mem_offset is u32 always, an unsigned int shouldn't be assigned to it.
> Also,
> > > > there are two planes, but passing only one offset is a bit inconsistent.
> > > > Although, why have two planes, if only one is used?
> > > 
> > 
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:168: }
On 2014/02/12 10:11:55, shivdasp wrote:
> There is no extension to create EglImages from dmabufs or the offsets at the
> moment unfortunately.
> I agree using the QUERYBUF for sending the EglImage can be misleading and I
will
> change it. V4L2VEA should not affected since we use the QUERYBUF for providing
> the actual offsets. This hack was only for sending the EglImages in case of
> CAPTURE PLANE of decoder.

Could you explain why is it not affected? VEA calls QUERYBUF on CAPTURE buffers.

> As I said earlier will adding a custom API (for now) to send the EglImages be
> okay ?
> 
> 
> On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > On 2014/02/10 13:31:17, shivdasp wrote:
> > > Since we started with implementing this as a V4L2-Like library we have
tried
> > to
> > > follow V4L2 syntax to provide the input & output buffers.
> > > QUERYBUF can be made into a custom call since it is doing very custom
thing
> > > here.
> > 
> > Please understand that:
> > 1. We are asking for a V4L2-like interface that conforms to the V4L2 API. A
> call
> > documented to work in the same way regardless of buffer type passed should
not
> > do otherwise, if it can be prevented. If it's expected to return values, it
> > shouldn't be accepting them instead. And so on.
> > Of course, this is an adapter class, so the actual inner workings may be
> > different and it's not always possible to do exactly the same thing, but
from
> > the point of view of the client the effects should be as close to what each
> call
> > is documented to do, as possible.
> > 
> > Otherwise this whole exercise of using V4L2 API is doing us more bad than
> good.
> > 
> > Please understand that the V4L2VDA and V4L2VEA classes will live on and will
> > work with multiple platforms. There will be many changes to them. People
> working
> > on them will expect behavior as documented in V4L2 API. Otherwise things
will
> > break (and not only for other platforms, but Tegra too) and it will be very
> > difficult to reason why.
> > 
> > So it's very important for Tegra V4L2Device to behave like V4L2 API
specifies
> > and not be tailored to how things are laid out in V4L2VDA currently.
> > 
> > 2. This V4L2Device class should work with V4L2VEA class as well. I don't
think
> > we can make it work if this hack on QUERYBUFS is here.
> > 
> > 
> > > If introducing another API is acceptable I can do that.
> > > 
> > > We do not yet have eglCreateImageKHR that accepts dmabufs. I can check
with
> > the
> > > our Graphics team but I don't there is any such plan to implement such
> > > extension.
> > > 
> > 
> > That's why I gave the option of using offsets. If you prefer not to use
> dmabufs,
> > could we please:
> > 
> > - provide offsets via querybufs from the driver/library
> > - pass those offsets to a new eglCreateImage extension and move it back to
> > V4L2VDA
> > - keep using texture binding API?
> > 
> > This should eliminate the need for this method as well.
> > 
> > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > After reading through it, this method feels unnecessary.
> > > > 
> > > > I'm assuming querybufs implementation in the library calls a method in
the
> > GPU
> > > > driver anyway?
> > > > 
> > > > This is basically redefining querybufs to do something completely
> different
> > > than
> > > > it normally does, turning it into a custom call. The offsets should be
> > coming
> > > > from the callee of QUERYBUFS, not the other way around, and there should
> be
> > no
> > > > custom side effects.
> > > > A non-V4L2, library-specific custom call would be better than this.
> > > > 
> > > > But why not implement eglCreateImageKHR that accepts dmabufs (or even
> > offsets)
> > > > that come from the v4l2 library to create EGL images, just like we do on
> > Mali
> > > on
> > > > Exynos?
> > > > 
> > > > Would it be possible to have an extension for eglCreateImage like Exynos
> > does
> > > > instead please? It doesn't seem to be much of a difference, instead of
> > calling
> > > > querybufs with custom arguments and having the library call something in
> the
> > > > driver, call eglCreateImage instead with custom arguments and have it do
> > > > everything?
> > > 
> > 
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:382:
glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image);
On 2014/02/12 10:11:55, shivdasp wrote:
> On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > On 2014/02/10 13:31:17, shivdasp wrote:
> > > The decoder's output buffers are created when REQBUFS(x) is called on
> > > CAPTURE_PLANE. These buffers are hardware buffers which can be shared with
> the
> > > AVP processor for decoder to write into.
> > 
> > By decoder do you mean V4L2VDA class?
> 
> No I meant the decoder entity within the library.
> > 
> > > Then V4L2VDA triggers ProvidePictureBuffers() on which textures are
created
> > and
> > > sent back in AssignPictureBuffers().
> > > Now V4L2VDA creates EglImages from these textures and sends each EglImage
> > handle
> > > to library using the QUERYBUF (but can use a custom call too). The
tegrav4l2
> > > library cannot create EglImages from DMABUFS like in Exynos since there is
> no
> > > such extension. We create EglImage from this texture itself so there is a
> > > binding between texture and eglImage.
> > 
> > Sounds like the eglCreateImage extension taking offsets I described in the
> > comment in tegra_v4l2_video_device.cc could work for this?
> Unfortunately there is no such extension today.
> > 
> > > Now when this EglImage is sent to libtegrav4l2, it is mapped with the
> > > corresponding decoder buffer created in REQBUF() call.
> > > This way there is one map of EglImage, texture and decoder buffer.
> > 
> > My understanding is you mean the buffer is bound to a texture? If so, then
it
> > also seems like we could use the current bind texture to eglimage calls?
> The libtegrav4l2 talks to another internal library which actually creates the
> YUV buffer. This is what is given to the AVP and where the decoded output is
> actually filled.
> There is a corresponding RGB buffer created when the EGLImage is called, this
is
> owned by the graphics library. While enqueuing buffers for CAPTURE PLANE,
there
> is a conversion performed to do YUV to RGB.

So the YUV buffers are tied to the textures somehow?

> > 
> > > When any buffer is enqueued in QBUF, the library sends it down to the
> decoder.
> > > Once the decoder buffer is ready, the library uses graphics apis to
populate
> > the
> > > corresponding EglImage with the RGB data and then pushes into a queue
> thereby
> > > making it available for DQBUF after which this buffer can be used only
when
> it
> > > is back in QBUF call.
> > > This way the buffer ownership is managed.
> > > So in summary the library uses queues and does all the buffer management
> > between
> > > decoder and the graphics stack for conversion.
> > 
> > What happens when this class calls REQBUFS(0), but the corresponding
textures
> > are being rendered to the screen?
> > How will the buffers be freed if the GPU process crashes without calling
> > REQBUFS(0)?
> > What happens when the bound textures are deleted, but the HW codec is still
> > using them?
> 
> I guess I am missing something here. I did not understand "REQBUFS(0) is
called
> but corresponding textures are being rendered ?". Doesn't
DestroyOutputBuffers()
> call guarantee that buffers on CAPTURE plane are no longer used.

The underlying memory can still be used as textures in the client of VDA class.
It only guarantees that they are not used anymore by the codec class as
v4l2_buffers.

> I will confirm about the buffer freeing in gpu process crash scenario.

Thanks.

> The last scenario (bound texture are deleted but HW codec is still using them)
> is taken care by the conversion step performed using the library.
> The texture is
> bound to the EGlImage. So that binding will fail. Since the libtegrav4l2 has
the
> EglImage backed by a RGB buffer the conversion can happen. How can I test this
> scenario ?

This is just a case where there is a bug in the code, but my point is that the
ownership should be shared with the kernel as well, so if the userspace (Chrome)
dies, the kernel will properly clean up.

> 
> > 
> > > 
> > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > I would like to understand the big picture here please.
> > > > 
> > > > We strive to stay as close as possible to using (and/or creating)
> > > > platform-independent standards where we can, like the sequence above,
> > instead
> > > of
> > > > providing custom calls for each platform. Removing this from here and
TVDA
> > is
> > > a
> > > > step into an opposite direction, and I would like to understand what
> > technical
> > > > difficulties force us to do this first.
> > > > 
> > > > Binding textures to EGLImages also serves to keep track of ownership.
> There
> > > are
> > > > multiple users of the shared buffer, the renderer, the GPU that renders
> the
> > > > textures, this class and the HW codec. How is ownership/destruction
> managed
> > > and
> > > > how is it ensured that the buffer is valid while any of the users are
> still
> > > > referring to/using it (both in userspace and in kernel)?
> > > > 
> > > > What happens if the renderer crashes and the codec is writing to the
> > textures?
> > > > What happens when this class is destroyed, but the texture is in the
> > renderer?
> > > > What happens when the whole Chrome crashes, but the HW codec is using a
> > buffer
> > > > (i.e. kernel has ownership)?
> > > > 
> > > > Could you please explain how is ownership managed for shared buffers on
> > Tegra?
> > > 
> > 
>

shivdasp

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media/exynos_v4l2_video_device.cc File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode123 content/common/gpu/media/exynos_v4l2_video_device.cc:123: EGLImageKHR egl_image = eglCreateImageKHR( make_context_current_ is already done before ...

6 years, 10 months ago (2014-02-14 03:06:44 UTC) #13

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/exynos_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/exynos_v4l2_video_device.cc:123: EGLImageKHR egl_image
= eglCreateImageKHR(
make_context_current_ is already done before calling this function. That should
be sufficient I believe. Or it has to be done again here ?
Also relatedly , how do I get the egl_context in which the textures are created
?
I am using eglGetCurrentContext() in the TegraV4L2Device.

On 2014/02/10 06:36:17, Pawel Osciak wrote:
> We need to make GL context current to be able to call this.
> We should at least require this in the doc that it should be done before
calling
> this method.

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] =
"/dev/video0";
Yes /dev/tegra_avpchannel will be added in the sandbox whitelist for now.
Work is in progress to make this sandbox friendly in our MM stack. Most probably
by the time we get this change merged we should have completed it so even
whitelisting may not be required. In worst case we will whitelist it sandbox
code.

As I said earlier this "device name" sent to the library is just dummy for the
libtegrav4l2 to create a decoder instance. It just has to be different than the
encoder "device name".
Shall I just keep it "tegra_dec" and "tegra_enc" to not mislead it for something
like a true v4l2 device name ?

On 2014/02/13 10:42:54, Pawel Osciak wrote:
> On 2014/02/12 10:11:55, shivdasp wrote:
> > This library internally talks to MM layer which talks to the device
> > (/dev/tegra_avpchannel) which is the nvavp driver.
> 
> This means you will have to add it to sandbox rules in Chrome, right? So the
> library should actually use the device path string provided from Chrome to
> Open() and not have the string hardcoded in the library please.
> 
> > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > This is a v4l2 decoder device name which we use to initialize a decoder
> > > context
> > > > within the libtegrav4l2 library.
> > > > This can be anything really as long as decoder and encoder device names
> are
> > > > different since we do not open a v4l2 video device underneath.
> Libtegrav4l2
> > is
> > > > really a pseudo implementation. I can change it /dev/tegra-dec and
> > > > /dev/tegra-enc for it to mean tegra specific.
> > > 
> > > Which device is actually being used? Does the library just talk to DRM
> driver
> > > via custom ioctls?
> > > 
> > > > 
> > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > Is this is the codec device exposed by Tegra kernel driver?
> > > > > 
> > > > > You can't assume it will be this on all configurations.
> > > > > Please use udev rules to create a codec specific device (see Exynos
> > example
> > > at
> > > > >
> > > >
> > >
> >
>
https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...)
> > > > 
> > > 
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for
tegrav4l2 device.
Understood. I believe the dmabuf export mechanism takes care of this in Exynos
since the buffer backed memory is in kernel and hence the deletion is kind of
synchronized.

I see in DestroyOutputbuffers(), before calling DismissPicture(), the
eglDestroyImageKHR() is called thereby the textures are unbound or perhaps the
deletion there is also delayed ?
How is the texture being rendered if the eglImage it is bound to is also
destroyed ?

On 2014/02/13 10:42:54, Pawel Osciak wrote:
> On 2014/02/12 10:11:55, shivdasp wrote:
> > If not in REQBUFS(0) then what will be the appropriate place to destroy the
> > buffers ?
> 
> Sorry, perhaps I wasn't clear, the term "buffers" is a bit overloaded. I mean
> the underlying memory. REQBUFS(0) may be called, but the actual memory that
> backed the v4l2_buffers may have to live on if it's still tied to the
textures.
> This will be a common case actually, because we don't explicitly destroy
> textures first unless they are dismissed. The memory should be then freed when
> the textures are deleted, not on REQBUFS(0). I'm wondering if the
library/driver
> take this into account.
> Of course, it's still possible for REQUBFS(0) to have to trigger destruction
of
> underlying memory, in case the textures get unbound and deleted before
> REQBUFS(0) is called.
> 
> > V4L2VDA::DestroyOutputBuffers() calls REQBUFS(0) and hence we destroy it
> there.
> > How does the renderer then inform the ownership of textures ?
> 
> glDeleteTextures(). So the textures and the underlying memory may have to
> outlive REQBUFS(0).
> 
> > 
> > 
> > 
> > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > Buffers are unmapped in REQBUFS(0) call and destroyed.
> > > > Since there is no real need for mmap and munmap, we did not implement it
> in
> > > the
> > > > library.
> > > > So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and mmaps
> the
> > > > buffer whereas REQBUF(0) unmaps and destroys them.
> > > 
> > > We should not rely on V4L2VDA to be the only place where the underlying
> memory
> > > will be destroyed. Even if we REQBUFS(0) on these buffers, the renderer
> > process
> > > may still be keeping ownership of the textures bound to them. Is this
taken
> > into
> > > account?
> > > 
> > > > 
> > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > If so, how is unmapping handled then? What if we want to free the
> buffers
> > > and
> > > > > reallocate them? You cannot call REQBUFS(0) without unmapping the
> buffers
> > > > > first...
> > > > 
> > > 
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset
= reinterpret_cast<unsigned int>(egl_image);
On 2014/02/13 10:42:54, Pawel Osciak wrote:
> On 2014/02/12 10:11:55, shivdasp wrote:
> > The output is YUV420 planar.
> 
> Are all planes non-interleaved and contiguous in memory? If so, then you need
to
> use either V4L2_PIX_FMT_YVU420 ('YV12') or V4L2_PIX_FMT_YUV420 ('YU12'),
please
> see http://linuxtv.org/downloads/v4l-dvb-apis/re23.html.
> 
> Please don't use V4L2_PIX_FMT_NV12M if this is not what codec produces.
Okay I will change the pixel format. However there are some DHECK_EQ() code in
V4L2VDA to check against V4L2_PIX_FMT_NV12M.
They also exist for num_planes. I will have to introduce private member
functions for ExynosV4L2Device and TegraV4L2Device to check against them rather
than hardcoded values in V4L2VDA. Will that be fine ?
> 
> > I think rather than using the QUERYBUF to pass the
> > EglImage handles and stuffing the required information I would rather
> introduce
> > a custom API (UseEglImage ?).
> > I hope that is fine.
> 
> Yes, it's preferable over using QUERYBUF for this. But let's agree on the
shape
> of it. What would UseEglImage do?
> Could we instead pass the offsets to eglCreateImageKHR?
> Will we be able to also retain texture binding in V4L2VDA then?
I was thinking of int UseEglImage(int buffer_index, EGLImageKHR egl_image);
We basically need to send the EglImage created for a particular buffer_index so
the library can convert YUV into its respective EglImage.
We cannot send offsets to eglCreateImageKHR unless we have extension. However
the buffer_index internally is the mapping for identifying the eglImage so in a
way that will work like an offset.
> 
> > 
> > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > We are really passing in the EglImage handle here to the library.
> > > 
> > > EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed in
an
> > u32
> > > variable.
> > > 
> > > The whole idea behind offsets is that they are usually not really offsets,
> but
> > > sort of platform-independent 32bit IDs. They are acquired from the V4L2
> driver
> > > (or library) via QUERYBUFS, and can be passed back to other calls to
> uniquely
> > > identify the buffers (e.g. to mmap).
> > > 
> > > The client is not supposed to generate them by itself and pass them to
> > > QUERYBUFS.
> > > 
> > > > The library
> > > > associates this with the corresponding v4l2_buffer on the CAPTURE plane
> and
> > > use
> > > > the underlying conversion APIs to transform the decoder's yuv output
into
> > the
> > > > egl image.
> > > > We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where
the
> > > number
> > > > of planes are checked with 2 (line #1660 of V4L2VDA).
> > > 
> > > You mean
> > >
> >
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu...
> > > ? 
> > > 
> > > This is an overassumption from the time where there was only one format
> > > supported.
> > > The number of planes to be used should be taken from the v4l2_format
struct,
> > > returned from G_FMT. This assumption should be fixed.
> > > 
> > > From what I'm seeing here, your HW doesn't really use V4L2_PIX_FMT_NV12M?
> > Which
> > > fourcc format does it use? Are the planes separate memory buffers?
> > > 
> > > > 
> > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > mem_offset is u32 always, an unsigned int shouldn't be assigned to it.
> > Also,
> > > > > there are two planes, but passing only one offset is a bit
inconsistent.
> > > > > Although, why have two planes, if only one is used?
> > > > 
> > > 
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:168: }
The fd returned by TegraV4L2Open() knows whether it was for a decoder instance
or encoder instance.
Hence QUERYBUF on CAPTURE_PLANE for an encoder will behave as per the V4L2
specification. An alternate hacky behavior was needed for decoder instance to
send EglImages which we are now sending in custom call.

On 2014/02/13 10:42:54, Pawel Osciak wrote:
> On 2014/02/12 10:11:55, shivdasp wrote:
> > There is no extension to create EglImages from dmabufs or the offsets at the
> > moment unfortunately.
> > I agree using the QUERYBUF for sending the EglImage can be misleading and I
> will
> > change it. V4L2VEA should not affected since we use the QUERYBUF for
providing
> > the actual offsets. This hack was only for sending the EglImages in case of
> > CAPTURE PLANE of decoder.
> 
> Could you explain why is it not affected? VEA calls QUERYBUF on CAPTURE
buffers.
> 
> > As I said earlier will adding a custom API (for now) to send the EglImages
be
> > okay ?
> > 
> > 
> > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > Since we started with implementing this as a V4L2-Like library we have
> tried
> > > to
> > > > follow V4L2 syntax to provide the input & output buffers.
> > > > QUERYBUF can be made into a custom call since it is doing very custom
> thing
> > > > here.
> > > 
> > > Please understand that:
> > > 1. We are asking for a V4L2-like interface that conforms to the V4L2 API.
A
> > call
> > > documented to work in the same way regardless of buffer type passed should
> not
> > > do otherwise, if it can be prevented. If it's expected to return values,
it
> > > shouldn't be accepting them instead. And so on.
> > > Of course, this is an adapter class, so the actual inner workings may be
> > > different and it's not always possible to do exactly the same thing, but
> from
> > > the point of view of the client the effects should be as close to what
each
> > call
> > > is documented to do, as possible.
> > > 
> > > Otherwise this whole exercise of using V4L2 API is doing us more bad than
> > good.
> > > 
> > > Please understand that the V4L2VDA and V4L2VEA classes will live on and
will
> > > work with multiple platforms. There will be many changes to them. People
> > working
> > > on them will expect behavior as documented in V4L2 API. Otherwise things
> will
> > > break (and not only for other platforms, but Tegra too) and it will be
very
> > > difficult to reason why.
> > > 
> > > So it's very important for Tegra V4L2Device to behave like V4L2 API
> specifies
> > > and not be tailored to how things are laid out in V4L2VDA currently.
> > > 
> > > 2. This V4L2Device class should work with V4L2VEA class as well. I don't
> think
> > > we can make it work if this hack on QUERYBUFS is here.
> > > 
> > > 
> > > > If introducing another API is acceptable I can do that.
> > > > 
> > > > We do not yet have eglCreateImageKHR that accepts dmabufs. I can check
> with
> > > the
> > > > our Graphics team but I don't there is any such plan to implement such
> > > > extension.
> > > > 
> > > 
> > > That's why I gave the option of using offsets. If you prefer not to use
> > dmabufs,
> > > could we please:
> > > 
> > > - provide offsets via querybufs from the driver/library
> > > - pass those offsets to a new eglCreateImage extension and move it back to
> > > V4L2VDA
> > > - keep using texture binding API?
> > > 
> > > This should eliminate the need for this method as well.
> > > 
> > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > After reading through it, this method feels unnecessary.
> > > > > 
> > > > > I'm assuming querybufs implementation in the library calls a method in
> the
> > > GPU
> > > > > driver anyway?
> > > > > 
> > > > > This is basically redefining querybufs to do something completely
> > different
> > > > than
> > > > > it normally does, turning it into a custom call. The offsets should be
> > > coming
> > > > > from the callee of QUERYBUFS, not the other way around, and there
should
> > be
> > > no
> > > > > custom side effects.
> > > > > A non-V4L2, library-specific custom call would be better than this.
> > > > > 
> > > > > But why not implement eglCreateImageKHR that accepts dmabufs (or even
> > > offsets)
> > > > > that come from the v4l2 library to create EGL images, just like we do
on
> > > Mali
> > > > on
> > > > > Exynos?
> > > > > 
> > > > > Would it be possible to have an extension for eglCreateImage like
Exynos
> > > does
> > > > > instead please? It doesn't seem to be much of a difference, instead of
> > > calling
> > > > > querybufs with custom arguments and having the library call something
in
> > the
> > > > > driver, call eglCreateImage instead with custom arguments and have it
do
> > > > > everything?
> > > > 
> > > 
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:382:
glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image);
On 2014/02/13 10:42:54, Pawel Osciak wrote:
> On 2014/02/12 10:11:55, shivdasp wrote:
> > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > The decoder's output buffers are created when REQBUFS(x) is called on
> > > > CAPTURE_PLANE. These buffers are hardware buffers which can be shared
with
> > the
> > > > AVP processor for decoder to write into.
> > > 
> > > By decoder do you mean V4L2VDA class?
> > 
> > No I meant the decoder entity within the library.
> > > 
> > > > Then V4L2VDA triggers ProvidePictureBuffers() on which textures are
> created
> > > and
> > > > sent back in AssignPictureBuffers().
> > > > Now V4L2VDA creates EglImages from these textures and sends each
EglImage
> > > handle
> > > > to library using the QUERYBUF (but can use a custom call too). The
> tegrav4l2
> > > > library cannot create EglImages from DMABUFS like in Exynos since there
is
> > no
> > > > such extension. We create EglImage from this texture itself so there is
a
> > > > binding between texture and eglImage.
> > > 
> > > Sounds like the eglCreateImage extension taking offsets I described in the
> > > comment in tegra_v4l2_video_device.cc could work for this?
> > Unfortunately there is no such extension today.
> > > 
> > > > Now when this EglImage is sent to libtegrav4l2, it is mapped with the
> > > > corresponding decoder buffer created in REQBUF() call.
> > > > This way there is one map of EglImage, texture and decoder buffer.
> > > 
> > > My understanding is you mean the buffer is bound to a texture? If so, then
> it
> > > also seems like we could use the current bind texture to eglimage calls?
> > The libtegrav4l2 talks to another internal library which actually creates
the
> > YUV buffer. This is what is given to the AVP and where the decoded output is
> > actually filled.
> > There is a corresponding RGB buffer created when the EGLImage is called,
this
> is
> > owned by the graphics library. While enqueuing buffers for CAPTURE PLANE,
> there
> > is a conversion performed to do YUV to RGB.
> 
> So the YUV buffers are tied to the textures somehow?
We send texture_id to eglCreateImageKHR and bind it there. And eglImage is sent
to the library which maps it to its YUV buffer.
My subsequent patch will probably make this clearer.
> 
> > > 
> > > > When any buffer is enqueued in QBUF, the library sends it down to the
> > decoder.
> > > > Once the decoder buffer is ready, the library uses graphics apis to
> populate
> > > the
> > > > corresponding EglImage with the RGB data and then pushes into a queue
> > thereby
> > > > making it available for DQBUF after which this buffer can be used only
> when
> > it
> > > > is back in QBUF call.
> > > > This way the buffer ownership is managed.
> > > > So in summary the library uses queues and does all the buffer management
> > > between
> > > > decoder and the graphics stack for conversion.
> > > 
> > > What happens when this class calls REQBUFS(0), but the corresponding
> textures
> > > are being rendered to the screen?
> > > How will the buffers be freed if the GPU process crashes without calling
> > > REQBUFS(0)?
> > > What happens when the bound textures are deleted, but the HW codec is
still
> > > using them?
> > 
> > I guess I am missing something here. I did not understand "REQBUFS(0) is
> called
> > but corresponding textures are being rendered ?". Doesn't
> DestroyOutputBuffers()
> > call guarantee that buffers on CAPTURE plane are no longer used.
> 
> The underlying memory can still be used as textures in the client of VDA
class.
> It only guarantees that they are not used anymore by the codec class as
> v4l2_buffers.
> 
> > I will confirm about the buffer freeing in gpu process crash scenario.
> 
> Thanks.
If the EGLimage is destroyed I think the texture becomes unbound. I was
debugging some scenario and I get errors as "texture not bound or texture id 0"
kind of errors from gles2_cmd_decoder.cc. These I guess represent this crash
scenario. So it is taken care of already while validating the texture before
rendering ?
And I observe similar kind of logs on Exynos too.
Do you have a test case or steps of validating this ? Will killing gpu process
while video playback validate this path ?
> 
> > The last scenario (bound texture are deleted but HW codec is still using
them)
> > is taken care by the conversion step performed using the library.
> > The texture is
> > bound to the EGlImage. So that binding will fail. Since the libtegrav4l2 has
> the
> > EglImage backed by a RGB buffer the conversion can happen. How can I test
this
> > scenario ?
> 
> This is just a case where there is a bug in the code, but my point is that the
> ownership should be shared with the kernel as well, so if the userspace
(Chrome)
> dies, the kernel will properly clean up.
> 
> > 
> > > 
> > > > 
> > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > I would like to understand the big picture here please.
> > > > > 
> > > > > We strive to stay as close as possible to using (and/or creating)
> > > > > platform-independent standards where we can, like the sequence above,
> > > instead
> > > > of
> > > > > providing custom calls for each platform. Removing this from here and
> TVDA
> > > is
> > > > a
> > > > > step into an opposite direction, and I would like to understand what
> > > technical
> > > > > difficulties force us to do this first.
> > > > > 
> > > > > Binding textures to EGLImages also serves to keep track of ownership.
> > There
> > > > are
> > > > > multiple users of the shared buffer, the renderer, the GPU that
renders
> > the
> > > > > textures, this class and the HW codec. How is ownership/destruction
> > managed
> > > > and
> > > > > how is it ensured that the buffer is valid while any of the users are
> > still
> > > > > referring to/using it (both in userspace and in kernel)?
> > > > > 
> > > > > What happens if the renderer crashes and the codec is writing to the
> > > textures?
> > > > > What happens when this class is destroyed, but the texture is in the
> > > renderer?
> > > > > What happens when the whole Chrome crashes, but the HW codec is using
a
> > > buffer
> > > > > (i.e. kernel has ownership)?
> > > > > 
> > > > > Could you please explain how is ownership managed for shared buffers
on
> > > Tegra?
> > > > 
> > > 
> > 
>

Pawel Osciak

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gpu/media/exynos_v4l2_video_device.cc File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode123 content/common/gpu/media/exynos_v4l2_video_device.cc:123: EGLImageKHR egl_image = eglCreateImageKHR( On 2014/02/14 03:06:45, shivdasp wrote: ...

6 years, 10 months ago (2014-02-14 07:36:10 UTC) #14

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/exynos_v4l2_video_device.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/exynos_v4l2_video_device.cc:123: EGLImageKHR egl_image
= eglCreateImageKHR(
On 2014/02/14 03:06:45, shivdasp wrote:
> make_context_current_ is already done before calling this function. That
should
> be sufficient I believe. Or it has to be done again here ?

Yes, I'm just saying this method should explicitly state so in its documentation
at least.
But it's not really relevant anymore, this method should be no longer needed.

> Also relatedly , how do I get the egl_context in which the textures are
created
> ?
> I am using eglGetCurrentContext() in the TegraV4L2Device.

The context in which textures are created is the one you get as an argument to
the device right now, you restored that argument recently.

> On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > We need to make GL context current to be able to call this.
> > We should at least require this in the doc that it should be done before
> calling
> > this method.
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] =
"/dev/video0";
On 2014/02/14 03:06:45, shivdasp wrote:
> Yes /dev/tegra_avpchannel will be added in the sandbox whitelist for now.
> Work is in progress to make this sandbox friendly in our MM stack. Most
probably
> by the time we get this change merged we should have completed it so even
> whitelisting may not be required. In worst case we will whitelist it sandbox
> code.

Why it may not be required? How else would it be accessible if it's not
whitelisted?

> As I said earlier this "device name" sent to the library is just dummy for the
> libtegrav4l2 to create a decoder instance. It just has to be different than
the
> encoder "device name".
> Shall I just keep it "tegra_dec" and "tegra_enc" to not mislead it for
something
> like a true v4l2 device name ?

Please, the library has to open the device with the same name as this method
provides to it.
 
> On 2014/02/13 10:42:54, Pawel Osciak wrote:
> > On 2014/02/12 10:11:55, shivdasp wrote:
> > > This library internally talks to MM layer which talks to the device
> > > (/dev/tegra_avpchannel) which is the nvavp driver.
> > 
> > This means you will have to add it to sandbox rules in Chrome, right? So the
> > library should actually use the device path string provided from Chrome to
> > Open() and not have the string hardcoded in the library please.
> > 
> > > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > > This is a v4l2 decoder device name which we use to initialize a
decoder
> > > > context
> > > > > within the libtegrav4l2 library.
> > > > > This can be anything really as long as decoder and encoder device
names
> > are
> > > > > different since we do not open a v4l2 video device underneath.
> > Libtegrav4l2
> > > is
> > > > > really a pseudo implementation. I can change it /dev/tegra-dec and
> > > > > /dev/tegra-enc for it to mean tegra specific.
> > > > 
> > > > Which device is actually being used? Does the library just talk to DRM
> > driver
> > > > via custom ioctls?
> > > > 
> > > > > 
> > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > > Is this is the codec device exposed by Tegra kernel driver?
> > > > > > 
> > > > > > You can't assume it will be this on all configurations.
> > > > > > Please use udev rules to create a codec specific device (see Exynos
> > > example
> > > > at
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...)
> > > > > 
> > > > 
> > > 
> > 
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for
tegrav4l2 device.
On 2014/02/14 03:06:45, shivdasp wrote:
> Understood. I believe the dmabuf export mechanism takes care of this in Exynos
> since the buffer backed memory is in kernel and hence the deletion is kind of
> synchronized.

Yes, although it's not "synchronized", but the ownership is managed and
refcounted in the driver and the memory is freed when there is no more users.
How does Tegra manage this?

> 
> I see in DestroyOutputbuffers(), before calling DismissPicture(), the
> eglDestroyImageKHR() is called thereby the textures are unbound or perhaps the
> deletion there is also delayed ?

Destroying the image doesn't destroy the textures, but it does unbind them.

> How is the texture being rendered if the eglImage it is bound to is also
> destroyed ?

Texture is a separate entity from the eglImage, they share the underlying memory
after binding, but after eglImage is destroyed, the memory is not freed and it
lives on in the texture.

> 
> On 2014/02/13 10:42:54, Pawel Osciak wrote:
> > On 2014/02/12 10:11:55, shivdasp wrote:
> > > If not in REQBUFS(0) then what will be the appropriate place to destroy
the
> > > buffers ?
> > 
> > Sorry, perhaps I wasn't clear, the term "buffers" is a bit overloaded. I
mean
> > the underlying memory. REQBUFS(0) may be called, but the actual memory that
> > backed the v4l2_buffers may have to live on if it's still tied to the
> textures.
> > This will be a common case actually, because we don't explicitly destroy
> > textures first unless they are dismissed. The memory should be then freed
when
> > the textures are deleted, not on REQBUFS(0). I'm wondering if the
> library/driver
> > take this into account.
> > Of course, it's still possible for REQUBFS(0) to have to trigger destruction
> of
> > underlying memory, in case the textures get unbound and deleted before
> > REQBUFS(0) is called.
> > 
> > > V4L2VDA::DestroyOutputBuffers() calls REQBUFS(0) and hence we destroy it
> > there.
> > > How does the renderer then inform the ownership of textures ?
> > 
> > glDeleteTextures(). So the textures and the underlying memory may have to
> > outlive REQBUFS(0).
> > 
> > > 
> > > 
> > > 
> > > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > > Buffers are unmapped in REQBUFS(0) call and destroyed.
> > > > > Since there is no real need for mmap and munmap, we did not implement
it
> > in
> > > > the
> > > > > library.
> > > > > So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and
mmaps
> > the
> > > > > buffer whereas REQBUF(0) unmaps and destroys them.
> > > > 
> > > > We should not rely on V4L2VDA to be the only place where the underlying
> > memory
> > > > will be destroyed. Even if we REQBUFS(0) on these buffers, the renderer
> > > process
> > > > may still be keeping ownership of the textures bound to them. Is this
> taken
> > > into
> > > > account?
> > > > 
> > > > > 
> > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > > If so, how is unmapping handled then? What if we want to free the
> > buffers
> > > > and
> > > > > > reallocate them? You cannot call REQBUFS(0) without unmapping the
> > buffers
> > > > > > first...
> > > > > 
> > > > 
> > > 
> > 
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset
= reinterpret_cast<unsigned int>(egl_image);
On 2014/02/14 03:06:45, shivdasp wrote:
> On 2014/02/13 10:42:54, Pawel Osciak wrote:
> > On 2014/02/12 10:11:55, shivdasp wrote:
> > > The output is YUV420 planar.
> > 
> > Are all planes non-interleaved and contiguous in memory? If so, then you
need
> to
> > use either V4L2_PIX_FMT_YVU420 ('YV12') or V4L2_PIX_FMT_YUV420 ('YU12'),
> please
> > see http://linuxtv.org/downloads/v4l-dvb-apis/re23.html.
> > 
> > Please don't use V4L2_PIX_FMT_NV12M if this is not what codec produces.
> Okay I will change the pixel format. However there are some DHECK_EQ() code in
> V4L2VDA to check against V4L2_PIX_FMT_NV12M.
> They also exist for num_planes. I will have to introduce private member
> functions for ExynosV4L2Device and TegraV4L2Device to check against them
rather
> than hardcoded values in V4L2VDA. Will that be fine ?

As I said, those checks and should be fixed to use the actual format given by
the device.
There is no need to have private member methods for devices. V4L2 API gives you
methods to query and get formats as well as information how many planes each
format uses. Please see documentation for v4l2_pix_format{,_mplane}
in http://linuxtv.org/downloads/v4l-dvb-apis/vidioc-g-fmt.html.

> > 
> > > I think rather than using the QUERYBUF to pass the
> > > EglImage handles and stuffing the required information I would rather
> > introduce
> > > a custom API (UseEglImage ?).
> > > I hope that is fine.
> > 
> > Yes, it's preferable over using QUERYBUF for this. But let's agree on the
> shape
> > of it. What would UseEglImage do?
> > Could we instead pass the offsets to eglCreateImageKHR?
> > Will we be able to also retain texture binding in V4L2VDA then?
> I was thinking of int UseEglImage(int buffer_index, EGLImageKHR egl_image);
> We basically need to send the EglImage created for a particular buffer_index
so
> the library can convert YUV into its respective EglImage.
> We cannot send offsets to eglCreateImageKHR unless we have extension. However
> the buffer_index internally is the mapping for identifying the eglImage so in
a
> way that will work like an offset.

I really don't see why it should be an issue to create such an extension with
very minimal effort.
It should be a trivial wrapper around eglCreateImageKHR. Your library has to
call some function in the driver anyway to do this, so why not extract that code
and move it to a special case in eglCreateImageKHR instead? The code is already
written I assume, since you use it, it just needs to be moved to a different
location (i.e. eglCreateImageKHR implementation).

> > 
> > > 
> > > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > > We are really passing in the EglImage handle here to the library.
> > > > 
> > > > EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed
in
> an
> > > u32
> > > > variable.
> > > > 
> > > > The whole idea behind offsets is that they are usually not really
offsets,
> > but
> > > > sort of platform-independent 32bit IDs. They are acquired from the V4L2
> > driver
> > > > (or library) via QUERYBUFS, and can be passed back to other calls to
> > uniquely
> > > > identify the buffers (e.g. to mmap).
> > > > 
> > > > The client is not supposed to generate them by itself and pass them to
> > > > QUERYBUFS.
> > > > 
> > > > > The library
> > > > > associates this with the corresponding v4l2_buffer on the CAPTURE
plane
> > and
> > > > use
> > > > > the underlying conversion APIs to transform the decoder's yuv output
> into
> > > the
> > > > > egl image.
> > > > > We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where
> the
> > > > number
> > > > > of planes are checked with 2 (line #1660 of V4L2VDA).
> > > > 
> > > > You mean
> > > >
> > >
> >
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu...
> > > > ? 
> > > > 
> > > > This is an overassumption from the time where there was only one format
> > > > supported.
> > > > The number of planes to be used should be taken from the v4l2_format
> struct,
> > > > returned from G_FMT. This assumption should be fixed.
> > > > 
> > > > From what I'm seeing here, your HW doesn't really use
V4L2_PIX_FMT_NV12M?
> > > Which
> > > > fourcc format does it use? Are the planes separate memory buffers?
> > > > 
> > > > > 
> > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > > mem_offset is u32 always, an unsigned int shouldn't be assigned to
it.
> > > Also,
> > > > > > there are two planes, but passing only one offset is a bit
> inconsistent.
> > > > > > Although, why have two planes, if only one is used?
> > > > > 
> > > > 
> > > 
> > 
>

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left):

https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:382:
glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image);
On 2014/02/14 03:06:45, shivdasp wrote:
> On 2014/02/13 10:42:54, Pawel Osciak wrote:
> > On 2014/02/12 10:11:55, shivdasp wrote:
> > > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > > The decoder's output buffers are created when REQBUFS(x) is called on
> > > > > CAPTURE_PLANE. These buffers are hardware buffers which can be shared
> with
> > > the
> > > > > AVP processor for decoder to write into.
> > > > 
> > > > By decoder do you mean V4L2VDA class?
> > > 
> > > No I meant the decoder entity within the library.
> > > > 
> > > > > Then V4L2VDA triggers ProvidePictureBuffers() on which textures are
> > created
> > > > and
> > > > > sent back in AssignPictureBuffers().
> > > > > Now V4L2VDA creates EglImages from these textures and sends each
> EglImage
> > > > handle
> > > > > to library using the QUERYBUF (but can use a custom call too). The
> > tegrav4l2
> > > > > library cannot create EglImages from DMABUFS like in Exynos since
there
> is
> > > no
> > > > > such extension. We create EglImage from this texture itself so there
is
> a
> > > > > binding between texture and eglImage.
> > > > 
> > > > Sounds like the eglCreateImage extension taking offsets I described in
the
> > > > comment in tegra_v4l2_video_device.cc could work for this?
> > > Unfortunately there is no such extension today.
> > > > 
> > > > > Now when this EglImage is sent to libtegrav4l2, it is mapped with the
> > > > > corresponding decoder buffer created in REQBUF() call.
> > > > > This way there is one map of EglImage, texture and decoder buffer.
> > > > 
> > > > My understanding is you mean the buffer is bound to a texture? If so,
then
> > it
> > > > also seems like we could use the current bind texture to eglimage calls?
> > > The libtegrav4l2 talks to another internal library which actually creates
> the
> > > YUV buffer. This is what is given to the AVP and where the decoded output
is
> > > actually filled.
> > > There is a corresponding RGB buffer created when the EGLImage is called,
> this
> > is
> > > owned by the graphics library. While enqueuing buffers for CAPTURE PLANE,
> > there
> > > is a conversion performed to do YUV to RGB.
> > 
> > So the YUV buffers are tied to the textures somehow?
> We send texture_id to eglCreateImageKHR and bind it there. And eglImage is
sent
> to the library which maps it to its YUV buffer.
> My subsequent patch will probably make this clearer.

Wait, where do you send texture_id to eglCreateImageKHR? I don't see that in the
code above.
Do you have an extension for eglCreateImageKHR to also accept texture ids and
bind during creation? Why not do this in the standard way, i.e. by using
GL_OES_EGL_image_external? I would expect your EGL implementation already has it
for other things (and it's an extension created by NVIDIA too)...
Or did you mean some other function?

> > 
> > > > 
> > > > > When any buffer is enqueued in QBUF, the library sends it down to the
> > > decoder.
> > > > > Once the decoder buffer is ready, the library uses graphics apis to
> > populate
> > > > the
> > > > > corresponding EglImage with the RGB data and then pushes into a queue
> > > thereby
> > > > > making it available for DQBUF after which this buffer can be used only
> > when
> > > it
> > > > > is back in QBUF call.
> > > > > This way the buffer ownership is managed.
> > > > > So in summary the library uses queues and does all the buffer
management
> > > > between
> > > > > decoder and the graphics stack for conversion.
> > > > 
> > > > What happens when this class calls REQBUFS(0), but the corresponding
> > textures
> > > > are being rendered to the screen?
> > > > How will the buffers be freed if the GPU process crashes without calling
> > > > REQBUFS(0)?
> > > > What happens when the bound textures are deleted, but the HW codec is
> still
> > > > using them?
> > > 
> > > I guess I am missing something here. I did not understand "REQBUFS(0) is
> > called
> > > but corresponding textures are being rendered ?". Doesn't
> > DestroyOutputBuffers()
> > > call guarantee that buffers on CAPTURE plane are no longer used.
> > 
> > The underlying memory can still be used as textures in the client of VDA
> class.
> > It only guarantees that they are not used anymore by the codec class as
> > v4l2_buffers.
> > 
> > > I will confirm about the buffer freeing in gpu process crash scenario.
> > 
> > Thanks.
> If the EGLimage is destroyed I think the texture becomes unbound. I was
> debugging some scenario and I get errors as "texture not bound or texture id
0"
> kind of errors from gles2_cmd_decoder.cc. These I guess represent this crash
> scenario. So it is taken care of already while validating the texture before
> rendering ?

If you are getting those errors, then there is definitely something wrong going
on.

> And I observe similar kind of logs on Exynos too.

That is even more worrying. Could you please submit a bug for Exynos with repro
steps?

> Do you have a test case or steps of validating this ? Will killing gpu process
> while video playback validate this path ?

It should.

> > 
> > > The last scenario (bound texture are deleted but HW codec is still using
> them)
> > > is taken care by the conversion step performed using the library.
> > > The texture is
> > > bound to the EGlImage. So that binding will fail. Since the libtegrav4l2
has
> > the
> > > EglImage backed by a RGB buffer the conversion can happen. How can I test
> this
> > > scenario ?
> > 
> > This is just a case where there is a bug in the code, but my point is that
the
> > ownership should be shared with the kernel as well, so if the userspace
> (Chrome)
> > dies, the kernel will properly clean up.
> > 
> > > 
> > > > 
> > > > > 
> > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > > I would like to understand the big picture here please.
> > > > > > 
> > > > > > We strive to stay as close as possible to using (and/or creating)
> > > > > > platform-independent standards where we can, like the sequence
above,
> > > > instead
> > > > > of
> > > > > > providing custom calls for each platform. Removing this from here
and
> > TVDA
> > > > is
> > > > > a
> > > > > > step into an opposite direction, and I would like to understand what
> > > > technical
> > > > > > difficulties force us to do this first.
> > > > > > 
> > > > > > Binding textures to EGLImages also serves to keep track of
ownership.
> > > There
> > > > > are
> > > > > > multiple users of the shared buffer, the renderer, the GPU that
> renders
> > > the
> > > > > > textures, this class and the HW codec. How is ownership/destruction
> > > managed
> > > > > and
> > > > > > how is it ensured that the buffer is valid while any of the users
are
> > > still
> > > > > > referring to/using it (both in userspace and in kernel)?
> > > > > > 
> > > > > > What happens if the renderer crashes and the codec is writing to the
> > > > textures?
> > > > > > What happens when this class is destroyed, but the texture is in the
> > > > renderer?
> > > > > > What happens when the whole Chrome crashes, but the HW codec is
using
> a
> > > > buffer
> > > > > > (i.e. kernel has ownership)?
> > > > > > 
> > > > > > Could you please explain how is ownership managed for shared buffers
> on
> > > > Tegra?
> > > > > 
> > > > 
> > > 
> > 
>

shivdasp

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media/tegra_v4l2_video_device.cc#newcode23 content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; Whitelisted will not be ...

6 years, 10 months ago (2014-02-14 09:18:58 UTC) #15

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] =
"/dev/video0";
Whitelisted will not be required because of pre-acquire the resources (including
open() for /dev/tegra_avpchannel) in the libtegrav4l2.so which we load before
the sandbox will kick in.
Hence the only place we will have this device name will be in TegraV4L2Device
which will be used to open the decoder instance. (The pre-opened fd will be
used).
Okay I will keep the same device name (/dev/tegra_avpchannel).

On 2014/02/14 07:36:10, Pawel Osciak wrote:
> On 2014/02/14 03:06:45, shivdasp wrote:
> > Yes /dev/tegra_avpchannel will be added in the sandbox whitelist for now.
> > Work is in progress to make this sandbox friendly in our MM stack. Most
> probably
> > by the time we get this change merged we should have completed it so even
> > whitelisting may not be required. In worst case we will whitelist it sandbox
> > code.
> 
> Why it may not be required? How else would it be accessible if it's not
> whitelisted?
> 
> > As I said earlier this "device name" sent to the library is just dummy for
the
> > libtegrav4l2 to create a decoder instance. It just has to be different than
> the
> > encoder "device name".
> > Shall I just keep it "tegra_dec" and "tegra_enc" to not mislead it for
> something
> > like a true v4l2 device name ?
> 
> Please, the library has to open the device with the same name as this method
> provides to it.
>  
> > On 2014/02/13 10:42:54, Pawel Osciak wrote:
> > > On 2014/02/12 10:11:55, shivdasp wrote:
> > > > This library internally talks to MM layer which talks to the device
> > > > (/dev/tegra_avpchannel) which is the nvavp driver.
> > > 
> > > This means you will have to add it to sandbox rules in Chrome, right? So
the
> > > library should actually use the device path string provided from Chrome to
> > > Open() and not have the string hardcoded in the library please.
> > > 
> > > > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > > > This is a v4l2 decoder device name which we use to initialize a
> decoder
> > > > > context
> > > > > > within the libtegrav4l2 library.
> > > > > > This can be anything really as long as decoder and encoder device
> names
> > > are
> > > > > > different since we do not open a v4l2 video device underneath.
> > > Libtegrav4l2
> > > > is
> > > > > > really a pseudo implementation. I can change it /dev/tegra-dec and
> > > > > > /dev/tegra-enc for it to mean tegra specific.
> > > > > 
> > > > > Which device is actually being used? Does the library just talk to DRM
> > > driver
> > > > > via custom ioctls?
> > > > > 
> > > > > > 
> > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > > > Is this is the codec device exposed by Tegra kernel driver?
> > > > > > > 
> > > > > > > You can't assume it will be this on all configurations.
> > > > > > > Please use udev rules to create a codec specific device (see
Exynos
> > > > example
> > > > > at
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...)
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for
tegrav4l2 device.
On 2014/02/14 07:36:10, Pawel Osciak wrote:
> On 2014/02/14 03:06:45, shivdasp wrote:
> > Understood. I believe the dmabuf export mechanism takes care of this in
Exynos
> > since the buffer backed memory is in kernel and hence the deletion is kind
of
> > synchronized.
> 
> Yes, although it's not "synchronized", but the ownership is managed and
> refcounted in the driver and the memory is freed when there is no more users.
> How does Tegra manage this?
I checked with the graphics team here. This is handled. Since the memory for
EglImage is refcounted and is backed-up by the graphics library itself, the
texture bound to a destroyed eglImage can still be rendered. The V4L2 REQBUFS(0)
shall de-allocate only the buffer allocated by the "actual decoder" which are
the YUV buffers and since there is no more conversion happening after REQBUFS(0)
this is handled too.
> 
> > 
> > I see in DestroyOutputbuffers(), before calling DismissPicture(), the
> > eglDestroyImageKHR() is called thereby the textures are unbound or perhaps
the
> > deletion there is also delayed ?
> 
> Destroying the image doesn't destroy the textures, but it does unbind them.
> 
> > How is the texture being rendered if the eglImage it is bound to is also
> > destroyed ?
> 
> Texture is a separate entity from the eglImage, they share the underlying
memory
> after binding, but after eglImage is destroyed, the memory is not freed and it
> lives on in the texture.
> 
> > 
> > On 2014/02/13 10:42:54, Pawel Osciak wrote:
> > > On 2014/02/12 10:11:55, shivdasp wrote:
> > > > If not in REQBUFS(0) then what will be the appropriate place to destroy
> the
> > > > buffers ?
> > > 
> > > Sorry, perhaps I wasn't clear, the term "buffers" is a bit overloaded. I
> mean
> > > the underlying memory. REQBUFS(0) may be called, but the actual memory
that
> > > backed the v4l2_buffers may have to live on if it's still tied to the
> > textures.
> > > This will be a common case actually, because we don't explicitly destroy
> > > textures first unless they are dismissed. The memory should be then freed
> when
> > > the textures are deleted, not on REQBUFS(0). I'm wondering if the
> > library/driver
> > > take this into account.
> > > Of course, it's still possible for REQUBFS(0) to have to trigger
destruction
> > of
> > > underlying memory, in case the textures get unbound and deleted before
> > > REQBUFS(0) is called.
> > > 
> > > > V4L2VDA::DestroyOutputBuffers() calls REQBUFS(0) and hence we destroy it
> > > there.
> > > > How does the renderer then inform the ownership of textures ?
> > > 
> > > glDeleteTextures(). So the textures and the underlying memory may have to
> > > outlive REQBUFS(0).
> > > 
> > > > 
> > > > 
> > > > 
> > > > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > > > Buffers are unmapped in REQBUFS(0) call and destroyed.
> > > > > > Since there is no real need for mmap and munmap, we did not
implement
> it
> > > in
> > > > > the
> > > > > > library.
> > > > > > So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and
> mmaps
> > > the
> > > > > > buffer whereas REQBUF(0) unmaps and destroys them.
> > > > > 
> > > > > We should not rely on V4L2VDA to be the only place where the
underlying
> > > memory
> > > > > will be destroyed. Even if we REQBUFS(0) on these buffers, the
renderer
> > > > process
> > > > > may still be keeping ownership of the textures bound to them. Is this
> > taken
> > > > into
> > > > > account?
> > > > > 
> > > > > > 
> > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > > > If so, how is unmapping handled then? What if we want to free the
> > > buffers
> > > > > and
> > > > > > > reallocate them? You cannot call REQBUFS(0) without unmapping the
> > > buffers
> > > > > > > first...
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset
= reinterpret_cast<unsigned int>(egl_image);
On 2014/02/14 07:36:10, Pawel Osciak wrote:
> On 2014/02/14 03:06:45, shivdasp wrote:
> > On 2014/02/13 10:42:54, Pawel Osciak wrote:
> > > On 2014/02/12 10:11:55, shivdasp wrote:
> > > > The output is YUV420 planar.
> > > 
> > > Are all planes non-interleaved and contiguous in memory? If so, then you
> need
> > to
> > > use either V4L2_PIX_FMT_YVU420 ('YV12') or V4L2_PIX_FMT_YUV420 ('YU12'),
> > please
> > > see http://linuxtv.org/downloads/v4l-dvb-apis/re23.html.
> > > 
> > > Please don't use V4L2_PIX_FMT_NV12M if this is not what codec produces.
> > Okay I will change the pixel format. However there are some DHECK_EQ() code
in
> > V4L2VDA to check against V4L2_PIX_FMT_NV12M.
> > They also exist for num_planes. I will have to introduce private member
> > functions for ExynosV4L2Device and TegraV4L2Device to check against them
> rather
> > than hardcoded values in V4L2VDA. Will that be fine ?
> 
> As I said, those checks and should be fixed to use the actual format given by
> the device.
> There is no need to have private member methods for devices. V4L2 API gives
you
> methods to query and get formats as well as information how many planes each
> format uses. Please see documentation for v4l2_pix_format{,_mplane}
> in http://linuxtv.org/downloads/v4l-dvb-apis/vidioc-g-fmt.html.
Okay will use the V4L2 API.
> 
> > > 
> > > > I think rather than using the QUERYBUF to pass the
> > > > EglImage handles and stuffing the required information I would rather
> > > introduce
> > > > a custom API (UseEglImage ?).
> > > > I hope that is fine.
> > > 
> > > Yes, it's preferable over using QUERYBUF for this. But let's agree on the
> > shape
> > > of it. What would UseEglImage do?
> > > Could we instead pass the offsets to eglCreateImageKHR?
> > > Will we be able to also retain texture binding in V4L2VDA then?
> > I was thinking of int UseEglImage(int buffer_index, EGLImageKHR egl_image);
> > We basically need to send the EglImage created for a particular buffer_index
> so
> > the library can convert YUV into its respective EglImage.
> > We cannot send offsets to eglCreateImageKHR unless we have extension.
However
> > the buffer_index internally is the mapping for identifying the eglImage so
in
> a
> > way that will work like an offset.
> 
> I really don't see why it should be an issue to create such an extension with
> very minimal effort.
> It should be a trivial wrapper around eglCreateImageKHR. Your library has to
> call some function in the driver anyway to do this, so why not extract that
code
> and move it to a special case in eglCreateImageKHR instead? The code is
already
> written I assume, since you use it, it just needs to be moved to a different
> location (i.e. eglCreateImageKHR implementation).
The graphics stack is owned by a separate team so I don't really understand the
implementation issues if any.
I will check if there is such plan in the meanwhile let me address all these
review comments and send out second patchset.

> 
> > > 
> > > > 
> > > > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > > > We are really passing in the EglImage handle here to the library.
> > > > > 
> > > > > EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed
> in
> > an
> > > > u32
> > > > > variable.
> > > > > 
> > > > > The whole idea behind offsets is that they are usually not really
> offsets,
> > > but
> > > > > sort of platform-independent 32bit IDs. They are acquired from the
V4L2
> > > driver
> > > > > (or library) via QUERYBUFS, and can be passed back to other calls to
> > > uniquely
> > > > > identify the buffers (e.g. to mmap).
> > > > > 
> > > > > The client is not supposed to generate them by itself and pass them to
> > > > > QUERYBUFS.
> > > > > 
> > > > > > The library
> > > > > > associates this with the corresponding v4l2_buffer on the CAPTURE
> plane
> > > and
> > > > > use
> > > > > > the underlying conversion APIs to transform the decoder's yuv output
> > into
> > > > the
> > > > > > egl image.
> > > > > > We have two planes on CAPTURE_PLANE to comply with V4L2VDA code
where
> > the
> > > > > number
> > > > > > of planes are checked with 2 (line #1660 of V4L2VDA).
> > > > > 
> > > > > You mean
> > > > >
> > > >
> > >
> >
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu...
> > > > > ? 
> > > > > 
> > > > > This is an overassumption from the time where there was only one
format
> > > > > supported.
> > > > > The number of planes to be used should be taken from the v4l2_format
> > struct,
> > > > > returned from G_FMT. This assumption should be fixed.
> > > > > 
> > > > > From what I'm seeing here, your HW doesn't really use
> V4L2_PIX_FMT_NV12M?
> > > > Which
> > > > > fourcc format does it use? Are the planes separate memory buffers?
> > > > > 
> > > > > > 
> > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > > > mem_offset is u32 always, an unsigned int shouldn't be assigned to
> it.
> > > > Also,
> > > > > > > there are two planes, but passing only one offset is a bit
> > inconsistent.
> > > > > > > Although, why have two planes, if only one is used?
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
>

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left):

https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:382:
glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image);
On 2014/02/14 07:36:10, Pawel Osciak wrote:
> On 2014/02/14 03:06:45, shivdasp wrote:
> > On 2014/02/13 10:42:54, Pawel Osciak wrote:
> > > On 2014/02/12 10:11:55, shivdasp wrote:
> > > > On 2014/02/12 09:15:13, Pawel Osciak wrote:
> > > > > On 2014/02/10 13:31:17, shivdasp wrote:
> > > > > > The decoder's output buffers are created when REQBUFS(x) is called
on
> > > > > > CAPTURE_PLANE. These buffers are hardware buffers which can be
shared
> > with
> > > > the
> > > > > > AVP processor for decoder to write into.
> > > > > 
> > > > > By decoder do you mean V4L2VDA class?
> > > > 
> > > > No I meant the decoder entity within the library.
> > > > > 
> > > > > > Then V4L2VDA triggers ProvidePictureBuffers() on which textures are
> > > created
> > > > > and
> > > > > > sent back in AssignPictureBuffers().
> > > > > > Now V4L2VDA creates EglImages from these textures and sends each
> > EglImage
> > > > > handle
> > > > > > to library using the QUERYBUF (but can use a custom call too). The
> > > tegrav4l2
> > > > > > library cannot create EglImages from DMABUFS like in Exynos since
> there
> > is
> > > > no
> > > > > > such extension. We create EglImage from this texture itself so there
> is
> > a
> > > > > > binding between texture and eglImage.
> > > > > 
> > > > > Sounds like the eglCreateImage extension taking offsets I described in
> the
> > > > > comment in tegra_v4l2_video_device.cc could work for this?
> > > > Unfortunately there is no such extension today.
> > > > > 
> > > > > > Now when this EglImage is sent to libtegrav4l2, it is mapped with
the
> > > > > > corresponding decoder buffer created in REQBUF() call.
> > > > > > This way there is one map of EglImage, texture and decoder buffer.
> > > > > 
> > > > > My understanding is you mean the buffer is bound to a texture? If so,
> then
> > > it
> > > > > also seems like we could use the current bind texture to eglimage
calls?
> > > > The libtegrav4l2 talks to another internal library which actually
creates
> > the
> > > > YUV buffer. This is what is given to the AVP and where the decoded
output
> is
> > > > actually filled.
> > > > There is a corresponding RGB buffer created when the EGLImage is called,
> > this
> > > is
> > > > owned by the graphics library. While enqueuing buffers for CAPTURE
PLANE,
> > > there
> > > > is a conversion performed to do YUV to RGB.
> > > 
> > > So the YUV buffers are tied to the textures somehow?
> > We send texture_id to eglCreateImageKHR and bind it there. And eglImage is
> sent
> > to the library which maps it to its YUV buffer.
> > My subsequent patch will probably make this clearer.
> 
> Wait, where do you send texture_id to eglCreateImageKHR? I don't see that in
the
> code above.
> Do you have an extension for eglCreateImageKHR to also accept texture ids and
> bind during creation? Why not do this in the standard way, i.e. by using
> GL_OES_EGL_image_external? I would expect your EGL implementation already has
it
> for other things (and it's an extension created by NVIDIA too)...
> Or did you mean some other function?
The texture_id is sent in eglCreateImageKHR parameter. See TegraV4L2Device
implementation of CreateEGLImage().

I will submit a bug with my findings and repro steps.

> 
> > > 
> > > > > 
> > > > > > When any buffer is enqueued in QBUF, the library sends it down to
the
> > > > decoder.
> > > > > > Once the decoder buffer is ready, the library uses graphics apis to
> > > populate
> > > > > the
> > > > > > corresponding EglImage with the RGB data and then pushes into a
queue
> > > > thereby
> > > > > > making it available for DQBUF after which this buffer can be used
only
> > > when
> > > > it
> > > > > > is back in QBUF call.
> > > > > > This way the buffer ownership is managed.
> > > > > > So in summary the library uses queues and does all the buffer
> management
> > > > > between
> > > > > > decoder and the graphics stack for conversion.
> > > > > 
> > > > > What happens when this class calls REQBUFS(0), but the corresponding
> > > textures
> > > > > are being rendered to the screen?
> > > > > How will the buffers be freed if the GPU process crashes without
calling
> > > > > REQBUFS(0)?
> > > > > What happens when the bound textures are deleted, but the HW codec is
> > still
> > > > > using them?
> > > > 
> > > > I guess I am missing something here. I did not understand "REQBUFS(0) is
> > > called
> > > > but corresponding textures are being rendered ?". Doesn't
> > > DestroyOutputBuffers()
> > > > call guarantee that buffers on CAPTURE plane are no longer used.
> > > 
> > > The underlying memory can still be used as textures in the client of VDA
> > class.
> > > It only guarantees that they are not used anymore by the codec class as
> > > v4l2_buffers.
> > > 
> > > > I will confirm about the buffer freeing in gpu process crash scenario.
> > > 
> > > Thanks.
> > If the EGLimage is destroyed I think the texture becomes unbound. I was
> > debugging some scenario and I get errors as "texture not bound or texture id
> 0"
> > kind of errors from gles2_cmd_decoder.cc. These I guess represent this crash
> > scenario. So it is taken care of already while validating the texture before
> > rendering ?
> 
> If you are getting those errors, then there is definitely something wrong
going
> on.
> 
> > And I observe similar kind of logs on Exynos too.
> 
> That is even more worrying. Could you please submit a bug for Exynos with
repro
> steps?
> 
> > Do you have a test case or steps of validating this ? Will killing gpu
process
> > while video playback validate this path ?
> 
> It should.
> 
> > > 
> > > > The last scenario (bound texture are deleted but HW codec is still using
> > them)
> > > > is taken care by the conversion step performed using the library.
> > > > The texture is
> > > > bound to the EGlImage. So that binding will fail. Since the libtegrav4l2
> has
> > > the
> > > > EglImage backed by a RGB buffer the conversion can happen. How can I
test
> > this
> > > > scenario ?
> > > 
> > > This is just a case where there is a bug in the code, but my point is that
> the
> > > ownership should be shared with the kernel as well, so if the userspace
> > (Chrome)
> > > dies, the kernel will properly clean up.
> > > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote:
> > > > > > > I would like to understand the big picture here please.
> > > > > > > 
> > > > > > > We strive to stay as close as possible to using (and/or creating)
> > > > > > > platform-independent standards where we can, like the sequence
> above,
> > > > > instead
> > > > > > of
> > > > > > > providing custom calls for each platform. Removing this from here
> and
> > > TVDA
> > > > > is
> > > > > > a
> > > > > > > step into an opposite direction, and I would like to understand
what
> > > > > technical
> > > > > > > difficulties force us to do this first.
> > > > > > > 
> > > > > > > Binding textures to EGLImages also serves to keep track of
> ownership.
> > > > There
> > > > > > are
> > > > > > > multiple users of the shared buffer, the renderer, the GPU that
> > renders
> > > > the
> > > > > > > textures, this class and the HW codec. How is
ownership/destruction
> > > > managed
> > > > > > and
> > > > > > > how is it ensured that the buffer is valid while any of the users
> are
> > > > still
> > > > > > > referring to/using it (both in userspace and in kernel)?
> > > > > > > 
> > > > > > > What happens if the renderer crashes and the codec is writing to
the
> > > > > textures?
> > > > > > > What happens when this class is destroyed, but the texture is in
the
> > > > > renderer?
> > > > > > > What happens when the whole Chrome crashes, but the HW codec is
> using
> > a
> > > > > buffer
> > > > > > > (i.e. kernel has ownership)?
> > > > > > > 
> > > > > > > Could you please explain how is ownership managed for shared
buffers
> > on
> > > > > Tegra?
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
>

shivdasp

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode713 content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if (!AppendToInputFrame(data, size)) I think there is a bug ...

6 years, 10 months ago (2014-02-20 08:51:01 UTC) #17

Pawel Osciak

On 2014/02/19 21:23:58, shivdasp wrote: > Incorporated review comments, please take a look. Hi Shivdas, ...

6 years, 10 months ago (2014-02-21 05:32:43 UTC) #18

Pawel Osciak

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode713 content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if (!AppendToInputFrame(data, size)) On 2014/02/20 08:51:02, shivdasp wrote: > ...

6 years, 10 months ago (2014-02-21 05:36:42 UTC) #19

shivdasp

On 2014/02/21 05:32:43, Pawel Osciak wrote: > On 2014/02/19 21:23:58, shivdasp wrote: > > Incorporated ...

6 years, 10 months ago (2014-02-21 07:06:53 UTC) #20

shivdasp

Ping on whether there is a need to change EglImage creation part. On 2014/02/21 07:06:53, ...

6 years, 10 months ago (2014-02-25 09:37:55 UTC) #21

shivdasp

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode713 content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if (!AppendToInputFrame(data, size)) The issue is that we try ...

6 years, 10 months ago (2014-02-25 09:38:14 UTC) #22

Pawel Osciak

On 2014/02/14 09:18:58, shivdasp wrote: > > Yes, although it's not "synchronized", but the ownership ...

6 years, 10 months ago (2014-02-26 10:57:59 UTC) #23

Pawel Osciak

On 2014/02/21 07:06:53, shivdasp wrote: > On 2014/02/21 05:32:43, Pawel Osciak wrote: > > On ...

6 years, 10 months ago (2014-02-26 11:02:28 UTC) #24

Pawel Osciak

On 2014/02/25 09:38:14, shivdasp wrote: > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/media/v4l2_video_decode_accelerator.cc > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode713 > ...

6 years, 10 months ago (2014-02-26 11:06:14 UTC) #25

shivdasp

On 2014/02/26 10:57:59, Pawel Osciak wrote: > On 2014/02/14 09:18:58, shivdasp wrote: > > > ...

6 years, 10 months ago (2014-02-26 14:39:52 UTC) #26

On 2014/02/26 10:57:59, Pawel Osciak wrote:
> On 2014/02/14 09:18:58, shivdasp wrote:
> 
> > > Yes, although it's not "synchronized", but the ownership is managed and
> > > refcounted in the driver and the memory is freed when there is no more
> users.
> > > How does Tegra manage this?
> > I checked with the graphics team here. This is handled. Since the memory for
> > EglImage is refcounted and is backed-up by the graphics library itself, the
> > texture bound to a destroyed eglImage can still be rendered. The V4L2
> REQBUFS(0)
> > shall de-allocate only the buffer allocated by the "actual decoder" which
are
> > the YUV buffers and since there is no more conversion happening after
> REQBUFS(0)
> > this is handled too.
> 
> They still have to be tied to each other and refcounted together. At which
point
> will the conversion from the yuv decoder buffer be done into the texture? At
The YUV conversion happens at decode time into the EGLImage buffer when DQBUF is
called.
> render time or at decode time? If at render time, if REQBUFS(0) is called
before
> the texture is rendered, then we have to retain the decoder buffer until it's
> converted and put into the texture before deleting it. If at decode time, what
> if renderer frees the texture and we are still decoding and try to convert
into
> that deleted texture afterwards?
If REQBUFS(0) is called, the eglImage is destroyed but since the texture is
still valid
& refcounts the eglimage memory it can still be rendered.
If the renderer frees the texture, the eglImage is still valid (since it is
refcounted) we can still decode
and convert it into the eglimage.
With this TVDA patch I have tested dash player and resolution change happens
smoothly.
I could also test resolution change through youtube player. Killing gpu-process
while playback also works as in Exynos (atleast from the logs the behavior seems
same).
Is there any specific test you want me to run and verify this case ?

shivdasp

On 2014/02/26 11:06:14, Pawel Osciak wrote: > On 2014/02/25 09:38:14, shivdasp wrote: > > > ...

6 years, 10 months ago (2014-02-26 14:45:26 UTC) #27

Pawel Osciak

On 2014/02/26 14:39:52, shivdasp wrote: > On 2014/02/26 10:57:59, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-03 05:10:48 UTC) #28

On 2014/02/26 14:39:52, shivdasp wrote:
> On 2014/02/26 10:57:59, Pawel Osciak wrote:
> > On 2014/02/14 09:18:58, shivdasp wrote:
> > 
> > > > Yes, although it's not "synchronized", but the ownership is managed and
> > > > refcounted in the driver and the memory is freed when there is no more
> > users.
> > > > How does Tegra manage this?
> > > I checked with the graphics team here. This is handled. Since the memory
for
> > > EglImage is refcounted and is backed-up by the graphics library itself,
the
> > > texture bound to a destroyed eglImage can still be rendered. The V4L2
> > REQBUFS(0)
> > > shall de-allocate only the buffer allocated by the "actual decoder" which
> are
> > > the YUV buffers and since there is no more conversion happening after
> > REQBUFS(0)
> > > this is handled too.
> > 
> > They still have to be tied to each other and refcounted together. At which
> point
> > will the conversion from the yuv decoder buffer be done into the texture? At
> The YUV conversion happens at decode time into the EGLImage buffer when DQBUF
is
> called.
> > render time or at decode time? If at render time, if REQBUFS(0) is called
> before
> > the texture is rendered, then we have to retain the decoder buffer until
it's
> > converted and put into the texture before deleting it. If at decode time,
what
> > if renderer frees the texture and we are still decoding and try to convert
> into
> > that deleted texture afterwards?
> If REQBUFS(0) is called, the eglImage is destroyed but since the texture is
> still valid
> & refcounts the eglimage memory it can still be rendered.
> If the renderer frees the texture, the eglImage is still valid (since it is
> refcounted) we can still decode
> and convert it into the eglimage.
> With this TVDA patch I have tested dash player and resolution change happens
> smoothly.
> I could also test resolution change through youtube player. Killing
gpu-process
> while playback also works as in Exynos (atleast from the logs the behavior
seems
> same).
> Is there any specific test you want me to run and verify this case ?

Please also test for memory leaks, if you haven't done so, in these scenarios.

Pawel Osciak

On 2014/02/26 14:45:26, shivdasp wrote: > On 2014/02/26 11:06:14, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-03 05:16:34 UTC) #29

On 2014/02/26 14:45:26, shivdasp wrote:
> On 2014/02/26 11:06:14, Pawel Osciak wrote:
> > On 2014/02/25 09:38:14, shivdasp wrote:
> > >
> >
>
https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
> > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):
> > > 
> > >
> >
>
https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
> > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if
> > > (!AppendToInputFrame(data, size))
> > > The issue is that we try to Dequeue() only once in ATIF() and if it fails
we
> > > stall this thread. Dequeue() cannot be guaranteed to succeed and provide
> back
> > a
> > > buffer immediately when the client expects. I can see this issue on Tegra
in
> 1
> > > out of 5 times, depending upon how the threads are scheduled.
> > > I think until we move out of the kInitialized into KDecoding, state we
> should
> > > try and scheduleDecodeBufferTask and call GetFormatInfo() irrespective of
> > > whether Dequeue() fails.
> > 
> > Right, I see now. The problem is with Dequeue(). But I need to think a
little
> > bit
> > longer how to solve this. We cannot move out to kDecoding, because we don't
> have
> > buffers allocated yet and if we do, then we will break allocation,
resolution
> > change,
> > resets and other things.
> I think we should restructure the DecodeBufferInitial() to atleast do ATIF()
and
> GetFormatInfo().
> Checking for GetFormatInfo() will put us into the kDecoding state which should
> have been generated in some later time if not immediately.
> 
> My current fix (not so elegant) which is working for all cases:
> DecodeBufferInitial() {
>   if (!ATIF()) {
>     if (free_input_buffers_.empty())
>       goto chk_format_info;
>   }
>   Dequeue()
> ..
> ..
> ..
> chk_format_info:
>   GetFormatInfo()
> ..
> ..
> }

Are you testing with vdatest after r249963? It should fail on this.
The correct solution is to wait for the driver to return more input buffers.
This should be fixed in V4L2VDA.

shivdasp

On 2014/03/03 05:10:48, Pawel Osciak wrote: > On 2014/02/26 14:39:52, shivdasp wrote: > > On ...

6 years, 9 months ago (2014-03-03 16:37:45 UTC) #30

On 2014/03/03 05:10:48, Pawel Osciak wrote:
> On 2014/02/26 14:39:52, shivdasp wrote:
> > On 2014/02/26 10:57:59, Pawel Osciak wrote:
> > > On 2014/02/14 09:18:58, shivdasp wrote:
> > > 
> > > > > Yes, although it's not "synchronized", but the ownership is managed
and
> > > > > refcounted in the driver and the memory is freed when there is no more
> > > users.
> > > > > How does Tegra manage this?
> > > > I checked with the graphics team here. This is handled. Since the memory
> for
> > > > EglImage is refcounted and is backed-up by the graphics library itself,
> the
> > > > texture bound to a destroyed eglImage can still be rendered. The V4L2
> > > REQBUFS(0)
> > > > shall de-allocate only the buffer allocated by the "actual decoder"
which
> > are
> > > > the YUV buffers and since there is no more conversion happening after
> > > REQBUFS(0)
> > > > this is handled too.
> > > 
> > > They still have to be tied to each other and refcounted together. At which
> > point
> > > will the conversion from the yuv decoder buffer be done into the texture?
At
> > The YUV conversion happens at decode time into the EGLImage buffer when
DQBUF
> is
> > called.
> > > render time or at decode time? If at render time, if REQBUFS(0) is called
> > before
> > > the texture is rendered, then we have to retain the decoder buffer until
> it's
> > > converted and put into the texture before deleting it. If at decode time,
> what
> > > if renderer frees the texture and we are still decoding and try to convert
> > into
> > > that deleted texture afterwards?
> > If REQBUFS(0) is called, the eglImage is destroyed but since the texture is
> > still valid
> > & refcounts the eglimage memory it can still be rendered.
> > If the renderer frees the texture, the eglImage is still valid (since it is
> > refcounted) we can still decode
> > and convert it into the eglimage.
> > With this TVDA patch I have tested dash player and resolution change happens
> > smoothly.
> > I could also test resolution change through youtube player. Killing
> gpu-process
> > while playback also works as in Exynos (atleast from the logs the behavior
> seems
> > same).
> > Is there any specific test you want me to run and verify this case ?
> 
> Please also test for memory leaks, if you haven't done so, in these scenarios.

Yes I have been testing them. If there's anything major comment that you would
like me to address here please let me know.

shivdasp

On 2014/03/03 05:16:34, Pawel Osciak wrote: > On 2014/02/26 14:45:26, shivdasp wrote: > > On ...

6 years, 9 months ago (2014-03-03 16:42:27 UTC) #31

On 2014/03/03 05:16:34, Pawel Osciak wrote:
> On 2014/02/26 14:45:26, shivdasp wrote:
> > On 2014/02/26 11:06:14, Pawel Osciak wrote:
> > > On 2014/02/25 09:38:14, shivdasp wrote:
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
> > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):
> > > > 
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
> > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if
> > > > (!AppendToInputFrame(data, size))
> > > > The issue is that we try to Dequeue() only once in ATIF() and if it
fails
> we
> > > > stall this thread. Dequeue() cannot be guaranteed to succeed and provide
> > back
> > > a
> > > > buffer immediately when the client expects. I can see this issue on
Tegra
> in
> > 1
> > > > out of 5 times, depending upon how the threads are scheduled.
> > > > I think until we move out of the kInitialized into KDecoding, state we
> > should
> > > > try and scheduleDecodeBufferTask and call GetFormatInfo() irrespective
of
> > > > whether Dequeue() fails.
> > > 
> > > Right, I see now. The problem is with Dequeue(). But I need to think a
> little
> > > bit
> > > longer how to solve this. We cannot move out to kDecoding, because we
don't
> > have
> > > buffers allocated yet and if we do, then we will break allocation,
> resolution
> > > change,
> > > resets and other things.
> > I think we should restructure the DecodeBufferInitial() to atleast do ATIF()
> and
> > GetFormatInfo().
> > Checking for GetFormatInfo() will put us into the kDecoding state which
should
> > have been generated in some later time if not immediately.
> > 
> > My current fix (not so elegant) which is working for all cases:
> > DecodeBufferInitial() {
> >   if (!ATIF()) {
> >     if (free_input_buffers_.empty())
> >       goto chk_format_info;
> >   }
> >   Dequeue()
> > ..
> > ..
> > ..
> > chk_format_info:
> >   GetFormatInfo()
> > ..
> > ..
> > }
> 
> Are you testing with vdatest after r249963? It should fail on this.
> The correct solution is to wait for the driver to return more input buffers.
> This should be fixed in V4L2VDA.
I was on older code and yes I do sometimes see some issue with my fix.
How about I file a partner bug for this issue since I imagine the fix may not be
trivial and will be better to de-couple from this change of adding TegraVDA ?

Pawel Osciak

On 2014/03/03 16:42:27, shivdasp wrote: > On 2014/03/03 05:16:34, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-04 08:37:04 UTC) #32

On 2014/03/03 16:42:27, shivdasp wrote:
> On 2014/03/03 05:16:34, Pawel Osciak wrote:
> > On 2014/02/26 14:45:26, shivdasp wrote:
> > > On 2014/02/26 11:06:14, Pawel Osciak wrote:
> > > > On 2014/02/25 09:38:14, shivdasp wrote:
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
> > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc
(right):
> > > > > 
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
> > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if
> > > > > (!AppendToInputFrame(data, size))
> > > > > The issue is that we try to Dequeue() only once in ATIF() and if it
> fails
> > we
> > > > > stall this thread. Dequeue() cannot be guaranteed to succeed and
provide
> > > back
> > > > a
> > > > > buffer immediately when the client expects. I can see this issue on
> Tegra
> > in
> > > 1
> > > > > out of 5 times, depending upon how the threads are scheduled.
> > > > > I think until we move out of the kInitialized into KDecoding, state we
> > > should
> > > > > try and scheduleDecodeBufferTask and call GetFormatInfo() irrespective
> of
> > > > > whether Dequeue() fails.
> > > > 
> > > > Right, I see now. The problem is with Dequeue(). But I need to think a
> > little
> > > > bit
> > > > longer how to solve this. We cannot move out to kDecoding, because we
> don't
> > > have
> > > > buffers allocated yet and if we do, then we will break allocation,
> > resolution
> > > > change,
> > > > resets and other things.
> > > I think we should restructure the DecodeBufferInitial() to atleast do
ATIF()
> > and
> > > GetFormatInfo().
> > > Checking for GetFormatInfo() will put us into the kDecoding state which
> should
> > > have been generated in some later time if not immediately.
> > > 
> > > My current fix (not so elegant) which is working for all cases:
> > > DecodeBufferInitial() {
> > >   if (!ATIF()) {
> > >     if (free_input_buffers_.empty())
> > >       goto chk_format_info;
> > >   }
> > >   Dequeue()
> > > ..
> > > ..
> > > ..
> > > chk_format_info:
> > >   GetFormatInfo()
> > > ..
> > > ..
> > > }
> > 
> > Are you testing with vdatest after r249963? It should fail on this.
> > The correct solution is to wait for the driver to return more input buffers.
> > This should be fixed in V4L2VDA.
> I was on older code and yes I do sometimes see some issue with my fix.
> How about I file a partner bug for this issue since I imagine the fix may not
be
> trivial and will be better to de-couple from this change of adding TegraVDA ?

By issue you mean the newest test doesn't pass?

Pawel Osciak

On 2014/03/03 16:42:27, shivdasp wrote: > On 2014/03/03 05:16:34, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-04 08:46:42 UTC) #33

On 2014/03/03 16:42:27, shivdasp wrote:
> On 2014/03/03 05:16:34, Pawel Osciak wrote:
> > On 2014/02/26 14:45:26, shivdasp wrote:
> > > On 2014/02/26 11:06:14, Pawel Osciak wrote:
> > > > On 2014/02/25 09:38:14, shivdasp wrote:
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
> > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc
(right):
> > > > > 
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
> > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if
> > > > > (!AppendToInputFrame(data, size))
> > > > > The issue is that we try to Dequeue() only once in ATIF() and if it
> fails
> > we
> > > > > stall this thread. Dequeue() cannot be guaranteed to succeed and
provide
> > > back
> > > > a
> > > > > buffer immediately when the client expects. I can see this issue on
> Tegra
> > in
> > > 1
> > > > > out of 5 times, depending upon how the threads are scheduled.
> > > > > I think until we move out of the kInitialized into KDecoding, state we
> > > should
> > > > > try and scheduleDecodeBufferTask and call GetFormatInfo() irrespective
> of
> > > > > whether Dequeue() fails.
> > > > 
> > > > Right, I see now. The problem is with Dequeue(). But I need to think a
> > little
> > > > bit
> > > > longer how to solve this. We cannot move out to kDecoding, because we
> don't
> > > have
> > > > buffers allocated yet and if we do, then we will break allocation,
> > resolution
> > > > change,
> > > > resets and other things.
> > > I think we should restructure the DecodeBufferInitial() to atleast do
ATIF()
> > and
> > > GetFormatInfo().
> > > Checking for GetFormatInfo() will put us into the kDecoding state which
> should
> > > have been generated in some later time if not immediately.
> > > 
> > > My current fix (not so elegant) which is working for all cases:
> > > DecodeBufferInitial() {
> > >   if (!ATIF()) {
> > >     if (free_input_buffers_.empty())
> > >       goto chk_format_info;
> > >   }
> > >   Dequeue()
> > > ..
> > > ..
> > > ..
> > > chk_format_info:
> > >   GetFormatInfo()
> > > ..
> > > ..
> > > }
> > 
> > Are you testing with vdatest after r249963? It should fail on this.
> > The correct solution is to wait for the driver to return more input buffers.
> > This should be fixed in V4L2VDA.
> I was on older code and yes I do sometimes see some issue with my fix.
> How about I file a partner bug for this issue since I imagine the fix may not
be
> trivial and will be better to de-couple from this change of adding TegraVDA ?

I'm not against this, but if TVDA fails more often than not on this, then the
fix should probably be submitted before TVDA.

sheu

https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/gpu/media/exynos_v4l2_video_device.h File content/common/gpu/media/exynos_v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/gpu/media/exynos_v4l2_video_device.h#newcode36 content/common/gpu/media/exynos_v4l2_video_device.h:36: unsigned int GetTextureTarget() OVERRIDE; While we're at this -- ...

6 years, 9 months ago (2014-03-06 08:51:34 UTC) #34

shivdasp

I did not notice sheu's comments while making patchset#7. Please have a look regarding the ...

6 years, 9 months ago (2014-03-06 11:10:08 UTC) #35

I did not notice sheu's comments while making patchset#7.
Please have a look regarding the DecodeBufferInitial() bug that I have fixed
here.
Will address sheu's comment in subsequent patchset.

Have tested VDAtest, all tests pass except the SlowRenderingTest which expects
the decoded buffers as 250 but on TegraVDA the decoded frames are 240.
This I think is because of how EOS is handled in VDA.
In my understanding we send a 0 sized buffer in Flush() and then enqueue no more
on OUTPUT PLANE. And when all the buffers are dequeued from OUTPUTPLANE we start
EOS processing in VDA. This does not guarantee that all the buffers that were
decoded and were ready to be dequeued from CAPTURE PLANE were indeed dequeued.
This is what happens when SlowRenderingTest fails on Tegra. There are 10 frames
ready to be dequeued and before that CAPTURE_STREAM_OFF happens.

Thanks

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
File content/common/gpu/media/exynos_v4l2_video_device.h (right):

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.h:36: unsigned int
GetTextureTarget() OVERRIDE;
On 2014/03/06 08:51:34, sheu wrote:
> While we're at this -- shouldn't these all be explicitly declared virtual?
> 
> It's not incorrect syntax as it is, just no quite up to style guidelines.

Done.

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:4: //
On 2014/03/06 08:51:34, sheu wrote:
> Extra comment line here.

Done.

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:62: return
TegraV4L2_Ioctl(device_fd_, flags, arg);
On 2014/03/06 08:51:34, sheu wrote:
> No HANDLE_EINTR wrapper needed for this?

Done.

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:66: if
(TegraV4L2_Poll(device_fd_, poll_device, event_pending) == -1) {
On 2014/03/06 08:51:34, sheu wrote:
> No HANDLE_EINTR wrapper needed for this?

Done.

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:120:
TEGRAV4L2_DLSYM_OR_RETURN_ON_ERROR(UseEglImage);
Initialize() is called when the V4L2Device is created which is the earliest.
Else will have to do it in a static method and have it called from somewhere.
On 2014/03/06 08:51:34, sheu wrote:
> Can this sort of thing be done once at startup time?

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:140:
(EGLClientBuffer)(texture_id),
On 2014/03/06 08:51:34, sheu wrote:
> static_cast<>; we don't use C-style casts.

Done.

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:149: return egl_image;
Yes that's right. tegrav4l2lib takes care of destroying them internally.
On 2014/03/06 08:51:34, sheu wrote:
> So as I understand it:
> 
> In the Exynos case, we export buffers from the video stack and import them to
> the graphics stack.
> In the Tegra case, we export buffers from the graphics stack and import them
to
> the tegrav4l2 lib, which does with them what it wants.  Responsibility for
> tracking ownership and making sure things don't leak then rests with the
> tegrav4l2 lib.
> 
> Sound about right?

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
File content/common/gpu/media/tegra_v4l2_video_device.h (right):

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.h:39: unsigned int
GetTextureTarget() OVERRIDE;
On 2014/03/06 08:51:34, sheu wrote:
> Declare these explicitly "virtual" again.

Done.

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if
(!AppendToInputFrame(data, size))
I have attempted a fix for this in patchset#7. Could you take a look at that ?
I agree there's a tight loop but the message is posted back to the
decoder_thread and with what I have tested, the thread scheduling allows the
underlying decoder thread to "report" stream format and get into the kDecoding
state. Please suggest how to go about this.
On Exynos I believe this condition seldom hits.

On 2014/03/06 08:51:34, sheu wrote:
> I think there's an actual bug here.  The reason why DecodeBufferInitial()
> returns false in this case is to avoid immediately scheduling another decode
> task, since AppendToInputFrame() failed because there's no input buffer
> available, and trying again would just be a tight loop.  The idea is that we
> should wait until another input buffer frees up.
> 
> Unfortunately, device_poll_thread_ is not running and so we're not polling and
> we'll never inform the decoder_thread_ that a buffer frees up when it does.
> 
> I think the solution is to call StartDevicePoll() earlier, perhaps at the end
of
> Initialize().  All the DCHECKS on decoder_state_ and
> device_poll_thread_.IsRunning() would naturally have to be re-audited to make
> sure they fall in line with our new assumptions.

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_device.h (right):

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:64: EGLint attrib[],
On 2014/03/06 08:51:34, sheu wrote:
> Can we make this a "const EGLint*" argument instead?

Done.

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:65: unsigned int texture_id,
On 2014/03/06 08:51:34, sheu wrote:
> texture_id should be GLuint

Done.

https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:69: virtual unsigned int
GetTextureTarget() = 0;
On 2014/03/06 08:51:34, sheu wrote:
> GL texture targets are GLenum types.  Also: would be nice to declare this as
> const.

Done.

sheu

https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/gpu/media/tegra_v4l2_video_device.cc#newcode120 content/common/gpu/media/tegra_v4l2_video_device.cc:120: TEGRAV4L2_DLSYM_OR_RETURN_ON_ERROR(UseEglImage); On 2014/03/06 11:10:09, shivdasp wrote: > Initialize() is ...

6 years, 9 months ago (2014-03-07 00:18:07 UTC) #36

sheu

I'm considering a fix for the stuck DecodeBufferInitial() in: http://chromiumcodereview.appspot.com/189993002/

6 years, 9 months ago (2014-03-07 04:06:22 UTC) #37

shivdasp

https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode719 content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto chk_format_info; Jumping to GetFormatInfo() will move us out ...

6 years, 9 months ago (2014-03-07 17:31:40 UTC) #38

sheu

https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode719 content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto chk_format_info; On 2014/03/07 17:31:40, shivdasp wrote: > Jumping ...

6 years, 9 months ago (2014-03-07 20:31:49 UTC) #39

https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto
chk_format_info;
On 2014/03/07 17:31:40, shivdasp wrote:
> Jumping to GetFormatInfo() will move us out of kInitialized state which is
what
> the bug really was. We did not try to check for format info if we were out of
> buffers while in kInitialized state.
> I am not sure how does starting of device poll earlier will solve this stuck
up
> problem. I will test it on my monday and update.
> The problem I see is that if we do not allocate buffers on CAPTURE plane
(which
> will happen after formatinfo is set) the buffers on OUTPUT plane will not get
> consumed and hence will not be dequeued. So even with device polling we cannot
> send more bitstream. We should have got the FormatInfo() after processing the
> first buffer itself but if the detection is getting delayed we should keep
> checking for it while sending more bitstream buffers.
> 
> 
> On 2014/03/07 00:18:08, sheu wrote:
> > We seem to be allergic to "goto" statements in Chrome, for not entirely bad
> > reasons.
> > 
> > But also I don't see what jumping to GetFormatInfo would buy us.  It seems
to
> me
> > that if we're out of buffers, whether we have format info or not does not
> > correlate with whether we'll soon be getting buffers.
> 

Ah, i see how it is.  So if I have this right: the problem is that we queue data
to the OUTPUT queue for the decoder to initialize, but the decoder
initialization happens asynchronously.  So VIDIOC_G_FMT on the CAPTURE queue may
not succeed immediately; instead at some undefined point in the future the
decoder initialization will finish and VIDIOC_G_FMT will start returning the
proper format.  This is a problem in TegraVDA since when we fail to get the
format after each buffer we queue, we continue trying to queue buffers; the race
is between the decoder initialization finishing and the VDA running out of
buffers to continue to enqueue with.

For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE queue is
synchronous w.r.t. the decoder initialization; i.e. it blocks until we know for
sure whether the decoder has initialized with the given input, or not.  That
brings to mind two possible solutions in the TegraVDA case:

1.  Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the Exynos
case
2.  Add a notification system for decoder initialization.  I'd go with the
V4L2_EVENT system, by adding a private event for the Tegra driver (see:
V4L2_EVENT_PRIVATE_START).

(1) would be a faster fix than (2).  Tight-loop polling (by constant reposting
of the task) is not the way to go.

sheu

https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode719 content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto chk_format_info; On 2014/03/07 20:31:50, sheu wrote: > On ...

6 years, 9 months ago (2014-03-07 21:36:29 UTC) #40

https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto
chk_format_info;
On 2014/03/07 20:31:50, sheu wrote:
> On 2014/03/07 17:31:40, shivdasp wrote:
> > Jumping to GetFormatInfo() will move us out of kInitialized state which is
> what
> > the bug really was. We did not try to check for format info if we were out
of
> > buffers while in kInitialized state.
> > I am not sure how does starting of device poll earlier will solve this stuck
> up
> > problem. I will test it on my monday and update.
> > The problem I see is that if we do not allocate buffers on CAPTURE plane
> (which
> > will happen after formatinfo is set) the buffers on OUTPUT plane will not
get
> > consumed and hence will not be dequeued. So even with device polling we
cannot
> > send more bitstream. We should have got the FormatInfo() after processing
the
> > first buffer itself but if the detection is getting delayed we should keep
> > checking for it while sending more bitstream buffers.
> > 
> > 
> > On 2014/03/07 00:18:08, sheu wrote:
> > > We seem to be allergic to "goto" statements in Chrome, for not entirely
bad
> > > reasons.
> > > 
> > > But also I don't see what jumping to GetFormatInfo would buy us.  It seems
> to
> > me
> > > that if we're out of buffers, whether we have format info or not does not
> > > correlate with whether we'll soon be getting buffers.
> > 
> 
> Ah, i see how it is.  So if I have this right: the problem is that we queue
data
> to the OUTPUT queue for the decoder to initialize, but the decoder
> initialization happens asynchronously.  So VIDIOC_G_FMT on the CAPTURE queue
may
> not succeed immediately; instead at some undefined point in the future the
> decoder initialization will finish and VIDIOC_G_FMT will start returning the
> proper format.  This is a problem in TegraVDA since when we fail to get the
> format after each buffer we queue, we continue trying to queue buffers; the
race
> is between the decoder initialization finishing and the VDA running out of
> buffers to continue to enqueue with.
> 
> For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE queue
is
> synchronous w.r.t. the decoder initialization; i.e. it blocks until we know
for
> sure whether the decoder has initialized with the given input, or not.  That
> brings to mind two possible solutions in the TegraVDA case:
> 
> 1.  Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the
Exynos
> case
> 2.  Add a notification system for decoder initialization.  I'd go with the
> V4L2_EVENT system, by adding a private event for the Tegra driver (see:
> V4L2_EVENT_PRIVATE_START).
> 
> (1) would be a faster fix than (2).  Tight-loop polling (by constant reposting
> of the task) is not the way to go.

Random thought for posciak@: if we make the decoder init notify through the
event system, we might even be able to unify this with the resolution change
event handling.  WDYT?

shivdasp

https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode719 content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto chk_format_info; Ohh that's why this never happens on ...

6 years, 9 months ago (2014-03-10 05:58:12 UTC) #41

https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto
chk_format_info;
Ohh that's why this never happens on Exynos.
Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane synchronous 
looks the quicker solution and we wouldn't have to change a thing in VDA.
I will make that change.
Will re-post a patch removing this fix in VDA and addressing your comments too.

Thanks.
On 2014/03/07 21:36:29, sheu wrote:
> On 2014/03/07 20:31:50, sheu wrote:
> > On 2014/03/07 17:31:40, shivdasp wrote:
> > > Jumping to GetFormatInfo() will move us out of kInitialized state which is
> > what
> > > the bug really was. We did not try to check for format info if we were out
> of
> > > buffers while in kInitialized state.
> > > I am not sure how does starting of device poll earlier will solve this
stuck
> > up
> > > problem. I will test it on my monday and update.
> > > The problem I see is that if we do not allocate buffers on CAPTURE plane
> > (which
> > > will happen after formatinfo is set) the buffers on OUTPUT plane will not
> get
> > > consumed and hence will not be dequeued. So even with device polling we
> cannot
> > > send more bitstream. We should have got the FormatInfo() after processing
> the
> > > first buffer itself but if the detection is getting delayed we should keep
> > > checking for it while sending more bitstream buffers.
> > > 
> > > 
> > > On 2014/03/07 00:18:08, sheu wrote:
> > > > We seem to be allergic to "goto" statements in Chrome, for not entirely
> bad
> > > > reasons.
> > > > 
> > > > But also I don't see what jumping to GetFormatInfo would buy us.  It
seems
> > to
> > > me
> > > > that if we're out of buffers, whether we have format info or not does
not
> > > > correlate with whether we'll soon be getting buffers.
> > > 
> > 
> > Ah, i see how it is.  So if I have this right: the problem is that we queue
> data
> > to the OUTPUT queue for the decoder to initialize, but the decoder
> > initialization happens asynchronously.  So VIDIOC_G_FMT on the CAPTURE queue
> may
> > not succeed immediately; instead at some undefined point in the future the
> > decoder initialization will finish and VIDIOC_G_FMT will start returning the
> > proper format.  This is a problem in TegraVDA since when we fail to get the
> > format after each buffer we queue, we continue trying to queue buffers; the
> race
> > is between the decoder initialization finishing and the VDA running out of
> > buffers to continue to enqueue with.
> > 
> > For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE
queue
> is
> > synchronous w.r.t. the decoder initialization; i.e. it blocks until we know
> for
> > sure whether the decoder has initialized with the given input, or not.  That
> > brings to mind two possible solutions in the TegraVDA case:
> > 
> > 1.  Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the
> Exynos
> > case
> > 2.  Add a notification system for decoder initialization.  I'd go with the
> > V4L2_EVENT system, by adding a private event for the Tegra driver (see:
> > V4L2_EVENT_PRIVATE_START).
> > 
> > (1) would be a faster fix than (2).  Tight-loop polling (by constant
reposting
> > of the task) is not the way to go.
> 
> Random thought for posciak@: if we make the decoder init notify through the
> event system, we might even be able to unify this with the resolution change
> event handling.  WDYT?

Pawel Osciak

On 2014/03/07 21:36:29, sheu wrote: > https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode719 > ...

6 years, 9 months ago (2014-03-10 11:31:48 UTC) #42

On 2014/03/07 21:36:29, sheu wrote:
>
https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g...
> File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):
> 
>
https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g...
> content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto
> chk_format_info;
> On 2014/03/07 20:31:50, sheu wrote:
> > On 2014/03/07 17:31:40, shivdasp wrote:
> > > Jumping to GetFormatInfo() will move us out of kInitialized state which is
> > what
> > > the bug really was. We did not try to check for format info if we were out
> of
> > > buffers while in kInitialized state.
> > > I am not sure how does starting of device poll earlier will solve this
stuck
> > up
> > > problem. I will test it on my monday and update.
> > > The problem I see is that if we do not allocate buffers on CAPTURE plane
> > (which
> > > will happen after formatinfo is set) the buffers on OUTPUT plane will not
> get
> > > consumed and hence will not be dequeued. So even with device polling we
> cannot
> > > send more bitstream. We should have got the FormatInfo() after processing
> the
> > > first buffer itself but if the detection is getting delayed we should keep
> > > checking for it while sending more bitstream buffers.
> > > 
> > > 
> > > On 2014/03/07 00:18:08, sheu wrote:
> > > > We seem to be allergic to "goto" statements in Chrome, for not entirely
> bad
> > > > reasons.
> > > > 
> > > > But also I don't see what jumping to GetFormatInfo would buy us.  It
seems
> > to
> > > me
> > > > that if we're out of buffers, whether we have format info or not does
not
> > > > correlate with whether we'll soon be getting buffers.
> > > 
> > 
> > Ah, i see how it is.  So if I have this right: the problem is that we queue
> data
> > to the OUTPUT queue for the decoder to initialize, but the decoder
> > initialization happens asynchronously.  So VIDIOC_G_FMT on the CAPTURE queue
> may
> > not succeed immediately; instead at some undefined point in the future the
> > decoder initialization will finish and VIDIOC_G_FMT will start returning the
> > proper format.  This is a problem in TegraVDA since when we fail to get the
> > format after each buffer we queue, we continue trying to queue buffers; the
> race
> > is between the decoder initialization finishing and the VDA running out of
> > buffers to continue to enqueue with.
> > 
> > For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE
queue
> is
> > synchronous w.r.t. the decoder initialization; i.e. it blocks until we know
> for
> > sure whether the decoder has initialized with the given input, or not.  That
> > brings to mind two possible solutions in the TegraVDA case:
> > 
> > 1.  Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the
> Exynos
> > case
> > 2.  Add a notification system for decoder initialization.  I'd go with the
> > V4L2_EVENT system, by adding a private event for the Tegra driver (see:
> > V4L2_EVENT_PRIVATE_START).
> > 
> > (1) would be a faster fix than (2).  Tight-loop polling (by constant
reposting
> > of the task) is not the way to go.
> 
> Random thought for posciak@: if we make the decoder init notify through the
> event system, we might even be able to unify this with the resolution change
> event handling.  WDYT?

Hm, this is actually quite a neat idea... I like it.

I discussed this with other V4L2 developers, but we haven't arrived at a
consensus yet. I believe this is better than G_FMT though. So this should be a
good solution at least for the short term.
We can change the behavior in Exynos to use the event for initial G_FMT as well
(you could do an ifdef on Exynos in the meantime).

Shivdas: what do you think about this (using the resolution change event for
initial G_FMT as well)?

Pawel Osciak

On 2014/03/10 05:58:12, shivdasp wrote: > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode719 > ...

6 years, 9 months ago (2014-03-10 11:32:54 UTC) #43

On 2014/03/10 05:58:12, shivdasp wrote:
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):
> 
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto
> chk_format_info;
> Ohh that's why this never happens on Exynos.
> Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane synchronous 
> looks the quicker solution and we wouldn't have to change a thing in VDA.
> I will make that change.
> Will re-post a patch removing this fix in VDA and addressing your comments
too.
> 

Does that mean you'd require the first buffer that is queued by client to
contain all the info required to make G_FMT work?

> Thanks.
> On 2014/03/07 21:36:29, sheu wrote:
> > On 2014/03/07 20:31:50, sheu wrote:
> > > On 2014/03/07 17:31:40, shivdasp wrote:
> > > > Jumping to GetFormatInfo() will move us out of kInitialized state which
is
> > > what
> > > > the bug really was. We did not try to check for format info if we were
out
> > of
> > > > buffers while in kInitialized state.
> > > > I am not sure how does starting of device poll earlier will solve this
> stuck
> > > up
> > > > problem. I will test it on my monday and update.
> > > > The problem I see is that if we do not allocate buffers on CAPTURE plane
> > > (which
> > > > will happen after formatinfo is set) the buffers on OUTPUT plane will
not
> > get
> > > > consumed and hence will not be dequeued. So even with device polling we
> > cannot
> > > > send more bitstream. We should have got the FormatInfo() after
processing
> > the
> > > > first buffer itself but if the detection is getting delayed we should
keep
> > > > checking for it while sending more bitstream buffers.
> > > > 
> > > > 
> > > > On 2014/03/07 00:18:08, sheu wrote:
> > > > > We seem to be allergic to "goto" statements in Chrome, for not
entirely
> > bad
> > > > > reasons.
> > > > > 
> > > > > But also I don't see what jumping to GetFormatInfo would buy us.  It
> seems
> > > to
> > > > me
> > > > > that if we're out of buffers, whether we have format info or not does
> not
> > > > > correlate with whether we'll soon be getting buffers.
> > > > 
> > > 
> > > Ah, i see how it is.  So if I have this right: the problem is that we
queue
> > data
> > > to the OUTPUT queue for the decoder to initialize, but the decoder
> > > initialization happens asynchronously.  So VIDIOC_G_FMT on the CAPTURE
queue
> > may
> > > not succeed immediately; instead at some undefined point in the future the
> > > decoder initialization will finish and VIDIOC_G_FMT will start returning
the
> > > proper format.  This is a problem in TegraVDA since when we fail to get
the
> > > format after each buffer we queue, we continue trying to queue buffers;
the
> > race
> > > is between the decoder initialization finishing and the VDA running out of
> > > buffers to continue to enqueue with.
> > > 
> > > For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE
> queue
> > is
> > > synchronous w.r.t. the decoder initialization; i.e. it blocks until we
know
> > for
> > > sure whether the decoder has initialized with the given input, or not. 
That
> > > brings to mind two possible solutions in the TegraVDA case:
> > > 
> > > 1.  Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the
> > Exynos
> > > case
> > > 2.  Add a notification system for decoder initialization.  I'd go with the
> > > V4L2_EVENT system, by adding a private event for the Tegra driver (see:
> > > V4L2_EVENT_PRIVATE_START).
> > > 
> > > (1) would be a faster fix than (2).  Tight-loop polling (by constant
> reposting
> > > of the task) is not the way to go.
> > 
> > Random thought for posciak@: if we make the decoder init notify through the
> > event system, we might even be able to unify this with the resolution change
> > event handling.  WDYT?

sheu

On 2014/03/10 11:31:48, Pawel Osciak wrote: > Hm, this is actually quite a neat idea... ...

6 years, 9 months ago (2014-03-10 19:42:18 UTC) #44

Pawel Osciak

On 2014/03/10 19:42:18, sheu wrote: > On 2014/03/10 11:31:48, Pawel Osciak wrote: > > Hm, ...

6 years, 9 months ago (2014-03-11 00:03:14 UTC) #45

shivdasp

On 2014/03/10 11:32:54, Pawel Osciak wrote: > On 2014/03/10 05:58:12, shivdasp wrote: > > > ...

6 years, 9 months ago (2014-03-11 06:25:19 UTC) #46

On 2014/03/10 11:32:54, Pawel Osciak wrote:
> On 2014/03/10 05:58:12, shivdasp wrote:
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):
> > 
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto
> > chk_format_info;
> > Ohh that's why this never happens on Exynos.
> > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane synchronous 
> > looks the quicker solution and we wouldn't have to change a thing in VDA.
> > I will make that change.
> > Will re-post a patch removing this fix in VDA and addressing your comments
> too.
> > 
> 
> Does that mean you'd require the first buffer that is queued by client to
> contain all the info required to make G_FMT work?
> 
To make this VIDIOC_G_FMT synchronous, VDA should have already enqueued a buffer
on OUTPUT PLANE with all info required
for the decode to initialize correctly.
When I try to make it synchronous I sometimes see that the VDA might not have
submitted any buffer in which case
the synchronous call of VIDIOC_G_FMT will wait and thereby cause the decoder
thread to wait indefinitely.
I can have timeouts but the timeouts may also race with VDA ending up with input
buffers.

How is the VIDIOC_F_FMT synchronous implemented in Exynos ?

I would lean towards event based mechanism rather than synchronous behavior to
avoid any deadlock issues like above.
I can add event based mechanism but there is no compile-time flag to work this
only for Tegra.

Should we attempt to restructure the DecodeBufferInitial() to try and enqueue
input buffers and try GetFormatInfo() too.
This would help us get the VDA to work without any implementation deviations.

Thanks

> > Thanks.
> > On 2014/03/07 21:36:29, sheu wrote:
> > > On 2014/03/07 20:31:50, sheu wrote:
> > > > On 2014/03/07 17:31:40, shivdasp wrote:
> > > > > Jumping to GetFormatInfo() will move us out of kInitialized state
which
> is
> > > > what
> > > > > the bug really was. We did not try to check for format info if we were
> out
> > > of
> > > > > buffers while in kInitialized state.
> > > > > I am not sure how does starting of device poll earlier will solve this
> > stuck
> > > > up
> > > > > problem. I will test it on my monday and update.
> > > > > The problem I see is that if we do not allocate buffers on CAPTURE
plane
> > > > (which
> > > > > will happen after formatinfo is set) the buffers on OUTPUT plane will
> not
> > > get
> > > > > consumed and hence will not be dequeued. So even with device polling
we
> > > cannot
> > > > > send more bitstream. We should have got the FormatInfo() after
> processing
> > > the
> > > > > first buffer itself but if the detection is getting delayed we should
> keep
> > > > > checking for it while sending more bitstream buffers.
> > > > > 
> > > > > 
> > > > > On 2014/03/07 00:18:08, sheu wrote:
> > > > > > We seem to be allergic to "goto" statements in Chrome, for not
> entirely
> > > bad
> > > > > > reasons.
> > > > > > 
> > > > > > But also I don't see what jumping to GetFormatInfo would buy us.  It
> > seems
> > > > to
> > > > > me
> > > > > > that if we're out of buffers, whether we have format info or not
does
> > not
> > > > > > correlate with whether we'll soon be getting buffers.
> > > > > 
> > > > 
> > > > Ah, i see how it is.  So if I have this right: the problem is that we
> queue
> > > data
> > > > to the OUTPUT queue for the decoder to initialize, but the decoder
> > > > initialization happens asynchronously.  So VIDIOC_G_FMT on the CAPTURE
> queue
> > > may
> > > > not succeed immediately; instead at some undefined point in the future
the
> > > > decoder initialization will finish and VIDIOC_G_FMT will start returning
> the
> > > > proper format.  This is a problem in TegraVDA since when we fail to get
> the
> > > > format after each buffer we queue, we continue trying to queue buffers;
> the
> > > race
> > > > is between the decoder initialization finishing and the VDA running out
of
> > > > buffers to continue to enqueue with.
> > > > 
> > > > For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE
> > queue
> > > is
> > > > synchronous w.r.t. the decoder initialization; i.e. it blocks until we
> know
> > > for
> > > > sure whether the decoder has initialized with the given input, or not. 
> That
> > > > brings to mind two possible solutions in the TegraVDA case:
> > > > 
> > > > 1.  Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the
> > > Exynos
> > > > case
> > > > 2.  Add a notification system for decoder initialization.  I'd go with
the
> > > > V4L2_EVENT system, by adding a private event for the Tegra driver (see:
> > > > V4L2_EVENT_PRIVATE_START).
> > > > 
> > > > (1) would be a faster fix than (2).  Tight-loop polling (by constant
> > reposting
> > > > of the task) is not the way to go.
> > > 
> > > Random thought for posciak@: if we make the decoder init notify through
the
> > > event system, we might even be able to unify this with the resolution
change
> > > event handling.  WDYT?

Pawel Osciak

On 2014/03/11 06:25:19, shivdasp wrote: > On 2014/03/10 11:32:54, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-12 06:30:33 UTC) #47

On 2014/03/11 06:25:19, shivdasp wrote:
> On 2014/03/10 11:32:54, Pawel Osciak wrote:
> > On 2014/03/10 05:58:12, shivdasp wrote:
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):
> > > 
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto
> > > chk_format_info;
> > > Ohh that's why this never happens on Exynos.
> > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane synchronous 
> > > looks the quicker solution and we wouldn't have to change a thing in VDA.
> > > I will make that change.
> > > Will re-post a patch removing this fix in VDA and addressing your comments
> > too.
> > > 
> > 
> > Does that mean you'd require the first buffer that is queued by client to
> > contain all the info required to make G_FMT work?
> > 
> To make this VIDIOC_G_FMT synchronous, VDA should have already enqueued a
buffer
> on OUTPUT PLANE with all info required
> for the decode to initialize correctly.
> When I try to make it synchronous I sometimes see that the VDA might not have
> submitted any buffer in which case
> the synchronous call of VIDIOC_G_FMT will wait and thereby cause the decoder
> thread to wait indefinitely.
> I can have timeouts but the timeouts may also race with VDA ending up with
input
> buffers.
> 
> How is the VIDIOC_F_FMT synchronous implemented in Exynos ?
> 

It's not really, there is a silent assumption that it will work on the first
buffer queued.

> I would lean towards event based mechanism rather than synchronous behavior to
> avoid any deadlock issues like above.
> I can add event based mechanism but there is no compile-time flag to work this
> only for Tegra.
> 

I'd agree. I'll add that event to Exynos later, but shouldn't you be able to
make it work already? If you keep going (assuming you have
https://codereview.chromium.org/189993002/), Dequeue() will get your event and
trigger a resolution change. DestroyOutputBuffers should then not do anything
apart from calling reqbufs(0), which is ok to call even if there are no buffers
allocated from the API perspective. And then it will go on.

So I think it should just work if you simply add the event to Tegra (and have
https://codereview.chromium.org/189993002/) without making any changes to the
class?

> Should we attempt to restructure the DecodeBufferInitial() to try and enqueue
> input buffers and try GetFormatInfo() too.
> This would help us get the VDA to work without any implementation deviations.
> 
> Thanks
> 
> > > Thanks.
> > > On 2014/03/07 21:36:29, sheu wrote:
> > > > On 2014/03/07 20:31:50, sheu wrote:
> > > > > On 2014/03/07 17:31:40, shivdasp wrote:
> > > > > > Jumping to GetFormatInfo() will move us out of kInitialized state
> which
> > is
> > > > > what
> > > > > > the bug really was. We did not try to check for format info if we
were
> > out
> > > > of
> > > > > > buffers while in kInitialized state.
> > > > > > I am not sure how does starting of device poll earlier will solve
this
> > > stuck
> > > > > up
> > > > > > problem. I will test it on my monday and update.
> > > > > > The problem I see is that if we do not allocate buffers on CAPTURE
> plane
> > > > > (which
> > > > > > will happen after formatinfo is set) the buffers on OUTPUT plane
will
> > not
> > > > get
> > > > > > consumed and hence will not be dequeued. So even with device polling
> we
> > > > cannot
> > > > > > send more bitstream. We should have got the FormatInfo() after
> > processing
> > > > the
> > > > > > first buffer itself but if the detection is getting delayed we
should
> > keep
> > > > > > checking for it while sending more bitstream buffers.
> > > > > > 
> > > > > > 
> > > > > > On 2014/03/07 00:18:08, sheu wrote:
> > > > > > > We seem to be allergic to "goto" statements in Chrome, for not
> > entirely
> > > > bad
> > > > > > > reasons.
> > > > > > > 
> > > > > > > But also I don't see what jumping to GetFormatInfo would buy us. 
It
> > > seems
> > > > > to
> > > > > > me
> > > > > > > that if we're out of buffers, whether we have format info or not
> does
> > > not
> > > > > > > correlate with whether we'll soon be getting buffers.
> > > > > > 
> > > > > 
> > > > > Ah, i see how it is.  So if I have this right: the problem is that we
> > queue
> > > > data
> > > > > to the OUTPUT queue for the decoder to initialize, but the decoder
> > > > > initialization happens asynchronously.  So VIDIOC_G_FMT on the CAPTURE
> > queue
> > > > may
> > > > > not succeed immediately; instead at some undefined point in the future
> the
> > > > > decoder initialization will finish and VIDIOC_G_FMT will start
returning
> > the
> > > > > proper format.  This is a problem in TegraVDA since when we fail to
get
> > the
> > > > > format after each buffer we queue, we continue trying to queue
buffers;
> > the
> > > > race
> > > > > is between the decoder initialization finishing and the VDA running
out
> of
> > > > > buffers to continue to enqueue with.
> > > > > 
> > > > > For Exynos we get around this problem since VIDIOC_G_FMT on the
CAPTURE
> > > queue
> > > > is
> > > > > synchronous w.r.t. the decoder initialization; i.e. it blocks until we
> > know
> > > > for
> > > > > sure whether the decoder has initialized with the given input, or not.

> > That
> > > > > brings to mind two possible solutions in the TegraVDA case:
> > > > > 
> > > > > 1.  Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in
the
> > > > Exynos
> > > > > case
> > > > > 2.  Add a notification system for decoder initialization.  I'd go with
> the
> > > > > V4L2_EVENT system, by adding a private event for the Tegra driver
(see:
> > > > > V4L2_EVENT_PRIVATE_START).
> > > > > 
> > > > > (1) would be a faster fix than (2).  Tight-loop polling (by constant
> > > reposting
> > > > > of the task) is not the way to go.
> > > > 
> > > > Random thought for posciak@: if we make the decoder init notify through
> the
> > > > event system, we might even be able to unify this with the resolution
> change
> > > > event handling.  WDYT?

shivdasp

On 2014/03/12 06:30:33, Pawel Osciak wrote: > On 2014/03/11 06:25:19, shivdasp wrote: > > On ...

6 years, 9 months ago (2014-03-12 06:46:15 UTC) #48

On 2014/03/12 06:30:33, Pawel Osciak wrote:
> On 2014/03/11 06:25:19, shivdasp wrote:
> > On 2014/03/10 11:32:54, Pawel Osciak wrote:
> > > On 2014/03/10 05:58:12, shivdasp wrote:
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):
> > > > 
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto
> > > > chk_format_info;
> > > > Ohh that's why this never happens on Exynos.
> > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane
synchronous 
> > > > looks the quicker solution and we wouldn't have to change a thing in
VDA.
> > > > I will make that change.
> > > > Will re-post a patch removing this fix in VDA and addressing your
comments
> > > too.
> > > > 
> > > 
> > > Does that mean you'd require the first buffer that is queued by client to
> > > contain all the info required to make G_FMT work?
> > > 
> > To make this VIDIOC_G_FMT synchronous, VDA should have already enqueued a
> buffer
> > on OUTPUT PLANE with all info required
> > for the decode to initialize correctly.
> > When I try to make it synchronous I sometimes see that the VDA might not
have
> > submitted any buffer in which case
> > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the decoder
> > thread to wait indefinitely.
> > I can have timeouts but the timeouts may also race with VDA ending up with
> input
> > buffers.
> > 
> > How is the VIDIOC_F_FMT synchronous implemented in Exynos ?
> > 
> 
> It's not really, there is a silent assumption that it will work on the first
> buffer queued.
> 
> > I would lean towards event based mechanism rather than synchronous behavior
to
> > avoid any deadlock issues like above.
> > I can add event based mechanism but there is no compile-time flag to work
this
> > only for Tegra.
> > 
> 
> I'd agree. I'll add that event to Exynos later, but shouldn't you be able to
> make it work already? If you keep going (assuming you have
> https://codereview.chromium.org/189993002/), Dequeue() will get your event and
> trigger a resolution change. DestroyOutputBuffers should then not do anything
> apart from calling reqbufs(0), which is ok to call even if there are no
buffers
> allocated from the API perspective. And then it will go on.
> 
> So I think it should just work if you simply add the event to Tegra (and have
> https://codereview.chromium.org/189993002/) without making any changes to the
> class?
> 
Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event when the
capture format is set for the first time.
I thought we were going to introduce another event for this that the decoder is
initialized.
Let me take https://codereview.chromium.org/189993002/ and add the resolution
change event and see how far can it go.
Will update.

Pawel Osciak

On 2014/03/12 06:46:15, shivdasp wrote: > On 2014/03/12 06:30:33, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-12 06:49:57 UTC) #49

On 2014/03/12 06:46:15, shivdasp wrote:
> On 2014/03/12 06:30:33, Pawel Osciak wrote:
> > On 2014/03/11 06:25:19, shivdasp wrote:
> > > On 2014/03/10 11:32:54, Pawel Osciak wrote:
> > > > On 2014/03/10 05:58:12, shivdasp wrote:
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc
(right):
> > > > > 
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto
> > > > > chk_format_info;
> > > > > Ohh that's why this never happens on Exynos.
> > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane
> synchronous 
> > > > > looks the quicker solution and we wouldn't have to change a thing in
> VDA.
> > > > > I will make that change.
> > > > > Will re-post a patch removing this fix in VDA and addressing your
> comments
> > > > too.
> > > > > 
> > > > 
> > > > Does that mean you'd require the first buffer that is queued by client
to
> > > > contain all the info required to make G_FMT work?
> > > > 
> > > To make this VIDIOC_G_FMT synchronous, VDA should have already enqueued a
> > buffer
> > > on OUTPUT PLANE with all info required
> > > for the decode to initialize correctly.
> > > When I try to make it synchronous I sometimes see that the VDA might not
> have
> > > submitted any buffer in which case
> > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the
decoder
> > > thread to wait indefinitely.
> > > I can have timeouts but the timeouts may also race with VDA ending up with
> > input
> > > buffers.
> > > 
> > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ?
> > > 
> > 
> > It's not really, there is a silent assumption that it will work on the first
> > buffer queued.
> > 
> > > I would lean towards event based mechanism rather than synchronous
behavior
> to
> > > avoid any deadlock issues like above.
> > > I can add event based mechanism but there is no compile-time flag to work
> this
> > > only for Tegra.
> > > 
> > 
> > I'd agree. I'll add that event to Exynos later, but shouldn't you be able to
> > make it work already? If you keep going (assuming you have
> > https://codereview.chromium.org/189993002/), Dequeue() will get your event
and
> > trigger a resolution change. DestroyOutputBuffers should then not do
anything
> > apart from calling reqbufs(0), which is ok to call even if there are no
> buffers
> > allocated from the API perspective. And then it will go on.
> > 
> > So I think it should just work if you simply add the event to Tegra (and
have
> > https://codereview.chromium.org/189993002/) without making any changes to
the
> > class?
> > 
> Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event when the
> capture format is set for the first time.

Yes exactly. Should just work.

> I thought we were going to introduce another event for this that the decoder
is
> initialized.
> Let me take https://codereview.chromium.org/189993002/ and add the resolution
> change event and see how far can it go.
> Will update.

shivdasp

On 2014/03/12 06:49:57, Pawel Osciak wrote: > On 2014/03/12 06:46:15, shivdasp wrote: > > On ...

6 years, 9 months ago (2014-03-12 09:35:08 UTC) #50

On 2014/03/12 06:49:57, Pawel Osciak wrote:
> On 2014/03/12 06:46:15, shivdasp wrote:
> > On 2014/03/12 06:30:33, Pawel Osciak wrote:
> > > On 2014/03/11 06:25:19, shivdasp wrote:
> > > > On 2014/03/10 11:32:54, Pawel Osciak wrote:
> > > > > On 2014/03/10 05:58:12, shivdasp wrote:
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc
> (right):
> > > > > > 
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto
> > > > > > chk_format_info;
> > > > > > Ohh that's why this never happens on Exynos.
> > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane
> > synchronous 
> > > > > > looks the quicker solution and we wouldn't have to change a thing in
> > VDA.
> > > > > > I will make that change.
> > > > > > Will re-post a patch removing this fix in VDA and addressing your
> > comments
> > > > > too.
> > > > > > 
> > > > > 
> > > > > Does that mean you'd require the first buffer that is queued by client
> to
> > > > > contain all the info required to make G_FMT work?
> > > > > 
> > > > To make this VIDIOC_G_FMT synchronous, VDA should have already enqueued
a
> > > buffer
> > > > on OUTPUT PLANE with all info required
> > > > for the decode to initialize correctly.
> > > > When I try to make it synchronous I sometimes see that the VDA might not
> > have
> > > > submitted any buffer in which case
> > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the
> decoder
> > > > thread to wait indefinitely.
> > > > I can have timeouts but the timeouts may also race with VDA ending up
with
> > > input
> > > > buffers.
> > > > 
> > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ?
> > > > 
> > > 
> > > It's not really, there is a silent assumption that it will work on the
first
> > > buffer queued.
> > > 
> > > > I would lean towards event based mechanism rather than synchronous
> behavior
> > to
> > > > avoid any deadlock issues like above.
> > > > I can add event based mechanism but there is no compile-time flag to
work
> > this
> > > > only for Tegra.
> > > > 
> > > 
> > > I'd agree. I'll add that event to Exynos later, but shouldn't you be able
to
> > > make it work already? If you keep going (assuming you have
> > > https://codereview.chromium.org/189993002/), Dequeue() will get your event
> and
> > > trigger a resolution change. DestroyOutputBuffers should then not do
> anything
> > > apart from calling reqbufs(0), which is ok to call even if there are no
> > buffers
> > > allocated from the API perspective. And then it will go on.
> > > 
> > > So I think it should just work if you simply add the event to Tegra (and
> have
> > > https://codereview.chromium.org/189993002/) without making any changes to
> the
> > > class?
> > > 
> > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event when
the
> > capture format is set for the first time.
> 
> Yes exactly. Should just work.
As I am making these changes it occurred to me that with this way of doing it,
we might do one unnecessary resolution change (freeing and re-allocation of
buffers)
if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself which will
cause a slight jitter.

> 
> > I thought we were going to introduce another event for this that the decoder
> is
> > initialized.
> > Let me take https://codereview.chromium.org/189993002/ and add the
resolution
> > change event and see how far can it go.
> > Will update.

Pawel Osciak

On 2014/03/12 09:35:08, shivdasp wrote: > On 2014/03/12 06:49:57, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-12 09:54:09 UTC) #51

On 2014/03/12 09:35:08, shivdasp wrote:
> On 2014/03/12 06:49:57, Pawel Osciak wrote:
> > On 2014/03/12 06:46:15, shivdasp wrote:
> > > On 2014/03/12 06:30:33, Pawel Osciak wrote:
> > > > On 2014/03/11 06:25:19, shivdasp wrote:
> > > > > On 2014/03/10 11:32:54, Pawel Osciak wrote:
> > > > > > On 2014/03/10 05:58:12, shivdasp wrote:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc
> > (right):
> > > > > > > 
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719:
goto
> > > > > > > chk_format_info;
> > > > > > > Ohh that's why this never happens on Exynos.
> > > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane
> > > synchronous 
> > > > > > > looks the quicker solution and we wouldn't have to change a thing
in
> > > VDA.
> > > > > > > I will make that change.
> > > > > > > Will re-post a patch removing this fix in VDA and addressing your
> > > comments
> > > > > > too.
> > > > > > > 
> > > > > > 
> > > > > > Does that mean you'd require the first buffer that is queued by
client
> > to
> > > > > > contain all the info required to make G_FMT work?
> > > > > > 
> > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already
enqueued
> a
> > > > buffer
> > > > > on OUTPUT PLANE with all info required
> > > > > for the decode to initialize correctly.
> > > > > When I try to make it synchronous I sometimes see that the VDA might
not
> > > have
> > > > > submitted any buffer in which case
> > > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the
> > decoder
> > > > > thread to wait indefinitely.
> > > > > I can have timeouts but the timeouts may also race with VDA ending up
> with
> > > > input
> > > > > buffers.
> > > > > 
> > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ?
> > > > > 
> > > > 
> > > > It's not really, there is a silent assumption that it will work on the
> first
> > > > buffer queued.
> > > > 
> > > > > I would lean towards event based mechanism rather than synchronous
> > behavior
> > > to
> > > > > avoid any deadlock issues like above.
> > > > > I can add event based mechanism but there is no compile-time flag to
> work
> > > this
> > > > > only for Tegra.
> > > > > 
> > > > 
> > > > I'd agree. I'll add that event to Exynos later, but shouldn't you be
able
> to
> > > > make it work already? If you keep going (assuming you have
> > > > https://codereview.chromium.org/189993002/), Dequeue() will get your
event
> > and
> > > > trigger a resolution change. DestroyOutputBuffers should then not do
> > anything
> > > > apart from calling reqbufs(0), which is ok to call even if there are no
> > > buffers
> > > > allocated from the API perspective. And then it will go on.
> > > > 
> > > > So I think it should just work if you simply add the event to Tegra (and
> > have
> > > > https://codereview.chromium.org/189993002/) without making any changes
to
> > the
> > > > class?
> > > > 
> > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event when
> the
> > > capture format is set for the first time.
> > 
> > Yes exactly. Should just work.
> As I am making these changes it occurred to me that with this way of doing it,
> we might do one unnecessary resolution change (freeing and re-allocation of
> buffers)
> if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself which
will
> cause a slight jitter.

True. But this is temporary until we implement it for Exynos and remove it from
DBI() entirely.

> 
> > 
> > > I thought we were going to introduce another event for this that the
decoder
> > is
> > > initialized.
> > > Let me take https://codereview.chromium.org/189993002/ and add the
> resolution
> > > change event and see how far can it go.
> > > Will update.

shivdasp

On 2014/03/12 09:54:09, Pawel Osciak wrote: > On 2014/03/12 09:35:08, shivdasp wrote: > > On ...

6 years, 9 months ago (2014-03-12 12:16:14 UTC) #52

On 2014/03/12 09:54:09, Pawel Osciak wrote:
> On 2014/03/12 09:35:08, shivdasp wrote:
> > On 2014/03/12 06:49:57, Pawel Osciak wrote:
> > > On 2014/03/12 06:46:15, shivdasp wrote:
> > > > On 2014/03/12 06:30:33, Pawel Osciak wrote:
> > > > > On 2014/03/11 06:25:19, shivdasp wrote:
> > > > > > On 2014/03/10 11:32:54, Pawel Osciak wrote:
> > > > > > > On 2014/03/10 05:58:12, shivdasp wrote:
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc
> > > (right):
> > > > > > > > 
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719:
> goto
> > > > > > > > chk_format_info;
> > > > > > > > Ohh that's why this never happens on Exynos.
> > > > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane
> > > > synchronous 
> > > > > > > > looks the quicker solution and we wouldn't have to change a
thing
> in
> > > > VDA.
> > > > > > > > I will make that change.
> > > > > > > > Will re-post a patch removing this fix in VDA and addressing
your
> > > > comments
> > > > > > > too.
> > > > > > > > 
> > > > > > > 
> > > > > > > Does that mean you'd require the first buffer that is queued by
> client
> > > to
> > > > > > > contain all the info required to make G_FMT work?
> > > > > > > 
> > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already
> enqueued
> > a
> > > > > buffer
> > > > > > on OUTPUT PLANE with all info required
> > > > > > for the decode to initialize correctly.
> > > > > > When I try to make it synchronous I sometimes see that the VDA might
> not
> > > > have
> > > > > > submitted any buffer in which case
> > > > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the
> > > decoder
> > > > > > thread to wait indefinitely.
> > > > > > I can have timeouts but the timeouts may also race with VDA ending
up
> > with
> > > > > input
> > > > > > buffers.
> > > > > > 
> > > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ?
> > > > > > 
> > > > > 
> > > > > It's not really, there is a silent assumption that it will work on the
> > first
> > > > > buffer queued.
> > > > > 
> > > > > > I would lean towards event based mechanism rather than synchronous
> > > behavior
> > > > to
> > > > > > avoid any deadlock issues like above.
> > > > > > I can add event based mechanism but there is no compile-time flag to
> > work
> > > > this
> > > > > > only for Tegra.
> > > > > > 
> > > > > 
> > > > > I'd agree. I'll add that event to Exynos later, but shouldn't you be
> able
> > to
> > > > > make it work already? If you keep going (assuming you have
> > > > > https://codereview.chromium.org/189993002/), Dequeue() will get your
> event
> > > and
> > > > > trigger a resolution change. DestroyOutputBuffers should then not do
> > > anything
> > > > > apart from calling reqbufs(0), which is ok to call even if there are
no
> > > > buffers
> > > > > allocated from the API perspective. And then it will go on.
> > > > > 
> > > > > So I think it should just work if you simply add the event to Tegra
(and
> > > have
> > > > > https://codereview.chromium.org/189993002/) without making any changes
> to
> > > the
> > > > > class?
> > > > > 
> > > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event
when
> > the
> > > > capture format is set for the first time.
> > > 
> > > Yes exactly. Should just work.
> > As I am making these changes it occurred to me that with this way of doing
it,
> > we might do one unnecessary resolution change (freeing and re-allocation of
> > buffers)
> > if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself which
> will
> > cause a slight jitter.
> 
> True. But this is temporary until we implement it for Exynos and remove it
from
> DBI() entirely.
Tried doing this changes and I see a couple of issues.
There is a race condition between StartDevicePoll() and decoder_thread_ is
created I have updated that in that CL.

Secondly when I get past it, the check in rendering_helper.cc fails.
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu...
I will debug more on what exactly fails here since I spent most of the time in
debugging the race condition.

> 
> > 
> > > 
> > > > I thought we were going to introduce another event for this that the
> decoder
> > > is
> > > > initialized.
> > > > Let me take https://codereview.chromium.org/189993002/ and add the
> > resolution
> > > > change event and see how far can it go.
> > > > Will update.

shivdasp

On 2014/03/12 12:16:14, shivdasp wrote: > On 2014/03/12 09:54:09, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-13 07:14:56 UTC) #53

On 2014/03/12 12:16:14, shivdasp wrote:
> On 2014/03/12 09:54:09, Pawel Osciak wrote:
> > On 2014/03/12 09:35:08, shivdasp wrote:
> > > On 2014/03/12 06:49:57, Pawel Osciak wrote:
> > > > On 2014/03/12 06:46:15, shivdasp wrote:
> > > > > On 2014/03/12 06:30:33, Pawel Osciak wrote:
> > > > > > On 2014/03/11 06:25:19, shivdasp wrote:
> > > > > > > On 2014/03/10 11:32:54, Pawel Osciak wrote:
> > > > > > > > On 2014/03/10 05:58:12, shivdasp wrote:
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc
> > > > (right):
> > > > > > > > > 
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719:
> > goto
> > > > > > > > > chk_format_info;
> > > > > > > > > Ohh that's why this never happens on Exynos.
> > > > > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane
> > > > > synchronous 
> > > > > > > > > looks the quicker solution and we wouldn't have to change a
> thing
> > in
> > > > > VDA.
> > > > > > > > > I will make that change.
> > > > > > > > > Will re-post a patch removing this fix in VDA and addressing
> your
> > > > > comments
> > > > > > > > too.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > Does that mean you'd require the first buffer that is queued by
> > client
> > > > to
> > > > > > > > contain all the info required to make G_FMT work?
> > > > > > > > 
> > > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already
> > enqueued
> > > a
> > > > > > buffer
> > > > > > > on OUTPUT PLANE with all info required
> > > > > > > for the decode to initialize correctly.
> > > > > > > When I try to make it synchronous I sometimes see that the VDA
might
> > not
> > > > > have
> > > > > > > submitted any buffer in which case
> > > > > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause
the
> > > > decoder
> > > > > > > thread to wait indefinitely.
> > > > > > > I can have timeouts but the timeouts may also race with VDA ending
> up
> > > with
> > > > > > input
> > > > > > > buffers.
> > > > > > > 
> > > > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ?
> > > > > > > 
> > > > > > 
> > > > > > It's not really, there is a silent assumption that it will work on
the
> > > first
> > > > > > buffer queued.
> > > > > > 
> > > > > > > I would lean towards event based mechanism rather than synchronous
> > > > behavior
> > > > > to
> > > > > > > avoid any deadlock issues like above.
> > > > > > > I can add event based mechanism but there is no compile-time flag
to
> > > work
> > > > > this
> > > > > > > only for Tegra.
> > > > > > > 
> > > > > > 
> > > > > > I'd agree. I'll add that event to Exynos later, but shouldn't you be
> > able
> > > to
> > > > > > make it work already? If you keep going (assuming you have
> > > > > > https://codereview.chromium.org/189993002/), Dequeue() will get your
> > event
> > > > and
> > > > > > trigger a resolution change. DestroyOutputBuffers should then not do
> > > > anything
> > > > > > apart from calling reqbufs(0), which is ok to call even if there are
> no
> > > > > buffers
> > > > > > allocated from the API perspective. And then it will go on.
> > > > > > 
> > > > > > So I think it should just work if you simply add the event to Tegra
> (and
> > > > have
> > > > > > https://codereview.chromium.org/189993002/) without making any
changes
> > to
> > > > the
> > > > > > class?
> > > > > > 
> > > > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event
> when
> > > the
> > > > > capture format is set for the first time.
> > > > 
> > > > Yes exactly. Should just work.
> > > As I am making these changes it occurred to me that with this way of doing
> it,
> > > we might do one unnecessary resolution change (freeing and re-allocation
of
> > > buffers)
> > > if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself
which
> > will
> > > cause a slight jitter.
> > 
> > True. But this is temporary until we implement it for Exynos and remove it
> from
> > DBI() entirely.
> Tried doing this changes and I see a couple of issues.
> There is a race condition between StartDevicePoll() and decoder_thread_ is
> created I have updated that in that CL.
> 
> Secondly when I get past it, the check in rendering_helper.cc fails.
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu...
> I will debug more on what exactly fails here since I spent most of the time in
> debugging the race condition.
> 
Okay so I investigated why the condition fails in rendering_helper.cc line #411.
It fails when we do ProvidePictureBuffers() twice.
Once when the G_FMT actually succeeds in DBI() and next when the DequeueEvents()
finds the resolution change event.
I guess the vdatest is not equipped to handle resolution change scenario ? Is it
true ?
This does not fail all the time but it does fail about 50% of the time.
I am now testing this with browser to see if there are any issues with it.


> > 
> > > 
> > > > 
> > > > > I thought we were going to introduce another event for this that the
> > decoder
> > > > is
> > > > > initialized.
> > > > > Let me take https://codereview.chromium.org/189993002/ and add the
> > > resolution
> > > > > change event and see how far can it go.
> > > > > Will update.

shivdasp

On 2014/03/13 07:14:56, shivdasp wrote: > On 2014/03/12 12:16:14, shivdasp wrote: > > On 2014/03/12 ...

6 years, 9 months ago (2014-03-13 10:48:08 UTC) #54

On 2014/03/13 07:14:56, shivdasp wrote:
> On 2014/03/12 12:16:14, shivdasp wrote:
> > On 2014/03/12 09:54:09, Pawel Osciak wrote:
> > > On 2014/03/12 09:35:08, shivdasp wrote:
> > > > On 2014/03/12 06:49:57, Pawel Osciak wrote:
> > > > > On 2014/03/12 06:46:15, shivdasp wrote:
> > > > > > On 2014/03/12 06:30:33, Pawel Osciak wrote:
> > > > > > > On 2014/03/11 06:25:19, shivdasp wrote:
> > > > > > > > On 2014/03/10 11:32:54, Pawel Osciak wrote:
> > > > > > > > > On 2014/03/10 05:58:12, shivdasp wrote:
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > > > > > File
content/common/gpu/media/v4l2_video_decode_accelerator.cc
> > > > > (right):
> > > > > > > > > > 
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > > > > >
content/common/gpu/media/v4l2_video_decode_accelerator.cc:719:
> > > goto
> > > > > > > > > > chk_format_info;
> > > > > > > > > > Ohh that's why this never happens on Exynos.
> > > > > > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE
plane
> > > > > > synchronous 
> > > > > > > > > > looks the quicker solution and we wouldn't have to change a
> > thing
> > > in
> > > > > > VDA.
> > > > > > > > > > I will make that change.
> > > > > > > > > > Will re-post a patch removing this fix in VDA and addressing
> > your
> > > > > > comments
> > > > > > > > > too.
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Does that mean you'd require the first buffer that is queued
by
> > > client
> > > > > to
> > > > > > > > > contain all the info required to make G_FMT work?
> > > > > > > > > 
> > > > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already
> > > enqueued
> > > > a
> > > > > > > buffer
> > > > > > > > on OUTPUT PLANE with all info required
> > > > > > > > for the decode to initialize correctly.
> > > > > > > > When I try to make it synchronous I sometimes see that the VDA
> might
> > > not
> > > > > > have
> > > > > > > > submitted any buffer in which case
> > > > > > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause
> the
> > > > > decoder
> > > > > > > > thread to wait indefinitely.
> > > > > > > > I can have timeouts but the timeouts may also race with VDA
ending
> > up
> > > > with
> > > > > > > input
> > > > > > > > buffers.
> > > > > > > > 
> > > > > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ?
> > > > > > > > 
> > > > > > > 
> > > > > > > It's not really, there is a silent assumption that it will work on
> the
> > > > first
> > > > > > > buffer queued.
> > > > > > > 
> > > > > > > > I would lean towards event based mechanism rather than
synchronous
> > > > > behavior
> > > > > > to
> > > > > > > > avoid any deadlock issues like above.
> > > > > > > > I can add event based mechanism but there is no compile-time
flag
> to
> > > > work
> > > > > > this
> > > > > > > > only for Tegra.
> > > > > > > > 
> > > > > > > 
> > > > > > > I'd agree. I'll add that event to Exynos later, but shouldn't you
be
> > > able
> > > > to
> > > > > > > make it work already? If you keep going (assuming you have
> > > > > > > https://codereview.chromium.org/189993002/), Dequeue() will get
your
> > > event
> > > > > and
> > > > > > > trigger a resolution change. DestroyOutputBuffers should then not
do
> > > > > anything
> > > > > > > apart from calling reqbufs(0), which is ok to call even if there
are
> > no
> > > > > > buffers
> > > > > > > allocated from the API perspective. And then it will go on.
> > > > > > > 
> > > > > > > So I think it should just work if you simply add the event to
Tegra
> > (and
> > > > > have
> > > > > > > https://codereview.chromium.org/189993002/) without making any
> changes
> > > to
> > > > > the
> > > > > > > class?
> > > > > > > 
> > > > > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event
> > when
> > > > the
> > > > > > capture format is set for the first time.
> > > > > 
> > > > > Yes exactly. Should just work.
> > > > As I am making these changes it occurred to me that with this way of
doing
> > it,
> > > > we might do one unnecessary resolution change (freeing and re-allocation
> of
> > > > buffers)
> > > > if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself
> which
> > > will
> > > > cause a slight jitter.
> > > 
> > > True. But this is temporary until we implement it for Exynos and remove it
> > from
> > > DBI() entirely.
> > Tried doing this changes and I see a couple of issues.
> > There is a race condition between StartDevicePoll() and decoder_thread_ is
> > created I have updated that in that CL.
> > 
> > Secondly when I get past it, the check in rendering_helper.cc fails.
> >
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu...
> > I will debug more on what exactly fails here since I spent most of the time
in
> > debugging the race condition.
> > 
> Okay so I investigated why the condition fails in rendering_helper.cc line
#411.
> It fails when we do ProvidePictureBuffers() twice.
> Once when the G_FMT actually succeeds in DBI() and next when the
DequeueEvents()
> finds the resolution change event.
> I guess the vdatest is not equipped to handle resolution change scenario ? Is
it
> true ?
> This does not fail all the time but it does fail about 50% of the time.
> I am now testing this with browser to see if there are any issues with it.
> 
Okay so the browser works fine even in the case when G_FMT is successful before
dequeuing the resolution change event.
But the vdatest fails as described above.
Pawel , how should be go about this now ?
> 
> > > 
> > > > 
> > > > > 
> > > > > > I thought we were going to introduce another event for this that the
> > > decoder
> > > > > is
> > > > > > initialized.
> > > > > > Let me take https://codereview.chromium.org/189993002/ and add the
> > > > resolution
> > > > > > change event and see how far can it go.
> > > > > > Will update.

shivdasp

On 2014/03/13 10:48:08, shivdasp wrote: > On 2014/03/13 07:14:56, shivdasp wrote: > > On 2014/03/12 ...

6 years, 9 months ago (2014-03-14 04:48:28 UTC) #55

On 2014/03/13 10:48:08, shivdasp wrote:
> On 2014/03/13 07:14:56, shivdasp wrote:
> > On 2014/03/12 12:16:14, shivdasp wrote:
> > > On 2014/03/12 09:54:09, Pawel Osciak wrote:
> > > > On 2014/03/12 09:35:08, shivdasp wrote:
> > > > > On 2014/03/12 06:49:57, Pawel Osciak wrote:
> > > > > > On 2014/03/12 06:46:15, shivdasp wrote:
> > > > > > > On 2014/03/12 06:30:33, Pawel Osciak wrote:
> > > > > > > > On 2014/03/11 06:25:19, shivdasp wrote:
> > > > > > > > > On 2014/03/10 11:32:54, Pawel Osciak wrote:
> > > > > > > > > > On 2014/03/10 05:58:12, shivdasp wrote:
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > > > > > > File
> content/common/gpu/media/v4l2_video_decode_accelerator.cc
> > > > > > (right):
> > > > > > > > > > > 
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi...
> > > > > > > > > > >
> content/common/gpu/media/v4l2_video_decode_accelerator.cc:719:
> > > > goto
> > > > > > > > > > > chk_format_info;
> > > > > > > > > > > Ohh that's why this never happens on Exynos.
> > > > > > > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE
> plane
> > > > > > > synchronous 
> > > > > > > > > > > looks the quicker solution and we wouldn't have to change
a
> > > thing
> > > > in
> > > > > > > VDA.
> > > > > > > > > > > I will make that change.
> > > > > > > > > > > Will re-post a patch removing this fix in VDA and
addressing
> > > your
> > > > > > > comments
> > > > > > > > > > too.
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Does that mean you'd require the first buffer that is queued
> by
> > > > client
> > > > > > to
> > > > > > > > > > contain all the info required to make G_FMT work?
> > > > > > > > > > 
> > > > > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already
> > > > enqueued
> > > > > a
> > > > > > > > buffer
> > > > > > > > > on OUTPUT PLANE with all info required
> > > > > > > > > for the decode to initialize correctly.
> > > > > > > > > When I try to make it synchronous I sometimes see that the VDA
> > might
> > > > not
> > > > > > > have
> > > > > > > > > submitted any buffer in which case
> > > > > > > > > the synchronous call of VIDIOC_G_FMT will wait and thereby
cause
> > the
> > > > > > decoder
> > > > > > > > > thread to wait indefinitely.
> > > > > > > > > I can have timeouts but the timeouts may also race with VDA
> ending
> > > up
> > > > > with
> > > > > > > > input
> > > > > > > > > buffers.
> > > > > > > > > 
> > > > > > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ?
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > It's not really, there is a silent assumption that it will work
on
> > the
> > > > > first
> > > > > > > > buffer queued.
> > > > > > > > 
> > > > > > > > > I would lean towards event based mechanism rather than
> synchronous
> > > > > > behavior
> > > > > > > to
> > > > > > > > > avoid any deadlock issues like above.
> > > > > > > > > I can add event based mechanism but there is no compile-time
> flag
> > to
> > > > > work
> > > > > > > this
> > > > > > > > > only for Tegra.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > I'd agree. I'll add that event to Exynos later, but shouldn't
you
> be
> > > > able
> > > > > to
> > > > > > > > make it work already? If you keep going (assuming you have
> > > > > > > > https://codereview.chromium.org/189993002/), Dequeue() will get
> your
> > > > event
> > > > > > and
> > > > > > > > trigger a resolution change. DestroyOutputBuffers should then
not
> do
> > > > > > anything
> > > > > > > > apart from calling reqbufs(0), which is ok to call even if there
> are
> > > no
> > > > > > > buffers
> > > > > > > > allocated from the API perspective. And then it will go on.
> > > > > > > > 
> > > > > > > > So I think it should just work if you simply add the event to
> Tegra
> > > (and
> > > > > > have
> > > > > > > > https://codereview.chromium.org/189993002/) without making any
> > changes
> > > > to
> > > > > > the
> > > > > > > > class?
> > > > > > > > 
> > > > > > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE
event
> > > when
> > > > > the
> > > > > > > capture format is set for the first time.
> > > > > > 
> > > > > > Yes exactly. Should just work.
> > > > > As I am making these changes it occurred to me that with this way of
> doing
> > > it,
> > > > > we might do one unnecessary resolution change (freeing and
re-allocation
> > of
> > > > > buffers)
> > > > > if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself
> > which
> > > > will
> > > > > cause a slight jitter.
> > > > 
> > > > True. But this is temporary until we implement it for Exynos and remove
it
> > > from
> > > > DBI() entirely.
> > > Tried doing this changes and I see a couple of issues.
> > > There is a race condition between StartDevicePoll() and decoder_thread_ is
> > > created I have updated that in that CL.
> > > 
> > > Secondly when I get past it, the check in rendering_helper.cc fails.
> > >
> >
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu...
> > > I will debug more on what exactly fails here since I spent most of the
time
> in
> > > debugging the race condition.
> > > 
> > Okay so I investigated why the condition fails in rendering_helper.cc line
> #411.
> > It fails when we do ProvidePictureBuffers() twice.
> > Once when the G_FMT actually succeeds in DBI() and next when the
> DequeueEvents()
> > finds the resolution change event.
> > I guess the vdatest is not equipped to handle resolution change scenario ?
Is
> it
> > true ?
> > This does not fail all the time but it does fail about 50% of the time.
> > I am now testing this with browser to see if there are any issues with it.
> > 
> Okay so the browser works fine even in the case when G_FMT is successful
before
> dequeuing the resolution change event.
> But the vdatest fails as described above.
> Pawel , how should be go about this now ?

Pawel,
How do you suggest we go ahead now ?
Having device poll thread start early and using the resolution change event
seems breaking the vdatest.
Could you re-consider the patchset#7 for now (will address sheu's comments) that
has a not-so-good fix of reposting the task which will happen only on Tegra. We
anyways have a plan to use the decode initialization event and that will be the
neat way of doing this.
If you have any other ideas I can try them as well.

Thanks

> > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > > I thought we were going to introduce another event for this that
the
> > > > decoder
> > > > > > is
> > > > > > > initialized.
> > > > > > > Let me take https://codereview.chromium.org/189993002/ and add the
> > > > > resolution
> > > > > > > change event and see how far can it go.
> > > > > > > Will update.

Pawel Osciak

On 2014/03/14 04:48:28, shivdasp wrote: > Pawel, > How do you suggest we go ahead ...

6 years, 9 months ago (2014-03-18 05:49:59 UTC) #56

shivdasp

On 2014/03/18 05:49:59, Pawel Osciak wrote: > On 2014/03/14 04:48:28, shivdasp wrote: > > Pawel, ...

6 years, 9 months ago (2014-03-18 06:08:49 UTC) #57

Pawel Osciak

On 2014/03/18 06:08:49, shivdasp wrote: > On 2014/03/18 05:49:59, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-18 06:15:28 UTC) #58

shivdasp

On 2014/03/18 06:15:28, Pawel Osciak wrote: > On 2014/03/18 06:08:49, shivdasp wrote: > > On ...

6 years, 9 months ago (2014-03-18 12:22:17 UTC) #59

On 2014/03/18 06:15:28, Pawel Osciak wrote:
> On 2014/03/18 06:08:49, shivdasp wrote:
> > On 2014/03/18 05:49:59, Pawel Osciak wrote:
> > > On 2014/03/14 04:48:28, shivdasp wrote:
> > > > Pawel,
> > > > How do you suggest we go ahead now ?
> > > > Having device poll thread start early and using the resolution change
> event
> > > > seems breaking the vdatest.
> > > > Could you re-consider the patchset#7 for now (will address sheu's
> comments)
> > > that
> > > > has a not-so-good fix of reposting the task which will happen only on
> Tegra.
> > > We
> > > > anyways have a plan to use the decode initialization event and that will
> be
> > > the
> > > > neat way of doing this.
> > > > If you have any other ideas I can try them as well.
> > > 
> > > Shivdas,
> > > Why are we doing ProvidePictureBuffers() twice? We should be seeing only
one
> > > resolution change (i.e. the initial change).
> > > We should rather fix vdatest than hack V4L2VDA, but I don't understand why
> > this
> > > becomes a resolution change event with us calling PPB() twice.
> > > Could you explain?
> > The first PPB() happens when the G_FMT succeeds in DBI() and thereby picture
> > buffers are requested.
> > The second PPB() happens because a RESOLUTION_CHANGE event is already
enqueued
> > underneath and it is dequeued() after picture buffers were already created.
So
> > there's no way to identify a real resolution change from the initial decoder
> > init request.
> 
> Could we fix rendering helper to reallocate textures?

I fixed the rendering helper to reallocate the textures and now it can handle
PPB() twice.
However there are two major issues that I see now:
1. Since we do a RES_CHANGE sequence, few of the decoded-but-yet-to-be-rendered
buffers may get lost since we call StopDevicePoll() which inturn calls STREAMOFF
which looses the buffers. 7 sub-tests fail on account of less than expected
frames returned.
2. Relatedly, since we DestroyOutputBuffers(), we might loose the reference
frames and as we are not actually starting the stream from a I frame there might
be corruption.

I think we might be better off with having a separate event for signalling the
decoder initialization rather than using resolution change sequence.
Add a new method DecoderInitTask() , which will be posted from DequeueEvents()
when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name ?? &
would need to add another enum like RESOLUTION_CHANGE with value 6 ?? )
DecoderInitTask(), does nothing if decoder_state_ is already kDecoding, else
GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding.

This should work with Exynos too since there will not be any DECODER_INIT event
dequeued on Exynos.
Thoughts ?

Thanks

sheu

On 2014/03/18 12:22:17, shivdasp wrote: > I fixed the rendering helper to reallocate the textures ...

6 years, 9 months ago (2014-03-18 20:15:22 UTC) #60

On 2014/03/18 12:22:17, shivdasp wrote:
> I fixed the rendering helper to reallocate the textures and now it can handle
> PPB() twice.
> However there are two major issues that I see now:
> 1. Since we do a RES_CHANGE sequence, few of the
decoded-but-yet-to-be-rendered
> buffers may get lost since we call StopDevicePoll() which inturn calls
STREAMOFF
> which looses the buffers. 7 sub-tests fail on account of less than expected
> frames returned.

If the frame has been decoded and returned to the VDA client, then calling
STREAMOFF on the queue may terminate the video decoder's access to the buffer,
but it should not be destroyed until the 3D context has released it -- it should
still be renderable.  This is the point of the discussion we had above about the
tegrav4l2 library needing to track buffer lifetimes correctly for both the 3D
and the video decode stacks.

> 2. Relatedly, since we DestroyOutputBuffers(), we might loose the reference
> frames and as we are not actually starting the stream from a I frame there
might
> be corruption.
> 
> I think we might be better off with having a separate event for signalling the
> decoder initialization rather than using resolution change sequence.
> Add a new method DecoderInitTask() , which will be posted from DequeueEvents()
> when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name ?? &
> would need to add another enum like RESOLUTION_CHANGE with value 6 ?? )
> DecoderInitTask(), does nothing if decoder_state_ is already kDecoding, else
> GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding.
> 
> This should work with Exynos too since there will not be any DECODER_INIT
event
> dequeued on Exynos.
> Thoughts ?

So my thought are that we have two concerns: (1) allowing for devices to signal
decoder init through the events system, and also (2) not breaking the existing
API for existing drivers and users.

For the existing usecase in (2), clients would be expected to poll VIDIOC_G_FMT
on the CAPTURE queue until it succeeds, at which point the client knows that the
initialization has succeeded and decoding can commence.

For (1), the new wrinkle would be clients don't have to poll; the arrival of the
event just signals that the client can expect to succeed the next time
VIDIOC_G_FMT is called.

So if we do it this way, we can just add the behavior where the arrival of a
DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT.  This would be correct
for the current Exynos case where VIDIOC_G_FMT is blocking (and DECODER_INIT is
not implemented), as well as a future where Exynos does not block on
VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib does not
block VIDIOC_G_FMT and it implements DECODER_INiT.

The other question is whether we can reuse the RESOLUTION_CHANGE event instead
of having to add a new DECODER_INIT change.  I think these two approaches are
both feasible: a dedicated DECODER_INIT event is the equivalent of a
"RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic.  Now that I think of
it though I'm less inclined to overload the meaning of the events.  We can use
the same handling codepaths in the VDA, but that doesn't mean we have to make
the V4L2 API overload the meanings of these two.

sheu

Can we split this CL into two? I think the changes to the V4L2VideoDevice interface ...

6 years, 9 months ago (2014-03-18 22:45:42 UTC) #61

shivdasp

On 2014/03/18 20:15:22, sheu wrote: > On 2014/03/18 12:22:17, shivdasp wrote: > > I fixed ...

6 years, 9 months ago (2014-03-19 04:23:20 UTC) #62

On 2014/03/18 20:15:22, sheu wrote:
> On 2014/03/18 12:22:17, shivdasp wrote:
> > I fixed the rendering helper to reallocate the textures and now it can
handle
> > PPB() twice.
> > However there are two major issues that I see now:
> > 1. Since we do a RES_CHANGE sequence, few of the
> decoded-but-yet-to-be-rendered
> > buffers may get lost since we call StopDevicePoll() which inturn calls
> STREAMOFF
> > which looses the buffers. 7 sub-tests fail on account of less than expected
> > frames returned.
> 
> If the frame has been decoded and returned to the VDA client, then calling
> STREAMOFF on the queue may terminate the video decoder's access to the buffer,
> but it should not be destroyed until the 3D context has released it -- it
should
> still be renderable.  This is the point of the discussion we had above about
the
> tegrav4l2 library needing to track buffer lifetimes correctly for both the 3D
> and the video decode stacks.
> 
Sorry if I wasn't clear, I was talking about the decoded but yet to be DQBUF'ed
buffers.
The buffers were ready to be dequeued but a STREAMOFF call would get them
dropped.

> > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the reference
> > frames and as we are not actually starting the stream from a I frame there
> might
> > be corruption.
> > 
This is a more serious problem (possible corruption) since the stream does not
restart (STREAMOFF for output plane is not called)
so the decoder believes it is starting from a I frame which may not be the case.

> > I think we might be better off with having a separate event for signalling
the
> > decoder initialization rather than using resolution change sequence.
> > Add a new method DecoderInitTask() , which will be posted from
DequeueEvents()
> > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name ?? &
> > would need to add another enum like RESOLUTION_CHANGE with value 6 ?? )
> > DecoderInitTask(), does nothing if decoder_state_ is already kDecoding, else
> > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding.
> > 
> > This should work with Exynos too since there will not be any DECODER_INIT
> event
> > dequeued on Exynos.
> > Thoughts ?
> 
> So my thought are that we have two concerns: (1) allowing for devices to
signal
> decoder init through the events system, and also (2) not breaking the existing
> API for existing drivers and users.
> 
> For the existing usecase in (2), clients would be expected to poll
VIDIOC_G_FMT
> on the CAPTURE queue until it succeeds, at which point the client knows that
the
> initialization has succeeded and decoding can commence.
>
yes we will still keep the sequence in DBI() as is so Exynos would stay
unaffected.
And if on Tegra the G_FMT succeeds earlier this can also be protected by
checking the decoder_state_
in DecoderInitTask() which will no-op if decoder_state_ is already kDecoding.
 
> For (1), the new wrinkle would be clients don't have to poll; the arrival of
the
> event just signals that the client can expect to succeed the next time
> VIDIOC_G_FMT is called.
> 
> So if we do it this way, we can just add the behavior where the arrival of a
> DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT.  This would be
correct
> for the current Exynos case where VIDIOC_G_FMT is blocking (and DECODER_INIT
is
> not implemented), as well as a future where Exynos does not block on
> VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib does not
> block VIDIOC_G_FMT and it implements DECODER_INiT.
> 
> 
> 
> The other question is whether we can reuse the RESOLUTION_CHANGE event instead
> of having to add a new DECODER_INIT change.  I think these two approaches are
> both feasible: a dedicated DECODER_INIT event is the equivalent of a
> "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic.  Now that I think of
> it though I'm less inclined to overload the meaning of the events.  We can use
> the same handling codepaths in the VDA, but that doesn't mean we have to make
> the V4L2 API overload the meanings of these two.
Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check. The issue
is when
we do both. Decoder initialization through G_FMT trigger in DBI() and again
after we get 
the RESOLUTION_CHANGE event.
I have tried this above DecoderInitTasks() and it works fine in my initial
tests.
Let me test it more and check for Exynos behavior too, I can then upload another
patchset.
Pawel, would you agree to this approach ? Any thoughts ?

Pawel Osciak

On 2014/03/18 12:22:17, shivdasp wrote: > On 2014/03/18 06:15:28, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-19 05:35:15 UTC) #63

On 2014/03/18 12:22:17, shivdasp wrote:
> On 2014/03/18 06:15:28, Pawel Osciak wrote:
> > On 2014/03/18 06:08:49, shivdasp wrote:
> > > On 2014/03/18 05:49:59, Pawel Osciak wrote:
> > > > On 2014/03/14 04:48:28, shivdasp wrote:
> > > > > Pawel,
> > > > > How do you suggest we go ahead now ?
> > > > > Having device poll thread start early and using the resolution change
> > event
> > > > > seems breaking the vdatest.
> > > > > Could you re-consider the patchset#7 for now (will address sheu's
> > comments)
> > > > that
> > > > > has a not-so-good fix of reposting the task which will happen only on
> > Tegra.
> > > > We
> > > > > anyways have a plan to use the decode initialization event and that
will
> > be
> > > > the
> > > > > neat way of doing this.
> > > > > If you have any other ideas I can try them as well.
> > > > 
> > > > Shivdas,
> > > > Why are we doing ProvidePictureBuffers() twice? We should be seeing only
> one
> > > > resolution change (i.e. the initial change).
> > > > We should rather fix vdatest than hack V4L2VDA, but I don't understand
why
> > > this
> > > > becomes a resolution change event with us calling PPB() twice.
> > > > Could you explain?
> > > The first PPB() happens when the G_FMT succeeds in DBI() and thereby
picture
> > > buffers are requested.
> > > The second PPB() happens because a RESOLUTION_CHANGE event is already
> enqueued
> > > underneath and it is dequeued() after picture buffers were already
created.
> So
> > > there's no way to identify a real resolution change from the initial
decoder
> > > init request.
> > 
> > Could we fix rendering helper to reallocate textures?
> 
> I fixed the rendering helper to reallocate the textures and now it can handle
> PPB() twice.
> However there are two major issues that I see now:
> 1. Since we do a RES_CHANGE sequence, few of the
decoded-but-yet-to-be-rendered
> buffers may get lost since we call StopDevicePoll() which inturn calls
STREAMOFF
> which looses the buffers. 7 sub-tests fail on account of less than expected
> frames returned.

Resolution change event should only be sent after all the output buffers from
before the change have been returned. And we immediately send PIctureReady for
them after dequeuing them, so destroy cannot happen before that. So the point of
DestroyOutputBuffers(), the buffers have been dequeued and sent to rendering,
although they might have not been rendered yet. This is fine, as we don't have
to maintain ownership of them, because as we discussed before, there is shared
ownership of the frames between renderer and codec. The standard resolution
change scenario involves freeing/destroying buffers by codec, while the renderer
still keeps them and only destroys them after it's done rendering (we will
probably be already decoding into the new buffers while it still finishes the
old ones). So John is right here and this is one of the tricky parts I was
mentioning before when we discussed ownership. Can your stack handle this?

> 2. Relatedly, since we DestroyOutputBuffers(), we might loose the reference
> frames and as we are not actually starting the stream from a I frame there
might
> be corruption.

As mentioned above, the resolution change event should only be sent after all
the output buffers from the previous resolution were returned to the userspace
from the codec (this has nothing to do with their rendering though).
Since a resolution change can only happen in SPS or in a keyframe for VP8, there
should be no keyframes required for decoding after the change.
Also, as a side note, I don't think you can use frames in one resolution as
reference for frames in a different resolution?

> I think we might be better off with having a separate event for signalling the
> decoder initialization rather than using resolution change sequence.

Given the above, I don't see a reason for one...

> Add a new method DecoderInitTask() , which will be posted from DequeueEvents()
> when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name ?? &
> would need to add another enum like RESOLUTION_CHANGE with value 6 ?? )
> DecoderInitTask(), does nothing if decoder_state_ is already kDecoding, else
> GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding.
> 
> This should work with Exynos too since there will not be any DECODER_INIT
event
> dequeued on Exynos.
> Thoughts ?
> 
> Thanks

Pawel Osciak

On 2014/03/19 04:23:20, shivdasp wrote: > On 2014/03/18 20:15:22, sheu wrote: > > On 2014/03/18 ...

6 years, 9 months ago (2014-03-19 05:44:09 UTC) #64

On 2014/03/19 04:23:20, shivdasp wrote:
> On 2014/03/18 20:15:22, sheu wrote:
> > On 2014/03/18 12:22:17, shivdasp wrote:
> > > I fixed the rendering helper to reallocate the textures and now it can
> handle
> > > PPB() twice.
> > > However there are two major issues that I see now:
> > > 1. Since we do a RES_CHANGE sequence, few of the
> > decoded-but-yet-to-be-rendered
> > > buffers may get lost since we call StopDevicePoll() which inturn calls
> > STREAMOFF
> > > which looses the buffers. 7 sub-tests fail on account of less than
expected
> > > frames returned.
> > 
> > If the frame has been decoded and returned to the VDA client, then calling
> > STREAMOFF on the queue may terminate the video decoder's access to the
buffer,
> > but it should not be destroyed until the 3D context has released it -- it
> should
> > still be renderable.  This is the point of the discussion we had above about
> the
> > tegrav4l2 library needing to track buffer lifetimes correctly for both the
3D
> > and the video decode stacks.
> > 
> Sorry if I wasn't clear, I was talking about the decoded but yet to be
DQBUF'ed
> buffers.
> The buffers were ready to be dequeued but a STREAMOFF call would get them
> dropped.

John is right here. Streamoff has nothing to do with the ownership of the
textures by the rendering part.
STREAMOFF only removes the ownership of the codec over them, but renderer keeps
the textures alive and finishes rendering them.
Once it's done, it frees the textures and everything gets cleaned up.

Does your stack work this way?

> > > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the
reference
> > > frames and as we are not actually starting the stream from a I frame there
> > might
> > > be corruption.
> > > 
> This is a more serious problem (possible corruption) since the stream does not
> restart (STREAMOFF for output plane is not called)
> so the decoder believes it is starting from a I frame which may not be the
case.
>

STREAMOFF is called at
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu....

I'm also not sure what do you mean by "stream restart"?
The decoder has to be starting from an I frame, there can be no resolution
change on a frame that is not a reference frame.

When the decoder sends the resolution change event:
- all the output buffers that were to be decoded before resolution change are
ready and can be dequeued; once the client sees the event, it is supposed to
dequeue and display all the output buffers and can be sure there will be no more
until it reallocates the output queue;
- the input queue is not to be touched and can operate in parallel without
problems (i.e. it will keep already enqueued stream buffers and more can be
queued at any time)
- decoding is at an SPS for H264 (which means a new I-frame should follow it) or
an I-frame for VP8 (the codec, when it sees an I-frame with a different
resolution, is supposed to stop, return all the outputs from before that frame
and only decode that I-frame with resolution change once it gets new buffers);
further decoding will thus start from an I-frame always;

> > > I think we might be better off with having a separate event for signalling
> the
> > > decoder initialization rather than using resolution change sequence.
> > > Add a new method DecoderInitTask() , which will be posted from
> DequeueEvents()
> > > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name ??
&
> > > would need to add another enum like RESOLUTION_CHANGE with value 6 ?? )
> > > DecoderInitTask(), does nothing if decoder_state_ is already kDecoding,
else
> > > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding.
> > > 
> > > This should work with Exynos too since there will not be any DECODER_INIT
> > event
> > > dequeued on Exynos.
> > > Thoughts ?
> > 
> > So my thought are that we have two concerns: (1) allowing for devices to
> signal
> > decoder init through the events system, and also (2) not breaking the
existing
> > API for existing drivers and users.
> > 
> > For the existing usecase in (2), clients would be expected to poll
> VIDIOC_G_FMT
> > on the CAPTURE queue until it succeeds, at which point the client knows that
> the
> > initialization has succeeded and decoding can commence.
> >
> yes we will still keep the sequence in DBI() as is so Exynos would stay
> unaffected.
> And if on Tegra the G_FMT succeeds earlier this can also be protected by
> checking the decoder_state_
> in DecoderInitTask() which will no-op if decoder_state_ is already kDecoding.
>  
> > For (1), the new wrinkle would be clients don't have to poll; the arrival of
> the
> > event just signals that the client can expect to succeed the next time
> > VIDIOC_G_FMT is called.
> > 
> > So if we do it this way, we can just add the behavior where the arrival of a
> > DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT.  This would be
> correct
> > for the current Exynos case where VIDIOC_G_FMT is blocking (and DECODER_INIT
> is
> > not implemented), as well as a future where Exynos does not block on
> > VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib does
not
> > block VIDIOC_G_FMT and it implements DECODER_INiT.
> > 
> > 
> > 
> > The other question is whether we can reuse the RESOLUTION_CHANGE event
instead
> > of having to add a new DECODER_INIT change.  I think these two approaches
are
> > both feasible: a dedicated DECODER_INIT event is the equivalent of a
> > "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic.  Now that I think
of
> > it though I'm less inclined to overload the meaning of the events.  We can
use
> > the same handling codepaths in the VDA, but that doesn't mean we have to
make
> > the V4L2 API overload the meanings of these two.
> Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check. The
issue
> is when
> we do both. Decoder initialization through G_FMT trigger in DBI() and again
> after we get 
> the RESOLUTION_CHANGE event.
> I have tried this above DecoderInitTasks() and it works fine in my initial
> tests.
> Let me test it more and check for Exynos behavior too, I can then upload
another
> patchset.
> Pawel, would you agree to this approach ? Any thoughts ?

shivdasp

On 2014/03/19 05:44:09, Pawel Osciak wrote: > On 2014/03/19 04:23:20, shivdasp wrote: > > On ...

6 years, 9 months ago (2014-03-19 06:17:55 UTC) #65

On 2014/03/19 05:44:09, Pawel Osciak wrote:
> On 2014/03/19 04:23:20, shivdasp wrote:
> > On 2014/03/18 20:15:22, sheu wrote:
> > > On 2014/03/18 12:22:17, shivdasp wrote:
> > > > I fixed the rendering helper to reallocate the textures and now it can
> > handle
> > > > PPB() twice.
> > > > However there are two major issues that I see now:
> > > > 1. Since we do a RES_CHANGE sequence, few of the
> > > decoded-but-yet-to-be-rendered
> > > > buffers may get lost since we call StopDevicePoll() which inturn calls
> > > STREAMOFF
> > > > which looses the buffers. 7 sub-tests fail on account of less than
> expected
> > > > frames returned.
> > > 
> > > If the frame has been decoded and returned to the VDA client, then calling
> > > STREAMOFF on the queue may terminate the video decoder's access to the
> buffer,
> > > but it should not be destroyed until the 3D context has released it -- it
> > should
> > > still be renderable.  This is the point of the discussion we had above
about
> > the
> > > tegrav4l2 library needing to track buffer lifetimes correctly for both the
> 3D
> > > and the video decode stacks.
> > > 
> > Sorry if I wasn't clear, I was talking about the decoded but yet to be
> DQBUF'ed
> > buffers.
> > The buffers were ready to be dequeued but a STREAMOFF call would get them
> > dropped.
> 
> John is right here. Streamoff has nothing to do with the ownership of the
> textures by the rendering part.
> STREAMOFF only removes the ownership of the codec over them, but renderer
keeps
> the textures alive and finishes rendering them.
> Once it's done, it frees the textures and everything gets cleaned up.
> 
> Does your stack work this way?
> 
> > > > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the
> reference
> > > > frames and as we are not actually starting the stream from a I frame
there
> > > might
> > > > be corruption.
> > > > 
> > This is a more serious problem (possible corruption) since the stream does
not
> > restart (STREAMOFF for output plane is not called)
> > so the decoder believes it is starting from a I frame which may not be the
> case.
> >
> 
> STREAMOFF is called at
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu....
> 
> I'm also not sure what do you mean by "stream restart"?
> The decoder has to be starting from an I frame, there can be no resolution
> change on a frame that is not a reference frame.
> 
> When the decoder sends the resolution change event:
> - all the output buffers that were to be decoded before resolution change are
> ready and can be dequeued; once the client sees the event, it is supposed to
> dequeue and display all the output buffers and can be sure there will be no
more
> until it reallocates the output queue;
> - the input queue is not to be touched and can operate in parallel without
> problems (i.e. it will keep already enqueued stream buffers and more can be
> queued at any time)
> - decoding is at an SPS for H264 (which means a new I-frame should follow it)
or
> an I-frame for VP8 (the codec, when it sees an I-frame with a different
> resolution, is supposed to stop, return all the outputs from before that frame
> and only decode that I-frame with resolution change once it gets new buffers);
> further decoding will thus start from an I-frame always;
> 
> > > > I think we might be better off with having a separate event for
signalling
> > the
> > > > decoder initialization rather than using resolution change sequence.
> > > > Add a new method DecoderInitTask() , which will be posted from
> > DequeueEvents()
> > > > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name
??
> &
> > > > would need to add another enum like RESOLUTION_CHANGE with value 6 ?? )
> > > > DecoderInitTask(), does nothing if decoder_state_ is already kDecoding,
> else
> > > > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding.
> > > > 
> > > > This should work with Exynos too since there will not be any
DECODER_INIT
> > > event
> > > > dequeued on Exynos.
> > > > Thoughts ?
> > > 
> > > So my thought are that we have two concerns: (1) allowing for devices to
> > signal
> > > decoder init through the events system, and also (2) not breaking the
> existing
> > > API for existing drivers and users.
> > > 
> > > For the existing usecase in (2), clients would be expected to poll
> > VIDIOC_G_FMT
> > > on the CAPTURE queue until it succeeds, at which point the client knows
that
> > the
> > > initialization has succeeded and decoding can commence.
> > >
> > yes we will still keep the sequence in DBI() as is so Exynos would stay
> > unaffected.
> > And if on Tegra the G_FMT succeeds earlier this can also be protected by
> > checking the decoder_state_
> > in DecoderInitTask() which will no-op if decoder_state_ is already
kDecoding.
> >  
> > > For (1), the new wrinkle would be clients don't have to poll; the arrival
of
> > the
> > > event just signals that the client can expect to succeed the next time
> > > VIDIOC_G_FMT is called.
> > > 
> > > So if we do it this way, we can just add the behavior where the arrival of
a
> > > DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT.  This would be
> > correct
> > > for the current Exynos case where VIDIOC_G_FMT is blocking (and
DECODER_INIT
> > is
> > > not implemented), as well as a future where Exynos does not block on
> > > VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib does
> not
> > > block VIDIOC_G_FMT and it implements DECODER_INiT.
> > > 
> > > 
> > > 
> > > The other question is whether we can reuse the RESOLUTION_CHANGE event
> instead
> > > of having to add a new DECODER_INIT change.  I think these two approaches
> are
> > > both feasible: a dedicated DECODER_INIT event is the equivalent of a
> > > "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic.  Now that I
think
> of
> > > it though I'm less inclined to overload the meaning of the events.  We can
> use
> > > the same handling codepaths in the VDA, but that doesn't mean we have to
> make
> > > the V4L2 API overload the meanings of these two.
> > Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check. The
> issue
> > is when
> > we do both. Decoder initialization through G_FMT trigger in DBI() and again
> > after we get 
> > the RESOLUTION_CHANGE event.
> > I have tried this above DecoderInitTasks() and it works fine in my initial
> > tests.
> > Let me test it more and check for Exynos behavior too, I can then upload
> another
> > patchset.
> > Pawel, would you agree to this approach ? Any thoughts ?

Hi Pawel,

The handling of resolution change event is done exactly as you have described
above in our stack.
However my concern is about our plan of enqueuing the resolution change event to
signal
that the decoder initialization has happened.
So in a "real" resolution change event everything will happen correctly, i.e
after all the bitstream buffers
are processed , the resolution change event is generated which then frees and
reallocates the yuv (output) buffers correctly.
However if we were already initialized (G_FMT happened to succeed in DBI() ) but
since the underlying stack has enqueued a 
resolution change event (in case the client did not get G_FMT to succeed before
emptying all the input buffers), the 
resolution change sequence will release the output buffers and since the
bitstream is not going to be restarted
on a SPS or keyframe the decoder will not have correct references.
And that's why I am suggesting to not use resolution change event but rather a
separate DECODER_INIT event.
Would you like to have a conf call to discuss this so that we can close on it
quickly ?

Thanks,

Pawel Osciak

On 2014/03/19 06:17:55, shivdasp wrote: > On 2014/03/19 05:44:09, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-19 06:21:45 UTC) #66

On 2014/03/19 06:17:55, shivdasp wrote:
> On 2014/03/19 05:44:09, Pawel Osciak wrote:
> > On 2014/03/19 04:23:20, shivdasp wrote:
> > > On 2014/03/18 20:15:22, sheu wrote:
> > > > On 2014/03/18 12:22:17, shivdasp wrote:
> > > > > I fixed the rendering helper to reallocate the textures and now it can
> > > handle
> > > > > PPB() twice.
> > > > > However there are two major issues that I see now:
> > > > > 1. Since we do a RES_CHANGE sequence, few of the
> > > > decoded-but-yet-to-be-rendered
> > > > > buffers may get lost since we call StopDevicePoll() which inturn calls
> > > > STREAMOFF
> > > > > which looses the buffers. 7 sub-tests fail on account of less than
> > expected
> > > > > frames returned.
> > > > 
> > > > If the frame has been decoded and returned to the VDA client, then
calling
> > > > STREAMOFF on the queue may terminate the video decoder's access to the
> > buffer,
> > > > but it should not be destroyed until the 3D context has released it --
it
> > > should
> > > > still be renderable.  This is the point of the discussion we had above
> about
> > > the
> > > > tegrav4l2 library needing to track buffer lifetimes correctly for both
the
> > 3D
> > > > and the video decode stacks.
> > > > 
> > > Sorry if I wasn't clear, I was talking about the decoded but yet to be
> > DQBUF'ed
> > > buffers.
> > > The buffers were ready to be dequeued but a STREAMOFF call would get them
> > > dropped.
> > 
> > John is right here. Streamoff has nothing to do with the ownership of the
> > textures by the rendering part.
> > STREAMOFF only removes the ownership of the codec over them, but renderer
> keeps
> > the textures alive and finishes rendering them.
> > Once it's done, it frees the textures and everything gets cleaned up.
> > 
> > Does your stack work this way?
> > 
> > > > > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the
> > reference
> > > > > frames and as we are not actually starting the stream from a I frame
> there
> > > > might
> > > > > be corruption.
> > > > > 
> > > This is a more serious problem (possible corruption) since the stream does
> not
> > > restart (STREAMOFF for output plane is not called)
> > > so the decoder believes it is starting from a I frame which may not be the
> > case.
> > >
> > 
> > STREAMOFF is called at
> >
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu....
> > 
> > I'm also not sure what do you mean by "stream restart"?
> > The decoder has to be starting from an I frame, there can be no resolution
> > change on a frame that is not a reference frame.
> > 
> > When the decoder sends the resolution change event:
> > - all the output buffers that were to be decoded before resolution change
are
> > ready and can be dequeued; once the client sees the event, it is supposed to
> > dequeue and display all the output buffers and can be sure there will be no
> more
> > until it reallocates the output queue;
> > - the input queue is not to be touched and can operate in parallel without
> > problems (i.e. it will keep already enqueued stream buffers and more can be
> > queued at any time)
> > - decoding is at an SPS for H264 (which means a new I-frame should follow
it)
> or
> > an I-frame for VP8 (the codec, when it sees an I-frame with a different
> > resolution, is supposed to stop, return all the outputs from before that
frame
> > and only decode that I-frame with resolution change once it gets new
buffers);
> > further decoding will thus start from an I-frame always;
> > 
> > > > > I think we might be better off with having a separate event for
> signalling
> > > the
> > > > > decoder initialization rather than using resolution change sequence.
> > > > > Add a new method DecoderInitTask() , which will be posted from
> > > DequeueEvents()
> > > > > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better
name
> ??
> > &
> > > > > would need to add another enum like RESOLUTION_CHANGE with value 6 ??
)
> > > > > DecoderInitTask(), does nothing if decoder_state_ is already
kDecoding,
> > else
> > > > > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding.
> > > > > 
> > > > > This should work with Exynos too since there will not be any
> DECODER_INIT
> > > > event
> > > > > dequeued on Exynos.
> > > > > Thoughts ?
> > > > 
> > > > So my thought are that we have two concerns: (1) allowing for devices to
> > > signal
> > > > decoder init through the events system, and also (2) not breaking the
> > existing
> > > > API for existing drivers and users.
> > > > 
> > > > For the existing usecase in (2), clients would be expected to poll
> > > VIDIOC_G_FMT
> > > > on the CAPTURE queue until it succeeds, at which point the client knows
> that
> > > the
> > > > initialization has succeeded and decoding can commence.
> > > >
> > > yes we will still keep the sequence in DBI() as is so Exynos would stay
> > > unaffected.
> > > And if on Tegra the G_FMT succeeds earlier this can also be protected by
> > > checking the decoder_state_
> > > in DecoderInitTask() which will no-op if decoder_state_ is already
> kDecoding.
> > >  
> > > > For (1), the new wrinkle would be clients don't have to poll; the
arrival
> of
> > > the
> > > > event just signals that the client can expect to succeed the next time
> > > > VIDIOC_G_FMT is called.
> > > > 
> > > > So if we do it this way, we can just add the behavior where the arrival
of
> a
> > > > DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT.  This would be
> > > correct
> > > > for the current Exynos case where VIDIOC_G_FMT is blocking (and
> DECODER_INIT
> > > is
> > > > not implemented), as well as a future where Exynos does not block on
> > > > VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib
does
> > not
> > > > block VIDIOC_G_FMT and it implements DECODER_INiT.
> > > > 
> > > > 
> > > > 
> > > > The other question is whether we can reuse the RESOLUTION_CHANGE event
> > instead
> > > > of having to add a new DECODER_INIT change.  I think these two
approaches
> > are
> > > > both feasible: a dedicated DECODER_INIT event is the equivalent of a
> > > > "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic.  Now that I
> think
> > of
> > > > it though I'm less inclined to overload the meaning of the events.  We
can
> > use
> > > > the same handling codepaths in the VDA, but that doesn't mean we have to
> > make
> > > > the V4L2 API overload the meanings of these two.
> > > Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check. The
> > issue
> > > is when
> > > we do both. Decoder initialization through G_FMT trigger in DBI() and
again
> > > after we get 
> > > the RESOLUTION_CHANGE event.
> > > I have tried this above DecoderInitTasks() and it works fine in my initial
> > > tests.
> > > Let me test it more and check for Exynos behavior too, I can then upload
> > another
> > > patchset.
> > > Pawel, would you agree to this approach ? Any thoughts ?
> 
> Hi Pawel,
> 
> The handling of resolution change event is done exactly as you have described
> above in our stack.
> However my concern is about our plan of enqueuing the resolution change event
to
> signal
> that the decoder initialization has happened.
> So in a "real" resolution change event everything will happen correctly, i.e
> after all the bitstream buffers
> are processed , the resolution change event is generated which then frees and
> reallocates the yuv (output) buffers correctly.
> However if we were already initialized (G_FMT happened to succeed in DBI() )
but
> since the underlying stack has enqueued a 
> resolution change event (in case the client did not get G_FMT to succeed
before
> emptying all the input buffers), the 
> resolution change sequence will release the output buffers and since the
> bitstream is not going to be restarted
> on a SPS or keyframe the decoder will not have correct references.
> And that's why I am suggesting to not use resolution change event but rather a
> separate DECODER_INIT event.

I think we have a misunderstanding here, we should not have a G_FMT call in DBI
at all. I think this should solve the problem?

> Would you like to have a conf call to discuss this so that we can close on it
> quickly ?

Sure.

shivdasp

On 2014/03/19 06:21:45, Pawel Osciak wrote: > On 2014/03/19 06:17:55, shivdasp wrote: > > On ...

6 years, 9 months ago (2014-03-20 04:44:16 UTC) #67

On 2014/03/19 06:21:45, Pawel Osciak wrote:
> On 2014/03/19 06:17:55, shivdasp wrote:
> > On 2014/03/19 05:44:09, Pawel Osciak wrote:
> > > On 2014/03/19 04:23:20, shivdasp wrote:
> > > > On 2014/03/18 20:15:22, sheu wrote:
> > > > > On 2014/03/18 12:22:17, shivdasp wrote:
> > > > > > I fixed the rendering helper to reallocate the textures and now it
can
> > > > handle
> > > > > > PPB() twice.
> > > > > > However there are two major issues that I see now:
> > > > > > 1. Since we do a RES_CHANGE sequence, few of the
> > > > > decoded-but-yet-to-be-rendered
> > > > > > buffers may get lost since we call StopDevicePoll() which inturn
calls
> > > > > STREAMOFF
> > > > > > which looses the buffers. 7 sub-tests fail on account of less than
> > > expected
> > > > > > frames returned.
> > > > > 
> > > > > If the frame has been decoded and returned to the VDA client, then
> calling
> > > > > STREAMOFF on the queue may terminate the video decoder's access to the
> > > buffer,
> > > > > but it should not be destroyed until the 3D context has released it --
> it
> > > > should
> > > > > still be renderable.  This is the point of the discussion we had above
> > about
> > > > the
> > > > > tegrav4l2 library needing to track buffer lifetimes correctly for both
> the
> > > 3D
> > > > > and the video decode stacks.
> > > > > 
> > > > Sorry if I wasn't clear, I was talking about the decoded but yet to be
> > > DQBUF'ed
> > > > buffers.
> > > > The buffers were ready to be dequeued but a STREAMOFF call would get
them
> > > > dropped.
> > > 
> > > John is right here. Streamoff has nothing to do with the ownership of the
> > > textures by the rendering part.
> > > STREAMOFF only removes the ownership of the codec over them, but renderer
> > keeps
> > > the textures alive and finishes rendering them.
> > > Once it's done, it frees the textures and everything gets cleaned up.
> > > 
> > > Does your stack work this way?
> > > 
> > > > > > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the
> > > reference
> > > > > > frames and as we are not actually starting the stream from a I frame
> > there
> > > > > might
> > > > > > be corruption.
> > > > > > 
> > > > This is a more serious problem (possible corruption) since the stream
does
> > not
> > > > restart (STREAMOFF for output plane is not called)
> > > > so the decoder believes it is starting from a I frame which may not be
the
> > > case.
> > > >
> > > 
> > > STREAMOFF is called at
> > >
> >
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu....
> > > 
> > > I'm also not sure what do you mean by "stream restart"?
> > > The decoder has to be starting from an I frame, there can be no resolution
> > > change on a frame that is not a reference frame.
> > > 
> > > When the decoder sends the resolution change event:
> > > - all the output buffers that were to be decoded before resolution change
> are
> > > ready and can be dequeued; once the client sees the event, it is supposed
to
> > > dequeue and display all the output buffers and can be sure there will be
no
> > more
> > > until it reallocates the output queue;
> > > - the input queue is not to be touched and can operate in parallel without
> > > problems (i.e. it will keep already enqueued stream buffers and more can
be
> > > queued at any time)
> > > - decoding is at an SPS for H264 (which means a new I-frame should follow
> it)
> > or
> > > an I-frame for VP8 (the codec, when it sees an I-frame with a different
> > > resolution, is supposed to stop, return all the outputs from before that
> frame
> > > and only decode that I-frame with resolution change once it gets new
> buffers);
> > > further decoding will thus start from an I-frame always;
> > > 
> > > > > > I think we might be better off with having a separate event for
> > signalling
> > > > the
> > > > > > decoder initialization rather than using resolution change sequence.
> > > > > > Add a new method DecoderInitTask() , which will be posted from
> > > > DequeueEvents()
> > > > > > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better
> name
> > ??
> > > &
> > > > > > would need to add another enum like RESOLUTION_CHANGE with value 6
??
> )
> > > > > > DecoderInitTask(), does nothing if decoder_state_ is already
> kDecoding,
> > > else
> > > > > > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding.
> > > > > > 
> > > > > > This should work with Exynos too since there will not be any
> > DECODER_INIT
> > > > > event
> > > > > > dequeued on Exynos.
> > > > > > Thoughts ?
> > > > > 
> > > > > So my thought are that we have two concerns: (1) allowing for devices
to
> > > > signal
> > > > > decoder init through the events system, and also (2) not breaking the
> > > existing
> > > > > API for existing drivers and users.
> > > > > 
> > > > > For the existing usecase in (2), clients would be expected to poll
> > > > VIDIOC_G_FMT
> > > > > on the CAPTURE queue until it succeeds, at which point the client
knows
> > that
> > > > the
> > > > > initialization has succeeded and decoding can commence.
> > > > >
> > > > yes we will still keep the sequence in DBI() as is so Exynos would stay
> > > > unaffected.
> > > > And if on Tegra the G_FMT succeeds earlier this can also be protected by
> > > > checking the decoder_state_
> > > > in DecoderInitTask() which will no-op if decoder_state_ is already
> > kDecoding.
> > > >  
> > > > > For (1), the new wrinkle would be clients don't have to poll; the
> arrival
> > of
> > > > the
> > > > > event just signals that the client can expect to succeed the next time
> > > > > VIDIOC_G_FMT is called.
> > > > > 
> > > > > So if we do it this way, we can just add the behavior where the
arrival
> of
> > a
> > > > > DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT.  This would
be
> > > > correct
> > > > > for the current Exynos case where VIDIOC_G_FMT is blocking (and
> > DECODER_INIT
> > > > is
> > > > > not implemented), as well as a future where Exynos does not block on
> > > > > VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib
> does
> > > not
> > > > > block VIDIOC_G_FMT and it implements DECODER_INiT.
> > > > > 
> > > > > 
> > > > > 
> > > > > The other question is whether we can reuse the RESOLUTION_CHANGE event
> > > instead
> > > > > of having to add a new DECODER_INIT change.  I think these two
> approaches
> > > are
> > > > > both feasible: a dedicated DECODER_INIT event is the equivalent of a
> > > > > "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic.  Now that I
> > think
> > > of
> > > > > it though I'm less inclined to overload the meaning of the events.  We
> can
> > > use
> > > > > the same handling codepaths in the VDA, but that doesn't mean we have
to
> > > make
> > > > > the V4L2 API overload the meanings of these two.
> > > > Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check.
The
> > > issue
> > > > is when
> > > > we do both. Decoder initialization through G_FMT trigger in DBI() and
> again
> > > > after we get 
> > > > the RESOLUTION_CHANGE event.
> > > > I have tried this above DecoderInitTasks() and it works fine in my
initial
> > > > tests.
> > > > Let me test it more and check for Exynos behavior too, I can then upload
> > > another
> > > > patchset.
> > > > Pawel, would you agree to this approach ? Any thoughts ?
> > 
> > Hi Pawel,
> > 
> > The handling of resolution change event is done exactly as you have
described
> > above in our stack.
> > However my concern is about our plan of enqueuing the resolution change
event
> to
> > signal
> > that the decoder initialization has happened.
> > So in a "real" resolution change event everything will happen correctly, i.e
> > after all the bitstream buffers
> > are processed , the resolution change event is generated which then frees
and
> > reallocates the yuv (output) buffers correctly.
> > However if we were already initialized (G_FMT happened to succeed in DBI() )
> but
> > since the underlying stack has enqueued a 
> > resolution change event (in case the client did not get G_FMT to succeed
> before
> > emptying all the input buffers), the 
> > resolution change sequence will release the output buffers and since the
> > bitstream is not going to be restarted
> > on a SPS or keyframe the decoder will not have correct references.
> > And that's why I am suggesting to not use resolution change event but rather
a
> > separate DECODER_INIT event.
> 
> I think we have a misunderstanding here, we should not have a G_FMT call in
DBI
> at all. I think this should solve the problem?
Yes I think that solves the problem. As discussed in yesterday over the call,
GetFormatInfo() and CreateBuffersForFormat() 
will be removed from DBI(). This will ensure that the decoder initialization
will be triggered only when 
RESOLUTION_CHANGE event is dequeued.
This will temporarily break Exynos until Exynos V4L2 driver also enqueues a
RESOLUTION_CHANGE to signal decoder initialization.
> 
> > Would you like to have a conf call to discuss this so that we can close on
it
> > quickly ?
> 
> Sure.

Pawel Osciak

On 2014/03/20 04:44:16, shivdasp wrote: > On 2014/03/19 06:21:45, Pawel Osciak wrote: > > On ...

6 years, 9 months ago (2014-03-20 05:29:41 UTC) #68

On 2014/03/20 04:44:16, shivdasp wrote:
> On 2014/03/19 06:21:45, Pawel Osciak wrote:
> > On 2014/03/19 06:17:55, shivdasp wrote:
> > > On 2014/03/19 05:44:09, Pawel Osciak wrote:
> > > > On 2014/03/19 04:23:20, shivdasp wrote:
> > > > > On 2014/03/18 20:15:22, sheu wrote:
> > > > > > On 2014/03/18 12:22:17, shivdasp wrote:
> > > > > > > I fixed the rendering helper to reallocate the textures and now it
> can
> > > > > handle
> > > > > > > PPB() twice.
> > > > > > > However there are two major issues that I see now:
> > > > > > > 1. Since we do a RES_CHANGE sequence, few of the
> > > > > > decoded-but-yet-to-be-rendered
> > > > > > > buffers may get lost since we call StopDevicePoll() which inturn
> calls
> > > > > > STREAMOFF
> > > > > > > which looses the buffers. 7 sub-tests fail on account of less than
> > > > expected
> > > > > > > frames returned.
> > > > > > 
> > > > > > If the frame has been decoded and returned to the VDA client, then
> > calling
> > > > > > STREAMOFF on the queue may terminate the video decoder's access to
the
> > > > buffer,
> > > > > > but it should not be destroyed until the 3D context has released it
--
> > it
> > > > > should
> > > > > > still be renderable.  This is the point of the discussion we had
above
> > > about
> > > > > the
> > > > > > tegrav4l2 library needing to track buffer lifetimes correctly for
both
> > the
> > > > 3D
> > > > > > and the video decode stacks.
> > > > > > 
> > > > > Sorry if I wasn't clear, I was talking about the decoded but yet to be
> > > > DQBUF'ed
> > > > > buffers.
> > > > > The buffers were ready to be dequeued but a STREAMOFF call would get
> them
> > > > > dropped.
> > > > 
> > > > John is right here. Streamoff has nothing to do with the ownership of
the
> > > > textures by the rendering part.
> > > > STREAMOFF only removes the ownership of the codec over them, but
renderer
> > > keeps
> > > > the textures alive and finishes rendering them.
> > > > Once it's done, it frees the textures and everything gets cleaned up.
> > > > 
> > > > Does your stack work this way?
> > > > 
> > > > > > > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the
> > > > reference
> > > > > > > frames and as we are not actually starting the stream from a I
frame
> > > there
> > > > > > might
> > > > > > > be corruption.
> > > > > > > 
> > > > > This is a more serious problem (possible corruption) since the stream
> does
> > > not
> > > > > restart (STREAMOFF for output plane is not called)
> > > > > so the decoder believes it is starting from a I frame which may not be
> the
> > > > case.
> > > > >
> > > > 
> > > > STREAMOFF is called at
> > > >
> > >
> >
>
https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu....
> > > > 
> > > > I'm also not sure what do you mean by "stream restart"?
> > > > The decoder has to be starting from an I frame, there can be no
resolution
> > > > change on a frame that is not a reference frame.
> > > > 
> > > > When the decoder sends the resolution change event:
> > > > - all the output buffers that were to be decoded before resolution
change
> > are
> > > > ready and can be dequeued; once the client sees the event, it is
supposed
> to
> > > > dequeue and display all the output buffers and can be sure there will be
> no
> > > more
> > > > until it reallocates the output queue;
> > > > - the input queue is not to be touched and can operate in parallel
without
> > > > problems (i.e. it will keep already enqueued stream buffers and more can
> be
> > > > queued at any time)
> > > > - decoding is at an SPS for H264 (which means a new I-frame should
follow
> > it)
> > > or
> > > > an I-frame for VP8 (the codec, when it sees an I-frame with a different
> > > > resolution, is supposed to stop, return all the outputs from before that
> > frame
> > > > and only decode that I-frame with resolution change once it gets new
> > buffers);
> > > > further decoding will thus start from an I-frame always;
> > > > 
> > > > > > > I think we might be better off with having a separate event for
> > > signalling
> > > > > the
> > > > > > > decoder initialization rather than using resolution change
sequence.
> > > > > > > Add a new method DecoderInitTask() , which will be posted from
> > > > > DequeueEvents()
> > > > > > > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better
> > name
> > > ??
> > > > &
> > > > > > > would need to add another enum like RESOLUTION_CHANGE with value 6
> ??
> > )
> > > > > > > DecoderInitTask(), does nothing if decoder_state_ is already
> > kDecoding,
> > > > else
> > > > > > > GetFormatInfo() and CreateBuffersForFormat() and moves to
kDecoding.
> > > > > > > 
> > > > > > > This should work with Exynos too since there will not be any
> > > DECODER_INIT
> > > > > > event
> > > > > > > dequeued on Exynos.
> > > > > > > Thoughts ?
> > > > > > 
> > > > > > So my thought are that we have two concerns: (1) allowing for
devices
> to
> > > > > signal
> > > > > > decoder init through the events system, and also (2) not breaking
the
> > > > existing
> > > > > > API for existing drivers and users.
> > > > > > 
> > > > > > For the existing usecase in (2), clients would be expected to poll
> > > > > VIDIOC_G_FMT
> > > > > > on the CAPTURE queue until it succeeds, at which point the client
> knows
> > > that
> > > > > the
> > > > > > initialization has succeeded and decoding can commence.
> > > > > >
> > > > > yes we will still keep the sequence in DBI() as is so Exynos would
stay
> > > > > unaffected.
> > > > > And if on Tegra the G_FMT succeeds earlier this can also be protected
by
> > > > > checking the decoder_state_
> > > > > in DecoderInitTask() which will no-op if decoder_state_ is already
> > > kDecoding.
> > > > >  
> > > > > > For (1), the new wrinkle would be clients don't have to poll; the
> > arrival
> > > of
> > > > > the
> > > > > > event just signals that the client can expect to succeed the next
time
> > > > > > VIDIOC_G_FMT is called.
> > > > > > 
> > > > > > So if we do it this way, we can just add the behavior where the
> arrival
> > of
> > > a
> > > > > > DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT.  This would
> be
> > > > > correct
> > > > > > for the current Exynos case where VIDIOC_G_FMT is blocking (and
> > > DECODER_INIT
> > > > > is
> > > > > > not implemented), as well as a future where Exynos does not block on
> > > > > > VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib
> > does
> > > > not
> > > > > > block VIDIOC_G_FMT and it implements DECODER_INiT.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > The other question is whether we can reuse the RESOLUTION_CHANGE
event
> > > > instead
> > > > > > of having to add a new DECODER_INIT change.  I think these two
> > approaches
> > > > are
> > > > > > both feasible: a dedicated DECODER_INIT event is the equivalent of a
> > > > > > "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic.  Now that
I
> > > think
> > > > of
> > > > > > it though I'm less inclined to overload the meaning of the events. 
We
> > can
> > > > use
> > > > > > the same handling codepaths in the VDA, but that doesn't mean we
have
> to
> > > > make
> > > > > > the V4L2 API overload the meanings of these two.
> > > > > Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check.
> The
> > > > issue
> > > > > is when
> > > > > we do both. Decoder initialization through G_FMT trigger in DBI() and
> > again
> > > > > after we get 
> > > > > the RESOLUTION_CHANGE event.
> > > > > I have tried this above DecoderInitTasks() and it works fine in my
> initial
> > > > > tests.
> > > > > Let me test it more and check for Exynos behavior too, I can then
upload
> > > > another
> > > > > patchset.
> > > > > Pawel, would you agree to this approach ? Any thoughts ?
> > > 
> > > Hi Pawel,
> > > 
> > > The handling of resolution change event is done exactly as you have
> described
> > > above in our stack.
> > > However my concern is about our plan of enqueuing the resolution change
> event
> > to
> > > signal
> > > that the decoder initialization has happened.
> > > So in a "real" resolution change event everything will happen correctly,
i.e
> > > after all the bitstream buffers
> > > are processed , the resolution change event is generated which then frees
> and
> > > reallocates the yuv (output) buffers correctly.
> > > However if we were already initialized (G_FMT happened to succeed in DBI()
)
> > but
> > > since the underlying stack has enqueued a 
> > > resolution change event (in case the client did not get G_FMT to succeed
> > before
> > > emptying all the input buffers), the 
> > > resolution change sequence will release the output buffers and since the
> > > bitstream is not going to be restarted
> > > on a SPS or keyframe the decoder will not have correct references.
> > > And that's why I am suggesting to not use resolution change event but
rather
> a
> > > separate DECODER_INIT event.
> > 
> > I think we have a misunderstanding here, we should not have a G_FMT call in
> DBI
> > at all. I think this should solve the problem?
> Yes I think that solves the problem. As discussed in yesterday over the call,
> GetFormatInfo() and CreateBuffersForFormat() 
> will be removed from DBI(). This will ensure that the decoder initialization
> will be triggered only when 
> RESOLUTION_CHANGE event is dequeued.
> This will temporarily break Exynos until Exynos V4L2 driver also enqueues a
> RESOLUTION_CHANGE to signal decoder initialization.


Great, thank you. Please let me know if the EOS event also solves the EOS
problem and whether it works with Exynos.

sheu

On 2014/03/20 05:29:41, Pawel Osciak wrote: > Great, thank you. Please let me know if ...

6 years, 9 months ago (2014-03-20 21:30:00 UTC) #69

shivdasp

On 2014/03/20 21:30:00, sheu wrote: > On 2014/03/20 05:29:41, Pawel Osciak wrote: > > Great, ...

6 years, 9 months ago (2014-03-21 01:16:19 UTC) #70

shivdasp

Addressed all the previous comments too, PTAL. Thanks https://codereview.chromium.org/137023008/diff/680001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/680001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc#newcode112 content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:112: write_whitelist->push_back(kDevVicPath); ...

6 years, 9 months ago (2014-03-21 10:53:44 UTC) #71

sheu

https://chromiumcodereview.appspot.com/137023008/diff/680001/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/680001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode1031 content/common/gpu/media/v4l2_video_decode_accelerator.cc:1031: } Pawel brought up the point that it might ...

6 years, 9 months ago (2014-03-21 23:19:53 UTC) #72

shivdasp

https://codereview.chromium.org/137023008/diff/680001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/680001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc#newcode112 content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:112: write_whitelist->push_back(kDevVicPath); On 2014/03/21 23:19:54, sheu wrote: > On 2014/03/21 ...

6 years, 9 months ago (2014-03-24 03:48:59 UTC) #73

shivdasp

Addressed John's comments, PTAL. https://codereview.chromium.org/137023008/diff/680001/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/680001/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode1031 content/common/gpu/media/v4l2_video_decode_accelerator.cc:1031: } Done, please refer patchset#9. ...

6 years, 9 months ago (2014-03-24 06:01:26 UTC) #74

shivdasp

> > Great, thank you. Please let me know if the EOS event also solves ...

6 years, 9 months ago (2014-03-24 06:30:34 UTC) #75

shivdasp

This is tested with all the cases for resolution change events. PTAL

6 years, 9 months ago (2014-03-24 12:14:23 UTC) #76

Pawel Osciak

I'll defer reviewing the sandboxing/library loading stuff to Jorge. https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/gpu/media/tegra_v4l2_video_device.cc#newcode119 content/common/gpu/media/tegra_v4l2_video_device.cc:119: ...

6 years, 9 months ago (2014-03-25 08:21:07 UTC) #77

shivdasp

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/media/tegra_v4l2_video_device.cc#newcode119 content/common/gpu/media/tegra_v4l2_video_device.cc:119: const EGLint* /*attrib*/, On 2014/03/25 08:21:08, Pawel Osciak wrote: ...

6 years, 9 months ago (2014-03-25 10:36:39 UTC) #78

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:119: const EGLint*
/*attrib*/,
On 2014/03/25 08:21:08, Pawel Osciak wrote:
> Ignoring attributes here and not in Exynos device, while doing the opposite
with
> buffer_index argument is not a perfect solution. Also, how do you handle
> exportbuf ioctl? Do you ignore it? That's also not too great.
This is primarily because of the eglImages are created. The attr is not required
on
Tegra. We cannot remove this argument altogether because on Exynos it is
populated with certain fields which I will have to otherwise pass in to this
function. Similarly buffer_index is ignored on Exynos but on Tegra we need to
send it down to the library in UseEglImage() to associate with the correct
picture buffer.

> 
> Also, how does close() on those dmabuf fds work for you now? When you ignore
the
> call, do you return -1? I guess that's what would make close() work.
EXPORTBUF ioctl actually does not populate anything since we do not support it.
And yes you are right, we return -1 and that's how the close() is also handled
in V4LVDA since it fd is closed only if it is not -1.
> 
> It's hard to come up with a good solution here, but we should at least try to
> minimize being confusing and. Always doing dmabuf export and ignoring fds,
> ignoring attributes, ignoring some arguments make it even harder to reason
about
> things. 
> 
I think we need to add more interface functions in V4L2Device class to abstract
this then.

> We should instead abstract those operations and have them in V4L2Device, since
> they depend on the device anyway. But we should at least be explicit about
this.
> 
> So I'm thinking that we should pass only the buffer index to CreateEGLImage
and
> move dmabuf exporting and handling in general to V4L2Device for Exynos. Then
> we'd also have to handle everything properly for destruction, but I feel that
> having a DestroyEGLImage() counterpart in V4L2Device is probably better than
> using V4L2Device::CreteEGLImage for creation, while destroying directly via
> eglDestroyImageKHR.
Okay I will add more interface API to V4L2Device.
> 
> Also keep in mind that CreateEGLImage is called on the ChildThread, instead of
> the decoder_thread_, but currently the decoder_thread_ is sleeping until
> AssignPictureBuffers is done, so we should be fine.

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
File content/common/gpu/media/tegra_v4l2_video_device.h (right):

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.h:49: bool
InitializeLibrarySymbols();
On 2014/03/25 08:21:08, Pawel Osciak wrote:
> Methods should come before members.

Done.

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1007: // Check if we
already have current_format_ set or this is an event
On 2014/03/25 08:21:08, Pawel Osciak wrote:
> s/or this/or if this/

Done.

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1009: if
((current_format_.fmt.pix_mp.width == 0) ||
On 2014/03/25 08:21:08, Pawel Osciak wrote:
> Since you are using the format only for size, I think you can use
> frame_buffer_size_ instead and drop current_format_.

Done.

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1951:
NOTIFY_ERROR(PLATFORM_FAILURE);
Since GetFormatInfo() does NOTIFY_ERROR already I will remove it from here.
On 2014/03/25 08:21:08, Pawel Osciak wrote:
> This means we will send NOTIFY_ERROR twice in case G_FMT failed.

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1960: DVLOG(3) <<
"IsResolutionChangeNecessary(): Dropping resolution change";
On 2014/03/25 08:21:08, Pawel Osciak wrote:
> No need for the else clause, just move it out please.

Done.

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_decode_accelerator.h (right):

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.h:14: #include
<linux/videodev2.h>
This was for v4l2_format but now if we use frame_buffer_size_ then
current_format_ can go away and hence this too.

On 2014/03/25 08:21:08, Pawel Osciak wrote:
> Why did we not need this before? If this is for v4l2_format, we were already
> using it...

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_device.h (right):

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:59: // This method is used to
create the EglImage since each V4L2Device
On 2014/03/25 08:21:08, Pawel Osciak wrote:
> s/the EglImage/an EGLImage/

Done.

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:60: // may have its own format. The
texture_id is used to bind the
On 2014/03/25 08:21:08, Pawel Osciak wrote:
> s/format/may use a different method of acquiring one and associating it to the
> given texture/

Done.

Pawel Osciak

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/media/tegra_v4l2_video_device.cc#newcode119 content/common/gpu/media/tegra_v4l2_video_device.cc:119: const EGLint* /*attrib*/, On 2014/03/25 10:36:40, shivdasp wrote: > ...

6 years, 9 months ago (2014-03-25 11:20:55 UTC) #79

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:119: const EGLint*
/*attrib*/,
On 2014/03/25 10:36:40, shivdasp wrote:
> On 2014/03/25 08:21:08, Pawel Osciak wrote:
> > Ignoring attributes here and not in Exynos device, while doing the opposite
> with
> > buffer_index argument is not a perfect solution. Also, how do you handle
> > exportbuf ioctl? Do you ignore it? That's also not too great.
> This is primarily because of the eglImages are created. The attr is not
required
> on
> Tegra. We cannot remove this argument altogether because on Exynos it is
> populated with certain fields which I will have to otherwise pass in to this
> function. Similarly buffer_index is ignored on Exynos but on Tegra we need to
> send it down to the library in UseEglImage() to associate with the correct
> picture buffer.

This is precisely why I'm suggesting attrs shouldn't be in the abstract
interface,
because they are not used on all platforms. Neither should the dmabufs handling
code,
including expbufs be in the platform-agnostic V4L2VDA class.

That's why I'm suggesting moving all the attrs, dmabufs, etc. related code to
CreateEglImage. You should be able to expbufs, fill in the attrs on Exynos and
create EGLImages
in the ExynosV4L2Device::CreateEglImage.

The Tegra implementation would not need it.

That's why I'm suggesting using buffer index argument on both platforms, and
also
size and num planes I would think. I think we should have something like:

V4L2Device::CreateEGLImage(EGLDisplay egl_display, int v4l2_buffer_index, GLuint
texture_id, gfx::Size size, size_t num_planes);


> > 
> > Also, how does close() on those dmabuf fds work for you now? When you ignore
> the
> > call, do you return -1? I guess that's what would make close() work.
> EXPORTBUF ioctl actually does not populate anything since we do not support
it.
> And yes you are right, we return -1 and that's how the close() is also handled
> in V4LVDA since it fd is closed only if it is not -1.
> > 
> > It's hard to come up with a good solution here, but we should at least try
to
> > minimize being confusing and. Always doing dmabuf export and ignoring fds,
> > ignoring attributes, ignoring some arguments make it even harder to reason
> about
> > things. 
> > 
> I think we need to add more interface functions in V4L2Device class to
abstract
> this then.

I think the only additional method would be DestroyEGLImage(EGLImage image).
This and the new CreateEGLImage().
Of course the related data such as fds, and their relation to egl_images should
be
handled internally by each device.

> > We should instead abstract those operations and have them in V4L2Device,
since
> > they depend on the device anyway. But we should at least be explicit about
> this.
> > 
> > So I'm thinking that we should pass only the buffer index to CreateEGLImage
> and
> > move dmabuf exporting and handling in general to V4L2Device for Exynos. Then
> > we'd also have to handle everything properly for destruction, but I feel
that
> > having a DestroyEGLImage() counterpart in V4L2Device is probably better than
> > using V4L2Device::CreteEGLImage for creation, while destroying directly via
> > eglDestroyImageKHR.
> Okay I will add more interface API to V4L2Device.
> > 
> > Also keep in mind that CreateEGLImage is called on the ChildThread, instead
of
> > the decoder_thread_, but currently the decoder_thread_ is sleeping until
> > AssignPictureBuffers is done, so we should be fine.
>

Jorge Lucangeli Obes

shivdasp@: please coordinate with davidung@ who's working on https://chromiumcodereview.appspot.com/179983006/ for the sandbox changes. Thanks! https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc ...

6 years, 9 months ago (2014-03-25 21:15:13 UTC) #80

shivdasp

Jorge, Yes either of us have to rebase if the other CL lands before. Shivdas ...

6 years, 9 months ago (2014-03-26 02:54:24 UTC) #81

Jorge Lucangeli Obes

https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc#newcode225 content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); On 2014/03/26 02:54:25, shivdasp wrote: > Yes ...

6 years, 9 months ago (2014-03-26 03:15:47 UTC) #82

shivdasp

https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc#newcode225 content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); I get your point. I think all ...

6 years, 9 months ago (2014-03-26 03:34:28 UTC) #83

Jorge Lucangeli Obes

https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc#newcode225 content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); On 2014/03/26 03:34:29, shivdasp wrote: > I ...

6 years, 9 months ago (2014-03-26 16:47:23 UTC) #84

shivdasp

https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc#newcode225 content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); On 2014/03/26 16:47:24, Jorge Lucangeli Obes wrote: ...

6 years, 9 months ago (2014-03-26 21:30:50 UTC) #85

shivdasp

Hi Pawel, Tried to address your proposal of moving the device specific egl allocation in ...

6 years, 9 months ago (2014-03-26 21:35:56 UTC) #86

sheu

https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/gpu/media/exynos_v4l2_video_device.cc File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode8 content/common/gpu/media/exynos_v4l2_video_device.cc:8: #include <libdrm/drm_fourcc.h> Header ordering -- this one above. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode194 ...

6 years, 9 months ago (2014-03-27 02:00:22 UTC) #87

shivdasp

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/exynos_v4l2_video_device.cc File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode194 content/common/gpu/media/exynos_v4l2_video_device.cc:194: device_output_buffer_map_.clear(); Seems this is not needed now after reading ...

6 years, 9 months ago (2014-03-27 02:41:03 UTC) #88

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi...
File content/common/gpu/media/exynos_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.cc:194:
device_output_buffer_map_.clear();
Seems this is not needed now after reading your previous comment.
On 2014/03/27 02:00:23, sheu wrote:
> This is really brittle.

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi...
File content/common/gpu/media/exynos_v4l2_video_device.h (right):

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.h:49: };
On 2014/03/27 02:00:23, sheu wrote:
> Since we already have the output buffers being tracked in the V4L2VDA, I think
> we could dispense with tracking them here altogether.  The CreateEGLImage()
call
> only needs:
> 
> * EGLDisplay egl_display
> * GLuint texture_id
> * gfx::Size frame_buffer_size
> * unsigned int buffer_index
> 
> It can call EXPBUF on the buffer_index, create the EGLImage from the DMABUF
fd,
> and then close the fd immediately.  The EGLImage is returned 
Ohh I did not know that the fd can be closed immediately here.
All the DeviceOutputRecord and map business was to have a correct fd being
closed while destroying . But looks like this is not needed at all.
All the code around device_output_buffer_map_ will also be removed. Thanks.
>to the V4L2VDA and
> can be closed with the normal eglDestroyImageKHR call.  We don't have to hold
on
> to the fd or the EGLImage.

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_device.h (right):

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:79: virtual uint8
GetNumberOfPlanes() = 0;
Alright will "Capture" this in my next patchset.
On 2014/03/27 02:00:23, sheu wrote:
> GetCapturePlaneCount()?  (Something with "Capture" in it)

Pawel Osciak

https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/gpu/media/v4l2_video_device.h File content/common/gpu/media/v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/gpu/media/v4l2_video_device.h#newcode60 content/common/gpu/media/v4l2_video_device.h:60: // may have its own format. The texture_id is ...

6 years, 9 months ago (2014-03-27 05:18:05 UTC) #89

shivdasp

https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/gpu/media/v4l2_video_device.h File content/common/gpu/media/v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/gpu/media/v4l2_video_device.h#newcode60 content/common/gpu/media/v4l2_video_device.h:60: // may have its own format. The texture_id is ...

6 years, 9 months ago (2014-03-27 05:40:36 UTC) #90

https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g...
File content/common/gpu/media/v4l2_video_device.h (right):

https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g...
content/common/gpu/media/v4l2_video_device.h:60: // may have its own format. The
texture_id is used to bind the
Apologies. I missed it. Will make a note of this.
On 2014/03/27 05:18:06, Pawel Osciak wrote:
> On 2014/03/25 10:36:40, shivdasp wrote:
> > On 2014/03/25 08:21:08, Pawel Osciak wrote:
> > > s/format/may use a different method of acquiring one and associating it to
> the
> > > given texture/
> > 
> > Done.
> 
> If I may suggest please, it is an agreed practice in this review tool to
respond
> "done" when uploading a new patch set that actually fixes the issue. This
helps
> with missing some of the comments like in this case.
> 
> The documentation still needs updating please. Here and in other places
please.

https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:991: if
(IsResolutionChangeNecessary()) {
On 2014/03/27 05:18:06, Pawel Osciak wrote:
> resolution_change_pending_ = IsResolutionChangeNecessary();

Done.

https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1913: if
((static_cast<int>(format.fmt.pix_mp.width) !=
On 2014/03/27 05:18:06, Pawel Osciak wrote:
> gfx::Size new_size(base::checked_cast<int>(format.fmt.pix_mp.width),
>                    base::checked_cast<int>(format.fmt.pix_mp.height));
> 
> if (frame_buffer_size_ != new_size) {
Will do.

> ...
> 
> Also, check if new_size isn't Empty.
new_size will be populated since GetFormatInfo is successful, so do we need to
check if it is still empty ?

https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g...
File content/common/gpu/media/v4l2_video_device.h (right):

https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g...
content/common/gpu/media/v4l2_video_device.h:70: virtual void
DestroyEGLImage(unsigned int buffer_index) = 0;
This method is not required, since we will close the dmabuf fds while in
createEGLImage() and eglDestroyImageKHR() is common to both the devices.
In next patchset this will be gone.
On 2014/03/27 05:18:06, Pawel Osciak wrote:
> Documentation please. This should also state who is the owner responsible for
> destroying the images.

https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g...
content/common/gpu/media/v4l2_video_device.h:76: virtual uint32
GetCapturePixelFormat() = 0;
My understanding is that we call S_FMT() in VDA with the pixelformat that we
expect and is compatible with the device. This is checked when we do
GetFormatInfo().
Similarly number of planes on the CAPTURE_PLANE are also device specific.
I added these methods so that if Exynos or Tegra change their pixel format and
number of planes in future, it should not break the other.


On 2014/03/27 05:18:06, Pawel Osciak wrote:
> Why do we need this? G_FMT tells us this.

Pawel Osciak

https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h File content/common/gpu/media/v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h#newcode70 content/common/gpu/media/v4l2_video_device.h:70: virtual void DestroyEGLImage(unsigned int buffer_index) = 0; On 2014/03/27 ...

6 years, 9 months ago (2014-03-27 05:58:34 UTC) #91

shivdasp

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h#newcode76 content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; On 2014/03/27 05:58:35, Pawel ...

6 years, 9 months ago (2014-03-27 06:51:45 UTC) #92

Pawel Osciak

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h#newcode76 content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; On 2014/03/27 06:51:46, shivdasp ...

6 years, 9 months ago (2014-03-27 06:59:55 UTC) #93

sheu

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h#newcode70 content/common/gpu/media/v4l2_video_device.h:70: virtual void DestroyEGLImage(unsigned int buffer_index) = 0; On 2014/03/27 ...

6 years, 9 months ago (2014-03-27 07:08:48 UTC) #94

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_device.h (right):

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:70: virtual void
DestroyEGLImage(unsigned int buffer_index) = 0;
On 2014/03/27 05:58:35, Pawel Osciak wrote:
> On 2014/03/27 05:40:37, shivdasp wrote:
> > This method is not required, since we will close the dmabuf fds while in
> > createEGLImage() and eglDestroyImageKHR() is common to both the devices.
> > In next patchset this will be gone.
> 
> I still feel we should have it even if only to call eglDestroyImageKHR
directly,
> for symmetry.

IMO we could call it CreateEGLImageForBuffer(..., unsigned int index) and not
have to shell out for the Destroy() part.  For this particular bit I don't think
symmetry is necessary.

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:76: virtual uint32
GetCapturePixelFormat() = 0;
On 2014/03/27 06:59:56, Pawel Osciak wrote:
> On 2014/03/27 06:51:46, shivdasp wrote:
> > On 2014/03/27 05:58:35, Pawel Osciak wrote:
> > > On 2014/03/27 05:40:37, shivdasp wrote:
> > > > My understanding is that we call S_FMT() in VDA with the pixelformat
that
> we
> > > > expect and is compatible with the device. This is checked when we do
> > > > GetFormatInfo().
> > > 
> > > This is because we prefer that format for performance reasons. This should
> > made
> > > more dynamic, but I'm ok with not doing it for now.
> > > 
> > > Please rename to PreferredOutputFormat() and make it static.
> > Sorry I didn't get it.
> > I thought this method would be extended by the device specific class since
it
> > can be different. Do you mean having a static data members that are
> initialized
> > to pixel format and number of planes in the ExynosV4L2Device and
> TegraV4L2Device
> > ?
> 
> Yeah sorry please ignore the static part.
> 
> As I mentioned though, you shouldn't need a plane number getter. G_FMT returns
> this.
> 
> > > 
> > > > Similarly number of planes on the CAPTURE_PLANE are also device
specific.
> > > > I added these methods so that if Exynos or Tegra change their pixel
format
> > and
> > > > number of planes in future, it should not break the other.
> > > > 
> > > > 
> > > > On 2014/03/27 05:18:06, Pawel Osciak wrote:
> > > > > Why do we need this? G_FMT tells us this.
> > > > 
> > > 
> > 
> 

That's right actually.  All we use the plane number getter is to DCHECK on it,
and if we were DCHECKing on it we should rather be doing it in the
device-specific V4L2VideoDevice implementation.  And since we're going to hide
the EGLImage creation logic inside the V4L2VideoDevice implementation, there's
no reason to expose the plane count here.

shivdasp

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h#newcode76 content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; On 2014/03/27 07:08:49, sheu ...

6 years, 9 months ago (2014-03-27 07:40:06 UTC) #95

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_device.h (right):

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:76: virtual uint32
GetCapturePixelFormat() = 0;
On 2014/03/27 07:08:49, sheu wrote:
> On 2014/03/27 06:59:56, Pawel Osciak wrote:
> > On 2014/03/27 06:51:46, shivdasp wrote:
> > > On 2014/03/27 05:58:35, Pawel Osciak wrote:
> > > > On 2014/03/27 05:40:37, shivdasp wrote:
> > > > > My understanding is that we call S_FMT() in VDA with the pixelformat
> that
> > we
> > > > > expect and is compatible with the device. This is checked when we do
> > > > > GetFormatInfo().
> > > > 
> > > > This is because we prefer that format for performance reasons. This
should
> > > made
> > > > more dynamic, but I'm ok with not doing it for now.
> > > > 
> > > > Please rename to PreferredOutputFormat() and make it static.
> > > Sorry I didn't get it.
> > > I thought this method would be extended by the device specific class since
> it
> > > can be different. Do you mean having a static data members that are
> > initialized
> > > to pixel format and number of planes in the ExynosV4L2Device and
> > TegraV4L2Device
> > > ?
> > 
> > Yeah sorry please ignore the static part.
> > 
> > As I mentioned though, you shouldn't need a plane number getter. G_FMT
returns
> > this.
> > 
> > > > 
> > > > > Similarly number of planes on the CAPTURE_PLANE are also device
> specific.
> > > > > I added these methods so that if Exynos or Tegra change their pixel
> format
> > > and
> > > > > number of planes in future, it should not break the other.
> > > > > 
> > > > > 
> > > > > On 2014/03/27 05:18:06, Pawel Osciak wrote:
> > > > > > Why do we need this? G_FMT tells us this.
> > > > > 
> > > > 
> > > 
> > 
> 
> That's right actually.  All we use the plane number getter is to DCHECK on it,
> and if we were DCHECKing on it we should rather be doing it in the
> device-specific V4L2VideoDevice implementation.  And since we're going to hide
> the EGLImage creation logic inside the V4L2VideoDevice implementation, there's
> no reason to expose the plane count here.
Apart from DCHECK , we use the plane count in EnqueueOutputRecord() line 1145
too. Though this plane count is actually coming from output_record.fds it is
actually device implementation specific. We populate the v4l2 struct arguments
based on the num_planes that we expect by the underlying driver.
So I think having a getter() will protect any such changes in device
implementations.
If you agree, I have made some changes to remove the fds from OutputRecord since
it is not really needed and it simplifies things a bit, I will upload it in few
mins.

Pawel Osciak

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h#newcode76 content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; > Apart from DCHECK ...

6 years, 9 months ago (2014-03-27 07:45:05 UTC) #96

shivdasp

https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h#newcode76 content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; On 2014/03/27 07:45:06, Pawel ...

6 years, 9 months ago (2014-03-27 07:54:54 UTC) #97

Pawel Osciak

On 2014/03/27 07:54:54, shivdasp wrote: > https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h > File content/common/gpu/media/v4l2_video_device.h (right): > > https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/media/v4l2_video_device.h#newcode76 > ...

6 years, 9 months ago (2014-03-27 08:01:16 UTC) #99

shivdasp

On 2014/03/27 08:01:16, Pawel Osciak wrote: > On 2014/03/27 07:54:54, shivdasp wrote: > > > ...

6 years, 9 months ago (2014-03-27 08:13:27 UTC) #100

Pawel Osciak

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/media/exynos_v4l2_video_device.cc File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode126 content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] = {-1, -1}; As I mentioned before, ...

6 years, 9 months ago (2014-03-27 09:09:49 UTC) #102

shivdasp

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/media/exynos_v4l2_video_device.cc File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode126 content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] = {-1, -1}; I think this is ...

6 years, 9 months ago (2014-03-27 09:33:21 UTC) #103

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
File content/common/gpu/media/exynos_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] =
{-1, -1};
I think this is do-able quickly.Done.
On 2014/03/27 09:09:50, Pawel Osciak wrote:
> As I mentioned before, num_planes should be passed to this method and not
> hardcoded.

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.cc:136: return
EGL_NO_IMAGE_KHR;
On 2014/03/27 09:09:50, Pawel Osciak wrote:
> If this fails on any other than the first fd, we will never close the ones
we've
> already exported and leak.
> 
> Please replace all close() calls in this method (including the ones on
> successful return) with a base::ScopedClosureRunner():
> 
> static void CloseDmabufFds(linked_ptr<std::vector<int> > dmabuf_fds) {
>   for (size_t i = 0; i < dmabuf_fds->size(); ++i)
>     close(dmabuf_fds->at(i));
> }
> 
> And in this method:
> 
> linked_ptr<std::vector<int> > dmabuf_fds(new std::vector<int>(num_planes,
-1));
> base::Closure dmabuf_fds_cb = 
>     base::Bind(&CloseDmabufFds, dmabuf_fds);
> base::ScopedClosureRunner dmabuf_fds_closer(dmabuf_fds_cb);
> 
> And here expbufs assigning to dmabuf_fds->at(i).
I need to understand the usage here. If I understand each of the fds are added
in the linked list and then deleted on failure.

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.cc:148: attrs[5] =
DRM_FORMAT_NV12;
On 2014/03/27 09:09:50, Pawel Osciak wrote:
> This shouldn't be hardcoded either actually...
> 
How do you propose we remove the hardcoding of DRM_FORMAT_NV12, is there a G_FMT
anywhere ? Since it is anyways in device implementation, it is confined for
Exynos only.

> How about instead of num_planes, size, etc. we just pass v4l2_format to this
> method and extract all info from there?
Hmm...  I think sending the complete v4l2_format struct here may not help.
Either we populate the v4l2_format before the call or in this class. I prefer
having it in this class.

> And in this case we can DCHECK on V4L2_PIX_FMT_NV12M and then hardcode this
DRM
> define for now.

sheu

For what it's worth -- I think we're really close. Just bits now. https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/media/exynos_v4l2_video_device.cc File ...

6 years, 9 months ago (2014-03-27 09:40:40 UTC) #104

For what it's worth -- I think we're really close.  Just bits now.

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
File content/common/gpu/media/exynos_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] =
{-1, -1};
On 2014/03/27 09:09:50, Pawel Osciak wrote:
> As I mentioned before, num_planes should be passed to this method and not
> hardcoded.

I'd disagree here -- if we're defining a preferred output format then we should
know the plane count here.  I think that's the simplest way around the issue.

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.cc:136: return
EGL_NO_IMAGE_KHR;
On 2014/03/27 09:09:50, Pawel Osciak wrote:
> If this fails on any other than the first fd, we will never close the ones
we've
> already exported and leak.
> 
> Please replace all close() calls in this method (including the ones on
> successful return) with a base::ScopedClosureRunner():
> 
> static void CloseDmabufFds(linked_ptr<std::vector<int> > dmabuf_fds) {
>   for (size_t i = 0; i < dmabuf_fds->size(); ++i)
>     close(dmabuf_fds->at(i));
> }
> 
> And in this method:
> 
> linked_ptr<std::vector<int> > dmabuf_fds(new std::vector<int>(num_planes,
-1));
> base::Closure dmabuf_fds_cb = 
>     base::Bind(&CloseDmabufFds, dmabuf_fds);
> base::ScopedClosureRunner dmabuf_fds_closer(dmabuf_fds_cb);
> 
> And here expbufs assigning to dmabuf_fds->at(i).

See above about preferred format.

In the case we go hard-coded on the number of planes, then dmabufds can be an
array of ScopedFD and we get the same auto-destruction behavior when they go out
of scope.  i.e.:

ScopedFD dma_bufs[2];

or if it's not hard-coded, even:

scoped_ptr<ScopedFD[]> dma_bufs(new ScopedFd[num_planes]);

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.cc:148: attrs[5] =
DRM_FORMAT_NV12;
On 2014/03/27 09:09:50, Pawel Osciak wrote:
> This shouldn't be hardcoded either actually...
> 
> How about instead of num_planes, size, etc. we just pass v4l2_format to this
> method and extract all info from there?
> And in this case we can DCHECK on V4L2_PIX_FMT_NV12M and then hardcode this
DRM
> define for now.

See above about preferred format.

shivdasp

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/media/exynos_v4l2_video_device.cc File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode126 content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] = {-1, -1}; On 2014/03/27 09:40:41, sheu ...

6 years, 9 months ago (2014-03-27 10:06:44 UTC) #105

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
File content/common/gpu/media/exynos_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] =
{-1, -1};
On 2014/03/27 09:40:41, sheu wrote:
> On 2014/03/27 09:09:50, Pawel Osciak wrote:
> > As I mentioned before, num_planes should be passed to this method and not
> > hardcoded.
> 
> I'd disagree here -- if we're defining a preferred output format then we
should
> know the plane count here.  I think that's the simplest way around > the
issue.
I think the num_planes depend upon the output format. So if we choose to use a
different output format, we wouldn't need to change the device implementation if
we use the parameterized num_planes here. I think that's what Pawel thinks.
But I guess that's partially true since there is rest of the code which would
need to change if num planes change. Filling up the attrs etc.
I am okay either ways. Let me know.

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.cc:136: return
EGL_NO_IMAGE_KHR;
On 2014/03/27 09:40:41, sheu wrote:
> On 2014/03/27 09:09:50, Pawel Osciak wrote:
> > If this fails on any other than the first fd, we will never close the ones
> we've
> > already exported and leak.
> > 
> > Please replace all close() calls in this method (including the ones on
> > successful return) with a base::ScopedClosureRunner():
> > 
> > static void CloseDmabufFds(linked_ptr<std::vector<int> > dmabuf_fds) {
> >   for (size_t i = 0; i < dmabuf_fds->size(); ++i)
> >     close(dmabuf_fds->at(i));
> > }
> > 
> > And in this method:
> > 
> > linked_ptr<std::vector<int> > dmabuf_fds(new std::vector<int>(num_planes,
> -1));
> > base::Closure dmabuf_fds_cb = 
> >     base::Bind(&CloseDmabufFds, dmabuf_fds);
> > base::ScopedClosureRunner dmabuf_fds_closer(dmabuf_fds_cb);
> > 
> > And here expbufs assigning to dmabuf_fds->at(i).
> 
> See above about preferred format.
> 
> In the case we go hard-coded on the number of planes, then dmabufds can be an
> array of ScopedFD and we get the same auto-destruction behavior when they go
out
> of scope.  i.e.:
> 
> ScopedFD dma_bufs[2];
> 
> or if it's not hard-coded, even:
> 
> scoped_ptr<ScopedFD[]> dma_bufs(new ScopedFd[num_planes]);

Using ScopedFD looks easier, let me make that change. Thanks.

shivdasp

Used the scopedFD for dmabuf fds. Waiting for consensus on other comments. PTAL

6 years, 9 months ago (2014-03-27 10:51:41 UTC) #106

kpurandare

On 2014/03/27 10:51:41, shivdasp wrote: Hi Pawel, Just wanted to confirm, if we are now ...

6 years, 9 months ago (2014-03-28 05:12:20 UTC) #107

Pawel Osciak

https://chromiumcodereview.appspot.com/137023008/diff/850001/content/common/gpu/media/exynos_v4l2_video_device.cc File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/850001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode126 content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] = {-1, -1}; On 2014/03/27 10:06:46, shivdasp ...

6 years, 9 months ago (2014-03-28 05:33:51 UTC) #108

https://chromiumcodereview.appspot.com/137023008/diff/850001/content/common/g...
File content/common/gpu/media/exynos_v4l2_video_device.cc (right):

https://chromiumcodereview.appspot.com/137023008/diff/850001/content/common/g...
content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] =
{-1, -1};
On 2014/03/27 10:06:46, shivdasp wrote:
> On 2014/03/27 09:40:41, sheu wrote:
> > On 2014/03/27 09:09:50, Pawel Osciak wrote:
> > > As I mentioned before, num_planes should be passed to this method and not
> > > hardcoded.
> > 
> > I'd disagree here -- if we're defining a preferred output format then we
> should
> > know the plane count here.  I think that's the simplest way around > the
> issue.
> I think the num_planes depend upon the output format. So if we choose to use a
> different output format, we wouldn't need to change the device implementation
if
> we use the parameterized num_planes here. I think that's what Pawel thinks.

Yes this is exactly what I mean. There is no need to hardcode both, because the
driver should return that value in G_FMT once we set the correct one. So there
is no need for this class to have a method for getting number of planes. V4L2VDA
can just call G_FMT to find out what it is.

> But I guess that's partially true since there is rest of the code which would
> need to change if num planes change. Filling up the attrs etc.
> I am okay either ways. Let me know.

We still need it, since V4L2VDA needs to know for other calls, like qbuf. For
Exynos num planes would be 2, but I think you didn't change the Tegra's
preferred format. You said before that it was not using 2-plane NV12M?
That's why using G_FMT in V4L2VDA and taking num_planes from there is the most
universal (and API-conformant), while simple enough way I would say.

https://chromiumcodereview.appspot.com/137023008/diff/850001/content/common/g...
content/common/gpu/media/exynos_v4l2_video_device.cc:136: return
EGL_NO_IMAGE_KHR;
On 2014/03/27 10:06:46, shivdasp wrote:
> On 2014/03/27 09:40:41, sheu wrote:
> > On 2014/03/27 09:09:50, Pawel Osciak wrote:
> > > If this fails on any other than the first fd, we will never close the ones
> > we've
> > > already exported and leak.
> > > 
> > > Please replace all close() calls in this method (including the ones on
> > > successful return) with a base::ScopedClosureRunner():
> > > 
> > > static void CloseDmabufFds(linked_ptr<std::vector<int> > dmabuf_fds) {
> > >   for (size_t i = 0; i < dmabuf_fds->size(); ++i)
> > >     close(dmabuf_fds->at(i));
> > > }
> > > 
> > > And in this method:
> > > 
> > > linked_ptr<std::vector<int> > dmabuf_fds(new std::vector<int>(num_planes,
> > -1));
> > > base::Closure dmabuf_fds_cb = 
> > >     base::Bind(&CloseDmabufFds, dmabuf_fds);
> > > base::ScopedClosureRunner dmabuf_fds_closer(dmabuf_fds_cb);
> > > 
> > > And here expbufs assigning to dmabuf_fds->at(i).
> > 
> > See above about preferred format.
> > 
> > In the case we go hard-coded on the number of planes, then dmabufds can be
an
> > array of ScopedFD and we get the same auto-destruction behavior when they go
> out
> > of scope.  i.e.:
> > 
> > ScopedFD dma_bufs[2];
> > 
> > or if it's not hard-coded, even:
> > 
> > scoped_ptr<ScopedFD[]> dma_bufs(new ScopedFd[num_planes]);
> 
> Using ScopedFD looks easier, let me make that change. Thanks.
> 

Yes, sorry, I forgot we had ScopedFD

https://chromiumcodereview.appspot.com/137023008/diff/850001/content/common/g...
content/common/gpu/media/exynos_v4l2_video_device.cc:148: attrs[5] =
DRM_FORMAT_NV12;
On 2014/03/27 09:33:22, shivdasp wrote:
> On 2014/03/27 09:09:50, Pawel Osciak wrote:
> > This shouldn't be hardcoded either actually...
> > 
> How do you propose we remove the hardcoding of DRM_FORMAT_NV12, is there a
G_FMT
> anywhere ? Since it is anyways in device implementation, it is confined for
> Exynos only.
> 

Ok, let's keep it for now and figure it out later. No need to block this CL on
this.

> > How about instead of num_planes, size, etc. we just pass v4l2_format to this
> > method and extract all info from there?

That's exactly what I hoped for.

> Hmm...  I think sending the complete v4l2_format struct here may not help.
> Either we populate the v4l2_format before the call or in this class. I prefer
> having it in this class.

It's the driver who should populate it on G_FMT.

> 
> > And in this case we can DCHECK on V4L2_PIX_FMT_NV12M and then hardcode this
> DRM
> > define for now.
> 

Yes exactly, agreed.

Pawel Osciak

By the way, for the sake of not holding this CL up too much longer ...

6 years, 9 months ago (2014-03-28 05:39:19 UTC) #109

shivdasp

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/media/exynos_v4l2_video_device.cc File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode126 content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] = {-1, -1}; On 2014/03/28 05:33:53, Pawel ...

6 years, 9 months ago (2014-03-28 05:42:37 UTC) #110

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
File content/common/gpu/media/exynos_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] =
{-1, -1};
On 2014/03/28 05:33:53, Pawel Osciak wrote:
> On 2014/03/27 10:06:46, shivdasp wrote:
> > On 2014/03/27 09:40:41, sheu wrote:
> > > On 2014/03/27 09:09:50, Pawel Osciak wrote:
> > > > As I mentioned before, num_planes should be passed to this method and
not
> > > > hardcoded.
> > > 
> > > I'd disagree here -- if we're defining a preferred output format then we
> > should
> > > know the plane count here.  I think that's the simplest way around > the
> > issue.
> > I think the num_planes depend upon the output format. So if we choose to use
a
> > different output format, we wouldn't need to change the device
implementation
> if
> > we use the parameterized num_planes here. I think that's what Pawel thinks.
> 
> Yes this is exactly what I mean. There is no need to hardcode both, because
the
> driver should return that value in G_FMT once we set the correct one. So there
> is no need for this class to have a method for getting number of planes.
V4L2VDA
> can just call G_FMT to find out what it is.
Okay I will make a change to send in the num_planes as an additional argument to
CreateEGLImage(). That's the only change I hope.
> 
> > But I guess that's partially true since there is rest of the code which
would
> > need to change if num planes change. Filling up the attrs etc.
> > I am okay either ways. Let me know.
> 
> We still need it, since V4L2VDA needs to know for other calls, like qbuf. For
> Exynos num planes would be 2, but I think you didn't change the Tegra's
> preferred format. You said before that it was not using 2-plane NV12M?
> That's why using G_FMT in V4L2VDA and taking num_planes from there is the most
> universal (and API-conformant), while simple enough way I would say.

We made change in the buffer allocation to have Tegra's preferred format same as
Exynos. Anyways since we now have moved code related to output formats into
device specific changing either ways will not affect another.

https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
content/common/gpu/media/exynos_v4l2_video_device.cc:148: attrs[5] =
DRM_FORMAT_NV12;
On 2014/03/28 05:33:53, Pawel Osciak wrote:
> On 2014/03/27 09:33:22, shivdasp wrote:
> > On 2014/03/27 09:09:50, Pawel Osciak wrote:
> > > This shouldn't be hardcoded either actually...
> > > 
> > How do you propose we remove the hardcoding of DRM_FORMAT_NV12, is there a
> G_FMT
> > anywhere ? Since it is anyways in device implementation, it is confined for
> > Exynos only.
> > 
> 
> Ok, let's keep it for now and figure it out later. No need to block this CL on
> this.
> 
> > > How about instead of num_planes, size, etc. we just pass v4l2_format to
this
> > > method and extract all info from there?
> 
> That's exactly what I hoped for.
> 
> > Hmm...  I think sending the complete v4l2_format struct here may not help.
> > Either we populate the v4l2_format before the call or in this class. I
prefer
> > having it in this class.
> 
> It's the driver who should populate it on G_FMT.
> 
> > 
> > > And in this case we can DCHECK on V4L2_PIX_FMT_NV12M and then hardcode
this
> > DRM
> > > define for now.
> > 
> 
> Yes exactly, agreed.

I understand there is no more change required here now atleast in this CL.

Pawel Osciak

On 2014/03/28 05:42:37, shivdasp wrote: > We made change in the buffer allocation to have ...

6 years, 9 months ago (2014-03-28 05:59:55 UTC) #111

On 2014/03/28 05:42:37, shivdasp wrote:
> We made change in the buffer allocation to have Tegra's preferred format same
as
> Exynos. Anyways since we now have moved code related to output formats into
> device specific changing either ways will not affect another.

Are you now allocating two discontiguous memory buffers per each v4l2 buffer and
using exactly this pixel format:
http://linuxtv.org/downloads/v4l-dvb-apis/re30.html and your converter/GPU now
handles this format?

V4L2::Dequeue() still needs the hardcoding to be removed, but yes, apart from
that I think that should be all for now. Let's not hold this up anymore.

>
https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi...
> content/common/gpu/media/exynos_v4l2_video_device.cc:148: attrs[5] =
> DRM_FORMAT_NV12;
> On 2014/03/28 05:33:53, Pawel Osciak wrote:
> > On 2014/03/27 09:33:22, shivdasp wrote:
> > > On 2014/03/27 09:09:50, Pawel Osciak wrote:
> > > > This shouldn't be hardcoded either actually...
> > > > 
> > > How do you propose we remove the hardcoding of DRM_FORMAT_NV12, is there a
> > G_FMT
> > > anywhere ? Since it is anyways in device implementation, it is confined
for
> > > Exynos only.
> > > 
> > 
> > Ok, let's keep it for now and figure it out later. No need to block this CL
on
> > this.
> > 
> > > > How about instead of num_planes, size, etc. we just pass v4l2_format to
> this
> > > > method and extract all info from there?
> > 
> > That's exactly what I hoped for.
> > 
> > > Hmm...  I think sending the complete v4l2_format struct here may not help.
> > > Either we populate the v4l2_format before the call or in this class. I
> prefer
> > > having it in this class.
> > 
> > It's the driver who should populate it on G_FMT.
> > 
> > > 
> > > > And in this case we can DCHECK on V4L2_PIX_FMT_NV12M and then hardcode
> this
> > > DRM
> > > > define for now.
> > > 
> > 
> > Yes exactly, agreed.
> 
> I understand there is no more change required here now atleast in this CL.

Yes, let's leave hardcoding removal in device classes for later.

Pawel Osciak

https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc#oldcode1539 content/common/gpu/media/v4l2_video_decode_accelerator.cc:1539: Let's not remove this please. https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): ...

6 years, 9 months ago (2014-03-28 06:21:44 UTC) #112

shivdasp

https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc#oldcode1539 content/common/gpu/media/v4l2_video_decode_accelerator.cc:1539: On 2014/03/28 06:21:46, Pawel Osciak wrote: > Let's not ...

6 years, 9 months ago (2014-03-28 06:40:42 UTC) #113

Pawel Osciak

https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode1887 content/common/gpu/media/v4l2_video_decode_accelerator.cc:1887: if (frame_buffer_size_.IsEmpty()) On 2014/03/28 06:40:43, shivdasp wrote: > This ...

6 years, 9 months ago (2014-03-28 06:52:32 UTC) #114

shivdasp

https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode1887 content/common/gpu/media/v4l2_video_decode_accelerator.cc:1887: if (frame_buffer_size_.IsEmpty()) On 2014/03/28 06:52:34, Pawel Osciak wrote: > ...

6 years, 9 months ago (2014-03-28 07:00:18 UTC) #115

Pawel Osciak

https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode1887 content/common/gpu/media/v4l2_video_decode_accelerator.cc:1887: if (frame_buffer_size_.IsEmpty()) On 2014/03/28 07:00:19, shivdasp wrote: > On ...

6 years, 9 months ago (2014-03-28 07:25:52 UTC) #116

https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1887: if
(frame_buffer_size_.IsEmpty())
On 2014/03/28 07:00:19, shivdasp wrote:
> On 2014/03/28 06:52:34, Pawel Osciak wrote:
> > On 2014/03/28 06:40:43, shivdasp wrote:
> > > This basically is used to trigger Decoder initialization through
> > > RESOLUTION_CHANGE event.
> > 
> > Yes.
> > 
> > > If we remove this, we need to modify GetFormatInfo() which would probably
> need
> > a
> > > change in DBI() as well since it gets called from there as well.
> > > If GetFormatInfo() does not get new format, it set again to true.
> > > Should we keep this for now ?
> > 
> > Sorry, I don't understand. How would we need to modify GetFormatInfo()? If
we
> > get here, then it means we got an event from the driver. So GetFormatInfo
> cannot
> > fail.
> > 
> > Why would we want to set this to true if GetFormatInfo fails?
> > 
> 
> GetFormatInfo() can fail if the asynchronos decoder initialization has not yet
> completed when we were in DBI(). And this function is triggered through the
> RESOLUTION_CHANGE event enqueued to trigger decoder initialization.
> 

But then the driver should not send the event if it's not ready to receive a
G_FMT.
The driver may only send the event if it's ready for a G_FMT.
And we only get here after receiving the event...
Am I missing something?

> I am sorry, I am not very clear what exactly you would like to change here.
> Could you please elaborate ?
> 
> > > On 2014/03/28 06:21:46, Pawel Osciak wrote:
> > > > We need to verify that GetFormatInfo doesn't fail and returns something
> that
> > > > makes sense. Please remove this, the if (frame_buffer_size_ != new_size)
> > will
> > > > handle this case as well.
> > > 
> > 
>

shivdasp

https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode1887 content/common/gpu/media/v4l2_video_decode_accelerator.cc:1887: if (frame_buffer_size_.IsEmpty()) > But then the driver should not ...

6 years, 9 months ago (2014-03-28 08:00:33 UTC) #117

shivdasp

Made the number of planes parameterized and addressed other comments. PTAL

6 years, 9 months ago (2014-03-28 08:52:48 UTC) #118

Pawel Osciak

https://codereview.chromium.org/137023008/diff/880001/content/common/gpu/media/v4l2_video_decode_accelerator.h File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/137023008/diff/880001/content/common/gpu/media/v4l2_video_decode_accelerator.h#newcode436 content/common/gpu/media/v4l2_video_decode_accelerator.h:436: uint8 output_planes_count_; s/uint8/size_t/

6 years, 9 months ago (2014-03-28 10:18:41 UTC) #120

Pawel Osciak

What's the plan for sandbox changes? In any case, I think it's time to ask ...

6 years, 9 months ago (2014-03-28 10:19:12 UTC) #121

shivdasp

On 2014/03/28 10:19:12, Pawel Osciak wrote: > What's the plan for sandbox changes? I explained ...

6 years, 9 months ago (2014-03-28 10:24:48 UTC) #122

shivdasp

https://codereview.chromium.org/137023008/diff/880001/content/common/gpu/media/v4l2_video_decode_accelerator.h File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/137023008/diff/880001/content/common/gpu/media/v4l2_video_decode_accelerator.h#newcode436 content/common/gpu/media/v4l2_video_decode_accelerator.h:436: uint8 output_planes_count_; On 2014/03/28 10:18:42, Pawel Osciak wrote: > ...

6 years, 9 months ago (2014-03-28 10:53:35 UTC) #123

Ami GONE FROM CHROMIUM

Mostly nits; I think this is really close to being landable! https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): ...

6 years, 9 months ago (2014-03-28 17:10:00 UTC) #124

shivdasp

Ami and John could you take a look at this please. Thanks

6 years, 9 months ago (2014-03-28 17:34:33 UTC) #125

Ami GONE FROM CHROMIUM

On 2014/03/28 17:34:33, shivdasp wrote: > Ami and John could you take a look at ...

6 years, 9 months ago (2014-03-28 17:44:34 UTC) #126

shivdasp

Patch incoming to address these comments. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/media/tegra_v4l2_video_device.cc#newcode4 content/common/gpu/media/tegra_v4l2_video_device.cc:4: // On 2014/03/28 ...

6 years, 9 months ago (2014-03-28 18:54:10 UTC) #127

Patch incoming to address these comments.

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:4: //
On 2014/03/28 17:10:01, Ami Fischman wrote:
> drop this line

Done.

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:104: static bool
functions_initialized = InitializeLibrarySymbols();
How was/is this done in vaapi_wrapper.c ?
There is no real computation happening in InitializeLibrarySymbols() so should
it be okay as is ? atleast for now.

On 2014/03/28 17:10:01, Ami Fischman wrote:
> this is racy:
>
http://dev.chromium.org/developers/coding-style/cpp-dos-and-donts#TOC-Static-...

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:154: TegraV4L2_##name =
reinterpret_cast<TegraV4L2##name>( \
On 2014/03/28 17:10:01, Ami Fischman wrote:
> ## and # are operators (albeit a pre-processor one).  Chromium style puts
spaces
> around operators.

Done.

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1912: return false;
See below response.
On 2014/03/28 17:10:01, Ami Fischman wrote:
> Considering that the containing method is only called if the driver indicated
a
> resolution change, shouldn't this default to true in both the again==true and
> !ret cases?

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1920: return false;
V4L2_EVENT_RESOLUTION change is enqueued by the driver at the time of decoder
initialization. The DBI() sequence also tries to check if format info is set in
DBI() using GetFormatInfo(). The decoder initialization is asynchronous in
Tegra.
So if VDA did detect decoder initialization through GetFormatInfo() in DBI()
luckily, we have to drop this V4L2_EVENT_RESOLUTION event since we do not want
an un-necessary re-allocation of buffers and hence we return false if neither
size nor CID_MIN_BUFFERS_FOR_CAPTURE match.

On 2014/03/28 17:10:01, Ami Fischman wrote:
> Last comment put another way: what can trigger V4L2_EVENT_RESOLUTION_CHANGE
but
> still not want to be a resolution change?

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_decode_accelerator.h (right):

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.h:297: // is indeed
required by returning true if either:
On 2014/03/28 17:10:01, Ami Fischman wrote:
> s/if either/iff/

Done.

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.h:298: // - width or
height of the new format is different than previous format.
On 2014/03/28 17:10:01, Ami Fischman wrote:
> s/./; or/

Done.

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.h:300: // Returns false
otherwise.
On 2014/03/28 17:10:01, Ami Fischman wrote:
> drop this line if you take my suggestion at l.297

Done.

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.h:435: // Stores the
number of planes (i.e. separate memory buffers) for output.
I will move this to l.397 but I don't think adding this variable in the comment
is true since this member is actually modified in the decoder_thread_ context.

On 2014/03/28 17:10:01, Ami Fischman wrote:
> This is decoder-thread state so belongs above.   I'd put it at l.397 and add
it
> to the list of variables in the comment at l.373.
> (read & set on the decoder thread, read on the child thread but only in
> AssignPictureBuffers which has the comment at l.334 of the .cc file).

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_device.h (right):

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:60: // This method is used to
create an EglImage since each V4L2Device
On 2014/03/28 17:10:01, Ami Fischman wrote:
> "This method is used to " is not adding value (ditto for other methods in this
> class whose comment starts with "This method " or "These methods are used to "
> etc.).
> 
>
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Function_Comments
> for details.

Done. I kept a few of "These methods" since the sentences seem alright.

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:62: // given texture. The
texture_id is used to bind the texture to the created
On 2014/03/28 17:10:01, Ami Fischman wrote:
> s/created/returned/

Done.

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:63: // eglImage. buffer_index can
be used to associate the created EglImage by
On 2014/03/28 17:10:01, Ami Fischman wrote:
> s/EglImage/EGLImageKHR/

Done.

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_device.h:79: virtual uint32
PreferredOutputFormat() = 0;
This is future-proofing as the formats are now same.
On 2014/03/28 17:10:01, Ami Fischman wrote:
> Both impls return the same (V4L2_PIX_FMT_NV12M).  Is this future-proofing or
is
> someone planning a change here?

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
File content/common/gpu/media/video_decode_accelerator_unittest.cc (right):

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/video_decode_accelerator_unittest.cc:1548:
dlopen("/usr/lib/libtegrav4l2.so", RTLD_NOW | RTLD_GLOBAL | RTLD_NODELETE);
I didn't get it. I think the sandbox is enabled here below so loading this
library in InitializeLibrarySymbols() will not be allowed after sandbox is
enabled.
So I load the library here.

On 2014/03/28 17:10:01, Ami Fischman wrote:
> Why not put this in TegraV4L2Device::InitializeLibrarySymbols() (relying on
> dlopen's refcounting nature to avoid sandbox violations in the sandboxed case,
> b/c the pre-sandbox code will already have dlopen'd this).

https://codereview.chromium.org/137023008/diff/900001/content/common/sandbox_...
File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right):

https://codereview.chromium.org/137023008/diff/900001/content/common/sandbox_...
content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225:
dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag);
David's CL https://codereview.chromium.org/179983006/ removes these preload
libraries altogether. I see it is merged just a few hours ago.
Will rebase and upload another patchset.

On 2014/03/28 17:10:01, Ami Fischman wrote:
> IDK what the TODO above this line is about, but if this dlopen falls under it,
> then probably it should move above the TODO, and if it does not, it probably
> warrants a newline between the TODO above and this dlopen, and maybe even a
> clarification of the TODO that it refers to the libs _preceding_ it.

https://codereview.chromium.org/137023008/diff/900001/content/content_common....
File content/content_common.gypi (right):

https://codereview.chromium.org/137023008/diff/900001/content/content_common....
content/content_common.gypi:627: 'common/gpu/media/tegra_v4l2_video_device.h',
Hmm.. I need to check it by comparing with or without this CL. Do you need it
now to confirm something ? Should not be much I reckon.
On 2014/03/28 17:10:01, Ami Fischman wrote:
> OOC what is the impact of this on the size (bytes) of libcontent.a or
> libcontent_common.a (depending on whether you're building static or shared
> libs)?

shivdasp

Syncing my code (taking more time than usual) to rebase because of a change in ...

6 years, 9 months ago (2014-03-28 19:06:34 UTC) #128

Ami GONE FROM CHROMIUM

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/media/video_decode_accelerator_unittest.cc File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/media/video_decode_accelerator_unittest.cc#newcode1548 content/common/gpu/media/video_decode_accelerator_unittest.cc:1548: dlopen("/usr/lib/libtegrav4l2.so", RTLD_NOW | RTLD_GLOBAL | RTLD_NODELETE); On 2014/03/28 18:54:11, ...

6 years, 9 months ago (2014-03-28 19:33:38 UTC) #129

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
File content/common/gpu/media/video_decode_accelerator_unittest.cc (right):

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/video_decode_accelerator_unittest.cc:1548:
dlopen("/usr/lib/libtegrav4l2.so", RTLD_NOW | RTLD_GLOBAL | RTLD_NODELETE);
On 2014/03/28 18:54:11, shivdasp wrote:
> I didn't get it. I think the sandbox is enabled here below so loading this
> library in InitializeLibrarySymbols() will not be allowed after sandbox is
> enabled.
> So I load the library here.
> 
> On 2014/03/28 17:10:01, Ami Fischman wrote:
> > Why not put this in TegraV4L2Device::InitializeLibrarySymbols() (relying on
> > dlopen's refcounting nature to avoid sandbox violations in the sandboxed
case,
> > b/c the pre-sandbox code will already have dlopen'd this).
> 

The sandbox never used to be triggered by this binary AFAIK.
Are you sure it is ever engaged?

https://codereview.chromium.org/137023008/diff/900001/content/content_common....
File content/content_common.gypi (right):

https://codereview.chromium.org/137023008/diff/900001/content/content_common....
content/content_common.gypi:627: 'common/gpu/media/tegra_v4l2_video_device.h',
On 2014/03/28 18:54:11, shivdasp wrote:
> Hmm.. I need to check it by comparing with or without this CL. Do you need it
> now to confirm something ? Should not be much I reckon.
> On 2014/03/28 17:10:01, Ami Fischman wrote:
> > OOC what is the impact of this on the size (bytes) of libcontent.a or
> > libcontent_common.a (depending on whether you're building static or shared
> > libs)?
> 

I also believe it will not be a significant increase but was hoping to get
confirmation.

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:103: static bool
functions_initialized = InitializeLibrarySymbols();
On 2014/03/28 18:54:11, shivdasp wrote:
> How was/is this done in vaapi_wrapper.c ?
> There is no real computation happening in InitializeLibrarySymbols() so should
> it be okay as is ? atleast for now.
> 
> On 2014/03/28 17:10:01, Ami Fischman wrote:
> > this is racy:
> >
>
http://dev.chromium.org/developers/coding-style/cpp-dos-and-donts#TOC-Static-...
> 

Why add a known race condition when the fix is so easy?

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:153: TegraV4L2_##name =
reinterpret_cast<TegraV4L2##name>( \
On 2014/03/28 18:54:11, shivdasp wrote:
> On 2014/03/28 17:10:01, Ami Fischman wrote:
> > ## and # are operators (albeit a pre-processor one).  Chromium style puts
> spaces
> > around operators.
> 
> Done.

I don't see spaces.  To be clear I mean that 
TegraV4L2_##name
should be 
TegraV4L2_ ## name
and
"TegraV4L2_" #name
should be
"TegraV4L2_" # name

and so on

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1912: return false;
On 2014/03/28 18:54:11, shivdasp wrote:
> See below response.
> On 2014/03/28 17:10:01, Ami Fischman wrote:
> > Considering that the containing method is only called if the driver
indicated
> a
> > resolution change, shouldn't this default to true in both the again==true
and
> > !ret cases?
> 

Response below makes sense to me for why you return false at l.1920, but not why
it makes sense to return true if GetFormatInfo fails.

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_decode_accelerator.h (right):

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.h:372: // while the Child
thread manipulates them.
On 2014/03/28 18:54:11, shivdasp wrote:
> I will move this to l.397 but I don't think adding this variable in the
comment
> is true since this member is actually modified in the decoder_thread_ context.
> 
> On 2014/03/28 17:10:01, Ami Fischman wrote:
> > This is decoder-thread state so belongs above.   I'd put it at l.397 and add
> it
> > to the list of variables in the comment at l.373.
> > (read & set on the decoder thread, read on the child thread but only in
> > AssignPictureBuffers which has the comment at l.334 of the .cc file).
> 


This comment is talking about vars that are normally read/written on the decoder
thread but which are accessed on the child thread during known-safe times.  That
seems to match output_planes_count_ to me.

shivdasp

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/media/video_decode_accelerator_unittest.cc File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/media/video_decode_accelerator_unittest.cc#newcode1548 content/common/gpu/media/video_decode_accelerator_unittest.cc:1548: dlopen("/usr/lib/libtegrav4l2.so", RTLD_NOW | RTLD_GLOBAL | RTLD_NODELETE); On 2014/03/28 19:33:40, ...

6 years, 9 months ago (2014-03-28 20:39:24 UTC) #130

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
File content/common/gpu/media/video_decode_accelerator_unittest.cc (right):

https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi...
content/common/gpu/media/video_decode_accelerator_unittest.cc:1548:
dlopen("/usr/lib/libtegrav4l2.so", RTLD_NOW | RTLD_GLOBAL | RTLD_NODELETE);
On 2014/03/28 19:33:40, Ami Fischman wrote:
> On 2014/03/28 18:54:11, shivdasp wrote:
> > I didn't get it. I think the sandbox is enabled here below so loading this
> > library in InitializeLibrarySymbols() will not be allowed after sandbox is
> > enabled.
> > So I load the library here.
> > 
> > On 2014/03/28 17:10:01, Ami Fischman wrote:
> > > Why not put this in TegraV4L2Device::InitializeLibrarySymbols() (relying
on
> > > dlopen's refcounting nature to avoid sandbox violations in the sandboxed
> case,
> > > b/c the pre-sandbox code will already have dlopen'd this).
> > 
> 
> The sandbox never used to be triggered by this binary AFAIK.
> Are you sure it is ever engaged?
What do you mean by binary here ?
The preload of the library before sandbox helps us to acquire resources
(pre-open the device nodes etc.) Without pre-loading it does not work.

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
File content/common/gpu/media/tegra_v4l2_video_device.cc (right):

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:103: static bool
functions_initialized = InitializeLibrarySymbols();
I am not very familiar with LazyInstance, will this be correct way ?

base::LazyInstance<bool>::Leaky g_functions_initialized =
          LAZY_INSTANCE_INITIALIZER;

TVDA::Initialize() {
if (!g_functions_initialized.Get()) {
    if (!InitializeLibrarySymbols()) {
      DLOG(ERROR) << "Unable to initialize functions ";
      return false;
    }
    g_functions_initialized.Get() = true;
  }
}

On 2014/03/28 19:33:40, Ami Fischman wrote:
> On 2014/03/28 18:54:11, shivdasp wrote:
> > How was/is this done in vaapi_wrapper.c ?
> > There is no real computation happening in InitializeLibrarySymbols() so
should
> > it be okay as is ? atleast for now.
> > 
> > On 2014/03/28 17:10:01, Ami Fischman wrote:
> > > this is racy:
> > >
> >
>
http://dev.chromium.org/developers/coding-style/cpp-dos-and-donts#TOC-Static-...
> > 
> 
> Why add a known race condition when the fix is so easy?

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
content/common/gpu/media/tegra_v4l2_video_device.cc:153: TegraV4L2_##name =
reinterpret_cast<TegraV4L2##name>( \
Trust me , I really did these changes but for some reason they did not get
uploaded.

On 2014/03/28 19:33:40, Ami Fischman wrote:
> On 2014/03/28 18:54:11, shivdasp wrote:
> > On 2014/03/28 17:10:01, Ami Fischman wrote:
> > > ## and # are operators (albeit a pre-processor one).  Chromium style puts
> > spaces
> > > around operators.
> > 
> > Done.
> 
> I don't see spaces.  To be clear I mean that 
> TegraV4L2_##name
> should be 
> TegraV4L2_ ## name
> and
> "TegraV4L2_" #name
> should be
> "TegraV4L2_" # name
> 
> and so on

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.cc:1912: return false;
GetFormatInfo() on failure will call NOTIFY_PLATFORM() so VDA goes into error
state anyways.

On 2014/03/28 19:33:40, Ami Fischman wrote:
> On 2014/03/28 18:54:11, shivdasp wrote:
> > See below response.
> > On 2014/03/28 17:10:01, Ami Fischman wrote:
> > > Considering that the containing method is only called if the driver
> indicated
> > a
> > > resolution change, shouldn't this default to true in both the again==true
> and
> > > !ret cases?
> > 
> 
> Response below makes sense to me for why you return false at l.1920, but not
why
> it makes sense to return true if GetFormatInfo fails.

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
File content/common/gpu/media/v4l2_video_decode_accelerator.h (right):

https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi...
content/common/gpu/media/v4l2_video_decode_accelerator.h:372: // while the Child
thread manipulates them.
On 2014/03/28 19:33:40, Ami Fischman wrote:
> On 2014/03/28 18:54:11, shivdasp wrote:
> > I will move this to l.397 but I don't think adding this variable in the
> comment
> > is true since this member is actually modified in the decoder_thread_
context.
> > 
> > On 2014/03/28 17:10:01, Ami Fischman wrote:
> > > This is decoder-thread state so belongs above.   I'd put it at l.397 and
add
> > it
> > > to the list of variables in the comment at l.373.
> > > (read & set on the decoder thread, read on the child thread but only in
> > > AssignPictureBuffers which has the comment at l.334 of the .cc file).
> > 
> 
> 
> This comment is talking about vars that are normally read/written on the
decoder
> thread but which are accessed on the child thread during known-safe times. 
That
> seems to match output_planes_count_ to me.

Done.

Ami GONE FROM CHROMIUM

On Fri, Mar 28, 2014 at 1:39 PM, <shivdasp@nvidia.com> wrote: > > https://codereview.chromium.org/137023008/diff/900001/ > content/common/gpu/media/video_decode_accelerator_unittest.cc ...

6 years, 9 months ago (2014-03-28 20:51:08 UTC) #131

On Fri, Mar 28, 2014 at 1:39 PM, <shivdasp@nvidia.com> wrote:

>
> https://codereview.chromium.org/137023008/diff/900001/
> content/common/gpu/media/video_decode_accelerator_unittest.cc
> File content/common/gpu/media/video_decode_accelerator_unittest.cc
> (right):
>
> https://codereview.chromium.org/137023008/diff/900001/
> content/common/gpu/media/video_decode_accelerator_unittest.cc#newcode1548
> content/common/gpu/media/video_decode_accelerator_unittest.cc:1548:
> dlopen("/usr/lib/libtegrav4l2.so", RTLD_NOW | RTLD_GLOBAL |
> RTLD_NODELETE);
> On 2014/03/28 19:33:40, Ami Fischman wrote:
>
>> On 2014/03/28 18:54:11, shivdasp wrote:
>> > I didn't get it. I think the sandbox is enabled here below so
>>
> loading this
>
>> > library in InitializeLibrarySymbols() will not be allowed after
>>
> sandbox is
>
>> > enabled.
>> > So I load the library here.
>> >
>> > On 2014/03/28 17:10:01, Ami Fischman wrote:
>> > > Why not put this in TegraV4L2Device::InitializeLibrarySymbols()
>>
> (relying on
>
>> > > dlopen's refcounting nature to avoid sandbox violations in the
>>
> sandboxed
>
>> case,
>> > > b/c the pre-sandbox code will already have dlopen'd this).
>> >
>>
>
>  The sandbox never used to be triggered by this binary AFAIK.
>> Are you sure it is ever engaged?
>>
> What do you mean by binary here ?
> The preload of the library before sandbox helps us to acquire resources
> (pre-open the device nodes etc.) Without pre-loading it does not work.


By "binary" I meant that this is a standalone test program, not part of
chrome.
I don't believe the sandbox is used for this unittest.


> https://codereview.chromium.org/137023008/diff/920001/
> content/common/gpu/media/tegra_v4l2_video_device.cc
> File content/common/gpu/media/tegra_v4l2_video_device.cc (right):
>
> https://codereview.chromium.org/137023008/diff/920001/
> content/common/gpu/media/tegra_v4l2_video_device.cc#newcode103
> content/common/gpu/media/tegra_v4l2_video_device.cc:103: static bool
> functions_initialized = InitializeLibrarySymbols();
> I am not very familiar with LazyInstance, will this be correct way ?
>
> base::LazyInstance<bool>::Leaky g_functions_initialized =
>           LAZY_INSTANCE_INITIALIZER;
>
> TVDA::Initialize() {
> if (!g_functions_initialized.Get()) {
>     if (!InitializeLibrarySymbols()) {
>       DLOG(ERROR) << "Unable to initialize functions ";
>       return false;
>     }
>     g_functions_initialized.Get() = true;
>
>   }
> }
>

No. You need to make a helper
class TegraFunctionSymbolFinder {
 public:
  TegraFunctionSymbolFinder() : initialized_(false) {
    ...do the work...
    initialized_ = true;
  }
  bool initialized() { return initialized_; }
 private:
  bool initailized_;
};

And then instead of your function-static you do:
  if (!g_tegra_function_symbol_finder_.Get()->initialized())
    return OOPS;

(with a global
LazyInstance<TegraFunctionSymbolFinder> g_tegra_function_symbol_finder_)


> https://codereview.chromium.org/137023008/diff/920001/
> content/common/gpu/media/tegra_v4l2_video_device.cc#newcode153
> content/common/gpu/media/tegra_v4l2_video_device.cc:153:
> TegraV4L2_##name = reinterpret_cast<TegraV4L2##name>( \
> Trust me , I really did these changes but for some reason they did not
> get uploaded.


lol


> https://codereview.chromium.org/137023008/diff/920001/
> content/common/gpu/media/v4l2_video_decode_accelerator.cc
> File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right):
>
> https://codereview.chromium.org/137023008/diff/920001/
> content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode1912
> content/common/gpu/media/v4l2_video_decode_accelerator.cc:1912: return
> false;
> GetFormatInfo() on failure will call NOTIFY_PLATFORM() so VDA goes into
> error state anyways.


What about the *again=true; return true; path in GetFormatInfo?


> https://codereview.chromium.org/137023008/diff/920001/
> content/common/gpu/media/v4l2_video_decode_accelerator.h
> File content/common/gpu/media/v4l2_video_decode_accelerator.h (right):
>
> https://codereview.chromium.org/137023008/diff/920001/
> content/common/gpu/media/v4l2_video_decode_accelerator.h#newcode372
> content/common/gpu/media/v4l2_video_decode_accelerator.h:372: // while
> the Child thread manipulates them.
> On 2014/03/28 19:33:40, Ami Fischman wrote:
>
>> On 2014/03/28 18:54:11, shivdasp wrote:
>> > I will move this to l.397 but I don't think adding this variable in
>>
> the
>
>> comment
>> > is true since this member is actually modified in the
>>
> decoder_thread_ context.
>
>> >
>> > On 2014/03/28 17:10:01, Ami Fischman wrote:
>> > > This is decoder-thread state so belongs above.   I'd put it at
>>
> l.397 and add
>
>> > it
>> > > to the list of variables in the comment at l.373.
>> > > (read & set on the decoder thread, read on the child thread but
>>
> only in
>
>> > > AssignPictureBuffers which has the comment at l.334 of the .cc
>>
> file).
>
>> >
>>
>
>
>  This comment is talking about vars that are normally read/written on
>>
> the decoder
>
>> thread but which are accessed on the child thread during known-safe
>>
> times.  That
>
>> seems to match output_planes_count_ to me.
>>
>
> Done.
>

I don't see a new patchset yet.


>
> https://codereview.chromium.org/137023008/
>

To unsubscribe from this group and stop receiving emails from it, send an email
to chromium-reviews+unsubscribe@chromium.org.

shivdasp

Can't reply inline for some reason. Regarding GetFormatInfo(), it must succeed since we have received ...

6 years, 9 months ago (2014-03-28 22:03:08 UTC) #132

shivdasp

git cl format does not like "spaces around ##" and that's what happened when I ...

6 years, 9 months ago (2014-03-28 22:15:47 UTC) #133

Ami GONE FROM CHROMIUM

Yes, obey git cl format. Yes, put TegraSymbolFinder can be in tegra_v4l2_video_device.cc Let me know ...

6 years, 9 months ago (2014-03-28 23:25:24 UTC) #134

shivdasp

Hi Ami, Pawel, Hope there are no more comments here to be addressed. Thanks for ...

6 years, 9 months ago (2014-03-29 01:10:21 UTC) #136

kpurandare

Hi Ami, Pawel, I hope the final patchset which is rebased is good to go ...

6 years, 9 months ago (2014-03-29 02:47:43 UTC) #137

Pawel Osciak

Hi Shivdas, Could you please address my yesterday's question about the capture format (V4L2_PIX_FMT_NV12M)? Are ...

6 years, 9 months ago (2014-03-29 04:01:30 UTC) #138

kpurandare

On 2014/03/29 04:01:30, Pawel Osciak wrote: Hi Pawel, As per my understanding the capture format ...

6 years, 9 months ago (2014-03-29 06:51:04 UTC) #139

Ami GONE FROM CHROMIUM

LGTM % nits & posciak's say-so. https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/media/tegra_v4l2_video_device.cc File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/media/tegra_v4l2_video_device.cc#newcode81 content/common/gpu/media/tegra_v4l2_video_device.cc:81: } nit: newline ...

6 years, 9 months ago (2014-03-29 08:23:50 UTC) #140

Pawel Osciak

On 2014/03/29 06:51:04, kpurandare wrote: > On 2014/03/29 04:01:30, Pawel Osciak wrote: > > Hi ...

6 years, 8 months ago (2014-03-29 11:14:20 UTC) #141

Pawel Osciak

LGTM % one nit, and assuming sandboxing owners approval and that the format issue is ...

6 years, 8 months ago (2014-03-29 11:16:11 UTC) #142

shivdasp

Patchset incoming for addressing nits. https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/media/exynos_v4l2_video_device.cc File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/media/exynos_v4l2_video_device.cc#newcode142 content/common/gpu/media/exynos_v4l2_video_device.cc:142: } On 2014/03/29 11:16:12, ...

6 years, 8 months ago (2014-03-29 17:47:36 UTC) #143

shivdasp

On 2014/03/29 11:14:20, Pawel Osciak wrote: > On 2014/03/29 06:51:04, kpurandare wrote: > > On ...

6 years, 8 months ago (2014-03-29 17:58:16 UTC) #144

On 2014/03/29 11:14:20, Pawel Osciak wrote:
> On 2014/03/29 06:51:04, kpurandare wrote:
> > On 2014/03/29 04:01:30, Pawel Osciak wrote:
> > 
> > Hi Pawel,
> > 
> > As per my understanding the capture format seems to be set appropriately and
> the
> > GPU is able to convert. Shivdas seems to be have tested and performance wise
> > there seems to be no issue.
> 
> Hi Kaustubh,
> I asked about the output format that the Tegra codec uses and we agreed to
> change the V4L2_PIX_FMT_NV12M define in Tegra device class to the actual
format
> that the codec produces. I'm asking for no more than simply changing this
format
> macro to the one that the Tegra codec actually uses, and not the one that
Exynos
> uses. We agreed with Shivdas to do this 5 weeks ago and since then I asked
about
> it multiple times.
> 
> Later though Shivdas mentioned that:
> "We made change in the buffer allocation to have Tegra's preferred format same
> as Exynos."
> 
Hi Pawel,

Yes on Tegra we allocate two non-contigous surfaces and I confirmed that
earlier.
I agree some point in time (quite a while ago) we were not reporting the correct
format but
we changed the buffer allocations for codecs.

From http://linuxtv.org/downloads/v4l-dvb-apis/re30.html:
"This is a multi-planar, two-plane version of the YUV 4:2:0 format. The three
components are separated into two sub-images or planes. V4L2_PIX_FMT_NV12M
differs from V4L2_PIX_FMT_NV12 in that the two planes are non-contiguous in
memory"

So this format is fine I believe.


> So I am merely asking for clarification on this sentence and/or follow up on
> what we agreed to do.
This does not need any follow up CL now I think.

> 
> I hope you could please promise me that this will be addressed in a follow up
> CL.
My sincere apologies if I did not make these things categorically clear earlier.
Thanks.
> 
> Thank you.
> 
> > If the suggestions are good to have then we can plan to adddress it
> > subsequently. 
> > 
> > Please let me know if you think otherwise.
> > 
> > Thanks
> > Kaustubh

shivdasp

Jorge, May I request you to take a look at the sandboxing related changes in ...

6 years, 8 months ago (2014-03-29 18:10:17 UTC) #146

shivdasp

On 2014/03/29 18:10:17, shivdasp wrote: > Jorge, > May I request you to take a ...

6 years, 8 months ago (2014-03-31 13:06:21 UTC) #147

Jorge Lucangeli Obes

https://codereview.chromium.org/137023008/diff/980001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/980001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc#newcode183 content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:183: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); Please add a comment above this line ...

6 years, 8 months ago (2014-03-31 18:07:23 UTC) #149

shivdasp

Addressed comment from Jorgelo@. PTAL https://codereview.chromium.org/137023008/diff/980001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/980001/content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc#newcode183 content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:183: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); On 2014/03/31 ...

6 years, 8 months ago (2014-03-31 18:17:04 UTC) #150

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/shivdasp@nvidia.com/137023008/1000001

6 years, 8 months ago (2014-03-31 18:40:33 UTC) #153

Message was sent while issue was closed.

Change committed as 260661

Issue 137023008: Add support for Tegra V4L2 VDA (Closed)

Description

Patch Set 1 #

Patch Set 2 : #

Patch Set 3 : #

Patch Set 4 : #

Patch Set 5 : #

Patch Set 6 : #

Patch Set 7 : #

Patch Set 8 : #

Patch Set 9 : #

Patch Set 10 : Fixed minor nit #

Patch Set 11 : fixed a small issue #

Patch Set 12 : #

Patch Set 13 : #

Patch Set 14 : #

Patch Set 15 : use scopedFD for dmabuf fds. #

Patch Set 16 : Addressed a few more comments #

Patch Set 17 : #

Patch Set 18 : #

Patch Set 19 : LazyInstance related changes #

Patch Set 20 : rebased #

Patch Set 21 : addressed nits #

Patch Set 22 : minor nit #

Messages