|
|
Created:
6 years, 10 months ago by shivdasp Modified:
6 years, 8 months ago Reviewers:
vpagar, Ami GONE FROM CHROMIUM, sheu, piman, Jorge Lucangeli Obes, wuchengli, Pawel Osciak CC:
chromium-reviews, fischman+watch_chromium.org, jam, mcasas+watch_chromium.org, joi+watch-content_chromium.org, feature-media-reviews_chromium.org, darin-cc_chromium.org, piman+watch_chromium.org, wjia+watch_chromium.org, jln+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Visibility:
Public. |
DescriptionAdd support for Tegra V4L2 VDA
This change add a TegraV4L2Device and
extends the support of V4L2 VDA on
Tegra platform.
BUG=chromium-os-partner:23082
TEST=Run video playback
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=260661
Patch Set 1 #Patch Set 2 : #Patch Set 3 : #Patch Set 4 : #Patch Set 5 : #
Total comments: 90
Patch Set 6 : #
Total comments: 28
Patch Set 7 : #
Total comments: 7
Patch Set 8 : #
Total comments: 7
Patch Set 9 : #Patch Set 10 : Fixed minor nit #Patch Set 11 : fixed a small issue #
Total comments: 31
Patch Set 12 : #
Total comments: 26
Patch Set 13 : #Patch Set 14 : #
Total comments: 16
Patch Set 15 : use scopedFD for dmabuf fds. #
Total comments: 16
Patch Set 16 : Addressed a few more comments #
Total comments: 2
Patch Set 17 : #
Total comments: 37
Patch Set 18 : #
Total comments: 9
Patch Set 19 : LazyInstance related changes #Patch Set 20 : rebased #
Total comments: 6
Patch Set 21 : addressed nits #
Total comments: 2
Patch Set 22 : minor nit #Messages
Total messages: 155 (0 generated)
Do not review yet. This is first draft version with some more changes needed.
This change adds support for Tegra V4L2Device into the V4L2VDA. Please have a look.
Is there a change in gles2_cmd_decoder.cc recently ? Because this CL used to work on my code synced about 1.5-2 weeks ago. After I rebased, I get errors from gles2_cmd_decoder.cc line #5967 (Texture is not renderable) and from line #10043 ( "glConsumeTextureCHROMIUM", "invalid mailbox name). Thought would check with you before I start debugging. Thanks,
On Thu, Feb 6, 2014 at 9:37 AM, <shivdasp@nvidia.com> wrote: > Is there a change in gles2_cmd_decoder.cc recently ? > Always... http://src.chromium.org/viewvc/chrome/trunk/src/gpu/command_buffer/service/gl... To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
That was the first place I checked but there are 4-5 changes recently in that file and since I am not very familiar with that part of the code thought will check here whether there is any decision to drop support for TEXTURE_2D. I will dig through. Thanks.
@posciak: as usual I rely on you to review the called platform code for calling correctness :) https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/exynos_v4l2_video_device.cc:122: unsigned int buffer_index) { Please /* comment out */ unused params. https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:1: #include <dlfcn.h> missing copyright header https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:62: bool TegraV4L2Device::SetDevicePollInterrupt(void) { drop "void" https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:63: if (HANDLE_EINTR(TegraV4L2SetDevicePollInterrupt(device_fd_) == -1)) { bug: HANDLE_EINTR should not be accepting the ==-1 as part of its arg. https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:78: bool TegraV4L2Device::Initialize(void) { drop "void" https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:79: TegraV4L2Open = reinterpret_cast<TegraV4L2OpenFunc>( Please avoid the sort of code duplication below (see vaapi_wrapper.cc for example of dlsym'ing). https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:134: EGLint attrib[], EGLint[] /* attrib */ https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:142: (EGLClientBuffer)(texture_id), static_cast https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:153: memset(planes, 0, sizeof(planes)); swap w/ previous line to match declaration order https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset = reinterpret_cast<unsigned int>(egl_image); how wide is mem_offset? I'm worried about this cast on a 64-bit platform. If mem_offset is intptr_t-sized, please use that in the cast. https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:165: return EGL_NO_IMAGE_KHR; leak? (no eglDestroyImageKHR needed?) https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/tegra_v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.h:9: class TegraV4L2Device : public V4L2Device { Class commentary please. https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.h:11: TegraV4L2Device(EGLContext egl_context); explicit https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.h:14: // Tries to create and initialize an TegraV4L2Device, returns s/an/a/ but this comment seems like leftover copy/pasta... Should instead just be // V4L2Device implementation. https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.h:26: // Does all the initialization of device fds , returns true on success. precede by newline https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.h:29: EGLImageKHR CreateEGLImage(EGLDisplay egl_display, this and the next method belong above Initialize as part of the V4L2Device block https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.h:57: TegraV4L2ClearDevicePollInterruptFunc TegraV4L2ClearDevicePollInterrupt; l.43-57 could be file-static in the .cc file, right? (see how vaapi_wrapper.cc does it for an example) https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_decode_accelerator.cc:376: DLOG(ERROR) << "AssignPictureBuffers(): could not create EGLImageKHR"; already logged in the Device method; unnecessary https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1642: if (HANDLE_EINTR(device_->Ioctl(VIDIOC_G_FMT, format) != 0)) { This is a bug! https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1654: // Since the underlying library at the moment is not updated this hack This comment is opaque to me. Also, if something is a hack, usually there should be a TODO/crbug to go along with it. https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_device.cc:21: exynos_device.reset(NULL); Can drop NULL (scoped_ptr::reset(NULL) is equiv to scoped_ptr::reset()). https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_device.cc:29: return exynos_device.PassAs<V4L2Device>(); l.19-29 would be clearer as: scoped_ptr<EV4L2D> e_d(...); if (e_d->Initialize()) return e_d.PassAs<V4L2D>(); DLOG(ERROR) << "Failed to open exynos v4l2 device."; scoped_ptr<TV4L2D> t_d(...); if (t_d->Initialize(e_c)) return t_d.PassAs<V4L2D>(); DLOG(ERROR) << "Failed to open tegra v4l2 device."; return scoped_ptr<V4L2D>(NULL); https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_device.h:56: virtual EGLImageKHR CreateEGLImage(EGLDisplay egl_display, Need to document these methods & parameters. I'm especially unclear at this point in the review what |buffer_index| is supposed to be (I see how it's used, but I don't understand what it means in the context of a generic V4L2Device). https://codereview.chromium.org/137023008/diff/80001/content/common/sandbox_l... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/sandbox_l... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:223: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); I assume failure here is non-fatal? (should errno be unconditionally cleared after this?) https://codereview.chromium.org/137023008/diff/80001/content/content_common.gypi File content/content_common.gypi (right): https://codereview.chromium.org/137023008/diff/80001/content/content_common.g... content/content_common.gypi:560: 'common/gpu/media/tegra_v4l2_video_device.h', please keep lists in alphabetical order.
https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/exynos_v4l2_video_device.cc:123: EGLImageKHR egl_image = eglCreateImageKHR( We need to make GL context current to be able to call this. We should at least require this in the doc that it should be done before calling this method. https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/exynos_v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/exynos_v4l2_video_device.h:40: unsigned int GetTextureTarget(); Documentation for methods please. Also, texture target should be a GLenum I think... https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:4: #include <libdrm/drm_fourcc.h> Not needed? https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:6: #include <poll.h> Not needed. https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:7: #include <sys/eventfd.h> Not needed. https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:9: #include <sys/mman.h> Not needed? Please clean up the headers in this file to a minimal set of required ones only (also applies to the Chrome headers below). https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; Is this is the codec device exposed by Tegra kernel driver? You can't assume it will be this on all configurations. Please use udev rules to create a codec specific device (see Exynos example at https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...) https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:54: return (void*)offset; QUERYBUFS from V4L2VDA is used to acquire offsets for mmap, but from CreateEGLImage to pass them for texture binding to the library? I'm guessing that this works, because for the former you call QUERYBUFS with OUTPUT buffer, while in the latter you call it with CAPTURE? Please don't do this. Suggested changes for EGLImages in my comments below. As for mmap, how (and from what) is the memory mapping actually acquired for the buffer in the library? Why do this on QUERYBUFS and not on Mmap()? How is it managed and when it is destroyed? https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for tegrav4l2 device. If so, how is unmapping handled then? What if we want to free the buffers and reallocate them? You cannot call REQBUFS(0) without unmapping the buffers first... https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:136: unsigned int buffer_index) { This method should take a v4l2_buffer instead. Depending on format and other circumstances format, memory type, etc. change. We shouldn't hardcode this in the device class. https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:158: capture_buffer.length = 2; arraysize(planes) https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset = reinterpret_cast<unsigned int>(egl_image); mem_offset is u32 always, an unsigned int shouldn't be assigned to it. Also, there are two planes, but passing only one offset is a bit inconsistent. Although, why have two planes, if only one is used? https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:168: } After reading through it, this method feels unnecessary. I'm assuming querybufs implementation in the library calls a method in the GPU driver anyway? This is basically redefining querybufs to do something completely different than it normally does, turning it into a custom call. The offsets should be coming from the callee of QUERYBUFS, not the other way around, and there should be no custom side effects. A non-V4L2, library-specific custom call would be better than this. But why not implement eglCreateImageKHR that accepts dmabufs (or even offsets) that come from the v4l2 library to create EGL images, just like we do on Mali on Exynos? Would it be possible to have an extension for eglCreateImage like Exynos does instead please? It doesn't seem to be much of a difference, instead of calling querybufs with custom arguments and having the library call something in the driver, call eglCreateImage instead with custom arguments and have it do everything? https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/tegra_v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.h:41: EGLContext egl_context_; To be honest I'm not especially excited with EGLContext becoming a part of V4L2Device, since the two are unrelated. A V4L2Device should have no need for an EGL context. Please see my comments in .cc for details, I think CreateEGLImage should not be in this class. https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.h:61: } Empty line above and // namespace content please. https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/v4l2_video_decode_accelerator.cc:382: glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image); I would like to understand the big picture here please. We strive to stay as close as possible to using (and/or creating) platform-independent standards where we can, like the sequence above, instead of providing custom calls for each platform. Removing this from here and TVDA is a step into an opposite direction, and I would like to understand what technical difficulties force us to do this first. Binding textures to EGLImages also serves to keep track of ownership. There are multiple users of the shared buffer, the renderer, the GPU that renders the textures, this class and the HW codec. How is ownership/destruction managed and how is it ensured that the buffer is valid while any of the users are still referring to/using it (both in userspace and in kernel)? What happens if the renderer crashes and the codec is writing to the textures? What happens when this class is destroyed, but the texture is in the renderer? What happens when the whole Chrome crashes, but the HW codec is using a buffer (i.e. kernel has ownership)? Could you please explain how is ownership managed for shared buffers on Tegra? https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1654: // Since the underlying library at the moment is not updated this hack Could this hack be moved into the library itself then? This class should not have any awareness of device-specific issues. https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/v4l2_video_device.cc:29: return exynos_device.PassAs<V4L2Device>(); On 2014/02/07 09:09:30, Ami Fischman wrote: > l.19-29 would be clearer as: > > scoped_ptr<EV4L2D> e_d(...); > if (e_d->Initialize()) > return e_d.PassAs<V4L2D>(); > DLOG(ERROR) << "Failed to open exynos v4l2 device."; > > scoped_ptr<TV4L2D> t_d(...); > if (t_d->Initialize(e_c)) > return t_d.PassAs<V4L2D>(); > DLOG(ERROR) << "Failed to open tegra v4l2 device."; > > return scoped_ptr<V4L2D>(NULL); +1 to this, but I don't think DLOGging failing to open a particular device is good here. It would always log it on Tegra, even though it wouldn't be an error. I feel we should only log an error if we return NULL and say no device could be opened.
https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; This is a v4l2 decoder device name which we use to initialize a decoder context within the libtegrav4l2 library. This can be anything really as long as decoder and encoder device names are different since we do not open a v4l2 video device underneath. Libtegrav4l2 is really a pseudo implementation. I can change it /dev/tegra-dec and /dev/tegra-enc for it to mean tegra specific. On 2014/02/10 06:36:17, Pawel Osciak wrote: > Is this is the codec device exposed by Tegra kernel driver? > > You can't assume it will be this on all configurations. > Please use udev rules to create a codec specific device (see Exynos example at > https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...) https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:54: return (void*)offset; When REQBUFS is called for OUTPUT_PLANE, the library creates internal buffers which can be shared with the AVP for decoding. AVP is a video processor which runs the firmware that actually does all the decoding. While creating this buffers, they are already mmapped to get a virtual address. The library returns this address in QUERYBUF API. Hence there is not real need for mmap. When REQBUFS with 0 is called on OUTPUT_PLANE, these buffers are internally unmapped and destroyed. I will explain the need for CreateEGLImage in later comments. On 2014/02/10 06:36:17, Pawel Osciak wrote: > QUERYBUFS from V4L2VDA is used to acquire offsets for mmap, but from > CreateEGLImage to pass them for texture binding to the library? I'm guessing > that this works, because for the former you call QUERYBUFS with OUTPUT buffer, > while in the latter you call it with CAPTURE? > > Please don't do this. Suggested changes for EGLImages in my comments below. > > As for mmap, how (and from what) is the memory mapping actually acquired for the > buffer in the library? Why do this on QUERYBUFS and not on Mmap()? How is it > managed and when it is destroyed? https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for tegrav4l2 device. Buffers are unmapped in REQBUFS(0) call and destroyed. Since there is no real need for mmap and munmap, we did not implement it in the library. So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and mmaps the buffer whereas REQBUF(0) unmaps and destroys them. On 2014/02/10 06:36:17, Pawel Osciak wrote: > If so, how is unmapping handled then? What if we want to free the buffers and > reallocate them? You cannot call REQBUFS(0) without unmapping the buffers > first... https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:79: TegraV4L2Open = reinterpret_cast<TegraV4L2OpenFunc>( Yes I will do this. That looks very clean. On 2014/02/07 09:09:30, Ami Fischman wrote: > Please avoid the sort of code duplication below (see vaapi_wrapper.cc for > example of dlsym'ing). https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:136: unsigned int buffer_index) { Since ExynosV4L2Device does not need the v4l2_buffer like TegraV4L2Device, I thought it would add less noise to the V4L2VDA code. Also the fields within the v4l2_buffer (the eglimage handle) is created within this method so can't fully initialize the structure here. On 2014/02/10 06:36:17, Pawel Osciak wrote: > This method should take a v4l2_buffer instead. > Depending on format and other circumstances format, memory type, etc. change. We > shouldn't hardcode this in the device class. https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset = reinterpret_cast<unsigned int>(egl_image); We are really passing in the EglImage handle here to the library. The library associates this with the corresponding v4l2_buffer on the CAPTURE plane and use the underlying conversion APIs to transform the decoder's yuv output into the egl image. We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where the number of planes are checked with 2 (line #1660 of V4L2VDA). On 2014/02/10 06:36:17, Pawel Osciak wrote: > mem_offset is u32 always, an unsigned int shouldn't be assigned to it. Also, > there are two planes, but passing only one offset is a bit inconsistent. > Although, why have two planes, if only one is used? https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:165: return EGL_NO_IMAGE_KHR; Yes will fix this. On 2014/02/07 09:09:30, Ami Fischman wrote: > leak? (no eglDestroyImageKHR needed?) https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:168: } Since we started with implementing this as a V4L2-Like library we have tried to follow V4L2 syntax to provide the input & output buffers. QUERYBUF can be made into a custom call since it is doing very custom thing here. If introducing another API is acceptable I can do that. We do not yet have eglCreateImageKHR that accepts dmabufs. I can check with the our Graphics team but I don't there is any such plan to implement such extension. On 2014/02/10 06:36:17, Pawel Osciak wrote: > After reading through it, this method feels unnecessary. > > I'm assuming querybufs implementation in the library calls a method in the GPU > driver anyway? > > This is basically redefining querybufs to do something completely different than > it normally does, turning it into a custom call. The offsets should be coming > from the callee of QUERYBUFS, not the other way around, and there should be no > custom side effects. > A non-V4L2, library-specific custom call would be better than this. > > But why not implement eglCreateImageKHR that accepts dmabufs (or even offsets) > that come from the v4l2 library to create EGL images, just like we do on Mali on > Exynos? > > Would it be possible to have an extension for eglCreateImage like Exynos does > instead please? It doesn't seem to be much of a difference, instead of calling > querybufs with custom arguments and having the library call something in the > driver, call eglCreateImage instead with custom arguments and have it do > everything? https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_decode_accelerator.cc:382: glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image); The decoder's output buffers are created when REQBUFS(x) is called on CAPTURE_PLANE. These buffers are hardware buffers which can be shared with the AVP processor for decoder to write into. Then V4L2VDA triggers ProvidePictureBuffers() on which textures are created and sent back in AssignPictureBuffers(). Now V4L2VDA creates EglImages from these textures and sends each EglImage handle to library using the QUERYBUF (but can use a custom call too). The tegrav4l2 library cannot create EglImages from DMABUFS like in Exynos since there is no such extension. We create EglImage from this texture itself so there is a binding between texture and eglImage. Now when this EglImage is sent to libtegrav4l2, it is mapped with the corresponding decoder buffer created in REQBUF() call. This way there is one map of EglImage, texture and decoder buffer. When any buffer is enqueued in QBUF, the library sends it down to the decoder. Once the decoder buffer is ready, the library uses graphics apis to populate the corresponding EglImage with the RGB data and then pushes into a queue thereby making it available for DQBUF after which this buffer can be used only when it is back in QBUF call. This way the buffer ownership is managed. So in summary the library uses queues and does all the buffer management between decoder and the graphics stack for conversion. On 2014/02/10 06:36:17, Pawel Osciak wrote: > I would like to understand the big picture here please. > > We strive to stay as close as possible to using (and/or creating) > platform-independent standards where we can, like the sequence above, instead of > providing custom calls for each platform. Removing this from here and TVDA is a > step into an opposite direction, and I would like to understand what technical > difficulties force us to do this first. > > Binding textures to EGLImages also serves to keep track of ownership. There are > multiple users of the shared buffer, the renderer, the GPU that renders the > textures, this class and the HW codec. How is ownership/destruction managed and > how is it ensured that the buffer is valid while any of the users are still > referring to/using it (both in userspace and in kernel)? > > What happens if the renderer crashes and the codec is writing to the textures? > What happens when this class is destroyed, but the texture is in the renderer? > What happens when the whole Chrome crashes, but the HW codec is using a buffer > (i.e. kernel has ownership)? > > Could you please explain how is ownership managed for shared buffers on Tegra? https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1654: // Since the underlying library at the moment is not updated this hack Apologies. This code was not meant to be here. Will remove it. On 2014/02/10 06:36:17, Pawel Osciak wrote: > Could this hack be moved into the library itself then? > This class should not have any awareness of device-specific issues. https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_device.cc:21: exynos_device.reset(NULL); On 2014/02/07 09:09:30, Ami Fischman wrote: > Can drop NULL (scoped_ptr::reset(NULL) is equiv to scoped_ptr::reset()). Done. https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_device.cc:29: return exynos_device.PassAs<V4L2Device>(); Agreed. Will make this change in next patchset. On 2014/02/10 06:36:17, Pawel Osciak wrote: > On 2014/02/07 09:09:30, Ami Fischman wrote: > > l.19-29 would be clearer as: > > > > scoped_ptr<EV4L2D> e_d(...); > > if (e_d->Initialize()) > > return e_d.PassAs<V4L2D>(); > > DLOG(ERROR) << "Failed to open exynos v4l2 device."; > > > > scoped_ptr<TV4L2D> t_d(...); > > if (t_d->Initialize(e_c)) > > return t_d.PassAs<V4L2D>(); > > DLOG(ERROR) << "Failed to open tegra v4l2 device."; > > > > return scoped_ptr<V4L2D>(NULL); > > +1 to this, but I don't think DLOGging failing to open a particular device is > good here. It would always log it on Tegra, even though it wouldn't be an error. > I feel we should only log an error if we return NULL and say no device could be > opened. https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_device.h:56: virtual EGLImageKHR CreateEGLImage(EGLDisplay egl_display, This method exists because of the difference in how Exynos and Tegra create the EglImages. See previous comments. On 2014/02/07 09:09:30, Ami Fischman wrote: > Need to document these methods & parameters. > I'm especially unclear at this point in the review what |buffer_index| is > supposed to be (I see how it's used, but I don't understand what it means in the > context of a generic V4L2Device). https://codereview.chromium.org/137023008/diff/80001/content/content_common.gypi File content/content_common.gypi (right): https://codereview.chromium.org/137023008/diff/80001/content/content_common.g... content/content_common.gypi:560: 'common/gpu/media/tegra_v4l2_video_device.h', On 2014/02/07 09:09:30, Ami Fischman wrote: > please keep lists in alphabetical order. Done.
Hi Pawel, Could you check my comments especially regarding the EglImage section ? Your views on it will help me make changes quicker in the underlying library and my next patchset. Thanks
https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; On 2014/02/10 13:31:17, shivdasp wrote: > This is a v4l2 decoder device name which we use to initialize a decoder context > within the libtegrav4l2 library. > This can be anything really as long as decoder and encoder device names are > different since we do not open a v4l2 video device underneath. Libtegrav4l2 is > really a pseudo implementation. I can change it /dev/tegra-dec and > /dev/tegra-enc for it to mean tegra specific. Which device is actually being used? Does the library just talk to DRM driver via custom ioctls? > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > Is this is the codec device exposed by Tegra kernel driver? > > > > You can't assume it will be this on all configurations. > > Please use udev rules to create a codec specific device (see Exynos example at > > > https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...) > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:54: return (void*)offset; On 2014/02/10 13:31:17, shivdasp wrote: > When REQBUFS is called for OUTPUT_PLANE, the library creates internal buffers > which can be shared with the AVP for decoding. AVP is a video processor which > runs the firmware that actually does all the decoding. While creating this > buffers, they are already mmapped to get a virtual address. The library returns > this address in QUERYBUF API. Hence there is not real need for mmap. > When REQBUFS with 0 is called on OUTPUT_PLANE, these buffers are internally > unmapped and destroyed. The library must be using some kind of an mmap call to get a mapping though. Would it be possible to move it to be done on this call, as expected? Also, how will this work for VEA, where CAPTURE buffers need to be mapped instead? > I will explain the need for CreateEGLImage in later comments. > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > QUERYBUFS from V4L2VDA is used to acquire offsets for mmap, but from > > CreateEGLImage to pass them for texture binding to the library? I'm guessing > > that this works, because for the former you call QUERYBUFS with OUTPUT buffer, > > while in the latter you call it with CAPTURE? > > > > Please don't do this. Suggested changes for EGLImages in my comments below. > > > > As for mmap, how (and from what) is the memory mapping actually acquired for > the > > buffer in the library? Why do this on QUERYBUFS and not on Mmap()? How is it > > managed and when it is destroyed? > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for tegrav4l2 device. On 2014/02/10 13:31:17, shivdasp wrote: > Buffers are unmapped in REQBUFS(0) call and destroyed. > Since there is no real need for mmap and munmap, we did not implement it in the > library. > So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and mmaps the > buffer whereas REQBUF(0) unmaps and destroys them. We should not rely on V4L2VDA to be the only place where the underlying memory will be destroyed. Even if we REQBUFS(0) on these buffers, the renderer process may still be keeping ownership of the textures bound to them. Is this taken into account? > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > If so, how is unmapping handled then? What if we want to free the buffers and > > reallocate them? You cannot call REQBUFS(0) without unmapping the buffers > > first... > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset = reinterpret_cast<unsigned int>(egl_image); On 2014/02/10 13:31:17, shivdasp wrote: > We are really passing in the EglImage handle here to the library. EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed in an u32 variable. The whole idea behind offsets is that they are usually not really offsets, but sort of platform-independent 32bit IDs. They are acquired from the V4L2 driver (or library) via QUERYBUFS, and can be passed back to other calls to uniquely identify the buffers (e.g. to mmap). The client is not supposed to generate them by itself and pass them to QUERYBUFS. > The library > associates this with the corresponding v4l2_buffer on the CAPTURE plane and use > the underlying conversion APIs to transform the decoder's yuv output into the > egl image. > We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where the number > of planes are checked with 2 (line #1660 of V4L2VDA). You mean https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu... ? This is an overassumption from the time where there was only one format supported. The number of planes to be used should be taken from the v4l2_format struct, returned from G_FMT. This assumption should be fixed. From what I'm seeing here, your HW doesn't really use V4L2_PIX_FMT_NV12M? Which fourcc format does it use? Are the planes separate memory buffers? > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > mem_offset is u32 always, an unsigned int shouldn't be assigned to it. Also, > > there are two planes, but passing only one offset is a bit inconsistent. > > Although, why have two planes, if only one is used? > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:168: } On 2014/02/10 13:31:17, shivdasp wrote: > Since we started with implementing this as a V4L2-Like library we have tried to > follow V4L2 syntax to provide the input & output buffers. > QUERYBUF can be made into a custom call since it is doing very custom thing > here. Please understand that: 1. We are asking for a V4L2-like interface that conforms to the V4L2 API. A call documented to work in the same way regardless of buffer type passed should not do otherwise, if it can be prevented. If it's expected to return values, it shouldn't be accepting them instead. And so on. Of course, this is an adapter class, so the actual inner workings may be different and it's not always possible to do exactly the same thing, but from the point of view of the client the effects should be as close to what each call is documented to do, as possible. Otherwise this whole exercise of using V4L2 API is doing us more bad than good. Please understand that the V4L2VDA and V4L2VEA classes will live on and will work with multiple platforms. There will be many changes to them. People working on them will expect behavior as documented in V4L2 API. Otherwise things will break (and not only for other platforms, but Tegra too) and it will be very difficult to reason why. So it's very important for Tegra V4L2Device to behave like V4L2 API specifies and not be tailored to how things are laid out in V4L2VDA currently. 2. This V4L2Device class should work with V4L2VEA class as well. I don't think we can make it work if this hack on QUERYBUFS is here. > If introducing another API is acceptable I can do that. > > We do not yet have eglCreateImageKHR that accepts dmabufs. I can check with the > our Graphics team but I don't there is any such plan to implement such > extension. > That's why I gave the option of using offsets. If you prefer not to use dmabufs, could we please: - provide offsets via querybufs from the driver/library - pass those offsets to a new eglCreateImage extension and move it back to V4L2VDA - keep using texture binding API? This should eliminate the need for this method as well. > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > After reading through it, this method feels unnecessary. > > > > I'm assuming querybufs implementation in the library calls a method in the GPU > > driver anyway? > > > > This is basically redefining querybufs to do something completely different > than > > it normally does, turning it into a custom call. The offsets should be coming > > from the callee of QUERYBUFS, not the other way around, and there should be no > > custom side effects. > > A non-V4L2, library-specific custom call would be better than this. > > > > But why not implement eglCreateImageKHR that accepts dmabufs (or even offsets) > > that come from the v4l2 library to create EGL images, just like we do on Mali > on > > Exynos? > > > > Would it be possible to have an extension for eglCreateImage like Exynos does > > instead please? It doesn't seem to be much of a difference, instead of calling > > querybufs with custom arguments and having the library call something in the > > driver, call eglCreateImage instead with custom arguments and have it do > > everything? > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/v4l2_video_decode_accelerator.cc:382: glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image); On 2014/02/10 13:31:17, shivdasp wrote: > The decoder's output buffers are created when REQBUFS(x) is called on > CAPTURE_PLANE. These buffers are hardware buffers which can be shared with the > AVP processor for decoder to write into. By decoder do you mean V4L2VDA class? > Then V4L2VDA triggers ProvidePictureBuffers() on which textures are created and > sent back in AssignPictureBuffers(). > Now V4L2VDA creates EglImages from these textures and sends each EglImage handle > to library using the QUERYBUF (but can use a custom call too). The tegrav4l2 > library cannot create EglImages from DMABUFS like in Exynos since there is no > such extension. We create EglImage from this texture itself so there is a > binding between texture and eglImage. Sounds like the eglCreateImage extension taking offsets I described in the comment in tegra_v4l2_video_device.cc could work for this? > Now when this EglImage is sent to libtegrav4l2, it is mapped with the > corresponding decoder buffer created in REQBUF() call. > This way there is one map of EglImage, texture and decoder buffer. My understanding is you mean the buffer is bound to a texture? If so, then it also seems like we could use the current bind texture to eglimage calls? > When any buffer is enqueued in QBUF, the library sends it down to the decoder. > Once the decoder buffer is ready, the library uses graphics apis to populate the > corresponding EglImage with the RGB data and then pushes into a queue thereby > making it available for DQBUF after which this buffer can be used only when it > is back in QBUF call. > This way the buffer ownership is managed. > So in summary the library uses queues and does all the buffer management between > decoder and the graphics stack for conversion. What happens when this class calls REQBUFS(0), but the corresponding textures are being rendered to the screen? How will the buffers be freed if the GPU process crashes without calling REQBUFS(0)? What happens when the bound textures are deleted, but the HW codec is still using them? > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > I would like to understand the big picture here please. > > > > We strive to stay as close as possible to using (and/or creating) > > platform-independent standards where we can, like the sequence above, instead > of > > providing custom calls for each platform. Removing this from here and TVDA is > a > > step into an opposite direction, and I would like to understand what technical > > difficulties force us to do this first. > > > > Binding textures to EGLImages also serves to keep track of ownership. There > are > > multiple users of the shared buffer, the renderer, the GPU that renders the > > textures, this class and the HW codec. How is ownership/destruction managed > and > > how is it ensured that the buffer is valid while any of the users are still > > referring to/using it (both in userspace and in kernel)? > > > > What happens if the renderer crashes and the codec is writing to the textures? > > What happens when this class is destroyed, but the texture is in the renderer? > > What happens when the whole Chrome crashes, but the HW codec is using a buffer > > (i.e. kernel has ownership)? > > > > Could you please explain how is ownership managed for shared buffers on Tegra? >
https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; This library internally talks to MM layer which talks to the device (/dev/tegra_avpchannel) which is the nvavp driver. On 2014/02/12 09:15:13, Pawel Osciak wrote: > On 2014/02/10 13:31:17, shivdasp wrote: > > This is a v4l2 decoder device name which we use to initialize a decoder > context > > within the libtegrav4l2 library. > > This can be anything really as long as decoder and encoder device names are > > different since we do not open a v4l2 video device underneath. Libtegrav4l2 is > > really a pseudo implementation. I can change it /dev/tegra-dec and > > /dev/tegra-enc for it to mean tegra specific. > > Which device is actually being used? Does the library just talk to DRM driver > via custom ioctls? > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > Is this is the codec device exposed by Tegra kernel driver? > > > > > > You can't assume it will be this on all configurations. > > > Please use udev rules to create a codec specific device (see Exynos example > at > > > > > > https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...) > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:54: return (void*)offset; Okay I will add Mmap and Munmap calls to the library and have it return the appropriate value internally. On 2014/02/12 09:15:13, Pawel Osciak wrote: > On 2014/02/10 13:31:17, shivdasp wrote: > > When REQBUFS is called for OUTPUT_PLANE, the library creates internal buffers > > which can be shared with the AVP for decoding. AVP is a video processor which > > runs the firmware that actually does all the decoding. While creating this > > buffers, they are already mmapped to get a virtual address. The library > returns > > this address in QUERYBUF API. Hence there is not real need for mmap. > > When REQBUFS with 0 is called on OUTPUT_PLANE, these buffers are internally > > unmapped and destroyed. > > The library must be using some kind of an mmap call to get a mapping though. > Would it be possible to move it to be done on this call, as expected? > > Also, how will this work for VEA, where CAPTURE buffers need to be mapped > instead? > > > I will explain the need for CreateEGLImage in later comments. > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > QUERYBUFS from V4L2VDA is used to acquire offsets for mmap, but from > > > CreateEGLImage to pass them for texture binding to the library? I'm guessing > > > that this works, because for the former you call QUERYBUFS with OUTPUT > buffer, > > > while in the latter you call it with CAPTURE? > > > > > > Please don't do this. Suggested changes for EGLImages in my comments below. > > > > > > As for mmap, how (and from what) is the memory mapping actually acquired for > > the > > > buffer in the library? Why do this on QUERYBUFS and not on Mmap()? How is it > > > managed and when it is destroyed? > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for tegrav4l2 device. If not in REQBUFS(0) then what will be the appropriate place to destroy the buffers ? V4L2VDA::DestroyOutputBuffers() calls REQBUFS(0) and hence we destroy it there. How does the renderer then inform the ownership of textures ? On 2014/02/12 09:15:13, Pawel Osciak wrote: > On 2014/02/10 13:31:17, shivdasp wrote: > > Buffers are unmapped in REQBUFS(0) call and destroyed. > > Since there is no real need for mmap and munmap, we did not implement it in > the > > library. > > So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and mmaps the > > buffer whereas REQBUF(0) unmaps and destroys them. > > We should not rely on V4L2VDA to be the only place where the underlying memory > will be destroyed. Even if we REQBUFS(0) on these buffers, the renderer process > may still be keeping ownership of the textures bound to them. Is this taken into > account? > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > If so, how is unmapping handled then? What if we want to free the buffers > and > > > reallocate them? You cannot call REQBUFS(0) without unmapping the buffers > > > first... > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset = reinterpret_cast<unsigned int>(egl_image); The output is YUV420 planar. I think rather than using the QUERYBUF to pass the EglImage handles and stuffing the required information I would rather introduce a custom API (UseEglImage ?). I hope that is fine. On 2014/02/12 09:15:13, Pawel Osciak wrote: > On 2014/02/10 13:31:17, shivdasp wrote: > > We are really passing in the EglImage handle here to the library. > > EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed in an u32 > variable. > > The whole idea behind offsets is that they are usually not really offsets, but > sort of platform-independent 32bit IDs. They are acquired from the V4L2 driver > (or library) via QUERYBUFS, and can be passed back to other calls to uniquely > identify the buffers (e.g. to mmap). > > The client is not supposed to generate them by itself and pass them to > QUERYBUFS. > > > The library > > associates this with the corresponding v4l2_buffer on the CAPTURE plane and > use > > the underlying conversion APIs to transform the decoder's yuv output into the > > egl image. > > We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where the > number > > of planes are checked with 2 (line #1660 of V4L2VDA). > > You mean > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu... > ? > > This is an overassumption from the time where there was only one format > supported. > The number of planes to be used should be taken from the v4l2_format struct, > returned from G_FMT. This assumption should be fixed. > > From what I'm seeing here, your HW doesn't really use V4L2_PIX_FMT_NV12M? Which > fourcc format does it use? Are the planes separate memory buffers? > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > mem_offset is u32 always, an unsigned int shouldn't be assigned to it. Also, > > > there are two planes, but passing only one offset is a bit inconsistent. > > > Although, why have two planes, if only one is used? > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:168: } There is no extension to create EglImages from dmabufs or the offsets at the moment unfortunately. I agree using the QUERYBUF for sending the EglImage can be misleading and I will change it. V4L2VEA should not affected since we use the QUERYBUF for providing the actual offsets. This hack was only for sending the EglImages in case of CAPTURE PLANE of decoder. As I said earlier will adding a custom API (for now) to send the EglImages be okay ? On 2014/02/12 09:15:13, Pawel Osciak wrote: > On 2014/02/10 13:31:17, shivdasp wrote: > > Since we started with implementing this as a V4L2-Like library we have tried > to > > follow V4L2 syntax to provide the input & output buffers. > > QUERYBUF can be made into a custom call since it is doing very custom thing > > here. > > Please understand that: > 1. We are asking for a V4L2-like interface that conforms to the V4L2 API. A call > documented to work in the same way regardless of buffer type passed should not > do otherwise, if it can be prevented. If it's expected to return values, it > shouldn't be accepting them instead. And so on. > Of course, this is an adapter class, so the actual inner workings may be > different and it's not always possible to do exactly the same thing, but from > the point of view of the client the effects should be as close to what each call > is documented to do, as possible. > > Otherwise this whole exercise of using V4L2 API is doing us more bad than good. > > Please understand that the V4L2VDA and V4L2VEA classes will live on and will > work with multiple platforms. There will be many changes to them. People working > on them will expect behavior as documented in V4L2 API. Otherwise things will > break (and not only for other platforms, but Tegra too) and it will be very > difficult to reason why. > > So it's very important for Tegra V4L2Device to behave like V4L2 API specifies > and not be tailored to how things are laid out in V4L2VDA currently. > > 2. This V4L2Device class should work with V4L2VEA class as well. I don't think > we can make it work if this hack on QUERYBUFS is here. > > > > If introducing another API is acceptable I can do that. > > > > We do not yet have eglCreateImageKHR that accepts dmabufs. I can check with > the > > our Graphics team but I don't there is any such plan to implement such > > extension. > > > > That's why I gave the option of using offsets. If you prefer not to use dmabufs, > could we please: > > - provide offsets via querybufs from the driver/library > - pass those offsets to a new eglCreateImage extension and move it back to > V4L2VDA > - keep using texture binding API? > > This should eliminate the need for this method as well. > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > After reading through it, this method feels unnecessary. > > > > > > I'm assuming querybufs implementation in the library calls a method in the > GPU > > > driver anyway? > > > > > > This is basically redefining querybufs to do something completely different > > than > > > it normally does, turning it into a custom call. The offsets should be > coming > > > from the callee of QUERYBUFS, not the other way around, and there should be > no > > > custom side effects. > > > A non-V4L2, library-specific custom call would be better than this. > > > > > > But why not implement eglCreateImageKHR that accepts dmabufs (or even > offsets) > > > that come from the v4l2 library to create EGL images, just like we do on > Mali > > on > > > Exynos? > > > > > > Would it be possible to have an extension for eglCreateImage like Exynos > does > > > instead please? It doesn't seem to be much of a difference, instead of > calling > > > querybufs with custom arguments and having the library call something in the > > > driver, call eglCreateImage instead with custom arguments and have it do > > > everything? > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_decode_accelerator.cc:382: glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image); On 2014/02/12 09:15:13, Pawel Osciak wrote: > On 2014/02/10 13:31:17, shivdasp wrote: > > The decoder's output buffers are created when REQBUFS(x) is called on > > CAPTURE_PLANE. These buffers are hardware buffers which can be shared with the > > AVP processor for decoder to write into. > > By decoder do you mean V4L2VDA class? No I meant the decoder entity within the library. > > > Then V4L2VDA triggers ProvidePictureBuffers() on which textures are created > and > > sent back in AssignPictureBuffers(). > > Now V4L2VDA creates EglImages from these textures and sends each EglImage > handle > > to library using the QUERYBUF (but can use a custom call too). The tegrav4l2 > > library cannot create EglImages from DMABUFS like in Exynos since there is no > > such extension. We create EglImage from this texture itself so there is a > > binding between texture and eglImage. > > Sounds like the eglCreateImage extension taking offsets I described in the > comment in tegra_v4l2_video_device.cc could work for this? Unfortunately there is no such extension today. > > > Now when this EglImage is sent to libtegrav4l2, it is mapped with the > > corresponding decoder buffer created in REQBUF() call. > > This way there is one map of EglImage, texture and decoder buffer. > > My understanding is you mean the buffer is bound to a texture? If so, then it > also seems like we could use the current bind texture to eglimage calls? The libtegrav4l2 talks to another internal library which actually creates the YUV buffer. This is what is given to the AVP and where the decoded output is actually filled. There is a corresponding RGB buffer created when the EGLImage is called, this is owned by the graphics library. While enqueuing buffers for CAPTURE PLANE, there is a conversion performed to do YUV to RGB. > > > When any buffer is enqueued in QBUF, the library sends it down to the decoder. > > Once the decoder buffer is ready, the library uses graphics apis to populate > the > > corresponding EglImage with the RGB data and then pushes into a queue thereby > > making it available for DQBUF after which this buffer can be used only when it > > is back in QBUF call. > > This way the buffer ownership is managed. > > So in summary the library uses queues and does all the buffer management > between > > decoder and the graphics stack for conversion. > > What happens when this class calls REQBUFS(0), but the corresponding textures > are being rendered to the screen? > How will the buffers be freed if the GPU process crashes without calling > REQBUFS(0)? > What happens when the bound textures are deleted, but the HW codec is still > using them? I guess I am missing something here. I did not understand "REQBUFS(0) is called but corresponding textures are being rendered ?". Doesn't DestroyOutputBuffers() call guarantee that buffers on CAPTURE plane are no longer used. I will confirm about the buffer freeing in gpu process crash scenario. The last scenario (bound texture are deleted but HW codec is still using them) is taken care by the conversion step performed using the library. The texture is bound to the EGlImage. So that binding will fail. Since the libtegrav4l2 has the EglImage backed by a RGB buffer the conversion can happen. How can I test this scenario ? > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > I would like to understand the big picture here please. > > > > > > We strive to stay as close as possible to using (and/or creating) > > > platform-independent standards where we can, like the sequence above, > instead > > of > > > providing custom calls for each platform. Removing this from here and TVDA > is > > a > > > step into an opposite direction, and I would like to understand what > technical > > > difficulties force us to do this first. > > > > > > Binding textures to EGLImages also serves to keep track of ownership. There > > are > > > multiple users of the shared buffer, the renderer, the GPU that renders the > > > textures, this class and the HW codec. How is ownership/destruction managed > > and > > > how is it ensured that the buffer is valid while any of the users are still > > > referring to/using it (both in userspace and in kernel)? > > > > > > What happens if the renderer crashes and the codec is writing to the > textures? > > > What happens when this class is destroyed, but the texture is in the > renderer? > > > What happens when the whole Chrome crashes, but the HW codec is using a > buffer > > > (i.e. kernel has ownership)? > > > > > > Could you please explain how is ownership managed for shared buffers on > Tegra? > > >
https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; On 2014/02/12 10:11:55, shivdasp wrote: > This library internally talks to MM layer which talks to the device > (/dev/tegra_avpchannel) which is the nvavp driver. This means you will have to add it to sandbox rules in Chrome, right? So the library should actually use the device path string provided from Chrome to Open() and not have the string hardcoded in the library please. > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > On 2014/02/10 13:31:17, shivdasp wrote: > > > This is a v4l2 decoder device name which we use to initialize a decoder > > context > > > within the libtegrav4l2 library. > > > This can be anything really as long as decoder and encoder device names are > > > different since we do not open a v4l2 video device underneath. Libtegrav4l2 > is > > > really a pseudo implementation. I can change it /dev/tegra-dec and > > > /dev/tegra-enc for it to mean tegra specific. > > > > Which device is actually being used? Does the library just talk to DRM driver > > via custom ioctls? > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > Is this is the codec device exposed by Tegra kernel driver? > > > > > > > > You can't assume it will be this on all configurations. > > > > Please use udev rules to create a codec specific device (see Exynos > example > > at > > > > > > > > > > https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...) > > > > > > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:54: return (void*)offset; On 2014/02/12 10:11:55, shivdasp wrote: > Okay I will add Mmap and Munmap calls to the library and have it return the > appropriate value internally. > Great. Thank you! > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > On 2014/02/10 13:31:17, shivdasp wrote: > > > When REQBUFS is called for OUTPUT_PLANE, the library creates internal > buffers > > > which can be shared with the AVP for decoding. AVP is a video processor > which > > > runs the firmware that actually does all the decoding. While creating this > > > buffers, they are already mmapped to get a virtual address. The library > > returns > > > this address in QUERYBUF API. Hence there is not real need for mmap. > > > When REQBUFS with 0 is called on OUTPUT_PLANE, these buffers are internally > > > unmapped and destroyed. > > > > The library must be using some kind of an mmap call to get a mapping though. > > Would it be possible to move it to be done on this call, as expected? > > > > Also, how will this work for VEA, where CAPTURE buffers need to be mapped > > instead? > > > > > I will explain the need for CreateEGLImage in later comments. > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > QUERYBUFS from V4L2VDA is used to acquire offsets for mmap, but from > > > > CreateEGLImage to pass them for texture binding to the library? I'm > guessing > > > > that this works, because for the former you call QUERYBUFS with OUTPUT > > buffer, > > > > while in the latter you call it with CAPTURE? > > > > > > > > Please don't do this. Suggested changes for EGLImages in my comments > below. > > > > > > > > As for mmap, how (and from what) is the memory mapping actually acquired > for > > > the > > > > buffer in the library? Why do this on QUERYBUFS and not on Mmap()? How is > it > > > > managed and when it is destroyed? > > > > > > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for tegrav4l2 device. On 2014/02/12 10:11:55, shivdasp wrote: > If not in REQBUFS(0) then what will be the appropriate place to destroy the > buffers ? Sorry, perhaps I wasn't clear, the term "buffers" is a bit overloaded. I mean the underlying memory. REQBUFS(0) may be called, but the actual memory that backed the v4l2_buffers may have to live on if it's still tied to the textures. This will be a common case actually, because we don't explicitly destroy textures first unless they are dismissed. The memory should be then freed when the textures are deleted, not on REQBUFS(0). I'm wondering if the library/driver take this into account. Of course, it's still possible for REQUBFS(0) to have to trigger destruction of underlying memory, in case the textures get unbound and deleted before REQBUFS(0) is called. > V4L2VDA::DestroyOutputBuffers() calls REQBUFS(0) and hence we destroy it there. > How does the renderer then inform the ownership of textures ? glDeleteTextures(). So the textures and the underlying memory may have to outlive REQBUFS(0). > > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > On 2014/02/10 13:31:17, shivdasp wrote: > > > Buffers are unmapped in REQBUFS(0) call and destroyed. > > > Since there is no real need for mmap and munmap, we did not implement it in > > the > > > library. > > > So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and mmaps the > > > buffer whereas REQBUF(0) unmaps and destroys them. > > > > We should not rely on V4L2VDA to be the only place where the underlying memory > > will be destroyed. Even if we REQBUFS(0) on these buffers, the renderer > process > > may still be keeping ownership of the textures bound to them. Is this taken > into > > account? > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > If so, how is unmapping handled then? What if we want to free the buffers > > and > > > > reallocate them? You cannot call REQBUFS(0) without unmapping the buffers > > > > first... > > > > > > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset = reinterpret_cast<unsigned int>(egl_image); On 2014/02/12 10:11:55, shivdasp wrote: > The output is YUV420 planar. Are all planes non-interleaved and contiguous in memory? If so, then you need to use either V4L2_PIX_FMT_YVU420 ('YV12') or V4L2_PIX_FMT_YUV420 ('YU12'), please see http://linuxtv.org/downloads/v4l-dvb-apis/re23.html. Please don't use V4L2_PIX_FMT_NV12M if this is not what codec produces. > I think rather than using the QUERYBUF to pass the > EglImage handles and stuffing the required information I would rather introduce > a custom API (UseEglImage ?). > I hope that is fine. Yes, it's preferable over using QUERYBUF for this. But let's agree on the shape of it. What would UseEglImage do? Could we instead pass the offsets to eglCreateImageKHR? Will we be able to also retain texture binding in V4L2VDA then? > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > On 2014/02/10 13:31:17, shivdasp wrote: > > > We are really passing in the EglImage handle here to the library. > > > > EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed in an > u32 > > variable. > > > > The whole idea behind offsets is that they are usually not really offsets, but > > sort of platform-independent 32bit IDs. They are acquired from the V4L2 driver > > (or library) via QUERYBUFS, and can be passed back to other calls to uniquely > > identify the buffers (e.g. to mmap). > > > > The client is not supposed to generate them by itself and pass them to > > QUERYBUFS. > > > > > The library > > > associates this with the corresponding v4l2_buffer on the CAPTURE plane and > > use > > > the underlying conversion APIs to transform the decoder's yuv output into > the > > > egl image. > > > We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where the > > number > > > of planes are checked with 2 (line #1660 of V4L2VDA). > > > > You mean > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu... > > ? > > > > This is an overassumption from the time where there was only one format > > supported. > > The number of planes to be used should be taken from the v4l2_format struct, > > returned from G_FMT. This assumption should be fixed. > > > > From what I'm seeing here, your HW doesn't really use V4L2_PIX_FMT_NV12M? > Which > > fourcc format does it use? Are the planes separate memory buffers? > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > mem_offset is u32 always, an unsigned int shouldn't be assigned to it. > Also, > > > > there are two planes, but passing only one offset is a bit inconsistent. > > > > Although, why have two planes, if only one is used? > > > > > > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:168: } On 2014/02/12 10:11:55, shivdasp wrote: > There is no extension to create EglImages from dmabufs or the offsets at the > moment unfortunately. > I agree using the QUERYBUF for sending the EglImage can be misleading and I will > change it. V4L2VEA should not affected since we use the QUERYBUF for providing > the actual offsets. This hack was only for sending the EglImages in case of > CAPTURE PLANE of decoder. Could you explain why is it not affected? VEA calls QUERYBUF on CAPTURE buffers. > As I said earlier will adding a custom API (for now) to send the EglImages be > okay ? > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > On 2014/02/10 13:31:17, shivdasp wrote: > > > Since we started with implementing this as a V4L2-Like library we have tried > > to > > > follow V4L2 syntax to provide the input & output buffers. > > > QUERYBUF can be made into a custom call since it is doing very custom thing > > > here. > > > > Please understand that: > > 1. We are asking for a V4L2-like interface that conforms to the V4L2 API. A > call > > documented to work in the same way regardless of buffer type passed should not > > do otherwise, if it can be prevented. If it's expected to return values, it > > shouldn't be accepting them instead. And so on. > > Of course, this is an adapter class, so the actual inner workings may be > > different and it's not always possible to do exactly the same thing, but from > > the point of view of the client the effects should be as close to what each > call > > is documented to do, as possible. > > > > Otherwise this whole exercise of using V4L2 API is doing us more bad than > good. > > > > Please understand that the V4L2VDA and V4L2VEA classes will live on and will > > work with multiple platforms. There will be many changes to them. People > working > > on them will expect behavior as documented in V4L2 API. Otherwise things will > > break (and not only for other platforms, but Tegra too) and it will be very > > difficult to reason why. > > > > So it's very important for Tegra V4L2Device to behave like V4L2 API specifies > > and not be tailored to how things are laid out in V4L2VDA currently. > > > > 2. This V4L2Device class should work with V4L2VEA class as well. I don't think > > we can make it work if this hack on QUERYBUFS is here. > > > > > > > If introducing another API is acceptable I can do that. > > > > > > We do not yet have eglCreateImageKHR that accepts dmabufs. I can check with > > the > > > our Graphics team but I don't there is any such plan to implement such > > > extension. > > > > > > > That's why I gave the option of using offsets. If you prefer not to use > dmabufs, > > could we please: > > > > - provide offsets via querybufs from the driver/library > > - pass those offsets to a new eglCreateImage extension and move it back to > > V4L2VDA > > - keep using texture binding API? > > > > This should eliminate the need for this method as well. > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > After reading through it, this method feels unnecessary. > > > > > > > > I'm assuming querybufs implementation in the library calls a method in the > > GPU > > > > driver anyway? > > > > > > > > This is basically redefining querybufs to do something completely > different > > > than > > > > it normally does, turning it into a custom call. The offsets should be > > coming > > > > from the callee of QUERYBUFS, not the other way around, and there should > be > > no > > > > custom side effects. > > > > A non-V4L2, library-specific custom call would be better than this. > > > > > > > > But why not implement eglCreateImageKHR that accepts dmabufs (or even > > offsets) > > > > that come from the v4l2 library to create EGL images, just like we do on > > Mali > > > on > > > > Exynos? > > > > > > > > Would it be possible to have an extension for eglCreateImage like Exynos > > does > > > > instead please? It doesn't seem to be much of a difference, instead of > > calling > > > > querybufs with custom arguments and having the library call something in > the > > > > driver, call eglCreateImage instead with custom arguments and have it do > > > > everything? > > > > > > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/v4l2_video_decode_accelerator.cc:382: glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image); On 2014/02/12 10:11:55, shivdasp wrote: > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > On 2014/02/10 13:31:17, shivdasp wrote: > > > The decoder's output buffers are created when REQBUFS(x) is called on > > > CAPTURE_PLANE. These buffers are hardware buffers which can be shared with > the > > > AVP processor for decoder to write into. > > > > By decoder do you mean V4L2VDA class? > > No I meant the decoder entity within the library. > > > > > Then V4L2VDA triggers ProvidePictureBuffers() on which textures are created > > and > > > sent back in AssignPictureBuffers(). > > > Now V4L2VDA creates EglImages from these textures and sends each EglImage > > handle > > > to library using the QUERYBUF (but can use a custom call too). The tegrav4l2 > > > library cannot create EglImages from DMABUFS like in Exynos since there is > no > > > such extension. We create EglImage from this texture itself so there is a > > > binding between texture and eglImage. > > > > Sounds like the eglCreateImage extension taking offsets I described in the > > comment in tegra_v4l2_video_device.cc could work for this? > Unfortunately there is no such extension today. > > > > > Now when this EglImage is sent to libtegrav4l2, it is mapped with the > > > corresponding decoder buffer created in REQBUF() call. > > > This way there is one map of EglImage, texture and decoder buffer. > > > > My understanding is you mean the buffer is bound to a texture? If so, then it > > also seems like we could use the current bind texture to eglimage calls? > The libtegrav4l2 talks to another internal library which actually creates the > YUV buffer. This is what is given to the AVP and where the decoded output is > actually filled. > There is a corresponding RGB buffer created when the EGLImage is called, this is > owned by the graphics library. While enqueuing buffers for CAPTURE PLANE, there > is a conversion performed to do YUV to RGB. So the YUV buffers are tied to the textures somehow? > > > > > When any buffer is enqueued in QBUF, the library sends it down to the > decoder. > > > Once the decoder buffer is ready, the library uses graphics apis to populate > > the > > > corresponding EglImage with the RGB data and then pushes into a queue > thereby > > > making it available for DQBUF after which this buffer can be used only when > it > > > is back in QBUF call. > > > This way the buffer ownership is managed. > > > So in summary the library uses queues and does all the buffer management > > between > > > decoder and the graphics stack for conversion. > > > > What happens when this class calls REQBUFS(0), but the corresponding textures > > are being rendered to the screen? > > How will the buffers be freed if the GPU process crashes without calling > > REQBUFS(0)? > > What happens when the bound textures are deleted, but the HW codec is still > > using them? > > I guess I am missing something here. I did not understand "REQBUFS(0) is called > but corresponding textures are being rendered ?". Doesn't DestroyOutputBuffers() > call guarantee that buffers on CAPTURE plane are no longer used. The underlying memory can still be used as textures in the client of VDA class. It only guarantees that they are not used anymore by the codec class as v4l2_buffers. > I will confirm about the buffer freeing in gpu process crash scenario. Thanks. > The last scenario (bound texture are deleted but HW codec is still using them) > is taken care by the conversion step performed using the library. > The texture is > bound to the EGlImage. So that binding will fail. Since the libtegrav4l2 has the > EglImage backed by a RGB buffer the conversion can happen. How can I test this > scenario ? This is just a case where there is a bug in the code, but my point is that the ownership should be shared with the kernel as well, so if the userspace (Chrome) dies, the kernel will properly clean up. > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > I would like to understand the big picture here please. > > > > > > > > We strive to stay as close as possible to using (and/or creating) > > > > platform-independent standards where we can, like the sequence above, > > instead > > > of > > > > providing custom calls for each platform. Removing this from here and TVDA > > is > > > a > > > > step into an opposite direction, and I would like to understand what > > technical > > > > difficulties force us to do this first. > > > > > > > > Binding textures to EGLImages also serves to keep track of ownership. > There > > > are > > > > multiple users of the shared buffer, the renderer, the GPU that renders > the > > > > textures, this class and the HW codec. How is ownership/destruction > managed > > > and > > > > how is it ensured that the buffer is valid while any of the users are > still > > > > referring to/using it (both in userspace and in kernel)? > > > > > > > > What happens if the renderer crashes and the codec is writing to the > > textures? > > > > What happens when this class is destroyed, but the texture is in the > > renderer? > > > > What happens when the whole Chrome crashes, but the HW codec is using a > > buffer > > > > (i.e. kernel has ownership)? > > > > > > > > Could you please explain how is ownership managed for shared buffers on > > Tegra? > > > > > >
https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/exynos_v4l2_video_device.cc:123: EGLImageKHR egl_image = eglCreateImageKHR( make_context_current_ is already done before calling this function. That should be sufficient I believe. Or it has to be done again here ? Also relatedly , how do I get the egl_context in which the textures are created ? I am using eglGetCurrentContext() in the TegraV4L2Device. On 2014/02/10 06:36:17, Pawel Osciak wrote: > We need to make GL context current to be able to call this. > We should at least require this in the doc that it should be done before calling > this method. https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; Yes /dev/tegra_avpchannel will be added in the sandbox whitelist for now. Work is in progress to make this sandbox friendly in our MM stack. Most probably by the time we get this change merged we should have completed it so even whitelisting may not be required. In worst case we will whitelist it sandbox code. As I said earlier this "device name" sent to the library is just dummy for the libtegrav4l2 to create a decoder instance. It just has to be different than the encoder "device name". Shall I just keep it "tegra_dec" and "tegra_enc" to not mislead it for something like a true v4l2 device name ? On 2014/02/13 10:42:54, Pawel Osciak wrote: > On 2014/02/12 10:11:55, shivdasp wrote: > > This library internally talks to MM layer which talks to the device > > (/dev/tegra_avpchannel) which is the nvavp driver. > > This means you will have to add it to sandbox rules in Chrome, right? So the > library should actually use the device path string provided from Chrome to > Open() and not have the string hardcoded in the library please. > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > This is a v4l2 decoder device name which we use to initialize a decoder > > > context > > > > within the libtegrav4l2 library. > > > > This can be anything really as long as decoder and encoder device names > are > > > > different since we do not open a v4l2 video device underneath. > Libtegrav4l2 > > is > > > > really a pseudo implementation. I can change it /dev/tegra-dec and > > > > /dev/tegra-enc for it to mean tegra specific. > > > > > > Which device is actually being used? Does the library just talk to DRM > driver > > > via custom ioctls? > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > Is this is the codec device exposed by Tegra kernel driver? > > > > > > > > > > You can't assume it will be this on all configurations. > > > > > Please use udev rules to create a codec specific device (see Exynos > > example > > > at > > > > > > > > > > > > > > > https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...) > > > > > > > > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for tegrav4l2 device. Understood. I believe the dmabuf export mechanism takes care of this in Exynos since the buffer backed memory is in kernel and hence the deletion is kind of synchronized. I see in DestroyOutputbuffers(), before calling DismissPicture(), the eglDestroyImageKHR() is called thereby the textures are unbound or perhaps the deletion there is also delayed ? How is the texture being rendered if the eglImage it is bound to is also destroyed ? On 2014/02/13 10:42:54, Pawel Osciak wrote: > On 2014/02/12 10:11:55, shivdasp wrote: > > If not in REQBUFS(0) then what will be the appropriate place to destroy the > > buffers ? > > Sorry, perhaps I wasn't clear, the term "buffers" is a bit overloaded. I mean > the underlying memory. REQBUFS(0) may be called, but the actual memory that > backed the v4l2_buffers may have to live on if it's still tied to the textures. > This will be a common case actually, because we don't explicitly destroy > textures first unless they are dismissed. The memory should be then freed when > the textures are deleted, not on REQBUFS(0). I'm wondering if the library/driver > take this into account. > Of course, it's still possible for REQUBFS(0) to have to trigger destruction of > underlying memory, in case the textures get unbound and deleted before > REQBUFS(0) is called. > > > V4L2VDA::DestroyOutputBuffers() calls REQBUFS(0) and hence we destroy it > there. > > How does the renderer then inform the ownership of textures ? > > glDeleteTextures(). So the textures and the underlying memory may have to > outlive REQBUFS(0). > > > > > > > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > Buffers are unmapped in REQBUFS(0) call and destroyed. > > > > Since there is no real need for mmap and munmap, we did not implement it > in > > > the > > > > library. > > > > So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and mmaps > the > > > > buffer whereas REQBUF(0) unmaps and destroys them. > > > > > > We should not rely on V4L2VDA to be the only place where the underlying > memory > > > will be destroyed. Even if we REQBUFS(0) on these buffers, the renderer > > process > > > may still be keeping ownership of the textures bound to them. Is this taken > > into > > > account? > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > If so, how is unmapping handled then? What if we want to free the > buffers > > > and > > > > > reallocate them? You cannot call REQBUFS(0) without unmapping the > buffers > > > > > first... > > > > > > > > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset = reinterpret_cast<unsigned int>(egl_image); On 2014/02/13 10:42:54, Pawel Osciak wrote: > On 2014/02/12 10:11:55, shivdasp wrote: > > The output is YUV420 planar. > > Are all planes non-interleaved and contiguous in memory? If so, then you need to > use either V4L2_PIX_FMT_YVU420 ('YV12') or V4L2_PIX_FMT_YUV420 ('YU12'), please > see http://linuxtv.org/downloads/v4l-dvb-apis/re23.html. > > Please don't use V4L2_PIX_FMT_NV12M if this is not what codec produces. Okay I will change the pixel format. However there are some DHECK_EQ() code in V4L2VDA to check against V4L2_PIX_FMT_NV12M. They also exist for num_planes. I will have to introduce private member functions for ExynosV4L2Device and TegraV4L2Device to check against them rather than hardcoded values in V4L2VDA. Will that be fine ? > > > I think rather than using the QUERYBUF to pass the > > EglImage handles and stuffing the required information I would rather > introduce > > a custom API (UseEglImage ?). > > I hope that is fine. > > Yes, it's preferable over using QUERYBUF for this. But let's agree on the shape > of it. What would UseEglImage do? > Could we instead pass the offsets to eglCreateImageKHR? > Will we be able to also retain texture binding in V4L2VDA then? I was thinking of int UseEglImage(int buffer_index, EGLImageKHR egl_image); We basically need to send the EglImage created for a particular buffer_index so the library can convert YUV into its respective EglImage. We cannot send offsets to eglCreateImageKHR unless we have extension. However the buffer_index internally is the mapping for identifying the eglImage so in a way that will work like an offset. > > > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > We are really passing in the EglImage handle here to the library. > > > > > > EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed in an > > u32 > > > variable. > > > > > > The whole idea behind offsets is that they are usually not really offsets, > but > > > sort of platform-independent 32bit IDs. They are acquired from the V4L2 > driver > > > (or library) via QUERYBUFS, and can be passed back to other calls to > uniquely > > > identify the buffers (e.g. to mmap). > > > > > > The client is not supposed to generate them by itself and pass them to > > > QUERYBUFS. > > > > > > > The library > > > > associates this with the corresponding v4l2_buffer on the CAPTURE plane > and > > > use > > > > the underlying conversion APIs to transform the decoder's yuv output into > > the > > > > egl image. > > > > We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where the > > > number > > > > of planes are checked with 2 (line #1660 of V4L2VDA). > > > > > > You mean > > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu... > > > ? > > > > > > This is an overassumption from the time where there was only one format > > > supported. > > > The number of planes to be used should be taken from the v4l2_format struct, > > > returned from G_FMT. This assumption should be fixed. > > > > > > From what I'm seeing here, your HW doesn't really use V4L2_PIX_FMT_NV12M? > > Which > > > fourcc format does it use? Are the planes separate memory buffers? > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > mem_offset is u32 always, an unsigned int shouldn't be assigned to it. > > Also, > > > > > there are two planes, but passing only one offset is a bit inconsistent. > > > > > Although, why have two planes, if only one is used? > > > > > > > > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:168: } The fd returned by TegraV4L2Open() knows whether it was for a decoder instance or encoder instance. Hence QUERYBUF on CAPTURE_PLANE for an encoder will behave as per the V4L2 specification. An alternate hacky behavior was needed for decoder instance to send EglImages which we are now sending in custom call. On 2014/02/13 10:42:54, Pawel Osciak wrote: > On 2014/02/12 10:11:55, shivdasp wrote: > > There is no extension to create EglImages from dmabufs or the offsets at the > > moment unfortunately. > > I agree using the QUERYBUF for sending the EglImage can be misleading and I > will > > change it. V4L2VEA should not affected since we use the QUERYBUF for providing > > the actual offsets. This hack was only for sending the EglImages in case of > > CAPTURE PLANE of decoder. > > Could you explain why is it not affected? VEA calls QUERYBUF on CAPTURE buffers. > > > As I said earlier will adding a custom API (for now) to send the EglImages be > > okay ? > > > > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > Since we started with implementing this as a V4L2-Like library we have > tried > > > to > > > > follow V4L2 syntax to provide the input & output buffers. > > > > QUERYBUF can be made into a custom call since it is doing very custom > thing > > > > here. > > > > > > Please understand that: > > > 1. We are asking for a V4L2-like interface that conforms to the V4L2 API. A > > call > > > documented to work in the same way regardless of buffer type passed should > not > > > do otherwise, if it can be prevented. If it's expected to return values, it > > > shouldn't be accepting them instead. And so on. > > > Of course, this is an adapter class, so the actual inner workings may be > > > different and it's not always possible to do exactly the same thing, but > from > > > the point of view of the client the effects should be as close to what each > > call > > > is documented to do, as possible. > > > > > > Otherwise this whole exercise of using V4L2 API is doing us more bad than > > good. > > > > > > Please understand that the V4L2VDA and V4L2VEA classes will live on and will > > > work with multiple platforms. There will be many changes to them. People > > working > > > on them will expect behavior as documented in V4L2 API. Otherwise things > will > > > break (and not only for other platforms, but Tegra too) and it will be very > > > difficult to reason why. > > > > > > So it's very important for Tegra V4L2Device to behave like V4L2 API > specifies > > > and not be tailored to how things are laid out in V4L2VDA currently. > > > > > > 2. This V4L2Device class should work with V4L2VEA class as well. I don't > think > > > we can make it work if this hack on QUERYBUFS is here. > > > > > > > > > > If introducing another API is acceptable I can do that. > > > > > > > > We do not yet have eglCreateImageKHR that accepts dmabufs. I can check > with > > > the > > > > our Graphics team but I don't there is any such plan to implement such > > > > extension. > > > > > > > > > > That's why I gave the option of using offsets. If you prefer not to use > > dmabufs, > > > could we please: > > > > > > - provide offsets via querybufs from the driver/library > > > - pass those offsets to a new eglCreateImage extension and move it back to > > > V4L2VDA > > > - keep using texture binding API? > > > > > > This should eliminate the need for this method as well. > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > After reading through it, this method feels unnecessary. > > > > > > > > > > I'm assuming querybufs implementation in the library calls a method in > the > > > GPU > > > > > driver anyway? > > > > > > > > > > This is basically redefining querybufs to do something completely > > different > > > > than > > > > > it normally does, turning it into a custom call. The offsets should be > > > coming > > > > > from the callee of QUERYBUFS, not the other way around, and there should > > be > > > no > > > > > custom side effects. > > > > > A non-V4L2, library-specific custom call would be better than this. > > > > > > > > > > But why not implement eglCreateImageKHR that accepts dmabufs (or even > > > offsets) > > > > > that come from the v4l2 library to create EGL images, just like we do on > > > Mali > > > > on > > > > > Exynos? > > > > > > > > > > Would it be possible to have an extension for eglCreateImage like Exynos > > > does > > > > > instead please? It doesn't seem to be much of a difference, instead of > > > calling > > > > > querybufs with custom arguments and having the library call something in > > the > > > > > driver, call eglCreateImage instead with custom arguments and have it do > > > > > everything? > > > > > > > > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_decode_accelerator.cc:382: glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image); On 2014/02/13 10:42:54, Pawel Osciak wrote: > On 2014/02/12 10:11:55, shivdasp wrote: > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > The decoder's output buffers are created when REQBUFS(x) is called on > > > > CAPTURE_PLANE. These buffers are hardware buffers which can be shared with > > the > > > > AVP processor for decoder to write into. > > > > > > By decoder do you mean V4L2VDA class? > > > > No I meant the decoder entity within the library. > > > > > > > Then V4L2VDA triggers ProvidePictureBuffers() on which textures are > created > > > and > > > > sent back in AssignPictureBuffers(). > > > > Now V4L2VDA creates EglImages from these textures and sends each EglImage > > > handle > > > > to library using the QUERYBUF (but can use a custom call too). The > tegrav4l2 > > > > library cannot create EglImages from DMABUFS like in Exynos since there is > > no > > > > such extension. We create EglImage from this texture itself so there is a > > > > binding between texture and eglImage. > > > > > > Sounds like the eglCreateImage extension taking offsets I described in the > > > comment in tegra_v4l2_video_device.cc could work for this? > > Unfortunately there is no such extension today. > > > > > > > Now when this EglImage is sent to libtegrav4l2, it is mapped with the > > > > corresponding decoder buffer created in REQBUF() call. > > > > This way there is one map of EglImage, texture and decoder buffer. > > > > > > My understanding is you mean the buffer is bound to a texture? If so, then > it > > > also seems like we could use the current bind texture to eglimage calls? > > The libtegrav4l2 talks to another internal library which actually creates the > > YUV buffer. This is what is given to the AVP and where the decoded output is > > actually filled. > > There is a corresponding RGB buffer created when the EGLImage is called, this > is > > owned by the graphics library. While enqueuing buffers for CAPTURE PLANE, > there > > is a conversion performed to do YUV to RGB. > > So the YUV buffers are tied to the textures somehow? We send texture_id to eglCreateImageKHR and bind it there. And eglImage is sent to the library which maps it to its YUV buffer. My subsequent patch will probably make this clearer. > > > > > > > > When any buffer is enqueued in QBUF, the library sends it down to the > > decoder. > > > > Once the decoder buffer is ready, the library uses graphics apis to > populate > > > the > > > > corresponding EglImage with the RGB data and then pushes into a queue > > thereby > > > > making it available for DQBUF after which this buffer can be used only > when > > it > > > > is back in QBUF call. > > > > This way the buffer ownership is managed. > > > > So in summary the library uses queues and does all the buffer management > > > between > > > > decoder and the graphics stack for conversion. > > > > > > What happens when this class calls REQBUFS(0), but the corresponding > textures > > > are being rendered to the screen? > > > How will the buffers be freed if the GPU process crashes without calling > > > REQBUFS(0)? > > > What happens when the bound textures are deleted, but the HW codec is still > > > using them? > > > > I guess I am missing something here. I did not understand "REQBUFS(0) is > called > > but corresponding textures are being rendered ?". Doesn't > DestroyOutputBuffers() > > call guarantee that buffers on CAPTURE plane are no longer used. > > The underlying memory can still be used as textures in the client of VDA class. > It only guarantees that they are not used anymore by the codec class as > v4l2_buffers. > > > I will confirm about the buffer freeing in gpu process crash scenario. > > Thanks. If the EGLimage is destroyed I think the texture becomes unbound. I was debugging some scenario and I get errors as "texture not bound or texture id 0" kind of errors from gles2_cmd_decoder.cc. These I guess represent this crash scenario. So it is taken care of already while validating the texture before rendering ? And I observe similar kind of logs on Exynos too. Do you have a test case or steps of validating this ? Will killing gpu process while video playback validate this path ? > > > The last scenario (bound texture are deleted but HW codec is still using them) > > is taken care by the conversion step performed using the library. > > The texture is > > bound to the EGlImage. So that binding will fail. Since the libtegrav4l2 has > the > > EglImage backed by a RGB buffer the conversion can happen. How can I test this > > scenario ? > > This is just a case where there is a bug in the code, but my point is that the > ownership should be shared with the kernel as well, so if the userspace (Chrome) > dies, the kernel will properly clean up. > > > > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > I would like to understand the big picture here please. > > > > > > > > > > We strive to stay as close as possible to using (and/or creating) > > > > > platform-independent standards where we can, like the sequence above, > > > instead > > > > of > > > > > providing custom calls for each platform. Removing this from here and > TVDA > > > is > > > > a > > > > > step into an opposite direction, and I would like to understand what > > > technical > > > > > difficulties force us to do this first. > > > > > > > > > > Binding textures to EGLImages also serves to keep track of ownership. > > There > > > > are > > > > > multiple users of the shared buffer, the renderer, the GPU that renders > > the > > > > > textures, this class and the HW codec. How is ownership/destruction > > managed > > > > and > > > > > how is it ensured that the buffer is valid while any of the users are > > still > > > > > referring to/using it (both in userspace and in kernel)? > > > > > > > > > > What happens if the renderer crashes and the codec is writing to the > > > textures? > > > > > What happens when this class is destroyed, but the texture is in the > > > renderer? > > > > > What happens when the whole Chrome crashes, but the HW codec is using a > > > buffer > > > > > (i.e. kernel has ownership)? > > > > > > > > > > Could you please explain how is ownership managed for shared buffers on > > > Tegra? > > > > > > > > > >
https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/exynos_v4l2_video_device.cc:123: EGLImageKHR egl_image = eglCreateImageKHR( On 2014/02/14 03:06:45, shivdasp wrote: > make_context_current_ is already done before calling this function. That should > be sufficient I believe. Or it has to be done again here ? Yes, I'm just saying this method should explicitly state so in its documentation at least. But it's not really relevant anymore, this method should be no longer needed. > Also relatedly , how do I get the egl_context in which the textures are created > ? > I am using eglGetCurrentContext() in the TegraV4L2Device. The context in which textures are created is the one you get as an argument to the device right now, you restored that argument recently. > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > We need to make GL context current to be able to call this. > > We should at least require this in the doc that it should be done before > calling > > this method. > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; On 2014/02/14 03:06:45, shivdasp wrote: > Yes /dev/tegra_avpchannel will be added in the sandbox whitelist for now. > Work is in progress to make this sandbox friendly in our MM stack. Most probably > by the time we get this change merged we should have completed it so even > whitelisting may not be required. In worst case we will whitelist it sandbox > code. Why it may not be required? How else would it be accessible if it's not whitelisted? > As I said earlier this "device name" sent to the library is just dummy for the > libtegrav4l2 to create a decoder instance. It just has to be different than the > encoder "device name". > Shall I just keep it "tegra_dec" and "tegra_enc" to not mislead it for something > like a true v4l2 device name ? Please, the library has to open the device with the same name as this method provides to it. > On 2014/02/13 10:42:54, Pawel Osciak wrote: > > On 2014/02/12 10:11:55, shivdasp wrote: > > > This library internally talks to MM layer which talks to the device > > > (/dev/tegra_avpchannel) which is the nvavp driver. > > > > This means you will have to add it to sandbox rules in Chrome, right? So the > > library should actually use the device path string provided from Chrome to > > Open() and not have the string hardcoded in the library please. > > > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > > This is a v4l2 decoder device name which we use to initialize a decoder > > > > context > > > > > within the libtegrav4l2 library. > > > > > This can be anything really as long as decoder and encoder device names > > are > > > > > different since we do not open a v4l2 video device underneath. > > Libtegrav4l2 > > > is > > > > > really a pseudo implementation. I can change it /dev/tegra-dec and > > > > > /dev/tegra-enc for it to mean tegra specific. > > > > > > > > Which device is actually being used? Does the library just talk to DRM > > driver > > > > via custom ioctls? > > > > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > > Is this is the codec device exposed by Tegra kernel driver? > > > > > > > > > > > > You can't assume it will be this on all configurations. > > > > > > Please use udev rules to create a codec specific device (see Exynos > > > example > > > > at > > > > > > > > > > > > > > > > > > > > > https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...) > > > > > > > > > > > > > > > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for tegrav4l2 device. On 2014/02/14 03:06:45, shivdasp wrote: > Understood. I believe the dmabuf export mechanism takes care of this in Exynos > since the buffer backed memory is in kernel and hence the deletion is kind of > synchronized. Yes, although it's not "synchronized", but the ownership is managed and refcounted in the driver and the memory is freed when there is no more users. How does Tegra manage this? > > I see in DestroyOutputbuffers(), before calling DismissPicture(), the > eglDestroyImageKHR() is called thereby the textures are unbound or perhaps the > deletion there is also delayed ? Destroying the image doesn't destroy the textures, but it does unbind them. > How is the texture being rendered if the eglImage it is bound to is also > destroyed ? Texture is a separate entity from the eglImage, they share the underlying memory after binding, but after eglImage is destroyed, the memory is not freed and it lives on in the texture. > > On 2014/02/13 10:42:54, Pawel Osciak wrote: > > On 2014/02/12 10:11:55, shivdasp wrote: > > > If not in REQBUFS(0) then what will be the appropriate place to destroy the > > > buffers ? > > > > Sorry, perhaps I wasn't clear, the term "buffers" is a bit overloaded. I mean > > the underlying memory. REQBUFS(0) may be called, but the actual memory that > > backed the v4l2_buffers may have to live on if it's still tied to the > textures. > > This will be a common case actually, because we don't explicitly destroy > > textures first unless they are dismissed. The memory should be then freed when > > the textures are deleted, not on REQBUFS(0). I'm wondering if the > library/driver > > take this into account. > > Of course, it's still possible for REQUBFS(0) to have to trigger destruction > of > > underlying memory, in case the textures get unbound and deleted before > > REQBUFS(0) is called. > > > > > V4L2VDA::DestroyOutputBuffers() calls REQBUFS(0) and hence we destroy it > > there. > > > How does the renderer then inform the ownership of textures ? > > > > glDeleteTextures(). So the textures and the underlying memory may have to > > outlive REQBUFS(0). > > > > > > > > > > > > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > > Buffers are unmapped in REQBUFS(0) call and destroyed. > > > > > Since there is no real need for mmap and munmap, we did not implement it > > in > > > > the > > > > > library. > > > > > So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and mmaps > > the > > > > > buffer whereas REQBUF(0) unmaps and destroys them. > > > > > > > > We should not rely on V4L2VDA to be the only place where the underlying > > memory > > > > will be destroyed. Even if we REQBUFS(0) on these buffers, the renderer > > > process > > > > may still be keeping ownership of the textures bound to them. Is this > taken > > > into > > > > account? > > > > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > > If so, how is unmapping handled then? What if we want to free the > > buffers > > > > and > > > > > > reallocate them? You cannot call REQBUFS(0) without unmapping the > > buffers > > > > > > first... > > > > > > > > > > > > > > > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset = reinterpret_cast<unsigned int>(egl_image); On 2014/02/14 03:06:45, shivdasp wrote: > On 2014/02/13 10:42:54, Pawel Osciak wrote: > > On 2014/02/12 10:11:55, shivdasp wrote: > > > The output is YUV420 planar. > > > > Are all planes non-interleaved and contiguous in memory? If so, then you need > to > > use either V4L2_PIX_FMT_YVU420 ('YV12') or V4L2_PIX_FMT_YUV420 ('YU12'), > please > > see http://linuxtv.org/downloads/v4l-dvb-apis/re23.html. > > > > Please don't use V4L2_PIX_FMT_NV12M if this is not what codec produces. > Okay I will change the pixel format. However there are some DHECK_EQ() code in > V4L2VDA to check against V4L2_PIX_FMT_NV12M. > They also exist for num_planes. I will have to introduce private member > functions for ExynosV4L2Device and TegraV4L2Device to check against them rather > than hardcoded values in V4L2VDA. Will that be fine ? As I said, those checks and should be fixed to use the actual format given by the device. There is no need to have private member methods for devices. V4L2 API gives you methods to query and get formats as well as information how many planes each format uses. Please see documentation for v4l2_pix_format{,_mplane} in http://linuxtv.org/downloads/v4l-dvb-apis/vidioc-g-fmt.html. > > > > > I think rather than using the QUERYBUF to pass the > > > EglImage handles and stuffing the required information I would rather > > introduce > > > a custom API (UseEglImage ?). > > > I hope that is fine. > > > > Yes, it's preferable over using QUERYBUF for this. But let's agree on the > shape > > of it. What would UseEglImage do? > > Could we instead pass the offsets to eglCreateImageKHR? > > Will we be able to also retain texture binding in V4L2VDA then? > I was thinking of int UseEglImage(int buffer_index, EGLImageKHR egl_image); > We basically need to send the EglImage created for a particular buffer_index so > the library can convert YUV into its respective EglImage. > We cannot send offsets to eglCreateImageKHR unless we have extension. However > the buffer_index internally is the mapping for identifying the eglImage so in a > way that will work like an offset. I really don't see why it should be an issue to create such an extension with very minimal effort. It should be a trivial wrapper around eglCreateImageKHR. Your library has to call some function in the driver anyway to do this, so why not extract that code and move it to a special case in eglCreateImageKHR instead? The code is already written I assume, since you use it, it just needs to be moved to a different location (i.e. eglCreateImageKHR implementation). > > > > > > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > > We are really passing in the EglImage handle here to the library. > > > > > > > > EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed in > an > > > u32 > > > > variable. > > > > > > > > The whole idea behind offsets is that they are usually not really offsets, > > but > > > > sort of platform-independent 32bit IDs. They are acquired from the V4L2 > > driver > > > > (or library) via QUERYBUFS, and can be passed back to other calls to > > uniquely > > > > identify the buffers (e.g. to mmap). > > > > > > > > The client is not supposed to generate them by itself and pass them to > > > > QUERYBUFS. > > > > > > > > > The library > > > > > associates this with the corresponding v4l2_buffer on the CAPTURE plane > > and > > > > use > > > > > the underlying conversion APIs to transform the decoder's yuv output > into > > > the > > > > > egl image. > > > > > We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where > the > > > > number > > > > > of planes are checked with 2 (line #1660 of V4L2VDA). > > > > > > > > You mean > > > > > > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu... > > > > ? > > > > > > > > This is an overassumption from the time where there was only one format > > > > supported. > > > > The number of planes to be used should be taken from the v4l2_format > struct, > > > > returned from G_FMT. This assumption should be fixed. > > > > > > > > From what I'm seeing here, your HW doesn't really use V4L2_PIX_FMT_NV12M? > > > Which > > > > fourcc format does it use? Are the planes separate memory buffers? > > > > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > > mem_offset is u32 always, an unsigned int shouldn't be assigned to it. > > > Also, > > > > > > there are two planes, but passing only one offset is a bit > inconsistent. > > > > > > Although, why have two planes, if only one is used? > > > > > > > > > > > > > > > https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://chromiumcodereview.appspot.com/137023008/diff/80001/content/common/gp... content/common/gpu/media/v4l2_video_decode_accelerator.cc:382: glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image); On 2014/02/14 03:06:45, shivdasp wrote: > On 2014/02/13 10:42:54, Pawel Osciak wrote: > > On 2014/02/12 10:11:55, shivdasp wrote: > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > > The decoder's output buffers are created when REQBUFS(x) is called on > > > > > CAPTURE_PLANE. These buffers are hardware buffers which can be shared > with > > > the > > > > > AVP processor for decoder to write into. > > > > > > > > By decoder do you mean V4L2VDA class? > > > > > > No I meant the decoder entity within the library. > > > > > > > > > Then V4L2VDA triggers ProvidePictureBuffers() on which textures are > > created > > > > and > > > > > sent back in AssignPictureBuffers(). > > > > > Now V4L2VDA creates EglImages from these textures and sends each > EglImage > > > > handle > > > > > to library using the QUERYBUF (but can use a custom call too). The > > tegrav4l2 > > > > > library cannot create EglImages from DMABUFS like in Exynos since there > is > > > no > > > > > such extension. We create EglImage from this texture itself so there is > a > > > > > binding between texture and eglImage. > > > > > > > > Sounds like the eglCreateImage extension taking offsets I described in the > > > > comment in tegra_v4l2_video_device.cc could work for this? > > > Unfortunately there is no such extension today. > > > > > > > > > Now when this EglImage is sent to libtegrav4l2, it is mapped with the > > > > > corresponding decoder buffer created in REQBUF() call. > > > > > This way there is one map of EglImage, texture and decoder buffer. > > > > > > > > My understanding is you mean the buffer is bound to a texture? If so, then > > it > > > > also seems like we could use the current bind texture to eglimage calls? > > > The libtegrav4l2 talks to another internal library which actually creates > the > > > YUV buffer. This is what is given to the AVP and where the decoded output is > > > actually filled. > > > There is a corresponding RGB buffer created when the EGLImage is called, > this > > is > > > owned by the graphics library. While enqueuing buffers for CAPTURE PLANE, > > there > > > is a conversion performed to do YUV to RGB. > > > > So the YUV buffers are tied to the textures somehow? > We send texture_id to eglCreateImageKHR and bind it there. And eglImage is sent > to the library which maps it to its YUV buffer. > My subsequent patch will probably make this clearer. Wait, where do you send texture_id to eglCreateImageKHR? I don't see that in the code above. Do you have an extension for eglCreateImageKHR to also accept texture ids and bind during creation? Why not do this in the standard way, i.e. by using GL_OES_EGL_image_external? I would expect your EGL implementation already has it for other things (and it's an extension created by NVIDIA too)... Or did you mean some other function? > > > > > > > > > > > When any buffer is enqueued in QBUF, the library sends it down to the > > > decoder. > > > > > Once the decoder buffer is ready, the library uses graphics apis to > > populate > > > > the > > > > > corresponding EglImage with the RGB data and then pushes into a queue > > > thereby > > > > > making it available for DQBUF after which this buffer can be used only > > when > > > it > > > > > is back in QBUF call. > > > > > This way the buffer ownership is managed. > > > > > So in summary the library uses queues and does all the buffer management > > > > between > > > > > decoder and the graphics stack for conversion. > > > > > > > > What happens when this class calls REQBUFS(0), but the corresponding > > textures > > > > are being rendered to the screen? > > > > How will the buffers be freed if the GPU process crashes without calling > > > > REQBUFS(0)? > > > > What happens when the bound textures are deleted, but the HW codec is > still > > > > using them? > > > > > > I guess I am missing something here. I did not understand "REQBUFS(0) is > > called > > > but corresponding textures are being rendered ?". Doesn't > > DestroyOutputBuffers() > > > call guarantee that buffers on CAPTURE plane are no longer used. > > > > The underlying memory can still be used as textures in the client of VDA > class. > > It only guarantees that they are not used anymore by the codec class as > > v4l2_buffers. > > > > > I will confirm about the buffer freeing in gpu process crash scenario. > > > > Thanks. > If the EGLimage is destroyed I think the texture becomes unbound. I was > debugging some scenario and I get errors as "texture not bound or texture id 0" > kind of errors from gles2_cmd_decoder.cc. These I guess represent this crash > scenario. So it is taken care of already while validating the texture before > rendering ? If you are getting those errors, then there is definitely something wrong going on. > And I observe similar kind of logs on Exynos too. That is even more worrying. Could you please submit a bug for Exynos with repro steps? > Do you have a test case or steps of validating this ? Will killing gpu process > while video playback validate this path ? It should. > > > > > The last scenario (bound texture are deleted but HW codec is still using > them) > > > is taken care by the conversion step performed using the library. > > > The texture is > > > bound to the EGlImage. So that binding will fail. Since the libtegrav4l2 has > > the > > > EglImage backed by a RGB buffer the conversion can happen. How can I test > this > > > scenario ? > > > > This is just a case where there is a bug in the code, but my point is that the > > ownership should be shared with the kernel as well, so if the userspace > (Chrome) > > dies, the kernel will properly clean up. > > > > > > > > > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > > I would like to understand the big picture here please. > > > > > > > > > > > > We strive to stay as close as possible to using (and/or creating) > > > > > > platform-independent standards where we can, like the sequence above, > > > > instead > > > > > of > > > > > > providing custom calls for each platform. Removing this from here and > > TVDA > > > > is > > > > > a > > > > > > step into an opposite direction, and I would like to understand what > > > > technical > > > > > > difficulties force us to do this first. > > > > > > > > > > > > Binding textures to EGLImages also serves to keep track of ownership. > > > There > > > > > are > > > > > > multiple users of the shared buffer, the renderer, the GPU that > renders > > > the > > > > > > textures, this class and the HW codec. How is ownership/destruction > > > managed > > > > > and > > > > > > how is it ensured that the buffer is valid while any of the users are > > > still > > > > > > referring to/using it (both in userspace and in kernel)? > > > > > > > > > > > > What happens if the renderer crashes and the codec is writing to the > > > > textures? > > > > > > What happens when this class is destroyed, but the texture is in the > > > > renderer? > > > > > > What happens when the whole Chrome crashes, but the HW codec is using > a > > > > buffer > > > > > > (i.e. kernel has ownership)? > > > > > > > > > > > > Could you please explain how is ownership managed for shared buffers > on > > > > Tegra? > > > > > > > > > > > > > > >
https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:23: const char kDevice[] = "/dev/video0"; Whitelisted will not be required because of pre-acquire the resources (including open() for /dev/tegra_avpchannel) in the libtegrav4l2.so which we load before the sandbox will kick in. Hence the only place we will have this device name will be in TegraV4L2Device which will be used to open the decoder instance. (The pre-opened fd will be used). Okay I will keep the same device name (/dev/tegra_avpchannel). On 2014/02/14 07:36:10, Pawel Osciak wrote: > On 2014/02/14 03:06:45, shivdasp wrote: > > Yes /dev/tegra_avpchannel will be added in the sandbox whitelist for now. > > Work is in progress to make this sandbox friendly in our MM stack. Most > probably > > by the time we get this change merged we should have completed it so even > > whitelisting may not be required. In worst case we will whitelist it sandbox > > code. > > Why it may not be required? How else would it be accessible if it's not > whitelisted? > > > As I said earlier this "device name" sent to the library is just dummy for the > > libtegrav4l2 to create a decoder instance. It just has to be different than > the > > encoder "device name". > > Shall I just keep it "tegra_dec" and "tegra_enc" to not mislead it for > something > > like a true v4l2 device name ? > > Please, the library has to open the device with the same name as this method > provides to it. > > > On 2014/02/13 10:42:54, Pawel Osciak wrote: > > > On 2014/02/12 10:11:55, shivdasp wrote: > > > > This library internally talks to MM layer which talks to the device > > > > (/dev/tegra_avpchannel) which is the nvavp driver. > > > > > > This means you will have to add it to sandbox rules in Chrome, right? So the > > > library should actually use the device path string provided from Chrome to > > > Open() and not have the string hardcoded in the library please. > > > > > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > > > This is a v4l2 decoder device name which we use to initialize a > decoder > > > > > context > > > > > > within the libtegrav4l2 library. > > > > > > This can be anything really as long as decoder and encoder device > names > > > are > > > > > > different since we do not open a v4l2 video device underneath. > > > Libtegrav4l2 > > > > is > > > > > > really a pseudo implementation. I can change it /dev/tegra-dec and > > > > > > /dev/tegra-enc for it to mean tegra specific. > > > > > > > > > > Which device is actually being used? Does the library just talk to DRM > > > driver > > > > > via custom ioctls? > > > > > > > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > > > Is this is the codec device exposed by Tegra kernel driver? > > > > > > > > > > > > > > You can't assume it will be this on all configurations. > > > > > > > Please use udev rules to create a codec specific device (see Exynos > > > > example > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/HEAD/o...) > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:58: // No real munmap for tegrav4l2 device. On 2014/02/14 07:36:10, Pawel Osciak wrote: > On 2014/02/14 03:06:45, shivdasp wrote: > > Understood. I believe the dmabuf export mechanism takes care of this in Exynos > > since the buffer backed memory is in kernel and hence the deletion is kind of > > synchronized. > > Yes, although it's not "synchronized", but the ownership is managed and > refcounted in the driver and the memory is freed when there is no more users. > How does Tegra manage this? I checked with the graphics team here. This is handled. Since the memory for EglImage is refcounted and is backed-up by the graphics library itself, the texture bound to a destroyed eglImage can still be rendered. The V4L2 REQBUFS(0) shall de-allocate only the buffer allocated by the "actual decoder" which are the YUV buffers and since there is no more conversion happening after REQBUFS(0) this is handled too. > > > > > I see in DestroyOutputbuffers(), before calling DismissPicture(), the > > eglDestroyImageKHR() is called thereby the textures are unbound or perhaps the > > deletion there is also delayed ? > > Destroying the image doesn't destroy the textures, but it does unbind them. > > > How is the texture being rendered if the eglImage it is bound to is also > > destroyed ? > > Texture is a separate entity from the eglImage, they share the underlying memory > after binding, but after eglImage is destroyed, the memory is not freed and it > lives on in the texture. > > > > > On 2014/02/13 10:42:54, Pawel Osciak wrote: > > > On 2014/02/12 10:11:55, shivdasp wrote: > > > > If not in REQBUFS(0) then what will be the appropriate place to destroy > the > > > > buffers ? > > > > > > Sorry, perhaps I wasn't clear, the term "buffers" is a bit overloaded. I > mean > > > the underlying memory. REQBUFS(0) may be called, but the actual memory that > > > backed the v4l2_buffers may have to live on if it's still tied to the > > textures. > > > This will be a common case actually, because we don't explicitly destroy > > > textures first unless they are dismissed. The memory should be then freed > when > > > the textures are deleted, not on REQBUFS(0). I'm wondering if the > > library/driver > > > take this into account. > > > Of course, it's still possible for REQUBFS(0) to have to trigger destruction > > of > > > underlying memory, in case the textures get unbound and deleted before > > > REQBUFS(0) is called. > > > > > > > V4L2VDA::DestroyOutputBuffers() calls REQBUFS(0) and hence we destroy it > > > there. > > > > How does the renderer then inform the ownership of textures ? > > > > > > glDeleteTextures(). So the textures and the underlying memory may have to > > > outlive REQBUFS(0). > > > > > > > > > > > > > > > > > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > > > Buffers are unmapped in REQBUFS(0) call and destroyed. > > > > > > Since there is no real need for mmap and munmap, we did not implement > it > > > in > > > > > the > > > > > > library. > > > > > > So our implementation for REQBUF(x) on OUTPUT_PLANE allocates and > mmaps > > > the > > > > > > buffer whereas REQBUF(0) unmaps and destroys them. > > > > > > > > > > We should not rely on V4L2VDA to be the only place where the underlying > > > memory > > > > > will be destroyed. Even if we REQBUFS(0) on these buffers, the renderer > > > > process > > > > > may still be keeping ownership of the textures bound to them. Is this > > taken > > > > into > > > > > account? > > > > > > > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > > > If so, how is unmapping handled then? What if we want to free the > > > buffers > > > > > and > > > > > > > reallocate them? You cannot call REQBUFS(0) without unmapping the > > > buffers > > > > > > > first... > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/tegra_v4l2_video_device.cc:162: planes[0].m.mem_offset = reinterpret_cast<unsigned int>(egl_image); On 2014/02/14 07:36:10, Pawel Osciak wrote: > On 2014/02/14 03:06:45, shivdasp wrote: > > On 2014/02/13 10:42:54, Pawel Osciak wrote: > > > On 2014/02/12 10:11:55, shivdasp wrote: > > > > The output is YUV420 planar. > > > > > > Are all planes non-interleaved and contiguous in memory? If so, then you > need > > to > > > use either V4L2_PIX_FMT_YVU420 ('YV12') or V4L2_PIX_FMT_YUV420 ('YU12'), > > please > > > see http://linuxtv.org/downloads/v4l-dvb-apis/re23.html. > > > > > > Please don't use V4L2_PIX_FMT_NV12M if this is not what codec produces. > > Okay I will change the pixel format. However there are some DHECK_EQ() code in > > V4L2VDA to check against V4L2_PIX_FMT_NV12M. > > They also exist for num_planes. I will have to introduce private member > > functions for ExynosV4L2Device and TegraV4L2Device to check against them > rather > > than hardcoded values in V4L2VDA. Will that be fine ? > > As I said, those checks and should be fixed to use the actual format given by > the device. > There is no need to have private member methods for devices. V4L2 API gives you > methods to query and get formats as well as information how many planes each > format uses. Please see documentation for v4l2_pix_format{,_mplane} > in http://linuxtv.org/downloads/v4l-dvb-apis/vidioc-g-fmt.html. Okay will use the V4L2 API. > > > > > > > > I think rather than using the QUERYBUF to pass the > > > > EglImage handles and stuffing the required information I would rather > > > introduce > > > > a custom API (UseEglImage ?). > > > > I hope that is fine. > > > > > > Yes, it's preferable over using QUERYBUF for this. But let's agree on the > > shape > > > of it. What would UseEglImage do? > > > Could we instead pass the offsets to eglCreateImageKHR? > > > Will we be able to also retain texture binding in V4L2VDA then? > > I was thinking of int UseEglImage(int buffer_index, EGLImageKHR egl_image); > > We basically need to send the EglImage created for a particular buffer_index > so > > the library can convert YUV into its respective EglImage. > > We cannot send offsets to eglCreateImageKHR unless we have extension. However > > the buffer_index internally is the mapping for identifying the eglImage so in > a > > way that will work like an offset. > > I really don't see why it should be an issue to create such an extension with > very minimal effort. > It should be a trivial wrapper around eglCreateImageKHR. Your library has to > call some function in the driver anyway to do this, so why not extract that code > and move it to a special case in eglCreateImageKHR instead? The code is already > written I assume, since you use it, it just needs to be moved to a different > location (i.e. eglCreateImageKHR implementation). The graphics stack is owned by a separate team so I don't really understand the implementation issues if any. I will check if there is such plan in the meanwhile let me address all these review comments and send out second patchset. > > > > > > > > > > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > > > We are really passing in the EglImage handle here to the library. > > > > > > > > > > EGLImageKHR is typedef void*, which can be 64 bit. It cannot be passed > in > > an > > > > u32 > > > > > variable. > > > > > > > > > > The whole idea behind offsets is that they are usually not really > offsets, > > > but > > > > > sort of platform-independent 32bit IDs. They are acquired from the V4L2 > > > driver > > > > > (or library) via QUERYBUFS, and can be passed back to other calls to > > > uniquely > > > > > identify the buffers (e.g. to mmap). > > > > > > > > > > The client is not supposed to generate them by itself and pass them to > > > > > QUERYBUFS. > > > > > > > > > > > The library > > > > > > associates this with the corresponding v4l2_buffer on the CAPTURE > plane > > > and > > > > > use > > > > > > the underlying conversion APIs to transform the decoder's yuv output > > into > > > > the > > > > > > egl image. > > > > > > We have two planes on CAPTURE_PLANE to comply with V4L2VDA code where > > the > > > > > number > > > > > > of planes are checked with 2 (line #1660 of V4L2VDA). > > > > > > > > > > You mean > > > > > > > > > > > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu... > > > > > ? > > > > > > > > > > This is an overassumption from the time where there was only one format > > > > > supported. > > > > > The number of planes to be used should be taken from the v4l2_format > > struct, > > > > > returned from G_FMT. This assumption should be fixed. > > > > > > > > > > From what I'm seeing here, your HW doesn't really use > V4L2_PIX_FMT_NV12M? > > > > Which > > > > > fourcc format does it use? Are the planes separate memory buffers? > > > > > > > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > > > mem_offset is u32 always, an unsigned int shouldn't be assigned to > it. > > > > Also, > > > > > > > there are two planes, but passing only one offset is a bit > > inconsistent. > > > > > > > Although, why have two planes, if only one is used? > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://codereview.chromium.org/137023008/diff/80001/content/common/gpu/media... content/common/gpu/media/v4l2_video_decode_accelerator.cc:382: glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image); On 2014/02/14 07:36:10, Pawel Osciak wrote: > On 2014/02/14 03:06:45, shivdasp wrote: > > On 2014/02/13 10:42:54, Pawel Osciak wrote: > > > On 2014/02/12 10:11:55, shivdasp wrote: > > > > On 2014/02/12 09:15:13, Pawel Osciak wrote: > > > > > On 2014/02/10 13:31:17, shivdasp wrote: > > > > > > The decoder's output buffers are created when REQBUFS(x) is called on > > > > > > CAPTURE_PLANE. These buffers are hardware buffers which can be shared > > with > > > > the > > > > > > AVP processor for decoder to write into. > > > > > > > > > > By decoder do you mean V4L2VDA class? > > > > > > > > No I meant the decoder entity within the library. > > > > > > > > > > > Then V4L2VDA triggers ProvidePictureBuffers() on which textures are > > > created > > > > > and > > > > > > sent back in AssignPictureBuffers(). > > > > > > Now V4L2VDA creates EglImages from these textures and sends each > > EglImage > > > > > handle > > > > > > to library using the QUERYBUF (but can use a custom call too). The > > > tegrav4l2 > > > > > > library cannot create EglImages from DMABUFS like in Exynos since > there > > is > > > > no > > > > > > such extension. We create EglImage from this texture itself so there > is > > a > > > > > > binding between texture and eglImage. > > > > > > > > > > Sounds like the eglCreateImage extension taking offsets I described in > the > > > > > comment in tegra_v4l2_video_device.cc could work for this? > > > > Unfortunately there is no such extension today. > > > > > > > > > > > Now when this EglImage is sent to libtegrav4l2, it is mapped with the > > > > > > corresponding decoder buffer created in REQBUF() call. > > > > > > This way there is one map of EglImage, texture and decoder buffer. > > > > > > > > > > My understanding is you mean the buffer is bound to a texture? If so, > then > > > it > > > > > also seems like we could use the current bind texture to eglimage calls? > > > > The libtegrav4l2 talks to another internal library which actually creates > > the > > > > YUV buffer. This is what is given to the AVP and where the decoded output > is > > > > actually filled. > > > > There is a corresponding RGB buffer created when the EGLImage is called, > > this > > > is > > > > owned by the graphics library. While enqueuing buffers for CAPTURE PLANE, > > > there > > > > is a conversion performed to do YUV to RGB. > > > > > > So the YUV buffers are tied to the textures somehow? > > We send texture_id to eglCreateImageKHR and bind it there. And eglImage is > sent > > to the library which maps it to its YUV buffer. > > My subsequent patch will probably make this clearer. > > Wait, where do you send texture_id to eglCreateImageKHR? I don't see that in the > code above. > Do you have an extension for eglCreateImageKHR to also accept texture ids and > bind during creation? Why not do this in the standard way, i.e. by using > GL_OES_EGL_image_external? I would expect your EGL implementation already has it > for other things (and it's an extension created by NVIDIA too)... > Or did you mean some other function? The texture_id is sent in eglCreateImageKHR parameter. See TegraV4L2Device implementation of CreateEGLImage(). I will submit a bug with my findings and repro steps. > > > > > > > > > > > > > > > When any buffer is enqueued in QBUF, the library sends it down to the > > > > decoder. > > > > > > Once the decoder buffer is ready, the library uses graphics apis to > > > populate > > > > > the > > > > > > corresponding EglImage with the RGB data and then pushes into a queue > > > > thereby > > > > > > making it available for DQBUF after which this buffer can be used only > > > when > > > > it > > > > > > is back in QBUF call. > > > > > > This way the buffer ownership is managed. > > > > > > So in summary the library uses queues and does all the buffer > management > > > > > between > > > > > > decoder and the graphics stack for conversion. > > > > > > > > > > What happens when this class calls REQBUFS(0), but the corresponding > > > textures > > > > > are being rendered to the screen? > > > > > How will the buffers be freed if the GPU process crashes without calling > > > > > REQBUFS(0)? > > > > > What happens when the bound textures are deleted, but the HW codec is > > still > > > > > using them? > > > > > > > > I guess I am missing something here. I did not understand "REQBUFS(0) is > > > called > > > > but corresponding textures are being rendered ?". Doesn't > > > DestroyOutputBuffers() > > > > call guarantee that buffers on CAPTURE plane are no longer used. > > > > > > The underlying memory can still be used as textures in the client of VDA > > class. > > > It only guarantees that they are not used anymore by the codec class as > > > v4l2_buffers. > > > > > > > I will confirm about the buffer freeing in gpu process crash scenario. > > > > > > Thanks. > > If the EGLimage is destroyed I think the texture becomes unbound. I was > > debugging some scenario and I get errors as "texture not bound or texture id > 0" > > kind of errors from gles2_cmd_decoder.cc. These I guess represent this crash > > scenario. So it is taken care of already while validating the texture before > > rendering ? > > If you are getting those errors, then there is definitely something wrong going > on. > > > And I observe similar kind of logs on Exynos too. > > That is even more worrying. Could you please submit a bug for Exynos with repro > steps? > > > Do you have a test case or steps of validating this ? Will killing gpu process > > while video playback validate this path ? > > It should. > > > > > > > > The last scenario (bound texture are deleted but HW codec is still using > > them) > > > > is taken care by the conversion step performed using the library. > > > > The texture is > > > > bound to the EGlImage. So that binding will fail. Since the libtegrav4l2 > has > > > the > > > > EglImage backed by a RGB buffer the conversion can happen. How can I test > > this > > > > scenario ? > > > > > > This is just a case where there is a bug in the code, but my point is that > the > > > ownership should be shared with the kernel as well, so if the userspace > > (Chrome) > > > dies, the kernel will properly clean up. > > > > > > > > > > > > > > > > > > > > > > > > On 2014/02/10 06:36:17, Pawel Osciak wrote: > > > > > > > I would like to understand the big picture here please. > > > > > > > > > > > > > > We strive to stay as close as possible to using (and/or creating) > > > > > > > platform-independent standards where we can, like the sequence > above, > > > > > instead > > > > > > of > > > > > > > providing custom calls for each platform. Removing this from here > and > > > TVDA > > > > > is > > > > > > a > > > > > > > step into an opposite direction, and I would like to understand what > > > > > technical > > > > > > > difficulties force us to do this first. > > > > > > > > > > > > > > Binding textures to EGLImages also serves to keep track of > ownership. > > > > There > > > > > > are > > > > > > > multiple users of the shared buffer, the renderer, the GPU that > > renders > > > > the > > > > > > > textures, this class and the HW codec. How is ownership/destruction > > > > managed > > > > > > and > > > > > > > how is it ensured that the buffer is valid while any of the users > are > > > > still > > > > > > > referring to/using it (both in userspace and in kernel)? > > > > > > > > > > > > > > What happens if the renderer crashes and the codec is writing to the > > > > > textures? > > > > > > > What happens when this class is destroyed, but the texture is in the > > > > > renderer? > > > > > > > What happens when the whole Chrome crashes, but the HW codec is > using > > a > > > > > buffer > > > > > > > (i.e. kernel has ownership)? > > > > > > > > > > > > > > Could you please explain how is ownership managed for shared buffers > > on > > > > > Tegra? > > > > > > > > > > > > > > > > > > > > >
Incorporated review comments, please take a look.
https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if (!AppendToInputFrame(data, size)) I think there is a bug here. Imagine a situation where AppendToInputFrame() fails because free_input_buffers_.empty() is true. This could very well happen if the VIDIOC_G_FMT on CAPTURE_PLANE did not succeed before we endup drying up the buffers. And once this condition is hit, we never try to dequeue nor check formatinfo.
On 2014/02/19 21:23:58, shivdasp wrote: > Incorporated review comments, please take a look. Hi Shivdas, Could you please let us know about: 1. The status/possibility of implementing an eglCreateImageKHR extension that we discussed instead? 2. How could we go back to using texture binding in the EGLImage flow, which we would prefer keeping? Thanks.
https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if (!AppendToInputFrame(data, size)) On 2014/02/20 08:51:02, shivdasp wrote: > I think there is a bug here. Imagine a situation where AppendToInputFrame() > fails because free_input_buffers_.empty() is true. > This could very well happen if the VIDIOC_G_FMT on CAPTURE_PLANE did not succeed > before we endup drying up the buffers. And once this condition is hit, we never > try to dequeue nor check formatinfo. If free_input_buffers_ is empty in ATIF(), then it will attempt to Dequeue() input buffers for reuse. If that fails, something is wrong with the driver. It should not keep input buffers if they are not needed for decode, and I don't see a situation when it would need to keep input buffers if we don't have format information yet. Could you provide a specific example and an example stream when this would happen? Does this reproduce on Exynos?
On 2014/02/21 05:32:43, Pawel Osciak wrote: > On 2014/02/19 21:23:58, shivdasp wrote: > > Incorporated review comments, please take a look. > > Hi Shivdas, > Could you please let us know about: > 1. The status/possibility of implementing an eglCreateImageKHR extension that we > discussed instead? > 2. How could we go back to using texture binding in the EGLImage flow, which we > would prefer keeping? > Thanks. Hi Pawel, 1. Creating EGLImage extension from the offset is very specific to libtegrav4l2 use-case hence it is not being supported & there is no plan as such to add such extension. 2. I can call glBindTexture(2D, texture_id) in the CreateEGLImage() but that seems unnecessary since the eglCreateImageKHR() already takes that arguement. Am I missing something ? Do you see anything functionally wrong in this way of creating EglImage() so that I can build up a strong case ? Thanks
Ping on whether there is a need to change EglImage creation part. On 2014/02/21 07:06:53, shivdasp wrote: > On 2014/02/21 05:32:43, Pawel Osciak wrote: > > On 2014/02/19 21:23:58, shivdasp wrote: > > > Incorporated review comments, please take a look. > > > > Hi Shivdas, > > Could you please let us know about: > > 1. The status/possibility of implementing an eglCreateImageKHR extension that > we > > discussed instead? > > 2. How could we go back to using texture binding in the EGLImage flow, which > we > > would prefer keeping? > > Thanks. > Hi Pawel, > > 1. Creating EGLImage extension from the offset is very specific to libtegrav4l2 > use-case hence it is not being supported & there is no plan as such to add such > extension. > 2. I can call glBindTexture(2D, texture_id) in the CreateEGLImage() but that > seems unnecessary since the eglCreateImageKHR() already takes that arguement. Am > I missing something ? > > Do you see anything functionally wrong in this way of creating EglImage() so > that I can build up a strong case ? > > Thanks
https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if (!AppendToInputFrame(data, size)) The issue is that we try to Dequeue() only once in ATIF() and if it fails we stall this thread. Dequeue() cannot be guaranteed to succeed and provide back a buffer immediately when the client expects. I can see this issue on Tegra in 1 out of 5 times, depending upon how the threads are scheduled. I think until we move out of the kInitialized into KDecoding, state we should try and scheduleDecodeBufferTask and call GetFormatInfo() irrespective of whether Dequeue() fails. My current fix is to go to GetFormatInfo() below which keeps the thread going. Have not tested on Exynos so can't say. On 2014/02/21 05:36:42, Pawel Osciak wrote: > On 2014/02/20 08:51:02, shivdasp wrote: > > I think there is a bug here. Imagine a situation where AppendToInputFrame() > > fails because free_input_buffers_.empty() is true. > > This could very well happen if the VIDIOC_G_FMT on CAPTURE_PLANE did not > succeed > > before we endup drying up the buffers. And once this condition is hit, we > never > > try to dequeue nor check formatinfo. > > > If free_input_buffers_ is empty in ATIF(), then it will attempt to Dequeue() > input buffers for reuse. If that fails, something is wrong with the driver. It > should not keep input buffers if they are not needed for decode, and I don't see > a situation when it would need to keep input buffers if we don't have format > information yet. > > Could you provide a specific example and an example stream when this would > happen? > Does this reproduce on Exynos?
On 2014/02/14 09:18:58, shivdasp wrote: > > Yes, although it's not "synchronized", but the ownership is managed and > > refcounted in the driver and the memory is freed when there is no more users. > > How does Tegra manage this? > I checked with the graphics team here. This is handled. Since the memory for > EglImage is refcounted and is backed-up by the graphics library itself, the > texture bound to a destroyed eglImage can still be rendered. The V4L2 REQBUFS(0) > shall de-allocate only the buffer allocated by the "actual decoder" which are > the YUV buffers and since there is no more conversion happening after REQBUFS(0) > this is handled too. They still have to be tied to each other and refcounted together. At which point will the conversion from the yuv decoder buffer be done into the texture? At render time or at decode time? If at render time, if REQBUFS(0) is called before the texture is rendered, then we have to retain the decoder buffer until it's converted and put into the texture before deleting it. If at decode time, what if renderer frees the texture and we are still decoding and try to convert into that deleted texture afterwards?
On 2014/02/21 07:06:53, shivdasp wrote: > On 2014/02/21 05:32:43, Pawel Osciak wrote: > > On 2014/02/19 21:23:58, shivdasp wrote: > > > Incorporated review comments, please take a look. > > > > Hi Shivdas, > > Could you please let us know about: > > 1. The status/possibility of implementing an eglCreateImageKHR extension that > we > > discussed instead? > > 2. How could we go back to using texture binding in the EGLImage flow, which > we > > would prefer keeping? > > Thanks. > Hi Pawel, > > 1. Creating EGLImage extension from the offset is very specific to libtegrav4l2 > use-case hence it is not being supported & there is no plan as such to add such > extension. > > 2. I can call glBindTexture(2D, texture_id) in the CreateEGLImage() but that > seems unnecessary since the eglCreateImageKHR() already takes that arguement. Am > I missing something ? > > Do you see anything functionally wrong in this way of creating EglImage() so > that I can build up a strong case ? You still need that TegraV4L2_UseEglImage instead of the extension. But if they are separate buffers, then we don't want to create an illusion that they are the same thing. So binding is not a good idea probably, you are right. The problem I'm seeing is in my other response that I just sent in the previous message, to the deallocation problem. Please respond to that so we could think more what to do. Thanks!
On 2014/02/25 09:38:14, shivdasp wrote: > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if > (!AppendToInputFrame(data, size)) > The issue is that we try to Dequeue() only once in ATIF() and if it fails we > stall this thread. Dequeue() cannot be guaranteed to succeed and provide back a > buffer immediately when the client expects. I can see this issue on Tegra in 1 > out of 5 times, depending upon how the threads are scheduled. > I think until we move out of the kInitialized into KDecoding, state we should > try and scheduleDecodeBufferTask and call GetFormatInfo() irrespective of > whether Dequeue() fails. Right, I see now. The problem is with Dequeue(). But I need to think a little bit longer how to solve this. We cannot move out to kDecoding, because we don't have buffers allocated yet and if we do, then we will break allocation, resolution change, resets and other things.
On 2014/02/26 10:57:59, Pawel Osciak wrote: > On 2014/02/14 09:18:58, shivdasp wrote: > > > > Yes, although it's not "synchronized", but the ownership is managed and > > > refcounted in the driver and the memory is freed when there is no more > users. > > > How does Tegra manage this? > > I checked with the graphics team here. This is handled. Since the memory for > > EglImage is refcounted and is backed-up by the graphics library itself, the > > texture bound to a destroyed eglImage can still be rendered. The V4L2 > REQBUFS(0) > > shall de-allocate only the buffer allocated by the "actual decoder" which are > > the YUV buffers and since there is no more conversion happening after > REQBUFS(0) > > this is handled too. > > They still have to be tied to each other and refcounted together. At which point > will the conversion from the yuv decoder buffer be done into the texture? At The YUV conversion happens at decode time into the EGLImage buffer when DQBUF is called. > render time or at decode time? If at render time, if REQBUFS(0) is called before > the texture is rendered, then we have to retain the decoder buffer until it's > converted and put into the texture before deleting it. If at decode time, what > if renderer frees the texture and we are still decoding and try to convert into > that deleted texture afterwards? If REQBUFS(0) is called, the eglImage is destroyed but since the texture is still valid & refcounts the eglimage memory it can still be rendered. If the renderer frees the texture, the eglImage is still valid (since it is refcounted) we can still decode and convert it into the eglimage. With this TVDA patch I have tested dash player and resolution change happens smoothly. I could also test resolution change through youtube player. Killing gpu-process while playback also works as in Exynos (atleast from the logs the behavior seems same). Is there any specific test you want me to run and verify this case ?
On 2014/02/26 11:06:14, Pawel Osciak wrote: > On 2014/02/25 09:38:14, shivdasp wrote: > > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > > > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if > > (!AppendToInputFrame(data, size)) > > The issue is that we try to Dequeue() only once in ATIF() and if it fails we > > stall this thread. Dequeue() cannot be guaranteed to succeed and provide back > a > > buffer immediately when the client expects. I can see this issue on Tegra in 1 > > out of 5 times, depending upon how the threads are scheduled. > > I think until we move out of the kInitialized into KDecoding, state we should > > try and scheduleDecodeBufferTask and call GetFormatInfo() irrespective of > > whether Dequeue() fails. > > Right, I see now. The problem is with Dequeue(). But I need to think a little > bit > longer how to solve this. We cannot move out to kDecoding, because we don't have > buffers allocated yet and if we do, then we will break allocation, resolution > change, > resets and other things. I think we should restructure the DecodeBufferInitial() to atleast do ATIF() and GetFormatInfo(). Checking for GetFormatInfo() will put us into the kDecoding state which should have been generated in some later time if not immediately. My current fix (not so elegant) which is working for all cases: DecodeBufferInitial() { if (!ATIF()) { if (free_input_buffers_.empty()) goto chk_format_info; } Dequeue() .. .. .. chk_format_info: GetFormatInfo() .. .. }
On 2014/02/26 14:39:52, shivdasp wrote: > On 2014/02/26 10:57:59, Pawel Osciak wrote: > > On 2014/02/14 09:18:58, shivdasp wrote: > > > > > > Yes, although it's not "synchronized", but the ownership is managed and > > > > refcounted in the driver and the memory is freed when there is no more > > users. > > > > How does Tegra manage this? > > > I checked with the graphics team here. This is handled. Since the memory for > > > EglImage is refcounted and is backed-up by the graphics library itself, the > > > texture bound to a destroyed eglImage can still be rendered. The V4L2 > > REQBUFS(0) > > > shall de-allocate only the buffer allocated by the "actual decoder" which > are > > > the YUV buffers and since there is no more conversion happening after > > REQBUFS(0) > > > this is handled too. > > > > They still have to be tied to each other and refcounted together. At which > point > > will the conversion from the yuv decoder buffer be done into the texture? At > The YUV conversion happens at decode time into the EGLImage buffer when DQBUF is > called. > > render time or at decode time? If at render time, if REQBUFS(0) is called > before > > the texture is rendered, then we have to retain the decoder buffer until it's > > converted and put into the texture before deleting it. If at decode time, what > > if renderer frees the texture and we are still decoding and try to convert > into > > that deleted texture afterwards? > If REQBUFS(0) is called, the eglImage is destroyed but since the texture is > still valid > & refcounts the eglimage memory it can still be rendered. > If the renderer frees the texture, the eglImage is still valid (since it is > refcounted) we can still decode > and convert it into the eglimage. > With this TVDA patch I have tested dash player and resolution change happens > smoothly. > I could also test resolution change through youtube player. Killing gpu-process > while playback also works as in Exynos (atleast from the logs the behavior seems > same). > Is there any specific test you want me to run and verify this case ? Please also test for memory leaks, if you haven't done so, in these scenarios.
On 2014/02/26 14:45:26, shivdasp wrote: > On 2014/02/26 11:06:14, Pawel Osciak wrote: > > On 2014/02/25 09:38:14, shivdasp wrote: > > > > > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > > > > > > > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if > > > (!AppendToInputFrame(data, size)) > > > The issue is that we try to Dequeue() only once in ATIF() and if it fails we > > > stall this thread. Dequeue() cannot be guaranteed to succeed and provide > back > > a > > > buffer immediately when the client expects. I can see this issue on Tegra in > 1 > > > out of 5 times, depending upon how the threads are scheduled. > > > I think until we move out of the kInitialized into KDecoding, state we > should > > > try and scheduleDecodeBufferTask and call GetFormatInfo() irrespective of > > > whether Dequeue() fails. > > > > Right, I see now. The problem is with Dequeue(). But I need to think a little > > bit > > longer how to solve this. We cannot move out to kDecoding, because we don't > have > > buffers allocated yet and if we do, then we will break allocation, resolution > > change, > > resets and other things. > I think we should restructure the DecodeBufferInitial() to atleast do ATIF() and > GetFormatInfo(). > Checking for GetFormatInfo() will put us into the kDecoding state which should > have been generated in some later time if not immediately. > > My current fix (not so elegant) which is working for all cases: > DecodeBufferInitial() { > if (!ATIF()) { > if (free_input_buffers_.empty()) > goto chk_format_info; > } > Dequeue() > .. > .. > .. > chk_format_info: > GetFormatInfo() > .. > .. > } Are you testing with vdatest after r249963? It should fail on this. The correct solution is to wait for the driver to return more input buffers. This should be fixed in V4L2VDA.
On 2014/03/03 05:10:48, Pawel Osciak wrote: > On 2014/02/26 14:39:52, shivdasp wrote: > > On 2014/02/26 10:57:59, Pawel Osciak wrote: > > > On 2014/02/14 09:18:58, shivdasp wrote: > > > > > > > > Yes, although it's not "synchronized", but the ownership is managed and > > > > > refcounted in the driver and the memory is freed when there is no more > > > users. > > > > > How does Tegra manage this? > > > > I checked with the graphics team here. This is handled. Since the memory > for > > > > EglImage is refcounted and is backed-up by the graphics library itself, > the > > > > texture bound to a destroyed eglImage can still be rendered. The V4L2 > > > REQBUFS(0) > > > > shall de-allocate only the buffer allocated by the "actual decoder" which > > are > > > > the YUV buffers and since there is no more conversion happening after > > > REQBUFS(0) > > > > this is handled too. > > > > > > They still have to be tied to each other and refcounted together. At which > > point > > > will the conversion from the yuv decoder buffer be done into the texture? At > > The YUV conversion happens at decode time into the EGLImage buffer when DQBUF > is > > called. > > > render time or at decode time? If at render time, if REQBUFS(0) is called > > before > > > the texture is rendered, then we have to retain the decoder buffer until > it's > > > converted and put into the texture before deleting it. If at decode time, > what > > > if renderer frees the texture and we are still decoding and try to convert > > into > > > that deleted texture afterwards? > > If REQBUFS(0) is called, the eglImage is destroyed but since the texture is > > still valid > > & refcounts the eglimage memory it can still be rendered. > > If the renderer frees the texture, the eglImage is still valid (since it is > > refcounted) we can still decode > > and convert it into the eglimage. > > With this TVDA patch I have tested dash player and resolution change happens > > smoothly. > > I could also test resolution change through youtube player. Killing > gpu-process > > while playback also works as in Exynos (atleast from the logs the behavior > seems > > same). > > Is there any specific test you want me to run and verify this case ? > > Please also test for memory leaks, if you haven't done so, in these scenarios. Yes I have been testing them. If there's anything major comment that you would like me to address here please let me know.
On 2014/03/03 05:16:34, Pawel Osciak wrote: > On 2014/02/26 14:45:26, shivdasp wrote: > > On 2014/02/26 11:06:14, Pawel Osciak wrote: > > > On 2014/02/25 09:38:14, shivdasp wrote: > > > > > > > > > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if > > > > (!AppendToInputFrame(data, size)) > > > > The issue is that we try to Dequeue() only once in ATIF() and if it fails > we > > > > stall this thread. Dequeue() cannot be guaranteed to succeed and provide > > back > > > a > > > > buffer immediately when the client expects. I can see this issue on Tegra > in > > 1 > > > > out of 5 times, depending upon how the threads are scheduled. > > > > I think until we move out of the kInitialized into KDecoding, state we > > should > > > > try and scheduleDecodeBufferTask and call GetFormatInfo() irrespective of > > > > whether Dequeue() fails. > > > > > > Right, I see now. The problem is with Dequeue(). But I need to think a > little > > > bit > > > longer how to solve this. We cannot move out to kDecoding, because we don't > > have > > > buffers allocated yet and if we do, then we will break allocation, > resolution > > > change, > > > resets and other things. > > I think we should restructure the DecodeBufferInitial() to atleast do ATIF() > and > > GetFormatInfo(). > > Checking for GetFormatInfo() will put us into the kDecoding state which should > > have been generated in some later time if not immediately. > > > > My current fix (not so elegant) which is working for all cases: > > DecodeBufferInitial() { > > if (!ATIF()) { > > if (free_input_buffers_.empty()) > > goto chk_format_info; > > } > > Dequeue() > > .. > > .. > > .. > > chk_format_info: > > GetFormatInfo() > > .. > > .. > > } > > Are you testing with vdatest after r249963? It should fail on this. > The correct solution is to wait for the driver to return more input buffers. > This should be fixed in V4L2VDA. I was on older code and yes I do sometimes see some issue with my fix. How about I file a partner bug for this issue since I imagine the fix may not be trivial and will be better to de-couple from this change of adding TegraVDA ?
On 2014/03/03 16:42:27, shivdasp wrote: > On 2014/03/03 05:16:34, Pawel Osciak wrote: > > On 2014/02/26 14:45:26, shivdasp wrote: > > > On 2014/02/26 11:06:14, Pawel Osciak wrote: > > > > On 2014/02/25 09:38:14, shivdasp wrote: > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if > > > > > (!AppendToInputFrame(data, size)) > > > > > The issue is that we try to Dequeue() only once in ATIF() and if it > fails > > we > > > > > stall this thread. Dequeue() cannot be guaranteed to succeed and provide > > > back > > > > a > > > > > buffer immediately when the client expects. I can see this issue on > Tegra > > in > > > 1 > > > > > out of 5 times, depending upon how the threads are scheduled. > > > > > I think until we move out of the kInitialized into KDecoding, state we > > > should > > > > > try and scheduleDecodeBufferTask and call GetFormatInfo() irrespective > of > > > > > whether Dequeue() fails. > > > > > > > > Right, I see now. The problem is with Dequeue(). But I need to think a > > little > > > > bit > > > > longer how to solve this. We cannot move out to kDecoding, because we > don't > > > have > > > > buffers allocated yet and if we do, then we will break allocation, > > resolution > > > > change, > > > > resets and other things. > > > I think we should restructure the DecodeBufferInitial() to atleast do ATIF() > > and > > > GetFormatInfo(). > > > Checking for GetFormatInfo() will put us into the kDecoding state which > should > > > have been generated in some later time if not immediately. > > > > > > My current fix (not so elegant) which is working for all cases: > > > DecodeBufferInitial() { > > > if (!ATIF()) { > > > if (free_input_buffers_.empty()) > > > goto chk_format_info; > > > } > > > Dequeue() > > > .. > > > .. > > > .. > > > chk_format_info: > > > GetFormatInfo() > > > .. > > > .. > > > } > > > > Are you testing with vdatest after r249963? It should fail on this. > > The correct solution is to wait for the driver to return more input buffers. > > This should be fixed in V4L2VDA. > I was on older code and yes I do sometimes see some issue with my fix. > How about I file a partner bug for this issue since I imagine the fix may not be > trivial and will be better to de-couple from this change of adding TegraVDA ? By issue you mean the newest test doesn't pass?
On 2014/03/03 16:42:27, shivdasp wrote: > On 2014/03/03 05:16:34, Pawel Osciak wrote: > > On 2014/02/26 14:45:26, shivdasp wrote: > > > On 2014/02/26 11:06:14, Pawel Osciak wrote: > > > > On 2014/02/25 09:38:14, shivdasp wrote: > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if > > > > > (!AppendToInputFrame(data, size)) > > > > > The issue is that we try to Dequeue() only once in ATIF() and if it > fails > > we > > > > > stall this thread. Dequeue() cannot be guaranteed to succeed and provide > > > back > > > > a > > > > > buffer immediately when the client expects. I can see this issue on > Tegra > > in > > > 1 > > > > > out of 5 times, depending upon how the threads are scheduled. > > > > > I think until we move out of the kInitialized into KDecoding, state we > > > should > > > > > try and scheduleDecodeBufferTask and call GetFormatInfo() irrespective > of > > > > > whether Dequeue() fails. > > > > > > > > Right, I see now. The problem is with Dequeue(). But I need to think a > > little > > > > bit > > > > longer how to solve this. We cannot move out to kDecoding, because we > don't > > > have > > > > buffers allocated yet and if we do, then we will break allocation, > > resolution > > > > change, > > > > resets and other things. > > > I think we should restructure the DecodeBufferInitial() to atleast do ATIF() > > and > > > GetFormatInfo(). > > > Checking for GetFormatInfo() will put us into the kDecoding state which > should > > > have been generated in some later time if not immediately. > > > > > > My current fix (not so elegant) which is working for all cases: > > > DecodeBufferInitial() { > > > if (!ATIF()) { > > > if (free_input_buffers_.empty()) > > > goto chk_format_info; > > > } > > > Dequeue() > > > .. > > > .. > > > .. > > > chk_format_info: > > > GetFormatInfo() > > > .. > > > .. > > > } > > > > Are you testing with vdatest after r249963? It should fail on this. > > The correct solution is to wait for the driver to return more input buffers. > > This should be fixed in V4L2VDA. > I was on older code and yes I do sometimes see some issue with my fix. > How about I file a partner bug for this issue since I imagine the fix may not be > trivial and will be better to de-couple from this change of adding TegraVDA ? I'm not against this, but if TVDA fails more often than not on this, then the fix should probably be submitted before TVDA.
https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... File content/common/gpu/media/exynos_v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/exynos_v4l2_video_device.h:36: unsigned int GetTextureTarget() OVERRIDE; While we're at this -- shouldn't these all be explicitly declared virtual? It's not incorrect syntax as it is, just no quite up to style guidelines. https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/tegra_v4l2_video_device.cc:4: // Extra comment line here. https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/tegra_v4l2_video_device.cc:62: return TegraV4L2_Ioctl(device_fd_, flags, arg); No HANDLE_EINTR wrapper needed for this? https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/tegra_v4l2_video_device.cc:66: if (TegraV4L2_Poll(device_fd_, poll_device, event_pending) == -1) { No HANDLE_EINTR wrapper needed for this? https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/tegra_v4l2_video_device.cc:120: TEGRAV4L2_DLSYM_OR_RETURN_ON_ERROR(UseEglImage); Can this sort of thing be done once at startup time? https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/tegra_v4l2_video_device.cc:140: (EGLClientBuffer)(texture_id), static_cast<>; we don't use C-style casts. https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/tegra_v4l2_video_device.cc:149: return egl_image; So as I understand it: In the Exynos case, we export buffers from the video stack and import them to the graphics stack. In the Tegra case, we export buffers from the graphics stack and import them to the tegrav4l2 lib, which does with them what it wants. Responsibility for tracking ownership and making sure things don't leak then rests with the tegrav4l2 lib. Sound about right? https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... File content/common/gpu/media/tegra_v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/tegra_v4l2_video_device.h:39: unsigned int GetTextureTarget() OVERRIDE; Declare these explicitly "virtual" again. https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if (!AppendToInputFrame(data, size)) I think there's an actual bug here. The reason why DecodeBufferInitial() returns false in this case is to avoid immediately scheduling another decode task, since AppendToInputFrame() failed because there's no input buffer available, and trying again would just be a tight loop. The idea is that we should wait until another input buffer frees up. Unfortunately, device_poll_thread_ is not running and so we're not polling and we'll never inform the decoder_thread_ that a buffer frees up when it does. I think the solution is to call StartDevicePoll() earlier, perhaps at the end of Initialize(). All the DCHECKS on decoder_state_ and device_poll_thread_.IsRunning() would naturally have to be re-audited to make sure they fall in line with our new assumptions. https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... File content/common/gpu/media/v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/v4l2_video_device.h:64: EGLint attrib[], Can we make this a "const EGLint*" argument instead? https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/v4l2_video_device.h:65: unsigned int texture_id, texture_id should be GLuint https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/v4l2_video_device.h:69: virtual unsigned int GetTextureTarget() = 0; GL texture targets are GLenum types. Also: would be nice to declare this as const.
I did not notice sheu's comments while making patchset#7. Please have a look regarding the DecodeBufferInitial() bug that I have fixed here. Will address sheu's comment in subsequent patchset. Have tested VDAtest, all tests pass except the SlowRenderingTest which expects the decoded buffers as 250 but on TegraVDA the decoded frames are 240. This I think is because of how EOS is handled in VDA. In my understanding we send a 0 sized buffer in Flush() and then enqueue no more on OUTPUT PLANE. And when all the buffers are dequeued from OUTPUTPLANE we start EOS processing in VDA. This does not guarantee that all the buffers that were decoded and were ready to be dequeued from CAPTURE PLANE were indeed dequeued. This is what happens when SlowRenderingTest fails on Tegra. There are 10 frames ready to be dequeued and before that CAPTURE_STREAM_OFF happens. Thanks https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... File content/common/gpu/media/exynos_v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.h:36: unsigned int GetTextureTarget() OVERRIDE; On 2014/03/06 08:51:34, sheu wrote: > While we're at this -- shouldn't these all be explicitly declared virtual? > > It's not incorrect syntax as it is, just no quite up to style guidelines. Done. https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:4: // On 2014/03/06 08:51:34, sheu wrote: > Extra comment line here. Done. https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:62: return TegraV4L2_Ioctl(device_fd_, flags, arg); On 2014/03/06 08:51:34, sheu wrote: > No HANDLE_EINTR wrapper needed for this? Done. https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:66: if (TegraV4L2_Poll(device_fd_, poll_device, event_pending) == -1) { On 2014/03/06 08:51:34, sheu wrote: > No HANDLE_EINTR wrapper needed for this? Done. https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:120: TEGRAV4L2_DLSYM_OR_RETURN_ON_ERROR(UseEglImage); Initialize() is called when the V4L2Device is created which is the earliest. Else will have to do it in a static method and have it called from somewhere. On 2014/03/06 08:51:34, sheu wrote: > Can this sort of thing be done once at startup time? https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:140: (EGLClientBuffer)(texture_id), On 2014/03/06 08:51:34, sheu wrote: > static_cast<>; we don't use C-style casts. Done. https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:149: return egl_image; Yes that's right. tegrav4l2lib takes care of destroying them internally. On 2014/03/06 08:51:34, sheu wrote: > So as I understand it: > > In the Exynos case, we export buffers from the video stack and import them to > the graphics stack. > In the Tegra case, we export buffers from the graphics stack and import them to > the tegrav4l2 lib, which does with them what it wants. Responsibility for > tracking ownership and making sure things don't leak then rests with the > tegrav4l2 lib. > > Sound about right? https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.h:39: unsigned int GetTextureTarget() OVERRIDE; On 2014/03/06 08:51:34, sheu wrote: > Declare these explicitly "virtual" again. Done. https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:713: if (!AppendToInputFrame(data, size)) I have attempted a fix for this in patchset#7. Could you take a look at that ? I agree there's a tight loop but the message is posted back to the decoder_thread and with what I have tested, the thread scheduling allows the underlying decoder thread to "report" stream format and get into the kDecoding state. Please suggest how to go about this. On Exynos I believe this condition seldom hits. On 2014/03/06 08:51:34, sheu wrote: > I think there's an actual bug here. The reason why DecodeBufferInitial() > returns false in this case is to avoid immediately scheduling another decode > task, since AppendToInputFrame() failed because there's no input buffer > available, and trying again would just be a tight loop. The idea is that we > should wait until another input buffer frees up. > > Unfortunately, device_poll_thread_ is not running and so we're not polling and > we'll never inform the decoder_thread_ that a buffer frees up when it does. > > I think the solution is to call StartDevicePoll() earlier, perhaps at the end of > Initialize(). All the DCHECKS on decoder_state_ and > device_poll_thread_.IsRunning() would naturally have to be re-audited to make > sure they fall in line with our new assumptions. https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:64: EGLint attrib[], On 2014/03/06 08:51:34, sheu wrote: > Can we make this a "const EGLint*" argument instead? Done. https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:65: unsigned int texture_id, On 2014/03/06 08:51:34, sheu wrote: > texture_id should be GLuint Done. https://codereview.chromium.org/137023008/diff/510001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:69: virtual unsigned int GetTextureTarget() = 0; On 2014/03/06 08:51:34, sheu wrote: > GL texture targets are GLenum types. Also: would be nice to declare this as > const. Done.
https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/510001/content/common/g... content/common/gpu/media/tegra_v4l2_video_device.cc:120: TEGRAV4L2_DLSYM_OR_RETURN_ON_ERROR(UseEglImage); On 2014/03/06 11:10:09, shivdasp wrote: > Initialize() is called when the V4L2Device is created which is the earliest. > Else will have to do it in a static method and have it called from somewhere. > On 2014/03/06 08:51:34, sheu wrote: > > Can this sort of thing be done once at startup time? > We could do this once statically similar to how VaapiWrapper does it. https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto chk_format_info; We seem to be allergic to "goto" statements in Chrome, for not entirely bad reasons. But also I don't see what jumping to GetFormatInfo would buy us. It seems to me that if we're out of buffers, whether we have format info or not does not correlate with whether we'll soon be getting buffers. https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g... File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g... content/common/gpu/media/video_decode_accelerator_unittest.cc:1563: #endif VaapiWrapper does the dlopen once statically in Initialize(), which has the nice effect of making this not necessary. https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/s... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/s... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:226: // resetting errno since it is expected to fail on non-Tegra platforms. Better: "Resetting errno since platform-specific libraries will fail on other platforms.
I'm considering a fix for the stuck DecodeBufferInitial() in: http://chromiumcodereview.appspot.com/189993002/
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto chk_format_info; Jumping to GetFormatInfo() will move us out of kInitialized state which is what the bug really was. We did not try to check for format info if we were out of buffers while in kInitialized state. I am not sure how does starting of device poll earlier will solve this stuck up problem. I will test it on my monday and update. The problem I see is that if we do not allocate buffers on CAPTURE plane (which will happen after formatinfo is set) the buffers on OUTPUT plane will not get consumed and hence will not be dequeued. So even with device polling we cannot send more bitstream. We should have got the FormatInfo() after processing the first buffer itself but if the detection is getting delayed we should keep checking for it while sending more bitstream buffers. On 2014/03/07 00:18:08, sheu wrote: > We seem to be allergic to "goto" statements in Chrome, for not entirely bad > reasons. > > But also I don't see what jumping to GetFormatInfo would buy us. It seems to me > that if we're out of buffers, whether we have format info or not does not > correlate with whether we'll soon be getting buffers.
https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto chk_format_info; On 2014/03/07 17:31:40, shivdasp wrote: > Jumping to GetFormatInfo() will move us out of kInitialized state which is what > the bug really was. We did not try to check for format info if we were out of > buffers while in kInitialized state. > I am not sure how does starting of device poll earlier will solve this stuck up > problem. I will test it on my monday and update. > The problem I see is that if we do not allocate buffers on CAPTURE plane (which > will happen after formatinfo is set) the buffers on OUTPUT plane will not get > consumed and hence will not be dequeued. So even with device polling we cannot > send more bitstream. We should have got the FormatInfo() after processing the > first buffer itself but if the detection is getting delayed we should keep > checking for it while sending more bitstream buffers. > > > On 2014/03/07 00:18:08, sheu wrote: > > We seem to be allergic to "goto" statements in Chrome, for not entirely bad > > reasons. > > > > But also I don't see what jumping to GetFormatInfo would buy us. It seems to > me > > that if we're out of buffers, whether we have format info or not does not > > correlate with whether we'll soon be getting buffers. > Ah, i see how it is. So if I have this right: the problem is that we queue data to the OUTPUT queue for the decoder to initialize, but the decoder initialization happens asynchronously. So VIDIOC_G_FMT on the CAPTURE queue may not succeed immediately; instead at some undefined point in the future the decoder initialization will finish and VIDIOC_G_FMT will start returning the proper format. This is a problem in TegraVDA since when we fail to get the format after each buffer we queue, we continue trying to queue buffers; the race is between the decoder initialization finishing and the VDA running out of buffers to continue to enqueue with. For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE queue is synchronous w.r.t. the decoder initialization; i.e. it blocks until we know for sure whether the decoder has initialized with the given input, or not. That brings to mind two possible solutions in the TegraVDA case: 1. Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the Exynos case 2. Add a notification system for decoder initialization. I'd go with the V4L2_EVENT system, by adding a private event for the Tegra driver (see: V4L2_EVENT_PRIVATE_START). (1) would be a faster fix than (2). Tight-loop polling (by constant reposting of the task) is not the way to go.
https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto chk_format_info; On 2014/03/07 20:31:50, sheu wrote: > On 2014/03/07 17:31:40, shivdasp wrote: > > Jumping to GetFormatInfo() will move us out of kInitialized state which is > what > > the bug really was. We did not try to check for format info if we were out of > > buffers while in kInitialized state. > > I am not sure how does starting of device poll earlier will solve this stuck > up > > problem. I will test it on my monday and update. > > The problem I see is that if we do not allocate buffers on CAPTURE plane > (which > > will happen after formatinfo is set) the buffers on OUTPUT plane will not get > > consumed and hence will not be dequeued. So even with device polling we cannot > > send more bitstream. We should have got the FormatInfo() after processing the > > first buffer itself but if the detection is getting delayed we should keep > > checking for it while sending more bitstream buffers. > > > > > > On 2014/03/07 00:18:08, sheu wrote: > > > We seem to be allergic to "goto" statements in Chrome, for not entirely bad > > > reasons. > > > > > > But also I don't see what jumping to GetFormatInfo would buy us. It seems > to > > me > > > that if we're out of buffers, whether we have format info or not does not > > > correlate with whether we'll soon be getting buffers. > > > > Ah, i see how it is. So if I have this right: the problem is that we queue data > to the OUTPUT queue for the decoder to initialize, but the decoder > initialization happens asynchronously. So VIDIOC_G_FMT on the CAPTURE queue may > not succeed immediately; instead at some undefined point in the future the > decoder initialization will finish and VIDIOC_G_FMT will start returning the > proper format. This is a problem in TegraVDA since when we fail to get the > format after each buffer we queue, we continue trying to queue buffers; the race > is between the decoder initialization finishing and the VDA running out of > buffers to continue to enqueue with. > > For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE queue is > synchronous w.r.t. the decoder initialization; i.e. it blocks until we know for > sure whether the decoder has initialized with the given input, or not. That > brings to mind two possible solutions in the TegraVDA case: > > 1. Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the Exynos > case > 2. Add a notification system for decoder initialization. I'd go with the > V4L2_EVENT system, by adding a private event for the Tegra driver (see: > V4L2_EVENT_PRIVATE_START). > > (1) would be a faster fix than (2). Tight-loop polling (by constant reposting > of the task) is not the way to go. Random thought for posciak@: if we make the decoder init notify through the event system, we might even be able to unify this with the resolution change event handling. WDYT?
https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto chk_format_info; Ohh that's why this never happens on Exynos. Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane synchronous looks the quicker solution and we wouldn't have to change a thing in VDA. I will make that change. Will re-post a patch removing this fix in VDA and addressing your comments too. Thanks. On 2014/03/07 21:36:29, sheu wrote: > On 2014/03/07 20:31:50, sheu wrote: > > On 2014/03/07 17:31:40, shivdasp wrote: > > > Jumping to GetFormatInfo() will move us out of kInitialized state which is > > what > > > the bug really was. We did not try to check for format info if we were out > of > > > buffers while in kInitialized state. > > > I am not sure how does starting of device poll earlier will solve this stuck > > up > > > problem. I will test it on my monday and update. > > > The problem I see is that if we do not allocate buffers on CAPTURE plane > > (which > > > will happen after formatinfo is set) the buffers on OUTPUT plane will not > get > > > consumed and hence will not be dequeued. So even with device polling we > cannot > > > send more bitstream. We should have got the FormatInfo() after processing > the > > > first buffer itself but if the detection is getting delayed we should keep > > > checking for it while sending more bitstream buffers. > > > > > > > > > On 2014/03/07 00:18:08, sheu wrote: > > > > We seem to be allergic to "goto" statements in Chrome, for not entirely > bad > > > > reasons. > > > > > > > > But also I don't see what jumping to GetFormatInfo would buy us. It seems > > to > > > me > > > > that if we're out of buffers, whether we have format info or not does not > > > > correlate with whether we'll soon be getting buffers. > > > > > > > Ah, i see how it is. So if I have this right: the problem is that we queue > data > > to the OUTPUT queue for the decoder to initialize, but the decoder > > initialization happens asynchronously. So VIDIOC_G_FMT on the CAPTURE queue > may > > not succeed immediately; instead at some undefined point in the future the > > decoder initialization will finish and VIDIOC_G_FMT will start returning the > > proper format. This is a problem in TegraVDA since when we fail to get the > > format after each buffer we queue, we continue trying to queue buffers; the > race > > is between the decoder initialization finishing and the VDA running out of > > buffers to continue to enqueue with. > > > > For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE queue > is > > synchronous w.r.t. the decoder initialization; i.e. it blocks until we know > for > > sure whether the decoder has initialized with the given input, or not. That > > brings to mind two possible solutions in the TegraVDA case: > > > > 1. Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the > Exynos > > case > > 2. Add a notification system for decoder initialization. I'd go with the > > V4L2_EVENT system, by adding a private event for the Tegra driver (see: > > V4L2_EVENT_PRIVATE_START). > > > > (1) would be a faster fix than (2). Tight-loop polling (by constant reposting > > of the task) is not the way to go. > > Random thought for posciak@: if we make the decoder init notify through the > event system, we might even be able to unify this with the resolution change > event handling. WDYT?
On 2014/03/07 21:36:29, sheu wrote: > https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g... > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > https://chromiumcodereview.appspot.com/137023008/diff/660001/content/common/g... > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto > chk_format_info; > On 2014/03/07 20:31:50, sheu wrote: > > On 2014/03/07 17:31:40, shivdasp wrote: > > > Jumping to GetFormatInfo() will move us out of kInitialized state which is > > what > > > the bug really was. We did not try to check for format info if we were out > of > > > buffers while in kInitialized state. > > > I am not sure how does starting of device poll earlier will solve this stuck > > up > > > problem. I will test it on my monday and update. > > > The problem I see is that if we do not allocate buffers on CAPTURE plane > > (which > > > will happen after formatinfo is set) the buffers on OUTPUT plane will not > get > > > consumed and hence will not be dequeued. So even with device polling we > cannot > > > send more bitstream. We should have got the FormatInfo() after processing > the > > > first buffer itself but if the detection is getting delayed we should keep > > > checking for it while sending more bitstream buffers. > > > > > > > > > On 2014/03/07 00:18:08, sheu wrote: > > > > We seem to be allergic to "goto" statements in Chrome, for not entirely > bad > > > > reasons. > > > > > > > > But also I don't see what jumping to GetFormatInfo would buy us. It seems > > to > > > me > > > > that if we're out of buffers, whether we have format info or not does not > > > > correlate with whether we'll soon be getting buffers. > > > > > > > Ah, i see how it is. So if I have this right: the problem is that we queue > data > > to the OUTPUT queue for the decoder to initialize, but the decoder > > initialization happens asynchronously. So VIDIOC_G_FMT on the CAPTURE queue > may > > not succeed immediately; instead at some undefined point in the future the > > decoder initialization will finish and VIDIOC_G_FMT will start returning the > > proper format. This is a problem in TegraVDA since when we fail to get the > > format after each buffer we queue, we continue trying to queue buffers; the > race > > is between the decoder initialization finishing and the VDA running out of > > buffers to continue to enqueue with. > > > > For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE queue > is > > synchronous w.r.t. the decoder initialization; i.e. it blocks until we know > for > > sure whether the decoder has initialized with the given input, or not. That > > brings to mind two possible solutions in the TegraVDA case: > > > > 1. Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the > Exynos > > case > > 2. Add a notification system for decoder initialization. I'd go with the > > V4L2_EVENT system, by adding a private event for the Tegra driver (see: > > V4L2_EVENT_PRIVATE_START). > > > > (1) would be a faster fix than (2). Tight-loop polling (by constant reposting > > of the task) is not the way to go. > > Random thought for posciak@: if we make the decoder init notify through the > event system, we might even be able to unify this with the resolution change > event handling. WDYT? Hm, this is actually quite a neat idea... I like it. I discussed this with other V4L2 developers, but we haven't arrived at a consensus yet. I believe this is better than G_FMT though. So this should be a good solution at least for the short term. We can change the behavior in Exynos to use the event for initial G_FMT as well (you could do an ifdef on Exynos in the meantime). Shivdas: what do you think about this (using the resolution change event for initial G_FMT as well)?
On 2014/03/10 05:58:12, shivdasp wrote: > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto > chk_format_info; > Ohh that's why this never happens on Exynos. > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane synchronous > looks the quicker solution and we wouldn't have to change a thing in VDA. > I will make that change. > Will re-post a patch removing this fix in VDA and addressing your comments too. > Does that mean you'd require the first buffer that is queued by client to contain all the info required to make G_FMT work? > Thanks. > On 2014/03/07 21:36:29, sheu wrote: > > On 2014/03/07 20:31:50, sheu wrote: > > > On 2014/03/07 17:31:40, shivdasp wrote: > > > > Jumping to GetFormatInfo() will move us out of kInitialized state which is > > > what > > > > the bug really was. We did not try to check for format info if we were out > > of > > > > buffers while in kInitialized state. > > > > I am not sure how does starting of device poll earlier will solve this > stuck > > > up > > > > problem. I will test it on my monday and update. > > > > The problem I see is that if we do not allocate buffers on CAPTURE plane > > > (which > > > > will happen after formatinfo is set) the buffers on OUTPUT plane will not > > get > > > > consumed and hence will not be dequeued. So even with device polling we > > cannot > > > > send more bitstream. We should have got the FormatInfo() after processing > > the > > > > first buffer itself but if the detection is getting delayed we should keep > > > > checking for it while sending more bitstream buffers. > > > > > > > > > > > > On 2014/03/07 00:18:08, sheu wrote: > > > > > We seem to be allergic to "goto" statements in Chrome, for not entirely > > bad > > > > > reasons. > > > > > > > > > > But also I don't see what jumping to GetFormatInfo would buy us. It > seems > > > to > > > > me > > > > > that if we're out of buffers, whether we have format info or not does > not > > > > > correlate with whether we'll soon be getting buffers. > > > > > > > > > > Ah, i see how it is. So if I have this right: the problem is that we queue > > data > > > to the OUTPUT queue for the decoder to initialize, but the decoder > > > initialization happens asynchronously. So VIDIOC_G_FMT on the CAPTURE queue > > may > > > not succeed immediately; instead at some undefined point in the future the > > > decoder initialization will finish and VIDIOC_G_FMT will start returning the > > > proper format. This is a problem in TegraVDA since when we fail to get the > > > format after each buffer we queue, we continue trying to queue buffers; the > > race > > > is between the decoder initialization finishing and the VDA running out of > > > buffers to continue to enqueue with. > > > > > > For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE > queue > > is > > > synchronous w.r.t. the decoder initialization; i.e. it blocks until we know > > for > > > sure whether the decoder has initialized with the given input, or not. That > > > brings to mind two possible solutions in the TegraVDA case: > > > > > > 1. Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the > > Exynos > > > case > > > 2. Add a notification system for decoder initialization. I'd go with the > > > V4L2_EVENT system, by adding a private event for the Tegra driver (see: > > > V4L2_EVENT_PRIVATE_START). > > > > > > (1) would be a faster fix than (2). Tight-loop polling (by constant > reposting > > > of the task) is not the way to go. > > > > Random thought for posciak@: if we make the decoder init notify through the > > event system, we might even be able to unify this with the resolution change > > event handling. WDYT?
On 2014/03/10 11:31:48, Pawel Osciak wrote: > Hm, this is actually quite a neat idea... I like it. > > I discussed this with other V4L2 developers, but we haven't arrived at a > consensus yet. I believe this is better than G_FMT though. So this should be a > good solution at least for the short term. > We can change the behavior in Exynos to use the event for initial G_FMT as well > (you could do an ifdef on Exynos in the meantime). > > Shivdas: what do you think about this (using the resolution change event for > initial G_FMT as well)? Is this a discussion taking place on linux-media? Or elsewhere?
On 2014/03/10 19:42:18, sheu wrote: > On 2014/03/10 11:31:48, Pawel Osciak wrote: > > Hm, this is actually quite a neat idea... I like it. > > > > I discussed this with other V4L2 developers, but we haven't arrived at a > > consensus yet. I believe this is better than G_FMT though. So this should be a > > good solution at least for the short term. > > We can change the behavior in Exynos to use the event for initial G_FMT as > well > > (you could do an ifdef on Exynos in the meantime). > > > > Shivdas: what do you think about this (using the resolution change event for > > initial G_FMT as well)? > > Is this a discussion taking place on linux-media? Or elsewhere? Currently in #v4l on freenode. But at some point I'll post an RFC to linux-media.
On 2014/03/10 11:32:54, Pawel Osciak wrote: > On 2014/03/10 05:58:12, shivdasp wrote: > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto > > chk_format_info; > > Ohh that's why this never happens on Exynos. > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane synchronous > > looks the quicker solution and we wouldn't have to change a thing in VDA. > > I will make that change. > > Will re-post a patch removing this fix in VDA and addressing your comments > too. > > > > Does that mean you'd require the first buffer that is queued by client to > contain all the info required to make G_FMT work? > To make this VIDIOC_G_FMT synchronous, VDA should have already enqueued a buffer on OUTPUT PLANE with all info required for the decode to initialize correctly. When I try to make it synchronous I sometimes see that the VDA might not have submitted any buffer in which case the synchronous call of VIDIOC_G_FMT will wait and thereby cause the decoder thread to wait indefinitely. I can have timeouts but the timeouts may also race with VDA ending up with input buffers. How is the VIDIOC_F_FMT synchronous implemented in Exynos ? I would lean towards event based mechanism rather than synchronous behavior to avoid any deadlock issues like above. I can add event based mechanism but there is no compile-time flag to work this only for Tegra. Should we attempt to restructure the DecodeBufferInitial() to try and enqueue input buffers and try GetFormatInfo() too. This would help us get the VDA to work without any implementation deviations. Thanks > > Thanks. > > On 2014/03/07 21:36:29, sheu wrote: > > > On 2014/03/07 20:31:50, sheu wrote: > > > > On 2014/03/07 17:31:40, shivdasp wrote: > > > > > Jumping to GetFormatInfo() will move us out of kInitialized state which > is > > > > what > > > > > the bug really was. We did not try to check for format info if we were > out > > > of > > > > > buffers while in kInitialized state. > > > > > I am not sure how does starting of device poll earlier will solve this > > stuck > > > > up > > > > > problem. I will test it on my monday and update. > > > > > The problem I see is that if we do not allocate buffers on CAPTURE plane > > > > (which > > > > > will happen after formatinfo is set) the buffers on OUTPUT plane will > not > > > get > > > > > consumed and hence will not be dequeued. So even with device polling we > > > cannot > > > > > send more bitstream. We should have got the FormatInfo() after > processing > > > the > > > > > first buffer itself but if the detection is getting delayed we should > keep > > > > > checking for it while sending more bitstream buffers. > > > > > > > > > > > > > > > On 2014/03/07 00:18:08, sheu wrote: > > > > > > We seem to be allergic to "goto" statements in Chrome, for not > entirely > > > bad > > > > > > reasons. > > > > > > > > > > > > But also I don't see what jumping to GetFormatInfo would buy us. It > > seems > > > > to > > > > > me > > > > > > that if we're out of buffers, whether we have format info or not does > > not > > > > > > correlate with whether we'll soon be getting buffers. > > > > > > > > > > > > > Ah, i see how it is. So if I have this right: the problem is that we > queue > > > data > > > > to the OUTPUT queue for the decoder to initialize, but the decoder > > > > initialization happens asynchronously. So VIDIOC_G_FMT on the CAPTURE > queue > > > may > > > > not succeed immediately; instead at some undefined point in the future the > > > > decoder initialization will finish and VIDIOC_G_FMT will start returning > the > > > > proper format. This is a problem in TegraVDA since when we fail to get > the > > > > format after each buffer we queue, we continue trying to queue buffers; > the > > > race > > > > is between the decoder initialization finishing and the VDA running out of > > > > buffers to continue to enqueue with. > > > > > > > > For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE > > queue > > > is > > > > synchronous w.r.t. the decoder initialization; i.e. it blocks until we > know > > > for > > > > sure whether the decoder has initialized with the given input, or not. > That > > > > brings to mind two possible solutions in the TegraVDA case: > > > > > > > > 1. Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the > > > Exynos > > > > case > > > > 2. Add a notification system for decoder initialization. I'd go with the > > > > V4L2_EVENT system, by adding a private event for the Tegra driver (see: > > > > V4L2_EVENT_PRIVATE_START). > > > > > > > > (1) would be a faster fix than (2). Tight-loop polling (by constant > > reposting > > > > of the task) is not the way to go. > > > > > > Random thought for posciak@: if we make the decoder init notify through the > > > event system, we might even be able to unify this with the resolution change > > > event handling. WDYT?
On 2014/03/11 06:25:19, shivdasp wrote: > On 2014/03/10 11:32:54, Pawel Osciak wrote: > > On 2014/03/10 05:58:12, shivdasp wrote: > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto > > > chk_format_info; > > > Ohh that's why this never happens on Exynos. > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane synchronous > > > looks the quicker solution and we wouldn't have to change a thing in VDA. > > > I will make that change. > > > Will re-post a patch removing this fix in VDA and addressing your comments > > too. > > > > > > > Does that mean you'd require the first buffer that is queued by client to > > contain all the info required to make G_FMT work? > > > To make this VIDIOC_G_FMT synchronous, VDA should have already enqueued a buffer > on OUTPUT PLANE with all info required > for the decode to initialize correctly. > When I try to make it synchronous I sometimes see that the VDA might not have > submitted any buffer in which case > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the decoder > thread to wait indefinitely. > I can have timeouts but the timeouts may also race with VDA ending up with input > buffers. > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ? > It's not really, there is a silent assumption that it will work on the first buffer queued. > I would lean towards event based mechanism rather than synchronous behavior to > avoid any deadlock issues like above. > I can add event based mechanism but there is no compile-time flag to work this > only for Tegra. > I'd agree. I'll add that event to Exynos later, but shouldn't you be able to make it work already? If you keep going (assuming you have https://codereview.chromium.org/189993002/), Dequeue() will get your event and trigger a resolution change. DestroyOutputBuffers should then not do anything apart from calling reqbufs(0), which is ok to call even if there are no buffers allocated from the API perspective. And then it will go on. So I think it should just work if you simply add the event to Tegra (and have https://codereview.chromium.org/189993002/) without making any changes to the class? > Should we attempt to restructure the DecodeBufferInitial() to try and enqueue > input buffers and try GetFormatInfo() too. > This would help us get the VDA to work without any implementation deviations. > > Thanks > > > > Thanks. > > > On 2014/03/07 21:36:29, sheu wrote: > > > > On 2014/03/07 20:31:50, sheu wrote: > > > > > On 2014/03/07 17:31:40, shivdasp wrote: > > > > > > Jumping to GetFormatInfo() will move us out of kInitialized state > which > > is > > > > > what > > > > > > the bug really was. We did not try to check for format info if we were > > out > > > > of > > > > > > buffers while in kInitialized state. > > > > > > I am not sure how does starting of device poll earlier will solve this > > > stuck > > > > > up > > > > > > problem. I will test it on my monday and update. > > > > > > The problem I see is that if we do not allocate buffers on CAPTURE > plane > > > > > (which > > > > > > will happen after formatinfo is set) the buffers on OUTPUT plane will > > not > > > > get > > > > > > consumed and hence will not be dequeued. So even with device polling > we > > > > cannot > > > > > > send more bitstream. We should have got the FormatInfo() after > > processing > > > > the > > > > > > first buffer itself but if the detection is getting delayed we should > > keep > > > > > > checking for it while sending more bitstream buffers. > > > > > > > > > > > > > > > > > > On 2014/03/07 00:18:08, sheu wrote: > > > > > > > We seem to be allergic to "goto" statements in Chrome, for not > > entirely > > > > bad > > > > > > > reasons. > > > > > > > > > > > > > > But also I don't see what jumping to GetFormatInfo would buy us. It > > > seems > > > > > to > > > > > > me > > > > > > > that if we're out of buffers, whether we have format info or not > does > > > not > > > > > > > correlate with whether we'll soon be getting buffers. > > > > > > > > > > > > > > > > Ah, i see how it is. So if I have this right: the problem is that we > > queue > > > > data > > > > > to the OUTPUT queue for the decoder to initialize, but the decoder > > > > > initialization happens asynchronously. So VIDIOC_G_FMT on the CAPTURE > > queue > > > > may > > > > > not succeed immediately; instead at some undefined point in the future > the > > > > > decoder initialization will finish and VIDIOC_G_FMT will start returning > > the > > > > > proper format. This is a problem in TegraVDA since when we fail to get > > the > > > > > format after each buffer we queue, we continue trying to queue buffers; > > the > > > > race > > > > > is between the decoder initialization finishing and the VDA running out > of > > > > > buffers to continue to enqueue with. > > > > > > > > > > For Exynos we get around this problem since VIDIOC_G_FMT on the CAPTURE > > > queue > > > > is > > > > > synchronous w.r.t. the decoder initialization; i.e. it blocks until we > > know > > > > for > > > > > sure whether the decoder has initialized with the given input, or not. > > That > > > > > brings to mind two possible solutions in the TegraVDA case: > > > > > > > > > > 1. Either make VIDIOC_G_FMT on the CAPTURE queue synchronous, as in the > > > > Exynos > > > > > case > > > > > 2. Add a notification system for decoder initialization. I'd go with > the > > > > > V4L2_EVENT system, by adding a private event for the Tegra driver (see: > > > > > V4L2_EVENT_PRIVATE_START). > > > > > > > > > > (1) would be a faster fix than (2). Tight-loop polling (by constant > > > reposting > > > > > of the task) is not the way to go. > > > > > > > > Random thought for posciak@: if we make the decoder init notify through > the > > > > event system, we might even be able to unify this with the resolution > change > > > > event handling. WDYT?
On 2014/03/12 06:30:33, Pawel Osciak wrote: > On 2014/03/11 06:25:19, shivdasp wrote: > > On 2014/03/10 11:32:54, Pawel Osciak wrote: > > > On 2014/03/10 05:58:12, shivdasp wrote: > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto > > > > chk_format_info; > > > > Ohh that's why this never happens on Exynos. > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane synchronous > > > > looks the quicker solution and we wouldn't have to change a thing in VDA. > > > > I will make that change. > > > > Will re-post a patch removing this fix in VDA and addressing your comments > > > too. > > > > > > > > > > Does that mean you'd require the first buffer that is queued by client to > > > contain all the info required to make G_FMT work? > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already enqueued a > buffer > > on OUTPUT PLANE with all info required > > for the decode to initialize correctly. > > When I try to make it synchronous I sometimes see that the VDA might not have > > submitted any buffer in which case > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the decoder > > thread to wait indefinitely. > > I can have timeouts but the timeouts may also race with VDA ending up with > input > > buffers. > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ? > > > > It's not really, there is a silent assumption that it will work on the first > buffer queued. > > > I would lean towards event based mechanism rather than synchronous behavior to > > avoid any deadlock issues like above. > > I can add event based mechanism but there is no compile-time flag to work this > > only for Tegra. > > > > I'd agree. I'll add that event to Exynos later, but shouldn't you be able to > make it work already? If you keep going (assuming you have > https://codereview.chromium.org/189993002/), Dequeue() will get your event and > trigger a resolution change. DestroyOutputBuffers should then not do anything > apart from calling reqbufs(0), which is ok to call even if there are no buffers > allocated from the API perspective. And then it will go on. > > So I think it should just work if you simply add the event to Tegra (and have > https://codereview.chromium.org/189993002/) without making any changes to the > class? > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event when the capture format is set for the first time. I thought we were going to introduce another event for this that the decoder is initialized. Let me take https://codereview.chromium.org/189993002/ and add the resolution change event and see how far can it go. Will update.
On 2014/03/12 06:46:15, shivdasp wrote: > On 2014/03/12 06:30:33, Pawel Osciak wrote: > > On 2014/03/11 06:25:19, shivdasp wrote: > > > On 2014/03/10 11:32:54, Pawel Osciak wrote: > > > > On 2014/03/10 05:58:12, shivdasp wrote: > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto > > > > > chk_format_info; > > > > > Ohh that's why this never happens on Exynos. > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane > synchronous > > > > > looks the quicker solution and we wouldn't have to change a thing in > VDA. > > > > > I will make that change. > > > > > Will re-post a patch removing this fix in VDA and addressing your > comments > > > > too. > > > > > > > > > > > > > Does that mean you'd require the first buffer that is queued by client to > > > > contain all the info required to make G_FMT work? > > > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already enqueued a > > buffer > > > on OUTPUT PLANE with all info required > > > for the decode to initialize correctly. > > > When I try to make it synchronous I sometimes see that the VDA might not > have > > > submitted any buffer in which case > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the decoder > > > thread to wait indefinitely. > > > I can have timeouts but the timeouts may also race with VDA ending up with > > input > > > buffers. > > > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ? > > > > > > > It's not really, there is a silent assumption that it will work on the first > > buffer queued. > > > > > I would lean towards event based mechanism rather than synchronous behavior > to > > > avoid any deadlock issues like above. > > > I can add event based mechanism but there is no compile-time flag to work > this > > > only for Tegra. > > > > > > > I'd agree. I'll add that event to Exynos later, but shouldn't you be able to > > make it work already? If you keep going (assuming you have > > https://codereview.chromium.org/189993002/), Dequeue() will get your event and > > trigger a resolution change. DestroyOutputBuffers should then not do anything > > apart from calling reqbufs(0), which is ok to call even if there are no > buffers > > allocated from the API perspective. And then it will go on. > > > > So I think it should just work if you simply add the event to Tegra (and have > > https://codereview.chromium.org/189993002/) without making any changes to the > > class? > > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event when the > capture format is set for the first time. Yes exactly. Should just work. > I thought we were going to introduce another event for this that the decoder is > initialized. > Let me take https://codereview.chromium.org/189993002/ and add the resolution > change event and see how far can it go. > Will update.
On 2014/03/12 06:49:57, Pawel Osciak wrote: > On 2014/03/12 06:46:15, shivdasp wrote: > > On 2014/03/12 06:30:33, Pawel Osciak wrote: > > > On 2014/03/11 06:25:19, shivdasp wrote: > > > > On 2014/03/10 11:32:54, Pawel Osciak wrote: > > > > > On 2014/03/10 05:58:12, shivdasp wrote: > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc > (right): > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto > > > > > > chk_format_info; > > > > > > Ohh that's why this never happens on Exynos. > > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane > > synchronous > > > > > > looks the quicker solution and we wouldn't have to change a thing in > > VDA. > > > > > > I will make that change. > > > > > > Will re-post a patch removing this fix in VDA and addressing your > > comments > > > > > too. > > > > > > > > > > > > > > > > Does that mean you'd require the first buffer that is queued by client > to > > > > > contain all the info required to make G_FMT work? > > > > > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already enqueued a > > > buffer > > > > on OUTPUT PLANE with all info required > > > > for the decode to initialize correctly. > > > > When I try to make it synchronous I sometimes see that the VDA might not > > have > > > > submitted any buffer in which case > > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the > decoder > > > > thread to wait indefinitely. > > > > I can have timeouts but the timeouts may also race with VDA ending up with > > > input > > > > buffers. > > > > > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ? > > > > > > > > > > It's not really, there is a silent assumption that it will work on the first > > > buffer queued. > > > > > > > I would lean towards event based mechanism rather than synchronous > behavior > > to > > > > avoid any deadlock issues like above. > > > > I can add event based mechanism but there is no compile-time flag to work > > this > > > > only for Tegra. > > > > > > > > > > I'd agree. I'll add that event to Exynos later, but shouldn't you be able to > > > make it work already? If you keep going (assuming you have > > > https://codereview.chromium.org/189993002/), Dequeue() will get your event > and > > > trigger a resolution change. DestroyOutputBuffers should then not do > anything > > > apart from calling reqbufs(0), which is ok to call even if there are no > > buffers > > > allocated from the API perspective. And then it will go on. > > > > > > So I think it should just work if you simply add the event to Tegra (and > have > > > https://codereview.chromium.org/189993002/) without making any changes to > the > > > class? > > > > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event when the > > capture format is set for the first time. > > Yes exactly. Should just work. As I am making these changes it occurred to me that with this way of doing it, we might do one unnecessary resolution change (freeing and re-allocation of buffers) if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself which will cause a slight jitter. > > > I thought we were going to introduce another event for this that the decoder > is > > initialized. > > Let me take https://codereview.chromium.org/189993002/ and add the resolution > > change event and see how far can it go. > > Will update.
On 2014/03/12 09:35:08, shivdasp wrote: > On 2014/03/12 06:49:57, Pawel Osciak wrote: > > On 2014/03/12 06:46:15, shivdasp wrote: > > > On 2014/03/12 06:30:33, Pawel Osciak wrote: > > > > On 2014/03/11 06:25:19, shivdasp wrote: > > > > > On 2014/03/10 11:32:54, Pawel Osciak wrote: > > > > > > On 2014/03/10 05:58:12, shivdasp wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc > > (right): > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: goto > > > > > > > chk_format_info; > > > > > > > Ohh that's why this never happens on Exynos. > > > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane > > > synchronous > > > > > > > looks the quicker solution and we wouldn't have to change a thing in > > > VDA. > > > > > > > I will make that change. > > > > > > > Will re-post a patch removing this fix in VDA and addressing your > > > comments > > > > > > too. > > > > > > > > > > > > > > > > > > > Does that mean you'd require the first buffer that is queued by client > > to > > > > > > contain all the info required to make G_FMT work? > > > > > > > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already enqueued > a > > > > buffer > > > > > on OUTPUT PLANE with all info required > > > > > for the decode to initialize correctly. > > > > > When I try to make it synchronous I sometimes see that the VDA might not > > > have > > > > > submitted any buffer in which case > > > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the > > decoder > > > > > thread to wait indefinitely. > > > > > I can have timeouts but the timeouts may also race with VDA ending up > with > > > > input > > > > > buffers. > > > > > > > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ? > > > > > > > > > > > > > It's not really, there is a silent assumption that it will work on the > first > > > > buffer queued. > > > > > > > > > I would lean towards event based mechanism rather than synchronous > > behavior > > > to > > > > > avoid any deadlock issues like above. > > > > > I can add event based mechanism but there is no compile-time flag to > work > > > this > > > > > only for Tegra. > > > > > > > > > > > > > I'd agree. I'll add that event to Exynos later, but shouldn't you be able > to > > > > make it work already? If you keep going (assuming you have > > > > https://codereview.chromium.org/189993002/), Dequeue() will get your event > > and > > > > trigger a resolution change. DestroyOutputBuffers should then not do > > anything > > > > apart from calling reqbufs(0), which is ok to call even if there are no > > > buffers > > > > allocated from the API perspective. And then it will go on. > > > > > > > > So I think it should just work if you simply add the event to Tegra (and > > have > > > > https://codereview.chromium.org/189993002/) without making any changes to > > the > > > > class? > > > > > > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event when > the > > > capture format is set for the first time. > > > > Yes exactly. Should just work. > As I am making these changes it occurred to me that with this way of doing it, > we might do one unnecessary resolution change (freeing and re-allocation of > buffers) > if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself which will > cause a slight jitter. True. But this is temporary until we implement it for Exynos and remove it from DBI() entirely. > > > > > > I thought we were going to introduce another event for this that the decoder > > is > > > initialized. > > > Let me take https://codereview.chromium.org/189993002/ and add the > resolution > > > change event and see how far can it go. > > > Will update.
On 2014/03/12 09:54:09, Pawel Osciak wrote: > On 2014/03/12 09:35:08, shivdasp wrote: > > On 2014/03/12 06:49:57, Pawel Osciak wrote: > > > On 2014/03/12 06:46:15, shivdasp wrote: > > > > On 2014/03/12 06:30:33, Pawel Osciak wrote: > > > > > On 2014/03/11 06:25:19, shivdasp wrote: > > > > > > On 2014/03/10 11:32:54, Pawel Osciak wrote: > > > > > > > On 2014/03/10 05:58:12, shivdasp wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc > > > (right): > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: > goto > > > > > > > > chk_format_info; > > > > > > > > Ohh that's why this never happens on Exynos. > > > > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane > > > > synchronous > > > > > > > > looks the quicker solution and we wouldn't have to change a thing > in > > > > VDA. > > > > > > > > I will make that change. > > > > > > > > Will re-post a patch removing this fix in VDA and addressing your > > > > comments > > > > > > > too. > > > > > > > > > > > > > > > > > > > > > > Does that mean you'd require the first buffer that is queued by > client > > > to > > > > > > > contain all the info required to make G_FMT work? > > > > > > > > > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already > enqueued > > a > > > > > buffer > > > > > > on OUTPUT PLANE with all info required > > > > > > for the decode to initialize correctly. > > > > > > When I try to make it synchronous I sometimes see that the VDA might > not > > > > have > > > > > > submitted any buffer in which case > > > > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the > > > decoder > > > > > > thread to wait indefinitely. > > > > > > I can have timeouts but the timeouts may also race with VDA ending up > > with > > > > > input > > > > > > buffers. > > > > > > > > > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ? > > > > > > > > > > > > > > > > It's not really, there is a silent assumption that it will work on the > > first > > > > > buffer queued. > > > > > > > > > > > I would lean towards event based mechanism rather than synchronous > > > behavior > > > > to > > > > > > avoid any deadlock issues like above. > > > > > > I can add event based mechanism but there is no compile-time flag to > > work > > > > this > > > > > > only for Tegra. > > > > > > > > > > > > > > > > I'd agree. I'll add that event to Exynos later, but shouldn't you be > able > > to > > > > > make it work already? If you keep going (assuming you have > > > > > https://codereview.chromium.org/189993002/), Dequeue() will get your > event > > > and > > > > > trigger a resolution change. DestroyOutputBuffers should then not do > > > anything > > > > > apart from calling reqbufs(0), which is ok to call even if there are no > > > > buffers > > > > > allocated from the API perspective. And then it will go on. > > > > > > > > > > So I think it should just work if you simply add the event to Tegra (and > > > have > > > > > https://codereview.chromium.org/189993002/) without making any changes > to > > > the > > > > > class? > > > > > > > > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event when > > the > > > > capture format is set for the first time. > > > > > > Yes exactly. Should just work. > > As I am making these changes it occurred to me that with this way of doing it, > > we might do one unnecessary resolution change (freeing and re-allocation of > > buffers) > > if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself which > will > > cause a slight jitter. > > True. But this is temporary until we implement it for Exynos and remove it from > DBI() entirely. Tried doing this changes and I see a couple of issues. There is a race condition between StartDevicePoll() and decoder_thread_ is created I have updated that in that CL. Secondly when I get past it, the check in rendering_helper.cc fails. https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu... I will debug more on what exactly fails here since I spent most of the time in debugging the race condition. > > > > > > > > > > I thought we were going to introduce another event for this that the > decoder > > > is > > > > initialized. > > > > Let me take https://codereview.chromium.org/189993002/ and add the > > resolution > > > > change event and see how far can it go. > > > > Will update.
On 2014/03/12 12:16:14, shivdasp wrote: > On 2014/03/12 09:54:09, Pawel Osciak wrote: > > On 2014/03/12 09:35:08, shivdasp wrote: > > > On 2014/03/12 06:49:57, Pawel Osciak wrote: > > > > On 2014/03/12 06:46:15, shivdasp wrote: > > > > > On 2014/03/12 06:30:33, Pawel Osciak wrote: > > > > > > On 2014/03/11 06:25:19, shivdasp wrote: > > > > > > > On 2014/03/10 11:32:54, Pawel Osciak wrote: > > > > > > > > On 2014/03/10 05:58:12, shivdasp wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc > > > > (right): > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: > > goto > > > > > > > > > chk_format_info; > > > > > > > > > Ohh that's why this never happens on Exynos. > > > > > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane > > > > > synchronous > > > > > > > > > looks the quicker solution and we wouldn't have to change a > thing > > in > > > > > VDA. > > > > > > > > > I will make that change. > > > > > > > > > Will re-post a patch removing this fix in VDA and addressing > your > > > > > comments > > > > > > > > too. > > > > > > > > > > > > > > > > > > > > > > > > > Does that mean you'd require the first buffer that is queued by > > client > > > > to > > > > > > > > contain all the info required to make G_FMT work? > > > > > > > > > > > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already > > enqueued > > > a > > > > > > buffer > > > > > > > on OUTPUT PLANE with all info required > > > > > > > for the decode to initialize correctly. > > > > > > > When I try to make it synchronous I sometimes see that the VDA might > > not > > > > > have > > > > > > > submitted any buffer in which case > > > > > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause the > > > > decoder > > > > > > > thread to wait indefinitely. > > > > > > > I can have timeouts but the timeouts may also race with VDA ending > up > > > with > > > > > > input > > > > > > > buffers. > > > > > > > > > > > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ? > > > > > > > > > > > > > > > > > > > It's not really, there is a silent assumption that it will work on the > > > first > > > > > > buffer queued. > > > > > > > > > > > > > I would lean towards event based mechanism rather than synchronous > > > > behavior > > > > > to > > > > > > > avoid any deadlock issues like above. > > > > > > > I can add event based mechanism but there is no compile-time flag to > > > work > > > > > this > > > > > > > only for Tegra. > > > > > > > > > > > > > > > > > > > I'd agree. I'll add that event to Exynos later, but shouldn't you be > > able > > > to > > > > > > make it work already? If you keep going (assuming you have > > > > > > https://codereview.chromium.org/189993002/), Dequeue() will get your > > event > > > > and > > > > > > trigger a resolution change. DestroyOutputBuffers should then not do > > > > anything > > > > > > apart from calling reqbufs(0), which is ok to call even if there are > no > > > > > buffers > > > > > > allocated from the API perspective. And then it will go on. > > > > > > > > > > > > So I think it should just work if you simply add the event to Tegra > (and > > > > have > > > > > > https://codereview.chromium.org/189993002/) without making any changes > > to > > > > the > > > > > > class? > > > > > > > > > > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event > when > > > the > > > > > capture format is set for the first time. > > > > > > > > Yes exactly. Should just work. > > > As I am making these changes it occurred to me that with this way of doing > it, > > > we might do one unnecessary resolution change (freeing and re-allocation of > > > buffers) > > > if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself which > > will > > > cause a slight jitter. > > > > True. But this is temporary until we implement it for Exynos and remove it > from > > DBI() entirely. > Tried doing this changes and I see a couple of issues. > There is a race condition between StartDevicePoll() and decoder_thread_ is > created I have updated that in that CL. > > Secondly when I get past it, the check in rendering_helper.cc fails. > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu... > I will debug more on what exactly fails here since I spent most of the time in > debugging the race condition. > Okay so I investigated why the condition fails in rendering_helper.cc line #411. It fails when we do ProvidePictureBuffers() twice. Once when the G_FMT actually succeeds in DBI() and next when the DequeueEvents() finds the resolution change event. I guess the vdatest is not equipped to handle resolution change scenario ? Is it true ? This does not fail all the time but it does fail about 50% of the time. I am now testing this with browser to see if there are any issues with it. > > > > > > > > > > > > > > I thought we were going to introduce another event for this that the > > decoder > > > > is > > > > > initialized. > > > > > Let me take https://codereview.chromium.org/189993002/ and add the > > > resolution > > > > > change event and see how far can it go. > > > > > Will update.
On 2014/03/13 07:14:56, shivdasp wrote: > On 2014/03/12 12:16:14, shivdasp wrote: > > On 2014/03/12 09:54:09, Pawel Osciak wrote: > > > On 2014/03/12 09:35:08, shivdasp wrote: > > > > On 2014/03/12 06:49:57, Pawel Osciak wrote: > > > > > On 2014/03/12 06:46:15, shivdasp wrote: > > > > > > On 2014/03/12 06:30:33, Pawel Osciak wrote: > > > > > > > On 2014/03/11 06:25:19, shivdasp wrote: > > > > > > > > On 2014/03/10 11:32:54, Pawel Osciak wrote: > > > > > > > > > On 2014/03/10 05:58:12, shivdasp wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > > > > > File content/common/gpu/media/v4l2_video_decode_accelerator.cc > > > > > (right): > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: > > > goto > > > > > > > > > > chk_format_info; > > > > > > > > > > Ohh that's why this never happens on Exynos. > > > > > > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE plane > > > > > > synchronous > > > > > > > > > > looks the quicker solution and we wouldn't have to change a > > thing > > > in > > > > > > VDA. > > > > > > > > > > I will make that change. > > > > > > > > > > Will re-post a patch removing this fix in VDA and addressing > > your > > > > > > comments > > > > > > > > > too. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Does that mean you'd require the first buffer that is queued by > > > client > > > > > to > > > > > > > > > contain all the info required to make G_FMT work? > > > > > > > > > > > > > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already > > > enqueued > > > > a > > > > > > > buffer > > > > > > > > on OUTPUT PLANE with all info required > > > > > > > > for the decode to initialize correctly. > > > > > > > > When I try to make it synchronous I sometimes see that the VDA > might > > > not > > > > > > have > > > > > > > > submitted any buffer in which case > > > > > > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause > the > > > > > decoder > > > > > > > > thread to wait indefinitely. > > > > > > > > I can have timeouts but the timeouts may also race with VDA ending > > up > > > > with > > > > > > > input > > > > > > > > buffers. > > > > > > > > > > > > > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ? > > > > > > > > > > > > > > > > > > > > > > It's not really, there is a silent assumption that it will work on > the > > > > first > > > > > > > buffer queued. > > > > > > > > > > > > > > > I would lean towards event based mechanism rather than synchronous > > > > > behavior > > > > > > to > > > > > > > > avoid any deadlock issues like above. > > > > > > > > I can add event based mechanism but there is no compile-time flag > to > > > > work > > > > > > this > > > > > > > > only for Tegra. > > > > > > > > > > > > > > > > > > > > > > I'd agree. I'll add that event to Exynos later, but shouldn't you be > > > able > > > > to > > > > > > > make it work already? If you keep going (assuming you have > > > > > > > https://codereview.chromium.org/189993002/), Dequeue() will get your > > > event > > > > > and > > > > > > > trigger a resolution change. DestroyOutputBuffers should then not do > > > > > anything > > > > > > > apart from calling reqbufs(0), which is ok to call even if there are > > no > > > > > > buffers > > > > > > > allocated from the API perspective. And then it will go on. > > > > > > > > > > > > > > So I think it should just work if you simply add the event to Tegra > > (and > > > > > have > > > > > > > https://codereview.chromium.org/189993002/) without making any > changes > > > to > > > > > the > > > > > > > class? > > > > > > > > > > > > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event > > when > > > > the > > > > > > capture format is set for the first time. > > > > > > > > > > Yes exactly. Should just work. > > > > As I am making these changes it occurred to me that with this way of doing > > it, > > > > we might do one unnecessary resolution change (freeing and re-allocation > of > > > > buffers) > > > > if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself > which > > > will > > > > cause a slight jitter. > > > > > > True. But this is temporary until we implement it for Exynos and remove it > > from > > > DBI() entirely. > > Tried doing this changes and I see a couple of issues. > > There is a race condition between StartDevicePoll() and decoder_thread_ is > > created I have updated that in that CL. > > > > Secondly when I get past it, the check in rendering_helper.cc fails. > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu... > > I will debug more on what exactly fails here since I spent most of the time in > > debugging the race condition. > > > Okay so I investigated why the condition fails in rendering_helper.cc line #411. > It fails when we do ProvidePictureBuffers() twice. > Once when the G_FMT actually succeeds in DBI() and next when the DequeueEvents() > finds the resolution change event. > I guess the vdatest is not equipped to handle resolution change scenario ? Is it > true ? > This does not fail all the time but it does fail about 50% of the time. > I am now testing this with browser to see if there are any issues with it. > Okay so the browser works fine even in the case when G_FMT is successful before dequeuing the resolution change event. But the vdatest fails as described above. Pawel , how should be go about this now ? > > > > > > > > > > > > > > > > > > > I thought we were going to introduce another event for this that the > > > decoder > > > > > is > > > > > > initialized. > > > > > > Let me take https://codereview.chromium.org/189993002/ and add the > > > > resolution > > > > > > change event and see how far can it go. > > > > > > Will update.
On 2014/03/13 10:48:08, shivdasp wrote: > On 2014/03/13 07:14:56, shivdasp wrote: > > On 2014/03/12 12:16:14, shivdasp wrote: > > > On 2014/03/12 09:54:09, Pawel Osciak wrote: > > > > On 2014/03/12 09:35:08, shivdasp wrote: > > > > > On 2014/03/12 06:49:57, Pawel Osciak wrote: > > > > > > On 2014/03/12 06:46:15, shivdasp wrote: > > > > > > > On 2014/03/12 06:30:33, Pawel Osciak wrote: > > > > > > > > On 2014/03/11 06:25:19, shivdasp wrote: > > > > > > > > > On 2014/03/10 11:32:54, Pawel Osciak wrote: > > > > > > > > > > On 2014/03/10 05:58:12, shivdasp wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > > > > > > File > content/common/gpu/media/v4l2_video_decode_accelerator.cc > > > > > > (right): > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/137023008/diff/660001/content/common/gpu/medi... > > > > > > > > > > > > content/common/gpu/media/v4l2_video_decode_accelerator.cc:719: > > > > goto > > > > > > > > > > > chk_format_info; > > > > > > > > > > > Ohh that's why this never happens on Exynos. > > > > > > > > > > > Alright, option (1) to make the VIDIOC_G_FMT on CAPTURE > plane > > > > > > > synchronous > > > > > > > > > > > looks the quicker solution and we wouldn't have to change a > > > thing > > > > in > > > > > > > VDA. > > > > > > > > > > > I will make that change. > > > > > > > > > > > Will re-post a patch removing this fix in VDA and addressing > > > your > > > > > > > comments > > > > > > > > > > too. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Does that mean you'd require the first buffer that is queued > by > > > > client > > > > > > to > > > > > > > > > > contain all the info required to make G_FMT work? > > > > > > > > > > > > > > > > > > > To make this VIDIOC_G_FMT synchronous, VDA should have already > > > > enqueued > > > > > a > > > > > > > > buffer > > > > > > > > > on OUTPUT PLANE with all info required > > > > > > > > > for the decode to initialize correctly. > > > > > > > > > When I try to make it synchronous I sometimes see that the VDA > > might > > > > not > > > > > > > have > > > > > > > > > submitted any buffer in which case > > > > > > > > > the synchronous call of VIDIOC_G_FMT will wait and thereby cause > > the > > > > > > decoder > > > > > > > > > thread to wait indefinitely. > > > > > > > > > I can have timeouts but the timeouts may also race with VDA > ending > > > up > > > > > with > > > > > > > > input > > > > > > > > > buffers. > > > > > > > > > > > > > > > > > > How is the VIDIOC_F_FMT synchronous implemented in Exynos ? > > > > > > > > > > > > > > > > > > > > > > > > > It's not really, there is a silent assumption that it will work on > > the > > > > > first > > > > > > > > buffer queued. > > > > > > > > > > > > > > > > > I would lean towards event based mechanism rather than > synchronous > > > > > > behavior > > > > > > > to > > > > > > > > > avoid any deadlock issues like above. > > > > > > > > > I can add event based mechanism but there is no compile-time > flag > > to > > > > > work > > > > > > > this > > > > > > > > > only for Tegra. > > > > > > > > > > > > > > > > > > > > > > > > > I'd agree. I'll add that event to Exynos later, but shouldn't you > be > > > > able > > > > > to > > > > > > > > make it work already? If you keep going (assuming you have > > > > > > > > https://codereview.chromium.org/189993002/), Dequeue() will get > your > > > > event > > > > > > and > > > > > > > > trigger a resolution change. DestroyOutputBuffers should then not > do > > > > > > anything > > > > > > > > apart from calling reqbufs(0), which is ok to call even if there > are > > > no > > > > > > > buffers > > > > > > > > allocated from the API perspective. And then it will go on. > > > > > > > > > > > > > > > > So I think it should just work if you simply add the event to > Tegra > > > (and > > > > > > have > > > > > > > > https://codereview.chromium.org/189993002/) without making any > > changes > > > > to > > > > > > the > > > > > > > > class? > > > > > > > > > > > > > > > Ahh, I get it now. So you are saying use the RESOLUTION_CHANGE event > > > when > > > > > the > > > > > > > capture format is set for the first time. > > > > > > > > > > > > Yes exactly. Should just work. > > > > > As I am making these changes it occurred to me that with this way of > doing > > > it, > > > > > we might do one unnecessary resolution change (freeing and re-allocation > > of > > > > > buffers) > > > > > if VIDIOC_G_FMT happen to succeed during DecodeBufferInitial() itself > > which > > > > will > > > > > cause a slight jitter. > > > > > > > > True. But this is temporary until we implement it for Exynos and remove it > > > from > > > > DBI() entirely. > > > Tried doing this changes and I see a couple of issues. > > > There is a race condition between StartDevicePoll() and decoder_thread_ is > > > created I have updated that in that CL. > > > > > > Secondly when I get past it, the check in rendering_helper.cc fails. > > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu... > > > I will debug more on what exactly fails here since I spent most of the time > in > > > debugging the race condition. > > > > > Okay so I investigated why the condition fails in rendering_helper.cc line > #411. > > It fails when we do ProvidePictureBuffers() twice. > > Once when the G_FMT actually succeeds in DBI() and next when the > DequeueEvents() > > finds the resolution change event. > > I guess the vdatest is not equipped to handle resolution change scenario ? Is > it > > true ? > > This does not fail all the time but it does fail about 50% of the time. > > I am now testing this with browser to see if there are any issues with it. > > > Okay so the browser works fine even in the case when G_FMT is successful before > dequeuing the resolution change event. > But the vdatest fails as described above. > Pawel , how should be go about this now ? Pawel, How do you suggest we go ahead now ? Having device poll thread start early and using the resolution change event seems breaking the vdatest. Could you re-consider the patchset#7 for now (will address sheu's comments) that has a not-so-good fix of reposting the task which will happen only on Tegra. We anyways have a plan to use the decode initialization event and that will be the neat way of doing this. If you have any other ideas I can try them as well. Thanks > > > > > > > > > > > > > > > > > > > > > > > > I thought we were going to introduce another event for this that the > > > > decoder > > > > > > is > > > > > > > initialized. > > > > > > > Let me take https://codereview.chromium.org/189993002/ and add the > > > > > resolution > > > > > > > change event and see how far can it go. > > > > > > > Will update.
On 2014/03/14 04:48:28, shivdasp wrote: > Pawel, > How do you suggest we go ahead now ? > Having device poll thread start early and using the resolution change event > seems breaking the vdatest. > Could you re-consider the patchset#7 for now (will address sheu's comments) that > has a not-so-good fix of reposting the task which will happen only on Tegra. We > anyways have a plan to use the decode initialization event and that will be the > neat way of doing this. > If you have any other ideas I can try them as well. Shivdas, Why are we doing ProvidePictureBuffers() twice? We should be seeing only one resolution change (i.e. the initial change). We should rather fix vdatest than hack V4L2VDA, but I don't understand why this becomes a resolution change event with us calling PPB() twice. Could you explain? Thanks, P.
On 2014/03/18 05:49:59, Pawel Osciak wrote: > On 2014/03/14 04:48:28, shivdasp wrote: > > Pawel, > > How do you suggest we go ahead now ? > > Having device poll thread start early and using the resolution change event > > seems breaking the vdatest. > > Could you re-consider the patchset#7 for now (will address sheu's comments) > that > > has a not-so-good fix of reposting the task which will happen only on Tegra. > We > > anyways have a plan to use the decode initialization event and that will be > the > > neat way of doing this. > > If you have any other ideas I can try them as well. > > Shivdas, > Why are we doing ProvidePictureBuffers() twice? We should be seeing only one > resolution change (i.e. the initial change). > We should rather fix vdatest than hack V4L2VDA, but I don't understand why this > becomes a resolution change event with us calling PPB() twice. > Could you explain? The first PPB() happens when the G_FMT succeeds in DBI() and thereby picture buffers are requested. The second PPB() happens because a RESOLUTION_CHANGE event is already enqueued underneath and it is dequeued() after picture buffers were already created. So there's no way to identify a real resolution change from the initial decoder init request. > Thanks, > P.
On 2014/03/18 06:08:49, shivdasp wrote: > On 2014/03/18 05:49:59, Pawel Osciak wrote: > > On 2014/03/14 04:48:28, shivdasp wrote: > > > Pawel, > > > How do you suggest we go ahead now ? > > > Having device poll thread start early and using the resolution change event > > > seems breaking the vdatest. > > > Could you re-consider the patchset#7 for now (will address sheu's comments) > > that > > > has a not-so-good fix of reposting the task which will happen only on Tegra. > > We > > > anyways have a plan to use the decode initialization event and that will be > > the > > > neat way of doing this. > > > If you have any other ideas I can try them as well. > > > > Shivdas, > > Why are we doing ProvidePictureBuffers() twice? We should be seeing only one > > resolution change (i.e. the initial change). > > We should rather fix vdatest than hack V4L2VDA, but I don't understand why > this > > becomes a resolution change event with us calling PPB() twice. > > Could you explain? > The first PPB() happens when the G_FMT succeeds in DBI() and thereby picture > buffers are requested. > The second PPB() happens because a RESOLUTION_CHANGE event is already enqueued > underneath and it is dequeued() after picture buffers were already created. So > there's no way to identify a real resolution change from the initial decoder > init request. Could we fix rendering helper to reallocate textures?
On 2014/03/18 06:15:28, Pawel Osciak wrote: > On 2014/03/18 06:08:49, shivdasp wrote: > > On 2014/03/18 05:49:59, Pawel Osciak wrote: > > > On 2014/03/14 04:48:28, shivdasp wrote: > > > > Pawel, > > > > How do you suggest we go ahead now ? > > > > Having device poll thread start early and using the resolution change > event > > > > seems breaking the vdatest. > > > > Could you re-consider the patchset#7 for now (will address sheu's > comments) > > > that > > > > has a not-so-good fix of reposting the task which will happen only on > Tegra. > > > We > > > > anyways have a plan to use the decode initialization event and that will > be > > > the > > > > neat way of doing this. > > > > If you have any other ideas I can try them as well. > > > > > > Shivdas, > > > Why are we doing ProvidePictureBuffers() twice? We should be seeing only one > > > resolution change (i.e. the initial change). > > > We should rather fix vdatest than hack V4L2VDA, but I don't understand why > > this > > > becomes a resolution change event with us calling PPB() twice. > > > Could you explain? > > The first PPB() happens when the G_FMT succeeds in DBI() and thereby picture > > buffers are requested. > > The second PPB() happens because a RESOLUTION_CHANGE event is already enqueued > > underneath and it is dequeued() after picture buffers were already created. So > > there's no way to identify a real resolution change from the initial decoder > > init request. > > Could we fix rendering helper to reallocate textures? I fixed the rendering helper to reallocate the textures and now it can handle PPB() twice. However there are two major issues that I see now: 1. Since we do a RES_CHANGE sequence, few of the decoded-but-yet-to-be-rendered buffers may get lost since we call StopDevicePoll() which inturn calls STREAMOFF which looses the buffers. 7 sub-tests fail on account of less than expected frames returned. 2. Relatedly, since we DestroyOutputBuffers(), we might loose the reference frames and as we are not actually starting the stream from a I frame there might be corruption. I think we might be better off with having a separate event for signalling the decoder initialization rather than using resolution change sequence. Add a new method DecoderInitTask() , which will be posted from DequeueEvents() when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name ?? & would need to add another enum like RESOLUTION_CHANGE with value 6 ?? ) DecoderInitTask(), does nothing if decoder_state_ is already kDecoding, else GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding. This should work with Exynos too since there will not be any DECODER_INIT event dequeued on Exynos. Thoughts ? Thanks
On 2014/03/18 12:22:17, shivdasp wrote: > I fixed the rendering helper to reallocate the textures and now it can handle > PPB() twice. > However there are two major issues that I see now: > 1. Since we do a RES_CHANGE sequence, few of the decoded-but-yet-to-be-rendered > buffers may get lost since we call StopDevicePoll() which inturn calls STREAMOFF > which looses the buffers. 7 sub-tests fail on account of less than expected > frames returned. If the frame has been decoded and returned to the VDA client, then calling STREAMOFF on the queue may terminate the video decoder's access to the buffer, but it should not be destroyed until the 3D context has released it -- it should still be renderable. This is the point of the discussion we had above about the tegrav4l2 library needing to track buffer lifetimes correctly for both the 3D and the video decode stacks. > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the reference > frames and as we are not actually starting the stream from a I frame there might > be corruption. > > I think we might be better off with having a separate event for signalling the > decoder initialization rather than using resolution change sequence. > Add a new method DecoderInitTask() , which will be posted from DequeueEvents() > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name ?? & > would need to add another enum like RESOLUTION_CHANGE with value 6 ?? ) > DecoderInitTask(), does nothing if decoder_state_ is already kDecoding, else > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding. > > This should work with Exynos too since there will not be any DECODER_INIT event > dequeued on Exynos. > Thoughts ? So my thought are that we have two concerns: (1) allowing for devices to signal decoder init through the events system, and also (2) not breaking the existing API for existing drivers and users. For the existing usecase in (2), clients would be expected to poll VIDIOC_G_FMT on the CAPTURE queue until it succeeds, at which point the client knows that the initialization has succeeded and decoding can commence. For (1), the new wrinkle would be clients don't have to poll; the arrival of the event just signals that the client can expect to succeed the next time VIDIOC_G_FMT is called. So if we do it this way, we can just add the behavior where the arrival of a DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT. This would be correct for the current Exynos case where VIDIOC_G_FMT is blocking (and DECODER_INIT is not implemented), as well as a future where Exynos does not block on VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib does not block VIDIOC_G_FMT and it implements DECODER_INiT. The other question is whether we can reuse the RESOLUTION_CHANGE event instead of having to add a new DECODER_INIT change. I think these two approaches are both feasible: a dedicated DECODER_INIT event is the equivalent of a "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic. Now that I think of it though I'm less inclined to overload the meaning of the events. We can use the same handling codepaths in the VDA, but that doesn't mean we have to make the V4L2 API overload the meanings of these two.
Can we split this CL into two? I think the changes to the V4L2VideoDevice interface are fairly noncontroversial and we should be able to put that into another CL and land that first. I have some unittests for V4L2VideoDevice that I'd like to land so it would be nice if that came first.
On 2014/03/18 20:15:22, sheu wrote: > On 2014/03/18 12:22:17, shivdasp wrote: > > I fixed the rendering helper to reallocate the textures and now it can handle > > PPB() twice. > > However there are two major issues that I see now: > > 1. Since we do a RES_CHANGE sequence, few of the > decoded-but-yet-to-be-rendered > > buffers may get lost since we call StopDevicePoll() which inturn calls > STREAMOFF > > which looses the buffers. 7 sub-tests fail on account of less than expected > > frames returned. > > If the frame has been decoded and returned to the VDA client, then calling > STREAMOFF on the queue may terminate the video decoder's access to the buffer, > but it should not be destroyed until the 3D context has released it -- it should > still be renderable. This is the point of the discussion we had above about the > tegrav4l2 library needing to track buffer lifetimes correctly for both the 3D > and the video decode stacks. > Sorry if I wasn't clear, I was talking about the decoded but yet to be DQBUF'ed buffers. The buffers were ready to be dequeued but a STREAMOFF call would get them dropped. > > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the reference > > frames and as we are not actually starting the stream from a I frame there > might > > be corruption. > > This is a more serious problem (possible corruption) since the stream does not restart (STREAMOFF for output plane is not called) so the decoder believes it is starting from a I frame which may not be the case. > > I think we might be better off with having a separate event for signalling the > > decoder initialization rather than using resolution change sequence. > > Add a new method DecoderInitTask() , which will be posted from DequeueEvents() > > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name ?? & > > would need to add another enum like RESOLUTION_CHANGE with value 6 ?? ) > > DecoderInitTask(), does nothing if decoder_state_ is already kDecoding, else > > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding. > > > > This should work with Exynos too since there will not be any DECODER_INIT > event > > dequeued on Exynos. > > Thoughts ? > > So my thought are that we have two concerns: (1) allowing for devices to signal > decoder init through the events system, and also (2) not breaking the existing > API for existing drivers and users. > > For the existing usecase in (2), clients would be expected to poll VIDIOC_G_FMT > on the CAPTURE queue until it succeeds, at which point the client knows that the > initialization has succeeded and decoding can commence. > yes we will still keep the sequence in DBI() as is so Exynos would stay unaffected. And if on Tegra the G_FMT succeeds earlier this can also be protected by checking the decoder_state_ in DecoderInitTask() which will no-op if decoder_state_ is already kDecoding. > For (1), the new wrinkle would be clients don't have to poll; the arrival of the > event just signals that the client can expect to succeed the next time > VIDIOC_G_FMT is called. > > So if we do it this way, we can just add the behavior where the arrival of a > DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT. This would be correct > for the current Exynos case where VIDIOC_G_FMT is blocking (and DECODER_INIT is > not implemented), as well as a future where Exynos does not block on > VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib does not > block VIDIOC_G_FMT and it implements DECODER_INiT. > > > > The other question is whether we can reuse the RESOLUTION_CHANGE event instead > of having to add a new DECODER_INIT change. I think these two approaches are > both feasible: a dedicated DECODER_INIT event is the equivalent of a > "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic. Now that I think of > it though I'm less inclined to overload the meaning of the events. We can use > the same handling codepaths in the VDA, but that doesn't mean we have to make > the V4L2 API overload the meanings of these two. Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check. The issue is when we do both. Decoder initialization through G_FMT trigger in DBI() and again after we get the RESOLUTION_CHANGE event. I have tried this above DecoderInitTasks() and it works fine in my initial tests. Let me test it more and check for Exynos behavior too, I can then upload another patchset. Pawel, would you agree to this approach ? Any thoughts ?
On 2014/03/18 12:22:17, shivdasp wrote: > On 2014/03/18 06:15:28, Pawel Osciak wrote: > > On 2014/03/18 06:08:49, shivdasp wrote: > > > On 2014/03/18 05:49:59, Pawel Osciak wrote: > > > > On 2014/03/14 04:48:28, shivdasp wrote: > > > > > Pawel, > > > > > How do you suggest we go ahead now ? > > > > > Having device poll thread start early and using the resolution change > > event > > > > > seems breaking the vdatest. > > > > > Could you re-consider the patchset#7 for now (will address sheu's > > comments) > > > > that > > > > > has a not-so-good fix of reposting the task which will happen only on > > Tegra. > > > > We > > > > > anyways have a plan to use the decode initialization event and that will > > be > > > > the > > > > > neat way of doing this. > > > > > If you have any other ideas I can try them as well. > > > > > > > > Shivdas, > > > > Why are we doing ProvidePictureBuffers() twice? We should be seeing only > one > > > > resolution change (i.e. the initial change). > > > > We should rather fix vdatest than hack V4L2VDA, but I don't understand why > > > this > > > > becomes a resolution change event with us calling PPB() twice. > > > > Could you explain? > > > The first PPB() happens when the G_FMT succeeds in DBI() and thereby picture > > > buffers are requested. > > > The second PPB() happens because a RESOLUTION_CHANGE event is already > enqueued > > > underneath and it is dequeued() after picture buffers were already created. > So > > > there's no way to identify a real resolution change from the initial decoder > > > init request. > > > > Could we fix rendering helper to reallocate textures? > > I fixed the rendering helper to reallocate the textures and now it can handle > PPB() twice. > However there are two major issues that I see now: > 1. Since we do a RES_CHANGE sequence, few of the decoded-but-yet-to-be-rendered > buffers may get lost since we call StopDevicePoll() which inturn calls STREAMOFF > which looses the buffers. 7 sub-tests fail on account of less than expected > frames returned. Resolution change event should only be sent after all the output buffers from before the change have been returned. And we immediately send PIctureReady for them after dequeuing them, so destroy cannot happen before that. So the point of DestroyOutputBuffers(), the buffers have been dequeued and sent to rendering, although they might have not been rendered yet. This is fine, as we don't have to maintain ownership of them, because as we discussed before, there is shared ownership of the frames between renderer and codec. The standard resolution change scenario involves freeing/destroying buffers by codec, while the renderer still keeps them and only destroys them after it's done rendering (we will probably be already decoding into the new buffers while it still finishes the old ones). So John is right here and this is one of the tricky parts I was mentioning before when we discussed ownership. Can your stack handle this? > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the reference > frames and as we are not actually starting the stream from a I frame there might > be corruption. As mentioned above, the resolution change event should only be sent after all the output buffers from the previous resolution were returned to the userspace from the codec (this has nothing to do with their rendering though). Since a resolution change can only happen in SPS or in a keyframe for VP8, there should be no keyframes required for decoding after the change. Also, as a side note, I don't think you can use frames in one resolution as reference for frames in a different resolution? > I think we might be better off with having a separate event for signalling the > decoder initialization rather than using resolution change sequence. Given the above, I don't see a reason for one... > Add a new method DecoderInitTask() , which will be posted from DequeueEvents() > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name ?? & > would need to add another enum like RESOLUTION_CHANGE with value 6 ?? ) > DecoderInitTask(), does nothing if decoder_state_ is already kDecoding, else > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding. > > This should work with Exynos too since there will not be any DECODER_INIT event > dequeued on Exynos. > Thoughts ? > > Thanks
On 2014/03/19 04:23:20, shivdasp wrote: > On 2014/03/18 20:15:22, sheu wrote: > > On 2014/03/18 12:22:17, shivdasp wrote: > > > I fixed the rendering helper to reallocate the textures and now it can > handle > > > PPB() twice. > > > However there are two major issues that I see now: > > > 1. Since we do a RES_CHANGE sequence, few of the > > decoded-but-yet-to-be-rendered > > > buffers may get lost since we call StopDevicePoll() which inturn calls > > STREAMOFF > > > which looses the buffers. 7 sub-tests fail on account of less than expected > > > frames returned. > > > > If the frame has been decoded and returned to the VDA client, then calling > > STREAMOFF on the queue may terminate the video decoder's access to the buffer, > > but it should not be destroyed until the 3D context has released it -- it > should > > still be renderable. This is the point of the discussion we had above about > the > > tegrav4l2 library needing to track buffer lifetimes correctly for both the 3D > > and the video decode stacks. > > > Sorry if I wasn't clear, I was talking about the decoded but yet to be DQBUF'ed > buffers. > The buffers were ready to be dequeued but a STREAMOFF call would get them > dropped. John is right here. Streamoff has nothing to do with the ownership of the textures by the rendering part. STREAMOFF only removes the ownership of the codec over them, but renderer keeps the textures alive and finishes rendering them. Once it's done, it frees the textures and everything gets cleaned up. Does your stack work this way? > > > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the reference > > > frames and as we are not actually starting the stream from a I frame there > > might > > > be corruption. > > > > This is a more serious problem (possible corruption) since the stream does not > restart (STREAMOFF for output plane is not called) > so the decoder believes it is starting from a I frame which may not be the case. > STREAMOFF is called at https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu.... I'm also not sure what do you mean by "stream restart"? The decoder has to be starting from an I frame, there can be no resolution change on a frame that is not a reference frame. When the decoder sends the resolution change event: - all the output buffers that were to be decoded before resolution change are ready and can be dequeued; once the client sees the event, it is supposed to dequeue and display all the output buffers and can be sure there will be no more until it reallocates the output queue; - the input queue is not to be touched and can operate in parallel without problems (i.e. it will keep already enqueued stream buffers and more can be queued at any time) - decoding is at an SPS for H264 (which means a new I-frame should follow it) or an I-frame for VP8 (the codec, when it sees an I-frame with a different resolution, is supposed to stop, return all the outputs from before that frame and only decode that I-frame with resolution change once it gets new buffers); further decoding will thus start from an I-frame always; > > > I think we might be better off with having a separate event for signalling > the > > > decoder initialization rather than using resolution change sequence. > > > Add a new method DecoderInitTask() , which will be posted from > DequeueEvents() > > > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name ?? & > > > would need to add another enum like RESOLUTION_CHANGE with value 6 ?? ) > > > DecoderInitTask(), does nothing if decoder_state_ is already kDecoding, else > > > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding. > > > > > > This should work with Exynos too since there will not be any DECODER_INIT > > event > > > dequeued on Exynos. > > > Thoughts ? > > > > So my thought are that we have two concerns: (1) allowing for devices to > signal > > decoder init through the events system, and also (2) not breaking the existing > > API for existing drivers and users. > > > > For the existing usecase in (2), clients would be expected to poll > VIDIOC_G_FMT > > on the CAPTURE queue until it succeeds, at which point the client knows that > the > > initialization has succeeded and decoding can commence. > > > yes we will still keep the sequence in DBI() as is so Exynos would stay > unaffected. > And if on Tegra the G_FMT succeeds earlier this can also be protected by > checking the decoder_state_ > in DecoderInitTask() which will no-op if decoder_state_ is already kDecoding. > > > For (1), the new wrinkle would be clients don't have to poll; the arrival of > the > > event just signals that the client can expect to succeed the next time > > VIDIOC_G_FMT is called. > > > > So if we do it this way, we can just add the behavior where the arrival of a > > DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT. This would be > correct > > for the current Exynos case where VIDIOC_G_FMT is blocking (and DECODER_INIT > is > > not implemented), as well as a future where Exynos does not block on > > VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib does not > > block VIDIOC_G_FMT and it implements DECODER_INiT. > > > > > > > > The other question is whether we can reuse the RESOLUTION_CHANGE event instead > > of having to add a new DECODER_INIT change. I think these two approaches are > > both feasible: a dedicated DECODER_INIT event is the equivalent of a > > "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic. Now that I think of > > it though I'm less inclined to overload the meaning of the events. We can use > > the same handling codepaths in the VDA, but that doesn't mean we have to make > > the V4L2 API overload the meanings of these two. > Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check. The issue > is when > we do both. Decoder initialization through G_FMT trigger in DBI() and again > after we get > the RESOLUTION_CHANGE event. > I have tried this above DecoderInitTasks() and it works fine in my initial > tests. > Let me test it more and check for Exynos behavior too, I can then upload another > patchset. > Pawel, would you agree to this approach ? Any thoughts ?
On 2014/03/19 05:44:09, Pawel Osciak wrote: > On 2014/03/19 04:23:20, shivdasp wrote: > > On 2014/03/18 20:15:22, sheu wrote: > > > On 2014/03/18 12:22:17, shivdasp wrote: > > > > I fixed the rendering helper to reallocate the textures and now it can > > handle > > > > PPB() twice. > > > > However there are two major issues that I see now: > > > > 1. Since we do a RES_CHANGE sequence, few of the > > > decoded-but-yet-to-be-rendered > > > > buffers may get lost since we call StopDevicePoll() which inturn calls > > > STREAMOFF > > > > which looses the buffers. 7 sub-tests fail on account of less than > expected > > > > frames returned. > > > > > > If the frame has been decoded and returned to the VDA client, then calling > > > STREAMOFF on the queue may terminate the video decoder's access to the > buffer, > > > but it should not be destroyed until the 3D context has released it -- it > > should > > > still be renderable. This is the point of the discussion we had above about > > the > > > tegrav4l2 library needing to track buffer lifetimes correctly for both the > 3D > > > and the video decode stacks. > > > > > Sorry if I wasn't clear, I was talking about the decoded but yet to be > DQBUF'ed > > buffers. > > The buffers were ready to be dequeued but a STREAMOFF call would get them > > dropped. > > John is right here. Streamoff has nothing to do with the ownership of the > textures by the rendering part. > STREAMOFF only removes the ownership of the codec over them, but renderer keeps > the textures alive and finishes rendering them. > Once it's done, it frees the textures and everything gets cleaned up. > > Does your stack work this way? > > > > > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the > reference > > > > frames and as we are not actually starting the stream from a I frame there > > > might > > > > be corruption. > > > > > > This is a more serious problem (possible corruption) since the stream does not > > restart (STREAMOFF for output plane is not called) > > so the decoder believes it is starting from a I frame which may not be the > case. > > > > STREAMOFF is called at > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu.... > > I'm also not sure what do you mean by "stream restart"? > The decoder has to be starting from an I frame, there can be no resolution > change on a frame that is not a reference frame. > > When the decoder sends the resolution change event: > - all the output buffers that were to be decoded before resolution change are > ready and can be dequeued; once the client sees the event, it is supposed to > dequeue and display all the output buffers and can be sure there will be no more > until it reallocates the output queue; > - the input queue is not to be touched and can operate in parallel without > problems (i.e. it will keep already enqueued stream buffers and more can be > queued at any time) > - decoding is at an SPS for H264 (which means a new I-frame should follow it) or > an I-frame for VP8 (the codec, when it sees an I-frame with a different > resolution, is supposed to stop, return all the outputs from before that frame > and only decode that I-frame with resolution change once it gets new buffers); > further decoding will thus start from an I-frame always; > > > > > I think we might be better off with having a separate event for signalling > > the > > > > decoder initialization rather than using resolution change sequence. > > > > Add a new method DecoderInitTask() , which will be posted from > > DequeueEvents() > > > > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name ?? > & > > > > would need to add another enum like RESOLUTION_CHANGE with value 6 ?? ) > > > > DecoderInitTask(), does nothing if decoder_state_ is already kDecoding, > else > > > > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding. > > > > > > > > This should work with Exynos too since there will not be any DECODER_INIT > > > event > > > > dequeued on Exynos. > > > > Thoughts ? > > > > > > So my thought are that we have two concerns: (1) allowing for devices to > > signal > > > decoder init through the events system, and also (2) not breaking the > existing > > > API for existing drivers and users. > > > > > > For the existing usecase in (2), clients would be expected to poll > > VIDIOC_G_FMT > > > on the CAPTURE queue until it succeeds, at which point the client knows that > > the > > > initialization has succeeded and decoding can commence. > > > > > yes we will still keep the sequence in DBI() as is so Exynos would stay > > unaffected. > > And if on Tegra the G_FMT succeeds earlier this can also be protected by > > checking the decoder_state_ > > in DecoderInitTask() which will no-op if decoder_state_ is already kDecoding. > > > > > For (1), the new wrinkle would be clients don't have to poll; the arrival of > > the > > > event just signals that the client can expect to succeed the next time > > > VIDIOC_G_FMT is called. > > > > > > So if we do it this way, we can just add the behavior where the arrival of a > > > DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT. This would be > > correct > > > for the current Exynos case where VIDIOC_G_FMT is blocking (and DECODER_INIT > > is > > > not implemented), as well as a future where Exynos does not block on > > > VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib does > not > > > block VIDIOC_G_FMT and it implements DECODER_INiT. > > > > > > > > > > > > The other question is whether we can reuse the RESOLUTION_CHANGE event > instead > > > of having to add a new DECODER_INIT change. I think these two approaches > are > > > both feasible: a dedicated DECODER_INIT event is the equivalent of a > > > "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic. Now that I think > of > > > it though I'm less inclined to overload the meaning of the events. We can > use > > > the same handling codepaths in the VDA, but that doesn't mean we have to > make > > > the V4L2 API overload the meanings of these two. > > Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check. The > issue > > is when > > we do both. Decoder initialization through G_FMT trigger in DBI() and again > > after we get > > the RESOLUTION_CHANGE event. > > I have tried this above DecoderInitTasks() and it works fine in my initial > > tests. > > Let me test it more and check for Exynos behavior too, I can then upload > another > > patchset. > > Pawel, would you agree to this approach ? Any thoughts ? Hi Pawel, The handling of resolution change event is done exactly as you have described above in our stack. However my concern is about our plan of enqueuing the resolution change event to signal that the decoder initialization has happened. So in a "real" resolution change event everything will happen correctly, i.e after all the bitstream buffers are processed , the resolution change event is generated which then frees and reallocates the yuv (output) buffers correctly. However if we were already initialized (G_FMT happened to succeed in DBI() ) but since the underlying stack has enqueued a resolution change event (in case the client did not get G_FMT to succeed before emptying all the input buffers), the resolution change sequence will release the output buffers and since the bitstream is not going to be restarted on a SPS or keyframe the decoder will not have correct references. And that's why I am suggesting to not use resolution change event but rather a separate DECODER_INIT event. Would you like to have a conf call to discuss this so that we can close on it quickly ? Thanks,
On 2014/03/19 06:17:55, shivdasp wrote: > On 2014/03/19 05:44:09, Pawel Osciak wrote: > > On 2014/03/19 04:23:20, shivdasp wrote: > > > On 2014/03/18 20:15:22, sheu wrote: > > > > On 2014/03/18 12:22:17, shivdasp wrote: > > > > > I fixed the rendering helper to reallocate the textures and now it can > > > handle > > > > > PPB() twice. > > > > > However there are two major issues that I see now: > > > > > 1. Since we do a RES_CHANGE sequence, few of the > > > > decoded-but-yet-to-be-rendered > > > > > buffers may get lost since we call StopDevicePoll() which inturn calls > > > > STREAMOFF > > > > > which looses the buffers. 7 sub-tests fail on account of less than > > expected > > > > > frames returned. > > > > > > > > If the frame has been decoded and returned to the VDA client, then calling > > > > STREAMOFF on the queue may terminate the video decoder's access to the > > buffer, > > > > but it should not be destroyed until the 3D context has released it -- it > > > should > > > > still be renderable. This is the point of the discussion we had above > about > > > the > > > > tegrav4l2 library needing to track buffer lifetimes correctly for both the > > 3D > > > > and the video decode stacks. > > > > > > > Sorry if I wasn't clear, I was talking about the decoded but yet to be > > DQBUF'ed > > > buffers. > > > The buffers were ready to be dequeued but a STREAMOFF call would get them > > > dropped. > > > > John is right here. Streamoff has nothing to do with the ownership of the > > textures by the rendering part. > > STREAMOFF only removes the ownership of the codec over them, but renderer > keeps > > the textures alive and finishes rendering them. > > Once it's done, it frees the textures and everything gets cleaned up. > > > > Does your stack work this way? > > > > > > > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the > > reference > > > > > frames and as we are not actually starting the stream from a I frame > there > > > > might > > > > > be corruption. > > > > > > > > This is a more serious problem (possible corruption) since the stream does > not > > > restart (STREAMOFF for output plane is not called) > > > so the decoder believes it is starting from a I frame which may not be the > > case. > > > > > > > STREAMOFF is called at > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu.... > > > > I'm also not sure what do you mean by "stream restart"? > > The decoder has to be starting from an I frame, there can be no resolution > > change on a frame that is not a reference frame. > > > > When the decoder sends the resolution change event: > > - all the output buffers that were to be decoded before resolution change are > > ready and can be dequeued; once the client sees the event, it is supposed to > > dequeue and display all the output buffers and can be sure there will be no > more > > until it reallocates the output queue; > > - the input queue is not to be touched and can operate in parallel without > > problems (i.e. it will keep already enqueued stream buffers and more can be > > queued at any time) > > - decoding is at an SPS for H264 (which means a new I-frame should follow it) > or > > an I-frame for VP8 (the codec, when it sees an I-frame with a different > > resolution, is supposed to stop, return all the outputs from before that frame > > and only decode that I-frame with resolution change once it gets new buffers); > > further decoding will thus start from an I-frame always; > > > > > > > I think we might be better off with having a separate event for > signalling > > > the > > > > > decoder initialization rather than using resolution change sequence. > > > > > Add a new method DecoderInitTask() , which will be posted from > > > DequeueEvents() > > > > > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better name > ?? > > & > > > > > would need to add another enum like RESOLUTION_CHANGE with value 6 ?? ) > > > > > DecoderInitTask(), does nothing if decoder_state_ is already kDecoding, > > else > > > > > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding. > > > > > > > > > > This should work with Exynos too since there will not be any > DECODER_INIT > > > > event > > > > > dequeued on Exynos. > > > > > Thoughts ? > > > > > > > > So my thought are that we have two concerns: (1) allowing for devices to > > > signal > > > > decoder init through the events system, and also (2) not breaking the > > existing > > > > API for existing drivers and users. > > > > > > > > For the existing usecase in (2), clients would be expected to poll > > > VIDIOC_G_FMT > > > > on the CAPTURE queue until it succeeds, at which point the client knows > that > > > the > > > > initialization has succeeded and decoding can commence. > > > > > > > yes we will still keep the sequence in DBI() as is so Exynos would stay > > > unaffected. > > > And if on Tegra the G_FMT succeeds earlier this can also be protected by > > > checking the decoder_state_ > > > in DecoderInitTask() which will no-op if decoder_state_ is already > kDecoding. > > > > > > > For (1), the new wrinkle would be clients don't have to poll; the arrival > of > > > the > > > > event just signals that the client can expect to succeed the next time > > > > VIDIOC_G_FMT is called. > > > > > > > > So if we do it this way, we can just add the behavior where the arrival of > a > > > > DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT. This would be > > > correct > > > > for the current Exynos case where VIDIOC_G_FMT is blocking (and > DECODER_INIT > > > is > > > > not implemented), as well as a future where Exynos does not block on > > > > VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib does > > not > > > > block VIDIOC_G_FMT and it implements DECODER_INiT. > > > > > > > > > > > > > > > > The other question is whether we can reuse the RESOLUTION_CHANGE event > > instead > > > > of having to add a new DECODER_INIT change. I think these two approaches > > are > > > > both feasible: a dedicated DECODER_INIT event is the equivalent of a > > > > "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic. Now that I > think > > of > > > > it though I'm less inclined to overload the meaning of the events. We can > > use > > > > the same handling codepaths in the VDA, but that doesn't mean we have to > > make > > > > the V4L2 API overload the meanings of these two. > > > Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check. The > > issue > > > is when > > > we do both. Decoder initialization through G_FMT trigger in DBI() and again > > > after we get > > > the RESOLUTION_CHANGE event. > > > I have tried this above DecoderInitTasks() and it works fine in my initial > > > tests. > > > Let me test it more and check for Exynos behavior too, I can then upload > > another > > > patchset. > > > Pawel, would you agree to this approach ? Any thoughts ? > > Hi Pawel, > > The handling of resolution change event is done exactly as you have described > above in our stack. > However my concern is about our plan of enqueuing the resolution change event to > signal > that the decoder initialization has happened. > So in a "real" resolution change event everything will happen correctly, i.e > after all the bitstream buffers > are processed , the resolution change event is generated which then frees and > reallocates the yuv (output) buffers correctly. > However if we were already initialized (G_FMT happened to succeed in DBI() ) but > since the underlying stack has enqueued a > resolution change event (in case the client did not get G_FMT to succeed before > emptying all the input buffers), the > resolution change sequence will release the output buffers and since the > bitstream is not going to be restarted > on a SPS or keyframe the decoder will not have correct references. > And that's why I am suggesting to not use resolution change event but rather a > separate DECODER_INIT event. I think we have a misunderstanding here, we should not have a G_FMT call in DBI at all. I think this should solve the problem? > Would you like to have a conf call to discuss this so that we can close on it > quickly ? Sure.
On 2014/03/19 06:21:45, Pawel Osciak wrote: > On 2014/03/19 06:17:55, shivdasp wrote: > > On 2014/03/19 05:44:09, Pawel Osciak wrote: > > > On 2014/03/19 04:23:20, shivdasp wrote: > > > > On 2014/03/18 20:15:22, sheu wrote: > > > > > On 2014/03/18 12:22:17, shivdasp wrote: > > > > > > I fixed the rendering helper to reallocate the textures and now it can > > > > handle > > > > > > PPB() twice. > > > > > > However there are two major issues that I see now: > > > > > > 1. Since we do a RES_CHANGE sequence, few of the > > > > > decoded-but-yet-to-be-rendered > > > > > > buffers may get lost since we call StopDevicePoll() which inturn calls > > > > > STREAMOFF > > > > > > which looses the buffers. 7 sub-tests fail on account of less than > > > expected > > > > > > frames returned. > > > > > > > > > > If the frame has been decoded and returned to the VDA client, then > calling > > > > > STREAMOFF on the queue may terminate the video decoder's access to the > > > buffer, > > > > > but it should not be destroyed until the 3D context has released it -- > it > > > > should > > > > > still be renderable. This is the point of the discussion we had above > > about > > > > the > > > > > tegrav4l2 library needing to track buffer lifetimes correctly for both > the > > > 3D > > > > > and the video decode stacks. > > > > > > > > > Sorry if I wasn't clear, I was talking about the decoded but yet to be > > > DQBUF'ed > > > > buffers. > > > > The buffers were ready to be dequeued but a STREAMOFF call would get them > > > > dropped. > > > > > > John is right here. Streamoff has nothing to do with the ownership of the > > > textures by the rendering part. > > > STREAMOFF only removes the ownership of the codec over them, but renderer > > keeps > > > the textures alive and finishes rendering them. > > > Once it's done, it frees the textures and everything gets cleaned up. > > > > > > Does your stack work this way? > > > > > > > > > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the > > > reference > > > > > > frames and as we are not actually starting the stream from a I frame > > there > > > > > might > > > > > > be corruption. > > > > > > > > > > This is a more serious problem (possible corruption) since the stream does > > not > > > > restart (STREAMOFF for output plane is not called) > > > > so the decoder believes it is starting from a I frame which may not be the > > > case. > > > > > > > > > > STREAMOFF is called at > > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu.... > > > > > > I'm also not sure what do you mean by "stream restart"? > > > The decoder has to be starting from an I frame, there can be no resolution > > > change on a frame that is not a reference frame. > > > > > > When the decoder sends the resolution change event: > > > - all the output buffers that were to be decoded before resolution change > are > > > ready and can be dequeued; once the client sees the event, it is supposed to > > > dequeue and display all the output buffers and can be sure there will be no > > more > > > until it reallocates the output queue; > > > - the input queue is not to be touched and can operate in parallel without > > > problems (i.e. it will keep already enqueued stream buffers and more can be > > > queued at any time) > > > - decoding is at an SPS for H264 (which means a new I-frame should follow > it) > > or > > > an I-frame for VP8 (the codec, when it sees an I-frame with a different > > > resolution, is supposed to stop, return all the outputs from before that > frame > > > and only decode that I-frame with resolution change once it gets new > buffers); > > > further decoding will thus start from an I-frame always; > > > > > > > > > I think we might be better off with having a separate event for > > signalling > > > > the > > > > > > decoder initialization rather than using resolution change sequence. > > > > > > Add a new method DecoderInitTask() , which will be posted from > > > > DequeueEvents() > > > > > > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better > name > > ?? > > > & > > > > > > would need to add another enum like RESOLUTION_CHANGE with value 6 ?? > ) > > > > > > DecoderInitTask(), does nothing if decoder_state_ is already > kDecoding, > > > else > > > > > > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding. > > > > > > > > > > > > This should work with Exynos too since there will not be any > > DECODER_INIT > > > > > event > > > > > > dequeued on Exynos. > > > > > > Thoughts ? > > > > > > > > > > So my thought are that we have two concerns: (1) allowing for devices to > > > > signal > > > > > decoder init through the events system, and also (2) not breaking the > > > existing > > > > > API for existing drivers and users. > > > > > > > > > > For the existing usecase in (2), clients would be expected to poll > > > > VIDIOC_G_FMT > > > > > on the CAPTURE queue until it succeeds, at which point the client knows > > that > > > > the > > > > > initialization has succeeded and decoding can commence. > > > > > > > > > yes we will still keep the sequence in DBI() as is so Exynos would stay > > > > unaffected. > > > > And if on Tegra the G_FMT succeeds earlier this can also be protected by > > > > checking the decoder_state_ > > > > in DecoderInitTask() which will no-op if decoder_state_ is already > > kDecoding. > > > > > > > > > For (1), the new wrinkle would be clients don't have to poll; the > arrival > > of > > > > the > > > > > event just signals that the client can expect to succeed the next time > > > > > VIDIOC_G_FMT is called. > > > > > > > > > > So if we do it this way, we can just add the behavior where the arrival > of > > a > > > > > DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT. This would be > > > > correct > > > > > for the current Exynos case where VIDIOC_G_FMT is blocking (and > > DECODER_INIT > > > > is > > > > > not implemented), as well as a future where Exynos does not block on > > > > > VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib > does > > > not > > > > > block VIDIOC_G_FMT and it implements DECODER_INiT. > > > > > > > > > > > > > > > > > > > > The other question is whether we can reuse the RESOLUTION_CHANGE event > > > instead > > > > > of having to add a new DECODER_INIT change. I think these two > approaches > > > are > > > > > both feasible: a dedicated DECODER_INIT event is the equivalent of a > > > > > "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic. Now that I > > think > > > of > > > > > it though I'm less inclined to overload the meaning of the events. We > can > > > use > > > > > the same handling codepaths in the VDA, but that doesn't mean we have to > > > make > > > > > the V4L2 API overload the meanings of these two. > > > > Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check. The > > > issue > > > > is when > > > > we do both. Decoder initialization through G_FMT trigger in DBI() and > again > > > > after we get > > > > the RESOLUTION_CHANGE event. > > > > I have tried this above DecoderInitTasks() and it works fine in my initial > > > > tests. > > > > Let me test it more and check for Exynos behavior too, I can then upload > > > another > > > > patchset. > > > > Pawel, would you agree to this approach ? Any thoughts ? > > > > Hi Pawel, > > > > The handling of resolution change event is done exactly as you have described > > above in our stack. > > However my concern is about our plan of enqueuing the resolution change event > to > > signal > > that the decoder initialization has happened. > > So in a "real" resolution change event everything will happen correctly, i.e > > after all the bitstream buffers > > are processed , the resolution change event is generated which then frees and > > reallocates the yuv (output) buffers correctly. > > However if we were already initialized (G_FMT happened to succeed in DBI() ) > but > > since the underlying stack has enqueued a > > resolution change event (in case the client did not get G_FMT to succeed > before > > emptying all the input buffers), the > > resolution change sequence will release the output buffers and since the > > bitstream is not going to be restarted > > on a SPS or keyframe the decoder will not have correct references. > > And that's why I am suggesting to not use resolution change event but rather a > > separate DECODER_INIT event. > > I think we have a misunderstanding here, we should not have a G_FMT call in DBI > at all. I think this should solve the problem? Yes I think that solves the problem. As discussed in yesterday over the call, GetFormatInfo() and CreateBuffersForFormat() will be removed from DBI(). This will ensure that the decoder initialization will be triggered only when RESOLUTION_CHANGE event is dequeued. This will temporarily break Exynos until Exynos V4L2 driver also enqueues a RESOLUTION_CHANGE to signal decoder initialization. > > > Would you like to have a conf call to discuss this so that we can close on it > > quickly ? > > Sure.
On 2014/03/20 04:44:16, shivdasp wrote: > On 2014/03/19 06:21:45, Pawel Osciak wrote: > > On 2014/03/19 06:17:55, shivdasp wrote: > > > On 2014/03/19 05:44:09, Pawel Osciak wrote: > > > > On 2014/03/19 04:23:20, shivdasp wrote: > > > > > On 2014/03/18 20:15:22, sheu wrote: > > > > > > On 2014/03/18 12:22:17, shivdasp wrote: > > > > > > > I fixed the rendering helper to reallocate the textures and now it > can > > > > > handle > > > > > > > PPB() twice. > > > > > > > However there are two major issues that I see now: > > > > > > > 1. Since we do a RES_CHANGE sequence, few of the > > > > > > decoded-but-yet-to-be-rendered > > > > > > > buffers may get lost since we call StopDevicePoll() which inturn > calls > > > > > > STREAMOFF > > > > > > > which looses the buffers. 7 sub-tests fail on account of less than > > > > expected > > > > > > > frames returned. > > > > > > > > > > > > If the frame has been decoded and returned to the VDA client, then > > calling > > > > > > STREAMOFF on the queue may terminate the video decoder's access to the > > > > buffer, > > > > > > but it should not be destroyed until the 3D context has released it -- > > it > > > > > should > > > > > > still be renderable. This is the point of the discussion we had above > > > about > > > > > the > > > > > > tegrav4l2 library needing to track buffer lifetimes correctly for both > > the > > > > 3D > > > > > > and the video decode stacks. > > > > > > > > > > > Sorry if I wasn't clear, I was talking about the decoded but yet to be > > > > DQBUF'ed > > > > > buffers. > > > > > The buffers were ready to be dequeued but a STREAMOFF call would get > them > > > > > dropped. > > > > > > > > John is right here. Streamoff has nothing to do with the ownership of the > > > > textures by the rendering part. > > > > STREAMOFF only removes the ownership of the codec over them, but renderer > > > keeps > > > > the textures alive and finishes rendering them. > > > > Once it's done, it frees the textures and everything gets cleaned up. > > > > > > > > Does your stack work this way? > > > > > > > > > > > 2. Relatedly, since we DestroyOutputBuffers(), we might loose the > > > > reference > > > > > > > frames and as we are not actually starting the stream from a I frame > > > there > > > > > > might > > > > > > > be corruption. > > > > > > > > > > > > This is a more serious problem (possible corruption) since the stream > does > > > not > > > > > restart (STREAMOFF for output plane is not called) > > > > > so the decoder believes it is starting from a I frame which may not be > the > > > > case. > > > > > > > > > > > > > STREAMOFF is called at > > > > > > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu.... > > > > > > > > I'm also not sure what do you mean by "stream restart"? > > > > The decoder has to be starting from an I frame, there can be no resolution > > > > change on a frame that is not a reference frame. > > > > > > > > When the decoder sends the resolution change event: > > > > - all the output buffers that were to be decoded before resolution change > > are > > > > ready and can be dequeued; once the client sees the event, it is supposed > to > > > > dequeue and display all the output buffers and can be sure there will be > no > > > more > > > > until it reallocates the output queue; > > > > - the input queue is not to be touched and can operate in parallel without > > > > problems (i.e. it will keep already enqueued stream buffers and more can > be > > > > queued at any time) > > > > - decoding is at an SPS for H264 (which means a new I-frame should follow > > it) > > > or > > > > an I-frame for VP8 (the codec, when it sees an I-frame with a different > > > > resolution, is supposed to stop, return all the outputs from before that > > frame > > > > and only decode that I-frame with resolution change once it gets new > > buffers); > > > > further decoding will thus start from an I-frame always; > > > > > > > > > > > I think we might be better off with having a separate event for > > > signalling > > > > > the > > > > > > > decoder initialization rather than using resolution change sequence. > > > > > > > Add a new method DecoderInitTask() , which will be posted from > > > > > DequeueEvents() > > > > > > > when the event dequeued is of type V4L2_EVENT_DECODER_INIT (better > > name > > > ?? > > > > & > > > > > > > would need to add another enum like RESOLUTION_CHANGE with value 6 > ?? > > ) > > > > > > > DecoderInitTask(), does nothing if decoder_state_ is already > > kDecoding, > > > > else > > > > > > > GetFormatInfo() and CreateBuffersForFormat() and moves to kDecoding. > > > > > > > > > > > > > > This should work with Exynos too since there will not be any > > > DECODER_INIT > > > > > > event > > > > > > > dequeued on Exynos. > > > > > > > Thoughts ? > > > > > > > > > > > > So my thought are that we have two concerns: (1) allowing for devices > to > > > > > signal > > > > > > decoder init through the events system, and also (2) not breaking the > > > > existing > > > > > > API for existing drivers and users. > > > > > > > > > > > > For the existing usecase in (2), clients would be expected to poll > > > > > VIDIOC_G_FMT > > > > > > on the CAPTURE queue until it succeeds, at which point the client > knows > > > that > > > > > the > > > > > > initialization has succeeded and decoding can commence. > > > > > > > > > > > yes we will still keep the sequence in DBI() as is so Exynos would stay > > > > > unaffected. > > > > > And if on Tegra the G_FMT succeeds earlier this can also be protected by > > > > > checking the decoder_state_ > > > > > in DecoderInitTask() which will no-op if decoder_state_ is already > > > kDecoding. > > > > > > > > > > > For (1), the new wrinkle would be clients don't have to poll; the > > arrival > > > of > > > > > the > > > > > > event just signals that the client can expect to succeed the next time > > > > > > VIDIOC_G_FMT is called. > > > > > > > > > > > > So if we do it this way, we can just add the behavior where the > arrival > > of > > > a > > > > > > DECODER_INIT event tells V4L2VDA to re-try VIDIOC_G_FMT. This would > be > > > > > correct > > > > > > for the current Exynos case where VIDIOC_G_FMT is blocking (and > > > DECODER_INIT > > > > > is > > > > > > not implemented), as well as a future where Exynos does not block on > > > > > > VIDIOC_G_FMT and DECODER_INIT is implemented, and where TegraV4L2lib > > does > > > > not > > > > > > block VIDIOC_G_FMT and it implements DECODER_INiT. > > > > > > > > > > > > > > > > > > > > > > > > The other question is whether we can reuse the RESOLUTION_CHANGE event > > > > instead > > > > > > of having to add a new DECODER_INIT change. I think these two > > approaches > > > > are > > > > > > both feasible: a dedicated DECODER_INIT event is the equivalent of a > > > > > > "RESOLUTION_CHANGE && !vidioc_g_fmt_has_succeeded" logic. Now that I > > > think > > > > of > > > > > > it though I'm less inclined to overload the meaning of the events. We > > can > > > > use > > > > > > the same handling codepaths in the VDA, but that doesn't mean we have > to > > > > make > > > > > > the V4L2 API overload the meanings of these two. > > > > > Using the RESOLUTION_CHANGE can work if we are not doing G_FMT check. > The > > > > issue > > > > > is when > > > > > we do both. Decoder initialization through G_FMT trigger in DBI() and > > again > > > > > after we get > > > > > the RESOLUTION_CHANGE event. > > > > > I have tried this above DecoderInitTasks() and it works fine in my > initial > > > > > tests. > > > > > Let me test it more and check for Exynos behavior too, I can then upload > > > > another > > > > > patchset. > > > > > Pawel, would you agree to this approach ? Any thoughts ? > > > > > > Hi Pawel, > > > > > > The handling of resolution change event is done exactly as you have > described > > > above in our stack. > > > However my concern is about our plan of enqueuing the resolution change > event > > to > > > signal > > > that the decoder initialization has happened. > > > So in a "real" resolution change event everything will happen correctly, i.e > > > after all the bitstream buffers > > > are processed , the resolution change event is generated which then frees > and > > > reallocates the yuv (output) buffers correctly. > > > However if we were already initialized (G_FMT happened to succeed in DBI() ) > > but > > > since the underlying stack has enqueued a > > > resolution change event (in case the client did not get G_FMT to succeed > > before > > > emptying all the input buffers), the > > > resolution change sequence will release the output buffers and since the > > > bitstream is not going to be restarted > > > on a SPS or keyframe the decoder will not have correct references. > > > And that's why I am suggesting to not use resolution change event but rather > a > > > separate DECODER_INIT event. > > > > I think we have a misunderstanding here, we should not have a G_FMT call in > DBI > > at all. I think this should solve the problem? > Yes I think that solves the problem. As discussed in yesterday over the call, > GetFormatInfo() and CreateBuffersForFormat() > will be removed from DBI(). This will ensure that the decoder initialization > will be triggered only when > RESOLUTION_CHANGE event is dequeued. > This will temporarily break Exynos until Exynos V4L2 driver also enqueues a > RESOLUTION_CHANGE to signal decoder initialization. Great, thank you. Please let me know if the EOS event also solves the EOS problem and whether it works with Exynos.
On 2014/03/20 05:29:41, Pawel Osciak wrote: > Great, thank you. Please let me know if the EOS event also solves the EOS > problem and whether it works with Exynos. I had a chat with Pawel at what was apparently 4:30 am in his timezone about the delayed initialization situation. I'm a little concerned about removing G_FMT completely from the DBI() sequence, since other drivers which don't support the resolution change event might rely on the behavior of being able to block the G_FMT call until the decoder initializes. I proposed that we keep the G_FMT call, but allow it to return failure error without blocking. We would also query G_FMT after a resolution change event, and when the event arrives we will check the returned format against previously returned formats, if any. If the format is the same, we can assume that this is an initialization event and/or no buffer reallocations are necessary, and skip the reallocation. Otherwise we will execute the resolution change, and in the case that this is an initialization event, create the buffers as appropriate. This should not require any changes to TegraVDA from the plan discussed above. G_FMT can be expected to fail if decoder init has not been completed (which should be the case anyways). What do you think?
On 2014/03/20 21:30:00, sheu wrote: > On 2014/03/20 05:29:41, Pawel Osciak wrote: > > Great, thank you. Please let me know if the EOS event also solves the EOS > > problem and whether it works with Exynos. > > I had a chat with Pawel at what was apparently 4:30 am in his timezone about the > delayed initialization situation. I'm a little concerned about removing G_FMT > completely from the DBI() sequence, since other drivers which don't support the > resolution change event might rely on the behavior of being able to block the > G_FMT call until the decoder initializes. > > I proposed that we keep the G_FMT call, but allow it to return failure error > without blocking. We would also query G_FMT after a resolution change event, > and when the event arrives we will check the returned format against previously > returned formats, if any. If the format is the same, we can assume that this is > an initialization event and/or no buffer reallocations are necessary, and skip > the reallocation. Otherwise we will execute the resolution change, and in the > case that this is an initialization event, create the buffers as appropriate. > > This should not require any changes to TegraVDA from the plan discussed above. > G_FMT can be expected to fail if decoder init has not been completed (which > should be the case anyways). What do you think? Hmmm. This should be fine. Would there be a case where the decoder detects a new SPS sequence but the format happens to be same ? i.e Not a real resolution change but a restart of SPS (forex concatenation of two different streams with same stream parameters). I am not sure if this is valid case though. Thanks
Addressed all the previous comments too, PTAL. Thanks https://codereview.chromium.org/137023008/diff/680001/content/common/sandbox_... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/680001/content/common/sandbox_... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:112: write_whitelist->push_back(kDevVicPath); This whitelisting is only temporary and a change to remove these altogether is up for review as part of another CL.
https://chromiumcodereview.appspot.com/137023008/diff/680001/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/680001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1031: } Pawel brought up the point that it might be a change to the number of reference frames/output buffers required. Can you also check for V4L2_CID_MIN_BUFFERS_FOR_CAPTURE, and skip only if that is less than the current allocation count equal as well? https://chromiumcodereview.appspot.com/137023008/diff/680001/content/common/g... File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/680001/content/common/g... content/common/gpu/media/video_decode_accelerator_unittest.cc:1549: errno = 0; VaapiVideoDecodeAccelerator seems to get by without explicitly dlopen()-ing here. Can you look into how they do it? https://chromiumcodereview.appspot.com/137023008/diff/680001/content/common/s... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/680001/content/common/s... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:112: write_whitelist->push_back(kDevVicPath); On 2014/03/21 10:53:45, shivdasp wrote: > This whitelisting is only temporary and a change to remove these altogether is > up for review as part of another CL. I'd be inclined not to add whitelisting, even if temporary, if at all possible.
https://codereview.chromium.org/137023008/diff/680001/content/common/sandbox_... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/680001/content/common/sandbox_... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:112: write_whitelist->push_back(kDevVicPath); On 2014/03/21 23:19:54, sheu wrote: > On 2014/03/21 10:53:45, shivdasp wrote: > > This whitelisting is only temporary and a change to remove these altogether is > > up for review as part of another CL. > > I'd be inclined not to add whitelisting, even if temporary, if at all possible. There's a race between this CL and another which removes Tegra whitelisting altogether.I will remove this since that another one is mostly ready to go.
Addressed John's comments, PTAL. https://codereview.chromium.org/137023008/diff/680001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/680001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1031: } Done, please refer patchset#9. On 2014/03/21 23:19:54, sheu wrote: > Pawel brought up the point that it might be a change to the number of reference > frames/output buffers required. Can you also check for > V4L2_CID_MIN_BUFFERS_FOR_CAPTURE, and skip only if that is less than the current > allocation count equal as well? https://codereview.chromium.org/137023008/diff/680001/content/common/gpu/medi... File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://codereview.chromium.org/137023008/diff/680001/content/common/gpu/medi... content/common/gpu/media/video_decode_accelerator_unittest.cc:1549: errno = 0; It seems VaapiVDA is using some auto-gen mechanism to avoid the dlopen(). I am trying to understand it more but I am afraid might run into x86 only scripts. Do you have any pointers on how this autogen stuff works ? It might help me to do the change quicker. On 2014/03/21 23:19:54, sheu wrote: > VaapiVideoDecodeAccelerator seems to get by without explicitly dlopen()-ing > here. Can you look into how they do it?
> > Great, thank you. Please let me know if the EOS event also solves the EOS > problem and whether it works with Exynos. If I subscribe the event for EOS, I do get it in DequeueEvents() on Exynos. However I don't think we really need to make that change in V4L2VDA immediately. I have added a fix in our stack to match the behavior so all buffers are returned. Thanks
This is tested with all the cases for resolution change events. PTAL
I'll defer reviewing the sandboxing/library loading stuff to Jorge. https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/tegra_v4l2_video_device.cc:119: const EGLint* /*attrib*/, Ignoring attributes here and not in Exynos device, while doing the opposite with buffer_index argument is not a perfect solution. Also, how do you handle exportbuf ioctl? Do you ignore it? That's also not too great. Also, how does close() on those dmabuf fds work for you now? When you ignore the call, do you return -1? I guess that's what would make close() work. It's hard to come up with a good solution here, but we should at least try to minimize being confusing and. Always doing dmabuf export and ignoring fds, ignoring attributes, ignoring some arguments make it even harder to reason about things. We should instead abstract those operations and have them in V4L2Device, since they depend on the device anyway. But we should at least be explicit about this. So I'm thinking that we should pass only the buffer index to CreateEGLImage and move dmabuf exporting and handling in general to V4L2Device for Exynos. Then we'd also have to handle everything properly for destruction, but I feel that having a DestroyEGLImage() counterpart in V4L2Device is probably better than using V4L2Device::CreteEGLImage for creation, while destroying directly via eglDestroyImageKHR. Also keep in mind that CreateEGLImage is called on the ChildThread, instead of the decoder_thread_, but currently the decoder_thread_ is sleeping until AssignPictureBuffers is done, so we should be fine. https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... File content/common/gpu/media/tegra_v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/tegra_v4l2_video_device.h:49: bool InitializeLibrarySymbols(); Methods should come before members. https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:270: format.fmt.pix_mp.pixelformat = V4L2_PIX_FMT_NV12M; I think we discussed before that this should be fixed please... https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:745: current_format_ = format; Please move this to the CreateBuffersForFormat call to avoid having it in multiple places. https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1007: // Check if we already have current_format_ set or this is an event s/or this/or if this/ https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1009: if ((current_format_.fmt.pix_mp.width == 0) || Since you are using the format only for size, I think you can use frame_buffer_size_ instead and drop current_format_. https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1013: } else if (IsResolutionChangeNecessary()) { Also, IsResolutionChangeNecessary() should handle the initialization case as well (i.e. return true if frame_buffer_size_.IsEmpty()) and this if/else shouldn't be needed here. https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1951: NOTIFY_ERROR(PLATFORM_FAILURE); This means we will send NOTIFY_ERROR twice in case G_FMT failed. https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1957: current_format_ = format; If you are doing this here, then there is no need to do all the format getting again in FinishResolutionChange. https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1960: DVLOG(3) << "IsResolutionChangeNecessary(): Dropping resolution change"; No need for the else clause, just move it out please. https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.h:14: #include <linux/videodev2.h> Why did we not need this before? If this is for v4l2_format, we were already using it... https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... File content/common/gpu/media/v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_device.h:59: // This method is used to create the EglImage since each V4L2Device s/the EglImage/an EGLImage/ https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_device.h:60: // may have its own format. The texture_id is used to bind the s/format/may use a different method of acquiring one and associating it to the given texture/
https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:119: const EGLint* /*attrib*/, On 2014/03/25 08:21:08, Pawel Osciak wrote: > Ignoring attributes here and not in Exynos device, while doing the opposite with > buffer_index argument is not a perfect solution. Also, how do you handle > exportbuf ioctl? Do you ignore it? That's also not too great. This is primarily because of the eglImages are created. The attr is not required on Tegra. We cannot remove this argument altogether because on Exynos it is populated with certain fields which I will have to otherwise pass in to this function. Similarly buffer_index is ignored on Exynos but on Tegra we need to send it down to the library in UseEglImage() to associate with the correct picture buffer. > > Also, how does close() on those dmabuf fds work for you now? When you ignore the > call, do you return -1? I guess that's what would make close() work. EXPORTBUF ioctl actually does not populate anything since we do not support it. And yes you are right, we return -1 and that's how the close() is also handled in V4LVDA since it fd is closed only if it is not -1. > > It's hard to come up with a good solution here, but we should at least try to > minimize being confusing and. Always doing dmabuf export and ignoring fds, > ignoring attributes, ignoring some arguments make it even harder to reason about > things. > I think we need to add more interface functions in V4L2Device class to abstract this then. > We should instead abstract those operations and have them in V4L2Device, since > they depend on the device anyway. But we should at least be explicit about this. > > So I'm thinking that we should pass only the buffer index to CreateEGLImage and > move dmabuf exporting and handling in general to V4L2Device for Exynos. Then > we'd also have to handle everything properly for destruction, but I feel that > having a DestroyEGLImage() counterpart in V4L2Device is probably better than > using V4L2Device::CreteEGLImage for creation, while destroying directly via > eglDestroyImageKHR. Okay I will add more interface API to V4L2Device. > > Also keep in mind that CreateEGLImage is called on the ChildThread, instead of > the decoder_thread_, but currently the decoder_thread_ is sleeping until > AssignPictureBuffers is done, so we should be fine. https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.h:49: bool InitializeLibrarySymbols(); On 2014/03/25 08:21:08, Pawel Osciak wrote: > Methods should come before members. Done. https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1007: // Check if we already have current_format_ set or this is an event On 2014/03/25 08:21:08, Pawel Osciak wrote: > s/or this/or if this/ Done. https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1009: if ((current_format_.fmt.pix_mp.width == 0) || On 2014/03/25 08:21:08, Pawel Osciak wrote: > Since you are using the format only for size, I think you can use > frame_buffer_size_ instead and drop current_format_. Done. https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1951: NOTIFY_ERROR(PLATFORM_FAILURE); Since GetFormatInfo() does NOTIFY_ERROR already I will remove it from here. On 2014/03/25 08:21:08, Pawel Osciak wrote: > This means we will send NOTIFY_ERROR twice in case G_FMT failed. https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1960: DVLOG(3) << "IsResolutionChangeNecessary(): Dropping resolution change"; On 2014/03/25 08:21:08, Pawel Osciak wrote: > No need for the else clause, just move it out please. Done. https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:14: #include <linux/videodev2.h> This was for v4l2_format but now if we use frame_buffer_size_ then current_format_ can go away and hence this too. On 2014/03/25 08:21:08, Pawel Osciak wrote: > Why did we not need this before? If this is for v4l2_format, we were already > using it... https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:59: // This method is used to create the EglImage since each V4L2Device On 2014/03/25 08:21:08, Pawel Osciak wrote: > s/the EglImage/an EGLImage/ Done. https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:60: // may have its own format. The texture_id is used to bind the On 2014/03/25 08:21:08, Pawel Osciak wrote: > s/format/may use a different method of acquiring one and associating it to the > given texture/ Done.
https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:119: const EGLint* /*attrib*/, On 2014/03/25 10:36:40, shivdasp wrote: > On 2014/03/25 08:21:08, Pawel Osciak wrote: > > Ignoring attributes here and not in Exynos device, while doing the opposite > with > > buffer_index argument is not a perfect solution. Also, how do you handle > > exportbuf ioctl? Do you ignore it? That's also not too great. > This is primarily because of the eglImages are created. The attr is not required > on > Tegra. We cannot remove this argument altogether because on Exynos it is > populated with certain fields which I will have to otherwise pass in to this > function. Similarly buffer_index is ignored on Exynos but on Tegra we need to > send it down to the library in UseEglImage() to associate with the correct > picture buffer. This is precisely why I'm suggesting attrs shouldn't be in the abstract interface, because they are not used on all platforms. Neither should the dmabufs handling code, including expbufs be in the platform-agnostic V4L2VDA class. That's why I'm suggesting moving all the attrs, dmabufs, etc. related code to CreateEglImage. You should be able to expbufs, fill in the attrs on Exynos and create EGLImages in the ExynosV4L2Device::CreateEglImage. The Tegra implementation would not need it. That's why I'm suggesting using buffer index argument on both platforms, and also size and num planes I would think. I think we should have something like: V4L2Device::CreateEGLImage(EGLDisplay egl_display, int v4l2_buffer_index, GLuint texture_id, gfx::Size size, size_t num_planes); > > > > Also, how does close() on those dmabuf fds work for you now? When you ignore > the > > call, do you return -1? I guess that's what would make close() work. > EXPORTBUF ioctl actually does not populate anything since we do not support it. > And yes you are right, we return -1 and that's how the close() is also handled > in V4LVDA since it fd is closed only if it is not -1. > > > > It's hard to come up with a good solution here, but we should at least try to > > minimize being confusing and. Always doing dmabuf export and ignoring fds, > > ignoring attributes, ignoring some arguments make it even harder to reason > about > > things. > > > I think we need to add more interface functions in V4L2Device class to abstract > this then. I think the only additional method would be DestroyEGLImage(EGLImage image). This and the new CreateEGLImage(). Of course the related data such as fds, and their relation to egl_images should be handled internally by each device. > > We should instead abstract those operations and have them in V4L2Device, since > > they depend on the device anyway. But we should at least be explicit about > this. > > > > So I'm thinking that we should pass only the buffer index to CreateEGLImage > and > > move dmabuf exporting and handling in general to V4L2Device for Exynos. Then > > we'd also have to handle everything properly for destruction, but I feel that > > having a DestroyEGLImage() counterpart in V4L2Device is probably better than > > using V4L2Device::CreteEGLImage for creation, while destroying directly via > > eglDestroyImageKHR. > Okay I will add more interface API to V4L2Device. > > > > Also keep in mind that CreateEGLImage is called on the ChildThread, instead of > > the decoder_thread_, but currently the decoder_thread_ is sleeping until > > AssignPictureBuffers is done, so we should be fine. >
shivdasp@: please coordinate with davidung@ who's working on https://chromiumcodereview.appspot.com/179983006/ for the sandbox changes. Thanks! https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/s... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/s... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); Is this really needed?
Jorge, Yes either of us have to rebase if the other CL lands before. Shivdas https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); Yes this is needed for the video decode acceleration on Tegra. On 2014/03/25 21:15:14, Jorge Lucangeli Obes wrote: > Is this really needed?
https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); On 2014/03/26 02:54:25, shivdasp wrote: > Yes this is needed for the video decode acceleration on Tegra. I apologize, I wasn't clear. I meant "is this needed here in the sandbox whitelist". If David managed to preload *all* the other Nvidia .so's in his CL, I'm surprised this one cannot be preloaded before enabling the sandbox. > On 2014/03/25 21:15:14, Jorge Lucangeli Obes wrote: > > Is this really needed? >
https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); I get your point. I think all the libraries that David's CL preloaded were part of graphics stack which got loaded through some gl initialization. We need this lib loaded here here since it is not part of graphics stack and is a wrapper for MM stack. On 2014/03/26 03:15:48, Jorge Lucangeli Obes wrote: > On 2014/03/26 02:54:25, shivdasp wrote: > > Yes this is needed for the video decode acceleration on Tegra. > > I apologize, I wasn't clear. I meant "is this needed here in the sandbox > whitelist". If David managed to preload *all* the other Nvidia .so's in his CL, > I'm surprised this one cannot be preloaded before enabling the sandbox. > > > On 2014/03/25 21:15:14, Jorge Lucangeli Obes wrote: > > > Is this really needed? > > >
https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); On 2014/03/26 03:34:29, shivdasp wrote: > I get your point. > I think all the libraries that David's CL preloaded were part of graphics stack > which got loaded through some gl initialization. > We need this lib loaded here here since it is not part of graphics stack and is > a wrapper for MM stack. > Thanks for the explanation. Then my only suggestion would be to check that video decode still works after David's change (not just that the change still applies, but that functionality works). > On 2014/03/26 03:15:48, Jorge Lucangeli Obes wrote: > > On 2014/03/26 02:54:25, shivdasp wrote: > > > Yes this is needed for the video decode acceleration on Tegra. > > > > I apologize, I wasn't clear. I meant "is this needed here in the sandbox > > whitelist". If David managed to preload *all* the other Nvidia .so's in his > CL, > > I'm surprised this one cannot be preloaded before enabling the sandbox. > > > > > On 2014/03/25 21:15:14, Jorge Lucangeli Obes wrote: > > > > Is this really needed? > > > > > >
https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/740001/content/common/sandbox_... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); On 2014/03/26 16:47:24, Jorge Lucangeli Obes wrote: > On 2014/03/26 03:34:29, shivdasp wrote: > > I get your point. > > I think all the libraries that David's CL preloaded were part of graphics > stack > > which got loaded through some gl initialization. > > We need this lib loaded here here since it is not part of graphics stack and > is > > a wrapper for MM stack. > > > > Thanks for the explanation. Then my only suggestion would be to check that video > decode still works after David's change (not just that the change still applies, > but that functionality works). Yes I have tested with David's change too. Nevertheless I will have to rebase once his change lands so it will get tested once more. Thanks. > > > On 2014/03/26 03:15:48, Jorge Lucangeli Obes wrote: > > > On 2014/03/26 02:54:25, shivdasp wrote: > > > > Yes this is needed for the video decode acceleration on Tegra. > > > > > > I apologize, I wasn't clear. I meant "is this needed here in the sandbox > > > whitelist". If David managed to preload *all* the other Nvidia .so's in his > > CL, > > > I'm surprised this one cannot be preloaded before enabling the sandbox. > > > > > > > On 2014/03/25 21:15:14, Jorge Lucangeli Obes wrote: > > > > > Is this really needed? > > > > > > > > > >
Hi Pawel, Tried to address your proposal of moving the device specific egl allocation in respective V4L2Device class to the best of my understanding. Please have a look. I also added two interfaces GetCapturePixelFormat() and GetNumberOfPlanes() to V4L2Device since these are the "expected" values from the driver and that defines how the VDA device behaves. Using G_FMT will only indicate what the driver has set. These methods shall also help to avoid breaking either of the devices should the pixelformat or number of planes change in future. Thanks
https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/exynos_v4l2_video_device.cc:8: #include <libdrm/drm_fourcc.h> Header ordering -- this one above. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/exynos_v4l2_video_device.cc:194: device_output_buffer_map_.clear(); This is really brittle. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... File content/common/gpu/media/exynos_v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/exynos_v4l2_video_device.h:49: }; Since we already have the output buffers being tracked in the V4L2VDA, I think we could dispense with tracking them here altogether. The CreateEGLImage() call only needs: * EGLDisplay egl_display * GLuint texture_id * gfx::Size frame_buffer_size * unsigned int buffer_index It can call EXPBUF on the buffer_index, create the EGLImage from the DMABUF fd, and then close the fd immediately. The EGLImage is returned to the V4L2VDA and can be closed with the normal eglDestroyImageKHR call. We don't have to hold on to the fd or the EGLImage. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... File content/common/gpu/media/v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_device.h:79: virtual uint8 GetNumberOfPlanes() = 0; GetCapturePlaneCount()? (Something with "Capture" in it)
https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:194: device_output_buffer_map_.clear(); Seems this is not needed now after reading your previous comment. On 2014/03/27 02:00:23, sheu wrote: > This is really brittle. https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... File content/common/gpu/media/exynos_v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.h:49: }; On 2014/03/27 02:00:23, sheu wrote: > Since we already have the output buffers being tracked in the V4L2VDA, I think > we could dispense with tracking them here altogether. The CreateEGLImage() call > only needs: > > * EGLDisplay egl_display > * GLuint texture_id > * gfx::Size frame_buffer_size > * unsigned int buffer_index > > It can call EXPBUF on the buffer_index, create the EGLImage from the DMABUF fd, > and then close the fd immediately. The EGLImage is returned Ohh I did not know that the fd can be closed immediately here. All the DeviceOutputRecord and map business was to have a correct fd being closed while destroying . But looks like this is not needed at all. All the code around device_output_buffer_map_ will also be removed. Thanks. >to the V4L2VDA and > can be closed with the normal eglDestroyImageKHR call. We don't have to hold on > to the fd or the EGLImage. https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:79: virtual uint8 GetNumberOfPlanes() = 0; Alright will "Capture" this in my next patchset. On 2014/03/27 02:00:23, sheu wrote: > GetCapturePlaneCount()? (Something with "Capture" in it)
https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... File content/common/gpu/media/v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_device.h:60: // may have its own format. The texture_id is used to bind the On 2014/03/25 10:36:40, shivdasp wrote: > On 2014/03/25 08:21:08, Pawel Osciak wrote: > > s/format/may use a different method of acquiring one and associating it to the > > given texture/ > > Done. If I may suggest please, it is an agreed practice in this review tool to respond "done" when uploading a new patch set that actually fixes the issue. This helps with missing some of the comments like in this case. The documentation still needs updating please. Here and in other places please. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... File content/common/gpu/media/exynos_v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/exynos_v4l2_video_device.h:49: }; John is right. To add more detail: the owner of buffers allocated with reqbufs(mmap) holds a reference to each buffer (this is not a dmabuf fd-based reference). Expbuf creating dmabufs adds additional reference on the buffers after producing the fds. Passing the fds to EGLImage adds a third reference to each buffer (this is a second reference on the file), because the GL driver calls dup() on the passed fds. Closing the fds here immediately after EGLImage is created is ok, because the VDA retains the first, non-fd-based reference from the initial reqbufs, while the GL driver keeps its dup()ed one. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:991: if (IsResolutionChangeNecessary()) { resolution_change_pending_ = IsResolutionChangeNecessary(); https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1913: if ((static_cast<int>(format.fmt.pix_mp.width) != gfx::Size new_size(base::checked_cast<int>(format.fmt.pix_mp.width), base::checked_cast<int>(format.fmt.pix_mp.height)); if (frame_buffer_size_ != new_size) { ... Also, check if new_size isn't Empty. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... File content/common/gpu/media/v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_device.h:70: virtual void DestroyEGLImage(unsigned int buffer_index) = 0; Documentation please. This should also state who is the owner responsible for destroying the images. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; Why do we need this? G_FMT tells us this. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_device.h:79: virtual uint8 GetNumberOfPlanes() = 0; G_FMT can also tell us this.
https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... File content/common/gpu/media/v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/740001/content/common/g... content/common/gpu/media/v4l2_video_device.h:60: // may have its own format. The texture_id is used to bind the Apologies. I missed it. Will make a note of this. On 2014/03/27 05:18:06, Pawel Osciak wrote: > On 2014/03/25 10:36:40, shivdasp wrote: > > On 2014/03/25 08:21:08, Pawel Osciak wrote: > > > s/format/may use a different method of acquiring one and associating it to > the > > > given texture/ > > > > Done. > > If I may suggest please, it is an agreed practice in this review tool to respond > "done" when uploading a new patch set that actually fixes the issue. This helps > with missing some of the comments like in this case. > > The documentation still needs updating please. Here and in other places please. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:991: if (IsResolutionChangeNecessary()) { On 2014/03/27 05:18:06, Pawel Osciak wrote: > resolution_change_pending_ = IsResolutionChangeNecessary(); Done. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1913: if ((static_cast<int>(format.fmt.pix_mp.width) != On 2014/03/27 05:18:06, Pawel Osciak wrote: > gfx::Size new_size(base::checked_cast<int>(format.fmt.pix_mp.width), > base::checked_cast<int>(format.fmt.pix_mp.height)); > > if (frame_buffer_size_ != new_size) { Will do. > ... > > Also, check if new_size isn't Empty. new_size will be populated since GetFormatInfo is successful, so do we need to check if it is still empty ? https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... File content/common/gpu/media/v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_device.h:70: virtual void DestroyEGLImage(unsigned int buffer_index) = 0; This method is not required, since we will close the dmabuf fds while in createEGLImage() and eglDestroyImageKHR() is common to both the devices. In next patchset this will be gone. On 2014/03/27 05:18:06, Pawel Osciak wrote: > Documentation please. This should also state who is the owner responsible for > destroying the images. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; My understanding is that we call S_FMT() in VDA with the pixelformat that we expect and is compatible with the device. This is checked when we do GetFormatInfo(). Similarly number of planes on the CAPTURE_PLANE are also device specific. I added these methods so that if Exynos or Tegra change their pixel format and number of planes in future, it should not break the other. On 2014/03/27 05:18:06, Pawel Osciak wrote: > Why do we need this? G_FMT tells us this.
https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... File content/common/gpu/media/v4l2_video_device.h (right): https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_device.h:70: virtual void DestroyEGLImage(unsigned int buffer_index) = 0; On 2014/03/27 05:40:37, shivdasp wrote: > This method is not required, since we will close the dmabuf fds while in > createEGLImage() and eglDestroyImageKHR() is common to both the devices. > In next patchset this will be gone. I still feel we should have it even if only to call eglDestroyImageKHR directly, for symmetry. https://chromiumcodereview.appspot.com/137023008/diff/810001/content/common/g... content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; On 2014/03/27 05:40:37, shivdasp wrote: > My understanding is that we call S_FMT() in VDA with the pixelformat that we > expect and is compatible with the device. This is checked when we do > GetFormatInfo(). This is because we prefer that format for performance reasons. This should made more dynamic, but I'm ok with not doing it for now. Please rename to PreferredOutputFormat() and make it static. > Similarly number of planes on the CAPTURE_PLANE are also device specific. > I added these methods so that if Exynos or Tegra change their pixel format and > number of planes in future, it should not break the other. > > > On 2014/03/27 05:18:06, Pawel Osciak wrote: > > Why do we need this? G_FMT tells us this. >
https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; On 2014/03/27 05:58:35, Pawel Osciak wrote: > On 2014/03/27 05:40:37, shivdasp wrote: > > My understanding is that we call S_FMT() in VDA with the pixelformat that we > > expect and is compatible with the device. This is checked when we do > > GetFormatInfo(). > > This is because we prefer that format for performance reasons. This should made > more dynamic, but I'm ok with not doing it for now. > > Please rename to PreferredOutputFormat() and make it static. Sorry I didn't get it. I thought this method would be extended by the device specific class since it can be different. Do you mean having a static data members that are initialized to pixel format and number of planes in the ExynosV4L2Device and TegraV4L2Device ? > > > Similarly number of planes on the CAPTURE_PLANE are also device specific. > > I added these methods so that if Exynos or Tegra change their pixel format and > > number of planes in future, it should not break the other. > > > > > > On 2014/03/27 05:18:06, Pawel Osciak wrote: > > > Why do we need this? G_FMT tells us this. > > >
https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; On 2014/03/27 06:51:46, shivdasp wrote: > On 2014/03/27 05:58:35, Pawel Osciak wrote: > > On 2014/03/27 05:40:37, shivdasp wrote: > > > My understanding is that we call S_FMT() in VDA with the pixelformat that we > > > expect and is compatible with the device. This is checked when we do > > > GetFormatInfo(). > > > > This is because we prefer that format for performance reasons. This should > made > > more dynamic, but I'm ok with not doing it for now. > > > > Please rename to PreferredOutputFormat() and make it static. > Sorry I didn't get it. > I thought this method would be extended by the device specific class since it > can be different. Do you mean having a static data members that are initialized > to pixel format and number of planes in the ExynosV4L2Device and TegraV4L2Device > ? Yeah sorry please ignore the static part. As I mentioned though, you shouldn't need a plane number getter. G_FMT returns this. > > > > > Similarly number of planes on the CAPTURE_PLANE are also device specific. > > > I added these methods so that if Exynos or Tegra change their pixel format > and > > > number of planes in future, it should not break the other. > > > > > > > > > On 2014/03/27 05:18:06, Pawel Osciak wrote: > > > > Why do we need this? G_FMT tells us this. > > > > > >
https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:70: virtual void DestroyEGLImage(unsigned int buffer_index) = 0; On 2014/03/27 05:58:35, Pawel Osciak wrote: > On 2014/03/27 05:40:37, shivdasp wrote: > > This method is not required, since we will close the dmabuf fds while in > > createEGLImage() and eglDestroyImageKHR() is common to both the devices. > > In next patchset this will be gone. > > I still feel we should have it even if only to call eglDestroyImageKHR directly, > for symmetry. IMO we could call it CreateEGLImageForBuffer(..., unsigned int index) and not have to shell out for the Destroy() part. For this particular bit I don't think symmetry is necessary. https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; On 2014/03/27 06:59:56, Pawel Osciak wrote: > On 2014/03/27 06:51:46, shivdasp wrote: > > On 2014/03/27 05:58:35, Pawel Osciak wrote: > > > On 2014/03/27 05:40:37, shivdasp wrote: > > > > My understanding is that we call S_FMT() in VDA with the pixelformat that > we > > > > expect and is compatible with the device. This is checked when we do > > > > GetFormatInfo(). > > > > > > This is because we prefer that format for performance reasons. This should > > made > > > more dynamic, but I'm ok with not doing it for now. > > > > > > Please rename to PreferredOutputFormat() and make it static. > > Sorry I didn't get it. > > I thought this method would be extended by the device specific class since it > > can be different. Do you mean having a static data members that are > initialized > > to pixel format and number of planes in the ExynosV4L2Device and > TegraV4L2Device > > ? > > Yeah sorry please ignore the static part. > > As I mentioned though, you shouldn't need a plane number getter. G_FMT returns > this. > > > > > > > > Similarly number of planes on the CAPTURE_PLANE are also device specific. > > > > I added these methods so that if Exynos or Tegra change their pixel format > > and > > > > number of planes in future, it should not break the other. > > > > > > > > > > > > On 2014/03/27 05:18:06, Pawel Osciak wrote: > > > > > Why do we need this? G_FMT tells us this. > > > > > > > > > > That's right actually. All we use the plane number getter is to DCHECK on it, and if we were DCHECKing on it we should rather be doing it in the device-specific V4L2VideoDevice implementation. And since we're going to hide the EGLImage creation logic inside the V4L2VideoDevice implementation, there's no reason to expose the plane count here.
https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; On 2014/03/27 07:08:49, sheu wrote: > On 2014/03/27 06:59:56, Pawel Osciak wrote: > > On 2014/03/27 06:51:46, shivdasp wrote: > > > On 2014/03/27 05:58:35, Pawel Osciak wrote: > > > > On 2014/03/27 05:40:37, shivdasp wrote: > > > > > My understanding is that we call S_FMT() in VDA with the pixelformat > that > > we > > > > > expect and is compatible with the device. This is checked when we do > > > > > GetFormatInfo(). > > > > > > > > This is because we prefer that format for performance reasons. This should > > > made > > > > more dynamic, but I'm ok with not doing it for now. > > > > > > > > Please rename to PreferredOutputFormat() and make it static. > > > Sorry I didn't get it. > > > I thought this method would be extended by the device specific class since > it > > > can be different. Do you mean having a static data members that are > > initialized > > > to pixel format and number of planes in the ExynosV4L2Device and > > TegraV4L2Device > > > ? > > > > Yeah sorry please ignore the static part. > > > > As I mentioned though, you shouldn't need a plane number getter. G_FMT returns > > this. > > > > > > > > > > > Similarly number of planes on the CAPTURE_PLANE are also device > specific. > > > > > I added these methods so that if Exynos or Tegra change their pixel > format > > > and > > > > > number of planes in future, it should not break the other. > > > > > > > > > > > > > > > On 2014/03/27 05:18:06, Pawel Osciak wrote: > > > > > > Why do we need this? G_FMT tells us this. > > > > > > > > > > > > > > > > That's right actually. All we use the plane number getter is to DCHECK on it, > and if we were DCHECKing on it we should rather be doing it in the > device-specific V4L2VideoDevice implementation. And since we're going to hide > the EGLImage creation logic inside the V4L2VideoDevice implementation, there's > no reason to expose the plane count here. Apart from DCHECK , we use the plane count in EnqueueOutputRecord() line 1145 too. Though this plane count is actually coming from output_record.fds it is actually device implementation specific. We populate the v4l2 struct arguments based on the num_planes that we expect by the underlying driver. So I think having a getter() will protect any such changes in device implementations. If you agree, I have made some changes to remove the fds from OutputRecord since it is not really needed and it simplifies things a bit, I will upload it in few mins.
https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; > Apart from DCHECK , we use the plane count in EnqueueOutputRecord() line 1145 > too. Though this plane count is actually coming from output_record.fds it is > actually device implementation specific. We populate the v4l2 struct arguments > based on the num_planes that we expect by the underlying driver. > So I think having a getter() will protect any such changes in device > implementations. The number of planes should come from v4l2_pix_format_mplane.num_planes on G_FMT.
https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 GetCapturePixelFormat() = 0; On 2014/03/27 07:45:06, Pawel Osciak wrote: > > Apart from DCHECK , we use the plane count in EnqueueOutputRecord() line 1145 > > too. Though this plane count is actually coming from output_record.fds it is > > actually device implementation specific. We populate the v4l2 struct arguments > > based on the num_planes that we expect by the underlying driver. > > So I think having a getter() will protect any such changes in device > > implementations. > > The number of planes should come from v4l2_pix_format_mplane.num_planes on > G_FMT. It does come from G_FMT but we have a CHECK_EQ in CreateBuffersForFormat() to compare it with constant 2. Instead of a constant 2 in VDA, I am trying to use a getter() from device specific implementation so that changes in either device should not break because of this CHECK. Please have a look at my next patchset.
Addressed review comments, PTAL.
On 2014/03/27 07:54:54, shivdasp wrote: > https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... > File content/common/gpu/media/v4l2_video_device.h (right): > > https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... > content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 > GetCapturePixelFormat() = 0; > On 2014/03/27 07:45:06, Pawel Osciak wrote: > > > Apart from DCHECK , we use the plane count in EnqueueOutputRecord() line > 1145 > > > too. Though this plane count is actually coming from output_record.fds it is > > > actually device implementation specific. We populate the v4l2 struct > arguments > > > based on the num_planes that we expect by the underlying driver. > > > So I think having a getter() will protect any such changes in device > > > implementations. > > > > The number of planes should come from v4l2_pix_format_mplane.num_planes on > > G_FMT. > It does come from G_FMT but we have a CHECK_EQ in CreateBuffersForFormat() to > compare it with constant 2. Instead of a constant 2 in VDA, I am trying to use a > getter() from device specific implementation so that changes in either device > should not break because of this CHECK. Please have a look at my next patchset. I mentioned it before, but that DCHECK shouldn't be there and the class shouldn't have "2" hardcoded in multiple places. It should query and use what G_FMT returns.
On 2014/03/27 08:01:16, Pawel Osciak wrote: > On 2014/03/27 07:54:54, shivdasp wrote: > > > https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... > > File content/common/gpu/media/v4l2_video_device.h (right): > > > > > https://codereview.chromium.org/137023008/diff/810001/content/common/gpu/medi... > > content/common/gpu/media/v4l2_video_device.h:76: virtual uint32 > > GetCapturePixelFormat() = 0; > > On 2014/03/27 07:45:06, Pawel Osciak wrote: > > > > Apart from DCHECK , we use the plane count in EnqueueOutputRecord() line > > 1145 > > > > too. Though this plane count is actually coming from output_record.fds it > is > > > > actually device implementation specific. We populate the v4l2 struct > > arguments > > > > based on the num_planes that we expect by the underlying driver. > > > > So I think having a getter() will protect any such changes in device > > > > implementations. > > > > > > The number of planes should come from v4l2_pix_format_mplane.num_planes on > > > G_FMT. > > It does come from G_FMT but we have a CHECK_EQ in CreateBuffersForFormat() to > > compare it with constant 2. Instead of a constant 2 in VDA, I am trying to use > a > > getter() from device specific implementation so that changes in either device > > should not break because of this CHECK. Please have a look at my next > patchset. > > I mentioned it before, but that DCHECK shouldn't be there and the class > shouldn't have "2" hardcoded in multiple places. > It should query and use what G_FMT returns. Okay then I will remove the DCHECK. And to remove the hardcoding I guess will have to store the num_planes returned from G_FMT and use that instead.
Addressed review comments regarding num_planes.
https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] = {-1, -1}; As I mentioned before, num_planes should be passed to this method and not hardcoded. https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:136: return EGL_NO_IMAGE_KHR; If this fails on any other than the first fd, we will never close the ones we've already exported and leak. Please replace all close() calls in this method (including the ones on successful return) with a base::ScopedClosureRunner(): static void CloseDmabufFds(linked_ptr<std::vector<int> > dmabuf_fds) { for (size_t i = 0; i < dmabuf_fds->size(); ++i) close(dmabuf_fds->at(i)); } And in this method: linked_ptr<std::vector<int> > dmabuf_fds(new std::vector<int>(num_planes, -1)); base::Closure dmabuf_fds_cb = base::Bind(&CloseDmabufFds, dmabuf_fds); base::ScopedClosureRunner dmabuf_fds_closer(dmabuf_fds_cb); And here expbufs assigning to dmabuf_fds->at(i). https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:148: attrs[5] = DRM_FORMAT_NV12; This shouldn't be hardcoded either actually... How about instead of num_planes, size, etc. we just pass v4l2_format to this method and extract all info from there? And in this case we can DCHECK on V4L2_PIX_FMT_NV12M and then hardcode this DRM define for now.
https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] = {-1, -1}; I think this is do-able quickly.Done. On 2014/03/27 09:09:50, Pawel Osciak wrote: > As I mentioned before, num_planes should be passed to this method and not > hardcoded. https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:136: return EGL_NO_IMAGE_KHR; On 2014/03/27 09:09:50, Pawel Osciak wrote: > If this fails on any other than the first fd, we will never close the ones we've > already exported and leak. > > Please replace all close() calls in this method (including the ones on > successful return) with a base::ScopedClosureRunner(): > > static void CloseDmabufFds(linked_ptr<std::vector<int> > dmabuf_fds) { > for (size_t i = 0; i < dmabuf_fds->size(); ++i) > close(dmabuf_fds->at(i)); > } > > And in this method: > > linked_ptr<std::vector<int> > dmabuf_fds(new std::vector<int>(num_planes, -1)); > base::Closure dmabuf_fds_cb = > base::Bind(&CloseDmabufFds, dmabuf_fds); > base::ScopedClosureRunner dmabuf_fds_closer(dmabuf_fds_cb); > > And here expbufs assigning to dmabuf_fds->at(i). I need to understand the usage here. If I understand each of the fds are added in the linked list and then deleted on failure. https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:148: attrs[5] = DRM_FORMAT_NV12; On 2014/03/27 09:09:50, Pawel Osciak wrote: > This shouldn't be hardcoded either actually... > How do you propose we remove the hardcoding of DRM_FORMAT_NV12, is there a G_FMT anywhere ? Since it is anyways in device implementation, it is confined for Exynos only. > How about instead of num_planes, size, etc. we just pass v4l2_format to this > method and extract all info from there? Hmm... I think sending the complete v4l2_format struct here may not help. Either we populate the v4l2_format before the call or in this class. I prefer having it in this class. > And in this case we can DCHECK on V4L2_PIX_FMT_NV12M and then hardcode this DRM > define for now.
For what it's worth -- I think we're really close. Just bits now. https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] = {-1, -1}; On 2014/03/27 09:09:50, Pawel Osciak wrote: > As I mentioned before, num_planes should be passed to this method and not > hardcoded. I'd disagree here -- if we're defining a preferred output format then we should know the plane count here. I think that's the simplest way around the issue. https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:136: return EGL_NO_IMAGE_KHR; On 2014/03/27 09:09:50, Pawel Osciak wrote: > If this fails on any other than the first fd, we will never close the ones we've > already exported and leak. > > Please replace all close() calls in this method (including the ones on > successful return) with a base::ScopedClosureRunner(): > > static void CloseDmabufFds(linked_ptr<std::vector<int> > dmabuf_fds) { > for (size_t i = 0; i < dmabuf_fds->size(); ++i) > close(dmabuf_fds->at(i)); > } > > And in this method: > > linked_ptr<std::vector<int> > dmabuf_fds(new std::vector<int>(num_planes, -1)); > base::Closure dmabuf_fds_cb = > base::Bind(&CloseDmabufFds, dmabuf_fds); > base::ScopedClosureRunner dmabuf_fds_closer(dmabuf_fds_cb); > > And here expbufs assigning to dmabuf_fds->at(i). See above about preferred format. In the case we go hard-coded on the number of planes, then dmabufds can be an array of ScopedFD and we get the same auto-destruction behavior when they go out of scope. i.e.: ScopedFD dma_bufs[2]; or if it's not hard-coded, even: scoped_ptr<ScopedFD[]> dma_bufs(new ScopedFd[num_planes]); https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:148: attrs[5] = DRM_FORMAT_NV12; On 2014/03/27 09:09:50, Pawel Osciak wrote: > This shouldn't be hardcoded either actually... > > How about instead of num_planes, size, etc. we just pass v4l2_format to this > method and extract all info from there? > And in this case we can DCHECK on V4L2_PIX_FMT_NV12M and then hardcode this DRM > define for now. See above about preferred format.
https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] = {-1, -1}; On 2014/03/27 09:40:41, sheu wrote: > On 2014/03/27 09:09:50, Pawel Osciak wrote: > > As I mentioned before, num_planes should be passed to this method and not > > hardcoded. > > I'd disagree here -- if we're defining a preferred output format then we should > know the plane count here. I think that's the simplest way around > the issue. I think the num_planes depend upon the output format. So if we choose to use a different output format, we wouldn't need to change the device implementation if we use the parameterized num_planes here. I think that's what Pawel thinks. But I guess that's partially true since there is rest of the code which would need to change if num planes change. Filling up the attrs etc. I am okay either ways. Let me know. https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:136: return EGL_NO_IMAGE_KHR; On 2014/03/27 09:40:41, sheu wrote: > On 2014/03/27 09:09:50, Pawel Osciak wrote: > > If this fails on any other than the first fd, we will never close the ones > we've > > already exported and leak. > > > > Please replace all close() calls in this method (including the ones on > > successful return) with a base::ScopedClosureRunner(): > > > > static void CloseDmabufFds(linked_ptr<std::vector<int> > dmabuf_fds) { > > for (size_t i = 0; i < dmabuf_fds->size(); ++i) > > close(dmabuf_fds->at(i)); > > } > > > > And in this method: > > > > linked_ptr<std::vector<int> > dmabuf_fds(new std::vector<int>(num_planes, > -1)); > > base::Closure dmabuf_fds_cb = > > base::Bind(&CloseDmabufFds, dmabuf_fds); > > base::ScopedClosureRunner dmabuf_fds_closer(dmabuf_fds_cb); > > > > And here expbufs assigning to dmabuf_fds->at(i). > > See above about preferred format. > > In the case we go hard-coded on the number of planes, then dmabufds can be an > array of ScopedFD and we get the same auto-destruction behavior when they go out > of scope. i.e.: > > ScopedFD dma_bufs[2]; > > or if it's not hard-coded, even: > > scoped_ptr<ScopedFD[]> dma_bufs(new ScopedFd[num_planes]); Using ScopedFD looks easier, let me make that change. Thanks.
Used the scopedFD for dmabuf fds. Waiting for consensus on other comments. PTAL
On 2014/03/27 10:51:41, shivdasp wrote: Hi Pawel, Just wanted to confirm, if we are now in a position to proeed with code approval. Thanks for all the help. Kaustubh
https://chromiumcodereview.appspot.com/137023008/diff/850001/content/common/g... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/850001/content/common/g... content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] = {-1, -1}; On 2014/03/27 10:06:46, shivdasp wrote: > On 2014/03/27 09:40:41, sheu wrote: > > On 2014/03/27 09:09:50, Pawel Osciak wrote: > > > As I mentioned before, num_planes should be passed to this method and not > > > hardcoded. > > > > I'd disagree here -- if we're defining a preferred output format then we > should > > know the plane count here. I think that's the simplest way around > the > issue. > I think the num_planes depend upon the output format. So if we choose to use a > different output format, we wouldn't need to change the device implementation if > we use the parameterized num_planes here. I think that's what Pawel thinks. Yes this is exactly what I mean. There is no need to hardcode both, because the driver should return that value in G_FMT once we set the correct one. So there is no need for this class to have a method for getting number of planes. V4L2VDA can just call G_FMT to find out what it is. > But I guess that's partially true since there is rest of the code which would > need to change if num planes change. Filling up the attrs etc. > I am okay either ways. Let me know. We still need it, since V4L2VDA needs to know for other calls, like qbuf. For Exynos num planes would be 2, but I think you didn't change the Tegra's preferred format. You said before that it was not using 2-plane NV12M? That's why using G_FMT in V4L2VDA and taking num_planes from there is the most universal (and API-conformant), while simple enough way I would say. https://chromiumcodereview.appspot.com/137023008/diff/850001/content/common/g... content/common/gpu/media/exynos_v4l2_video_device.cc:136: return EGL_NO_IMAGE_KHR; On 2014/03/27 10:06:46, shivdasp wrote: > On 2014/03/27 09:40:41, sheu wrote: > > On 2014/03/27 09:09:50, Pawel Osciak wrote: > > > If this fails on any other than the first fd, we will never close the ones > > we've > > > already exported and leak. > > > > > > Please replace all close() calls in this method (including the ones on > > > successful return) with a base::ScopedClosureRunner(): > > > > > > static void CloseDmabufFds(linked_ptr<std::vector<int> > dmabuf_fds) { > > > for (size_t i = 0; i < dmabuf_fds->size(); ++i) > > > close(dmabuf_fds->at(i)); > > > } > > > > > > And in this method: > > > > > > linked_ptr<std::vector<int> > dmabuf_fds(new std::vector<int>(num_planes, > > -1)); > > > base::Closure dmabuf_fds_cb = > > > base::Bind(&CloseDmabufFds, dmabuf_fds); > > > base::ScopedClosureRunner dmabuf_fds_closer(dmabuf_fds_cb); > > > > > > And here expbufs assigning to dmabuf_fds->at(i). > > > > See above about preferred format. > > > > In the case we go hard-coded on the number of planes, then dmabufds can be an > > array of ScopedFD and we get the same auto-destruction behavior when they go > out > > of scope. i.e.: > > > > ScopedFD dma_bufs[2]; > > > > or if it's not hard-coded, even: > > > > scoped_ptr<ScopedFD[]> dma_bufs(new ScopedFd[num_planes]); > > Using ScopedFD looks easier, let me make that change. Thanks. > Yes, sorry, I forgot we had ScopedFD https://chromiumcodereview.appspot.com/137023008/diff/850001/content/common/g... content/common/gpu/media/exynos_v4l2_video_device.cc:148: attrs[5] = DRM_FORMAT_NV12; On 2014/03/27 09:33:22, shivdasp wrote: > On 2014/03/27 09:09:50, Pawel Osciak wrote: > > This shouldn't be hardcoded either actually... > > > How do you propose we remove the hardcoding of DRM_FORMAT_NV12, is there a G_FMT > anywhere ? Since it is anyways in device implementation, it is confined for > Exynos only. > Ok, let's keep it for now and figure it out later. No need to block this CL on this. > > How about instead of num_planes, size, etc. we just pass v4l2_format to this > > method and extract all info from there? That's exactly what I hoped for. > Hmm... I think sending the complete v4l2_format struct here may not help. > Either we populate the v4l2_format before the call or in this class. I prefer > having it in this class. It's the driver who should populate it on G_FMT. > > > And in this case we can DCHECK on V4L2_PIX_FMT_NV12M and then hardcode this > DRM > > define for now. > Yes exactly, agreed.
By the way, for the sake of not holding this CL up too much longer (I really appreciate your patience), I'm fine with hardcoding in each device class. But let's just please not have hardcoding in V4L2VDA if possible. This should be achievable by calling using preferred format acquired from the device class and getting num_planes from G_FMT ioctl.
https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:126: int dmabuf_fds[2] = {-1, -1}; On 2014/03/28 05:33:53, Pawel Osciak wrote: > On 2014/03/27 10:06:46, shivdasp wrote: > > On 2014/03/27 09:40:41, sheu wrote: > > > On 2014/03/27 09:09:50, Pawel Osciak wrote: > > > > As I mentioned before, num_planes should be passed to this method and not > > > > hardcoded. > > > > > > I'd disagree here -- if we're defining a preferred output format then we > > should > > > know the plane count here. I think that's the simplest way around > the > > issue. > > I think the num_planes depend upon the output format. So if we choose to use a > > different output format, we wouldn't need to change the device implementation > if > > we use the parameterized num_planes here. I think that's what Pawel thinks. > > Yes this is exactly what I mean. There is no need to hardcode both, because the > driver should return that value in G_FMT once we set the correct one. So there > is no need for this class to have a method for getting number of planes. V4L2VDA > can just call G_FMT to find out what it is. Okay I will make a change to send in the num_planes as an additional argument to CreateEGLImage(). That's the only change I hope. > > > But I guess that's partially true since there is rest of the code which would > > need to change if num planes change. Filling up the attrs etc. > > I am okay either ways. Let me know. > > We still need it, since V4L2VDA needs to know for other calls, like qbuf. For > Exynos num planes would be 2, but I think you didn't change the Tegra's > preferred format. You said before that it was not using 2-plane NV12M? > That's why using G_FMT in V4L2VDA and taking num_planes from there is the most > universal (and API-conformant), while simple enough way I would say. We made change in the buffer allocation to have Tegra's preferred format same as Exynos. Anyways since we now have moved code related to output formats into device specific changing either ways will not affect another. https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:148: attrs[5] = DRM_FORMAT_NV12; On 2014/03/28 05:33:53, Pawel Osciak wrote: > On 2014/03/27 09:33:22, shivdasp wrote: > > On 2014/03/27 09:09:50, Pawel Osciak wrote: > > > This shouldn't be hardcoded either actually... > > > > > How do you propose we remove the hardcoding of DRM_FORMAT_NV12, is there a > G_FMT > > anywhere ? Since it is anyways in device implementation, it is confined for > > Exynos only. > > > > Ok, let's keep it for now and figure it out later. No need to block this CL on > this. > > > > How about instead of num_planes, size, etc. we just pass v4l2_format to this > > > method and extract all info from there? > > That's exactly what I hoped for. > > > Hmm... I think sending the complete v4l2_format struct here may not help. > > Either we populate the v4l2_format before the call or in this class. I prefer > > having it in this class. > > It's the driver who should populate it on G_FMT. > > > > > > And in this case we can DCHECK on V4L2_PIX_FMT_NV12M and then hardcode this > > DRM > > > define for now. > > > > Yes exactly, agreed. I understand there is no more change required here now atleast in this CL.
On 2014/03/28 05:42:37, shivdasp wrote: > We made change in the buffer allocation to have Tegra's preferred format same as > Exynos. Anyways since we now have moved code related to output formats into > device specific changing either ways will not affect another. Are you now allocating two discontiguous memory buffers per each v4l2 buffer and using exactly this pixel format: http://linuxtv.org/downloads/v4l-dvb-apis/re30.html and your converter/GPU now handles this format? V4L2::Dequeue() still needs the hardcoding to be removed, but yes, apart from that I think that should be all for now. Let's not hold this up anymore. > https://codereview.chromium.org/137023008/diff/850001/content/common/gpu/medi... > content/common/gpu/media/exynos_v4l2_video_device.cc:148: attrs[5] = > DRM_FORMAT_NV12; > On 2014/03/28 05:33:53, Pawel Osciak wrote: > > On 2014/03/27 09:33:22, shivdasp wrote: > > > On 2014/03/27 09:09:50, Pawel Osciak wrote: > > > > This shouldn't be hardcoded either actually... > > > > > > > How do you propose we remove the hardcoding of DRM_FORMAT_NV12, is there a > > G_FMT > > > anywhere ? Since it is anyways in device implementation, it is confined for > > > Exynos only. > > > > > > > Ok, let's keep it for now and figure it out later. No need to block this CL on > > this. > > > > > > How about instead of num_planes, size, etc. we just pass v4l2_format to > this > > > > method and extract all info from there? > > > > That's exactly what I hoped for. > > > > > Hmm... I think sending the complete v4l2_format struct here may not help. > > > Either we populate the v4l2_format before the call or in this class. I > prefer > > > having it in this class. > > > > It's the driver who should populate it on G_FMT. > > > > > > > > > And in this case we can DCHECK on V4L2_PIX_FMT_NV12M and then hardcode > this > > > DRM > > > > define for now. > > > > > > > Yes exactly, agreed. > > I understand there is no more change required here now atleast in this CL. Yes, let's leave hardcoding removal in device classes for later.
https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1539: Let's not remove this please. https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1005: struct v4l2_plane planes[2]; Remaining plane number hardcode. https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1137: struct v4l2_plane qbuf_planes[output_planes_count_]; I don't think this is valid C++, it's an extension. scoped_ptr<struct v4l2_plane []> https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1783: DVLOG(1) << __func__ << " eglDestroyImageKHR failed."; Please update the comment. https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1887: if (frame_buffer_size_.IsEmpty()) We need to verify that GetFormatInfo doesn't fail and returns something that makes sense. Please remove this, the if (frame_buffer_size_ != new_size) will handle this case as well. https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.h:436: uint8 output_planes_count_; s/uint8/size_t/
https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (left): https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1539: On 2014/03/28 06:21:46, Pawel Osciak wrote: > Let's not remove this please. Done. https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1005: struct v4l2_plane planes[2]; Yes I am removing this in a follow up patchset. Since Dequeue() thread starts earlier (and hence output_planes_count_ may not initialized) I will have to move this declaration within the while(output_buffer_queued_count_) loop and leave this one here with value 1 for OUTPUT plane. On 2014/03/28 06:21:46, Pawel Osciak wrote: > Remaining plane number hardcode. https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1137: struct v4l2_plane qbuf_planes[output_planes_count_]; Will fix this in next patchset. On 2014/03/28 06:21:46, Pawel Osciak wrote: > I don't think this is valid C++, it's an extension. > scoped_ptr<struct v4l2_plane []> https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1783: DVLOG(1) << __func__ << " eglDestroyImageKHR failed."; On 2014/03/28 06:21:46, Pawel Osciak wrote: > Please update the comment. Will fix in next patchset. https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1887: if (frame_buffer_size_.IsEmpty()) This basically is used to trigger Decoder initialization through RESOLUTION_CHANGE event. If we remove this, we need to modify GetFormatInfo() which would probably need a change in DBI() as well since it gets called from there as well. If GetFormatInfo() does not get new format, it set again to true. Should we keep this for now ? On 2014/03/28 06:21:46, Pawel Osciak wrote: > We need to verify that GetFormatInfo doesn't fail and returns something that > makes sense. Please remove this, the if (frame_buffer_size_ != new_size) will > handle this case as well. https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:436: uint8 output_planes_count_; On 2014/03/28 06:21:46, Pawel Osciak wrote: > s/uint8/size_t/ Done.
https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/g... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://chromiumcodereview.appspot.com/137023008/diff/860017/content/common/g... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1887: if (frame_buffer_size_.IsEmpty()) On 2014/03/28 06:40:43, shivdasp wrote: > This basically is used to trigger Decoder initialization through > RESOLUTION_CHANGE event. Yes. > If we remove this, we need to modify GetFormatInfo() which would probably need a > change in DBI() as well since it gets called from there as well. > If GetFormatInfo() does not get new format, it set again to true. > Should we keep this for now ? Sorry, I don't understand. How would we need to modify GetFormatInfo()? If we get here, then it means we got an event from the driver. So GetFormatInfo cannot fail. Why would we want to set this to true if GetFormatInfo fails? > On 2014/03/28 06:21:46, Pawel Osciak wrote: > > We need to verify that GetFormatInfo doesn't fail and returns something that > > makes sense. Please remove this, the if (frame_buffer_size_ != new_size) will > > handle this case as well. >
https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1887: if (frame_buffer_size_.IsEmpty()) On 2014/03/28 06:52:34, Pawel Osciak wrote: > On 2014/03/28 06:40:43, shivdasp wrote: > > This basically is used to trigger Decoder initialization through > > RESOLUTION_CHANGE event. > > Yes. > > > If we remove this, we need to modify GetFormatInfo() which would probably need > a > > change in DBI() as well since it gets called from there as well. > > If GetFormatInfo() does not get new format, it set again to true. > > Should we keep this for now ? > > Sorry, I don't understand. How would we need to modify GetFormatInfo()? If we > get here, then it means we got an event from the driver. So GetFormatInfo cannot > fail. > > Why would we want to set this to true if GetFormatInfo fails? > GetFormatInfo() can fail if the asynchronos decoder initialization has not yet completed when we were in DBI(). And this function is triggered through the RESOLUTION_CHANGE event enqueued to trigger decoder initialization. I am sorry, I am not very clear what exactly you would like to change here. Could you please elaborate ? > > On 2014/03/28 06:21:46, Pawel Osciak wrote: > > > We need to verify that GetFormatInfo doesn't fail and returns something that > > > makes sense. Please remove this, the if (frame_buffer_size_ != new_size) > will > > > handle this case as well. > > >
https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1887: if (frame_buffer_size_.IsEmpty()) On 2014/03/28 07:00:19, shivdasp wrote: > On 2014/03/28 06:52:34, Pawel Osciak wrote: > > On 2014/03/28 06:40:43, shivdasp wrote: > > > This basically is used to trigger Decoder initialization through > > > RESOLUTION_CHANGE event. > > > > Yes. > > > > > If we remove this, we need to modify GetFormatInfo() which would probably > need > > a > > > change in DBI() as well since it gets called from there as well. > > > If GetFormatInfo() does not get new format, it set again to true. > > > Should we keep this for now ? > > > > Sorry, I don't understand. How would we need to modify GetFormatInfo()? If we > > get here, then it means we got an event from the driver. So GetFormatInfo > cannot > > fail. > > > > Why would we want to set this to true if GetFormatInfo fails? > > > > GetFormatInfo() can fail if the asynchronos decoder initialization has not yet > completed when we were in DBI(). And this function is triggered through the > RESOLUTION_CHANGE event enqueued to trigger decoder initialization. > But then the driver should not send the event if it's not ready to receive a G_FMT. The driver may only send the event if it's ready for a G_FMT. And we only get here after receiving the event... Am I missing something? > I am sorry, I am not very clear what exactly you would like to change here. > Could you please elaborate ? > > > > On 2014/03/28 06:21:46, Pawel Osciak wrote: > > > > We need to verify that GetFormatInfo doesn't fail and returns something > that > > > > makes sense. Please remove this, the if (frame_buffer_size_ != new_size) > > will > > > > handle this case as well. > > > > > >
https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/860017/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1887: if (frame_buffer_size_.IsEmpty()) > But then the driver should not send the event if it's not ready to receive a > G_FMT. > The driver may only send the event if it's ready for a G_FMT. > And we only get here after receiving the event... > Am I missing something? > Yes you are right, this if frame_buffer_size_.IsEmpty() check is not really required. The ctrl.value will differ if the decoder initialization has not happened and will trigger resolution change sequence. I will remove this if check from there in the next patchset.
Made the number of planes parameterized and addressed other comments. PTAL
Do let me know if this looks alright.
https://codereview.chromium.org/137023008/diff/880001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/137023008/diff/880001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:436: uint8 output_planes_count_; s/uint8/size_t/
What's the plan for sandbox changes? In any case, I think it's time to ask Ami for another look.
On 2014/03/28 10:19:12, Pawel Osciak wrote: > What's the plan for sandbox changes? I explained Jorge about the need for loading library in sandbox file. BTW, is there a way to submit a CL with anticipatory rebase ? https://codereview.chromium.org/179983006/ is CQ for sometime and I will have to rebase after it lands. > > In any case, I think it's time to ask Ami for another look.
https://codereview.chromium.org/137023008/diff/880001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/137023008/diff/880001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:436: uint8 output_planes_count_; On 2014/03/28 10:18:42, Pawel Osciak wrote: > s/uint8/size_t/ Done.
Mostly nits; I think this is really close to being landable! https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:4: // drop this line https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:104: static bool functions_initialized = InitializeLibrarySymbols(); this is racy: http://dev.chromium.org/developers/coding-style/cpp-dos-and-donts#TOC-Static-... https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:154: TegraV4L2_##name = reinterpret_cast<TegraV4L2##name>( \ ## and # are operators (albeit a pre-processor one). Chromium style puts spaces around operators. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1912: return false; Considering that the containing method is only called if the driver indicated a resolution change, shouldn't this default to true in both the again==true and !ret cases? https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1920: return false; Last comment put another way: what can trigger V4L2_EVENT_RESOLUTION_CHANGE but still not want to be a resolution change? https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:297: // is indeed required by returning true if either: s/if either/iff/ https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:298: // - width or height of the new format is different than previous format. s/./; or/ https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:300: // Returns false otherwise. drop this line if you take my suggestion at l.297 https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:435: // Stores the number of planes (i.e. separate memory buffers) for output. This is decoder-thread state so belongs above. I'd put it at l.397 and add it to the list of variables in the comment at l.373. (read & set on the decoder thread, read on the child thread but only in AssignPictureBuffers which has the comment at l.334 of the .cc file). https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:60: // This method is used to create an EglImage since each V4L2Device "This method is used to " is not adding value (ditto for other methods in this class whose comment starts with "This method " or "These methods are used to " etc.). http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Function_Comments for details. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:62: // given texture. The texture_id is used to bind the texture to the created s/created/returned/ https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:63: // eglImage. buffer_index can be used to associate the created EglImage by s/eglImage/EGLImageKHR/ https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:63: // eglImage. buffer_index can be used to associate the created EglImage by s/created/returned/ https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:63: // eglImage. buffer_index can be used to associate the created EglImage by s/EglImage/EGLImageKHR/ https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:79: virtual uint32 PreferredOutputFormat() = 0; Both impls return the same (V4L2_PIX_FMT_NV12M). Is this future-proofing or is someone planning a change here? https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/video_decode_accelerator_unittest.cc:1548: dlopen("/usr/lib/libtegrav4l2.so", RTLD_NOW | RTLD_GLOBAL | RTLD_NODELETE); Why not put this in TegraV4L2Device::InitializeLibrarySymbols() (relying on dlopen's refcounting nature to avoid sandbox violations in the sandboxed case, b/c the pre-sandbox code will already have dlopen'd this). https://codereview.chromium.org/137023008/diff/900001/content/common/sandbox_... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/sandbox_... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); IDK what the TODO above this line is about, but if this dlopen falls under it, then probably it should move above the TODO, and if it does not, it probably warrants a newline between the TODO above and this dlopen, and maybe even a clarification of the TODO that it refers to the libs _preceding_ it. https://codereview.chromium.org/137023008/diff/900001/content/content_common.... File content/content_common.gypi (right): https://codereview.chromium.org/137023008/diff/900001/content/content_common.... content/content_common.gypi:627: 'common/gpu/media/tegra_v4l2_video_device.h', OOC what is the impact of this on the size (bytes) of libcontent.a or libcontent_common.a (depending on whether you're building static or shared libs)?
Ami and John could you take a look at this please. Thanks
On 2014/03/28 17:34:33, shivdasp wrote: > Ami and John could you take a look at this please. > Thanks I commented on PS#17 but don't see a PS#18. What do you want me to look at?
Patch incoming to address these comments. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:4: // On 2014/03/28 17:10:01, Ami Fischman wrote: > drop this line Done. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:104: static bool functions_initialized = InitializeLibrarySymbols(); How was/is this done in vaapi_wrapper.c ? There is no real computation happening in InitializeLibrarySymbols() so should it be okay as is ? atleast for now. On 2014/03/28 17:10:01, Ami Fischman wrote: > this is racy: > http://dev.chromium.org/developers/coding-style/cpp-dos-and-donts#TOC-Static-... https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:154: TegraV4L2_##name = reinterpret_cast<TegraV4L2##name>( \ On 2014/03/28 17:10:01, Ami Fischman wrote: > ## and # are operators (albeit a pre-processor one). Chromium style puts spaces > around operators. Done. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1912: return false; See below response. On 2014/03/28 17:10:01, Ami Fischman wrote: > Considering that the containing method is only called if the driver indicated a > resolution change, shouldn't this default to true in both the again==true and > !ret cases? https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1920: return false; V4L2_EVENT_RESOLUTION change is enqueued by the driver at the time of decoder initialization. The DBI() sequence also tries to check if format info is set in DBI() using GetFormatInfo(). The decoder initialization is asynchronous in Tegra. So if VDA did detect decoder initialization through GetFormatInfo() in DBI() luckily, we have to drop this V4L2_EVENT_RESOLUTION event since we do not want an un-necessary re-allocation of buffers and hence we return false if neither size nor CID_MIN_BUFFERS_FOR_CAPTURE match. On 2014/03/28 17:10:01, Ami Fischman wrote: > Last comment put another way: what can trigger V4L2_EVENT_RESOLUTION_CHANGE but > still not want to be a resolution change? https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:297: // is indeed required by returning true if either: On 2014/03/28 17:10:01, Ami Fischman wrote: > s/if either/iff/ Done. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:298: // - width or height of the new format is different than previous format. On 2014/03/28 17:10:01, Ami Fischman wrote: > s/./; or/ Done. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:300: // Returns false otherwise. On 2014/03/28 17:10:01, Ami Fischman wrote: > drop this line if you take my suggestion at l.297 Done. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:435: // Stores the number of planes (i.e. separate memory buffers) for output. I will move this to l.397 but I don't think adding this variable in the comment is true since this member is actually modified in the decoder_thread_ context. On 2014/03/28 17:10:01, Ami Fischman wrote: > This is decoder-thread state so belongs above. I'd put it at l.397 and add it > to the list of variables in the comment at l.373. > (read & set on the decoder thread, read on the child thread but only in > AssignPictureBuffers which has the comment at l.334 of the .cc file). https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_device.h (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:60: // This method is used to create an EglImage since each V4L2Device On 2014/03/28 17:10:01, Ami Fischman wrote: > "This method is used to " is not adding value (ditto for other methods in this > class whose comment starts with "This method " or "These methods are used to " > etc.). > > http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Function_Comments > for details. Done. I kept a few of "These methods" since the sentences seem alright. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:62: // given texture. The texture_id is used to bind the texture to the created On 2014/03/28 17:10:01, Ami Fischman wrote: > s/created/returned/ Done. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:63: // eglImage. buffer_index can be used to associate the created EglImage by On 2014/03/28 17:10:01, Ami Fischman wrote: > s/EglImage/EGLImageKHR/ Done. https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_device.h:79: virtual uint32 PreferredOutputFormat() = 0; This is future-proofing as the formats are now same. On 2014/03/28 17:10:01, Ami Fischman wrote: > Both impls return the same (V4L2_PIX_FMT_NV12M). Is this future-proofing or is > someone planning a change here? https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/video_decode_accelerator_unittest.cc:1548: dlopen("/usr/lib/libtegrav4l2.so", RTLD_NOW | RTLD_GLOBAL | RTLD_NODELETE); I didn't get it. I think the sandbox is enabled here below so loading this library in InitializeLibrarySymbols() will not be allowed after sandbox is enabled. So I load the library here. On 2014/03/28 17:10:01, Ami Fischman wrote: > Why not put this in TegraV4L2Device::InitializeLibrarySymbols() (relying on > dlopen's refcounting nature to avoid sandbox violations in the sandboxed case, > b/c the pre-sandbox code will already have dlopen'd this). https://codereview.chromium.org/137023008/diff/900001/content/common/sandbox_... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/sandbox_... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:225: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); David's CL https://codereview.chromium.org/179983006/ removes these preload libraries altogether. I see it is merged just a few hours ago. Will rebase and upload another patchset. On 2014/03/28 17:10:01, Ami Fischman wrote: > IDK what the TODO above this line is about, but if this dlopen falls under it, > then probably it should move above the TODO, and if it does not, it probably > warrants a newline between the TODO above and this dlopen, and maybe even a > clarification of the TODO that it refers to the libs _preceding_ it. https://codereview.chromium.org/137023008/diff/900001/content/content_common.... File content/content_common.gypi (right): https://codereview.chromium.org/137023008/diff/900001/content/content_common.... content/content_common.gypi:627: 'common/gpu/media/tegra_v4l2_video_device.h', Hmm.. I need to check it by comparing with or without this CL. Do you need it now to confirm something ? Should not be much I reckon. On 2014/03/28 17:10:01, Ami Fischman wrote: > OOC what is the impact of this on the size (bytes) of libcontent.a or > libcontent_common.a (depending on whether you're building static or shared > libs)?
Syncing my code (taking more time than usual) to rebase because of a change in bpf_cros_arm_gpu_policy_linux.cc. Meanwhile please take a look at this patchset. Thanks
https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/video_decode_accelerator_unittest.cc:1548: dlopen("/usr/lib/libtegrav4l2.so", RTLD_NOW | RTLD_GLOBAL | RTLD_NODELETE); On 2014/03/28 18:54:11, shivdasp wrote: > I didn't get it. I think the sandbox is enabled here below so loading this > library in InitializeLibrarySymbols() will not be allowed after sandbox is > enabled. > So I load the library here. > > On 2014/03/28 17:10:01, Ami Fischman wrote: > > Why not put this in TegraV4L2Device::InitializeLibrarySymbols() (relying on > > dlopen's refcounting nature to avoid sandbox violations in the sandboxed case, > > b/c the pre-sandbox code will already have dlopen'd this). > The sandbox never used to be triggered by this binary AFAIK. Are you sure it is ever engaged? https://codereview.chromium.org/137023008/diff/900001/content/content_common.... File content/content_common.gypi (right): https://codereview.chromium.org/137023008/diff/900001/content/content_common.... content/content_common.gypi:627: 'common/gpu/media/tegra_v4l2_video_device.h', On 2014/03/28 18:54:11, shivdasp wrote: > Hmm.. I need to check it by comparing with or without this CL. Do you need it > now to confirm something ? Should not be much I reckon. > On 2014/03/28 17:10:01, Ami Fischman wrote: > > OOC what is the impact of this on the size (bytes) of libcontent.a or > > libcontent_common.a (depending on whether you're building static or shared > > libs)? > I also believe it will not be a significant increase but was hoping to get confirmation. https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:103: static bool functions_initialized = InitializeLibrarySymbols(); On 2014/03/28 18:54:11, shivdasp wrote: > How was/is this done in vaapi_wrapper.c ? > There is no real computation happening in InitializeLibrarySymbols() so should > it be okay as is ? atleast for now. > > On 2014/03/28 17:10:01, Ami Fischman wrote: > > this is racy: > > > http://dev.chromium.org/developers/coding-style/cpp-dos-and-donts#TOC-Static-... > Why add a known race condition when the fix is so easy? https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:153: TegraV4L2_##name = reinterpret_cast<TegraV4L2##name>( \ On 2014/03/28 18:54:11, shivdasp wrote: > On 2014/03/28 17:10:01, Ami Fischman wrote: > > ## and # are operators (albeit a pre-processor one). Chromium style puts > spaces > > around operators. > > Done. I don't see spaces. To be clear I mean that TegraV4L2_##name should be TegraV4L2_ ## name and "TegraV4L2_" #name should be "TegraV4L2_" # name and so on https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1912: return false; On 2014/03/28 18:54:11, shivdasp wrote: > See below response. > On 2014/03/28 17:10:01, Ami Fischman wrote: > > Considering that the containing method is only called if the driver indicated > a > > resolution change, shouldn't this default to true in both the again==true and > > !ret cases? > Response below makes sense to me for why you return false at l.1920, but not why it makes sense to return true if GetFormatInfo fails. https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:372: // while the Child thread manipulates them. On 2014/03/28 18:54:11, shivdasp wrote: > I will move this to l.397 but I don't think adding this variable in the comment > is true since this member is actually modified in the decoder_thread_ context. > > On 2014/03/28 17:10:01, Ami Fischman wrote: > > This is decoder-thread state so belongs above. I'd put it at l.397 and add > it > > to the list of variables in the comment at l.373. > > (read & set on the decoder thread, read on the child thread but only in > > AssignPictureBuffers which has the comment at l.334 of the .cc file). > This comment is talking about vars that are normally read/written on the decoder thread but which are accessed on the child thread during known-safe times. That seems to match output_planes_count_ to me.
https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... File content/common/gpu/media/video_decode_accelerator_unittest.cc (right): https://codereview.chromium.org/137023008/diff/900001/content/common/gpu/medi... content/common/gpu/media/video_decode_accelerator_unittest.cc:1548: dlopen("/usr/lib/libtegrav4l2.so", RTLD_NOW | RTLD_GLOBAL | RTLD_NODELETE); On 2014/03/28 19:33:40, Ami Fischman wrote: > On 2014/03/28 18:54:11, shivdasp wrote: > > I didn't get it. I think the sandbox is enabled here below so loading this > > library in InitializeLibrarySymbols() will not be allowed after sandbox is > > enabled. > > So I load the library here. > > > > On 2014/03/28 17:10:01, Ami Fischman wrote: > > > Why not put this in TegraV4L2Device::InitializeLibrarySymbols() (relying on > > > dlopen's refcounting nature to avoid sandbox violations in the sandboxed > case, > > > b/c the pre-sandbox code will already have dlopen'd this). > > > > The sandbox never used to be triggered by this binary AFAIK. > Are you sure it is ever engaged? What do you mean by binary here ? The preload of the library before sandbox helps us to acquire resources (pre-open the device nodes etc.) Without pre-loading it does not work. https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:103: static bool functions_initialized = InitializeLibrarySymbols(); I am not very familiar with LazyInstance, will this be correct way ? base::LazyInstance<bool>::Leaky g_functions_initialized = LAZY_INSTANCE_INITIALIZER; TVDA::Initialize() { if (!g_functions_initialized.Get()) { if (!InitializeLibrarySymbols()) { DLOG(ERROR) << "Unable to initialize functions "; return false; } g_functions_initialized.Get() = true; } } On 2014/03/28 19:33:40, Ami Fischman wrote: > On 2014/03/28 18:54:11, shivdasp wrote: > > How was/is this done in vaapi_wrapper.c ? > > There is no real computation happening in InitializeLibrarySymbols() so should > > it be okay as is ? atleast for now. > > > > On 2014/03/28 17:10:01, Ami Fischman wrote: > > > this is racy: > > > > > > http://dev.chromium.org/developers/coding-style/cpp-dos-and-donts#TOC-Static-... > > > > Why add a known race condition when the fix is so easy? https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:153: TegraV4L2_##name = reinterpret_cast<TegraV4L2##name>( \ Trust me , I really did these changes but for some reason they did not get uploaded. On 2014/03/28 19:33:40, Ami Fischman wrote: > On 2014/03/28 18:54:11, shivdasp wrote: > > On 2014/03/28 17:10:01, Ami Fischman wrote: > > > ## and # are operators (albeit a pre-processor one). Chromium style puts > > spaces > > > around operators. > > > > Done. > > I don't see spaces. To be clear I mean that > TegraV4L2_##name > should be > TegraV4L2_ ## name > and > "TegraV4L2_" #name > should be > "TegraV4L2_" # name > > and so on https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.cc:1912: return false; GetFormatInfo() on failure will call NOTIFY_PLATFORM() so VDA goes into error state anyways. On 2014/03/28 19:33:40, Ami Fischman wrote: > On 2014/03/28 18:54:11, shivdasp wrote: > > See below response. > > On 2014/03/28 17:10:01, Ami Fischman wrote: > > > Considering that the containing method is only called if the driver > indicated > > a > > > resolution change, shouldn't this default to true in both the again==true > and > > > !ret cases? > > > > Response below makes sense to me for why you return false at l.1920, but not why > it makes sense to return true if GetFormatInfo fails. https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... content/common/gpu/media/v4l2_video_decode_accelerator.h:372: // while the Child thread manipulates them. On 2014/03/28 19:33:40, Ami Fischman wrote: > On 2014/03/28 18:54:11, shivdasp wrote: > > I will move this to l.397 but I don't think adding this variable in the > comment > > is true since this member is actually modified in the decoder_thread_ context. > > > > On 2014/03/28 17:10:01, Ami Fischman wrote: > > > This is decoder-thread state so belongs above. I'd put it at l.397 and add > > it > > > to the list of variables in the comment at l.373. > > > (read & set on the decoder thread, read on the child thread but only in > > > AssignPictureBuffers which has the comment at l.334 of the .cc file). > > > > > This comment is talking about vars that are normally read/written on the decoder > thread but which are accessed on the child thread during known-safe times. That > seems to match output_planes_count_ to me. Done.
On Fri, Mar 28, 2014 at 1:39 PM, <shivdasp@nvidia.com> wrote: > > https://codereview.chromium.org/137023008/diff/900001/ > content/common/gpu/media/video_decode_accelerator_unittest.cc > File content/common/gpu/media/video_decode_accelerator_unittest.cc > (right): > > https://codereview.chromium.org/137023008/diff/900001/ > content/common/gpu/media/video_decode_accelerator_unittest.cc#newcode1548 > content/common/gpu/media/video_decode_accelerator_unittest.cc:1548: > dlopen("/usr/lib/libtegrav4l2.so", RTLD_NOW | RTLD_GLOBAL | > RTLD_NODELETE); > On 2014/03/28 19:33:40, Ami Fischman wrote: > >> On 2014/03/28 18:54:11, shivdasp wrote: >> > I didn't get it. I think the sandbox is enabled here below so >> > loading this > >> > library in InitializeLibrarySymbols() will not be allowed after >> > sandbox is > >> > enabled. >> > So I load the library here. >> > >> > On 2014/03/28 17:10:01, Ami Fischman wrote: >> > > Why not put this in TegraV4L2Device::InitializeLibrarySymbols() >> > (relying on > >> > > dlopen's refcounting nature to avoid sandbox violations in the >> > sandboxed > >> case, >> > > b/c the pre-sandbox code will already have dlopen'd this). >> > >> > > The sandbox never used to be triggered by this binary AFAIK. >> Are you sure it is ever engaged? >> > What do you mean by binary here ? > The preload of the library before sandbox helps us to acquire resources > (pre-open the device nodes etc.) Without pre-loading it does not work. By "binary" I meant that this is a standalone test program, not part of chrome. I don't believe the sandbox is used for this unittest. > https://codereview.chromium.org/137023008/diff/920001/ > content/common/gpu/media/tegra_v4l2_video_device.cc > File content/common/gpu/media/tegra_v4l2_video_device.cc (right): > > https://codereview.chromium.org/137023008/diff/920001/ > content/common/gpu/media/tegra_v4l2_video_device.cc#newcode103 > content/common/gpu/media/tegra_v4l2_video_device.cc:103: static bool > functions_initialized = InitializeLibrarySymbols(); > I am not very familiar with LazyInstance, will this be correct way ? > > base::LazyInstance<bool>::Leaky g_functions_initialized = > LAZY_INSTANCE_INITIALIZER; > > TVDA::Initialize() { > if (!g_functions_initialized.Get()) { > if (!InitializeLibrarySymbols()) { > DLOG(ERROR) << "Unable to initialize functions "; > return false; > } > g_functions_initialized.Get() = true; > > } > } > No. You need to make a helper class TegraFunctionSymbolFinder { public: TegraFunctionSymbolFinder() : initialized_(false) { ...do the work... initialized_ = true; } bool initialized() { return initialized_; } private: bool initailized_; }; And then instead of your function-static you do: if (!g_tegra_function_symbol_finder_.Get()->initialized()) return OOPS; (with a global LazyInstance<TegraFunctionSymbolFinder> g_tegra_function_symbol_finder_) > https://codereview.chromium.org/137023008/diff/920001/ > content/common/gpu/media/tegra_v4l2_video_device.cc#newcode153 > content/common/gpu/media/tegra_v4l2_video_device.cc:153: > TegraV4L2_##name = reinterpret_cast<TegraV4L2##name>( \ > Trust me , I really did these changes but for some reason they did not > get uploaded. lol > https://codereview.chromium.org/137023008/diff/920001/ > content/common/gpu/media/v4l2_video_decode_accelerator.cc > File content/common/gpu/media/v4l2_video_decode_accelerator.cc (right): > > https://codereview.chromium.org/137023008/diff/920001/ > content/common/gpu/media/v4l2_video_decode_accelerator.cc#newcode1912 > content/common/gpu/media/v4l2_video_decode_accelerator.cc:1912: return > false; > GetFormatInfo() on failure will call NOTIFY_PLATFORM() so VDA goes into > error state anyways. What about the *again=true; return true; path in GetFormatInfo? > https://codereview.chromium.org/137023008/diff/920001/ > content/common/gpu/media/v4l2_video_decode_accelerator.h > File content/common/gpu/media/v4l2_video_decode_accelerator.h (right): > > https://codereview.chromium.org/137023008/diff/920001/ > content/common/gpu/media/v4l2_video_decode_accelerator.h#newcode372 > content/common/gpu/media/v4l2_video_decode_accelerator.h:372: // while > the Child thread manipulates them. > On 2014/03/28 19:33:40, Ami Fischman wrote: > >> On 2014/03/28 18:54:11, shivdasp wrote: >> > I will move this to l.397 but I don't think adding this variable in >> > the > >> comment >> > is true since this member is actually modified in the >> > decoder_thread_ context. > >> > >> > On 2014/03/28 17:10:01, Ami Fischman wrote: >> > > This is decoder-thread state so belongs above. I'd put it at >> > l.397 and add > >> > it >> > > to the list of variables in the comment at l.373. >> > > (read & set on the decoder thread, read on the child thread but >> > only in > >> > > AssignPictureBuffers which has the comment at l.334 of the .cc >> > file). > >> > >> > > > This comment is talking about vars that are normally read/written on >> > the decoder > >> thread but which are accessed on the child thread during known-safe >> > times. That > >> seems to match output_planes_count_ to me. >> > > Done. > I don't see a new patchset yet. > > https://codereview.chromium.org/137023008/ > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Can't reply inline for some reason. Regarding GetFormatInfo(), it must succeed since we have received RESOLUTION_CHANGE event, hence returning false is correct. Another patch coming with LazyInstance fix. https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/920001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:103: static bool functions_initialized = InitializeLibrarySymbols(); On 2014/03/28 20:39:26, shivdasp wrote: > I am not very familiar with LazyInstance, will this be correct way ? > > base::LazyInstance<bool>::Leaky g_functions_initialized = > LAZY_INSTANCE_INITIALIZER; > > TVDA::Initialize() { > if (!g_functions_initialized.Get()) { > if (!InitializeLibrarySymbols()) { > DLOG(ERROR) << "Unable to initialize functions "; > return false; > } > g_functions_initialized.Get() = true; > } > } > > On 2014/03/28 19:33:40, Ami Fischman wrote: > > On 2014/03/28 18:54:11, shivdasp wrote: > > > How was/is this done in vaapi_wrapper.c ? > > > There is no real computation happening in InitializeLibrarySymbols() so > should > > > it be okay as is ? atleast for now. > > > > > > On 2014/03/28 17:10:01, Ami Fischman wrote: > > > > this is racy: > > > > > > > > > > http://dev.chromium.org/developers/coding-style/cpp-dos-and-donts#TOC-Static-... > > > > > > > Why add a known race condition when the fix is so easy? > Making this change in next patchset. class declaration TegraSymbolFinder can be in tegra_v4l2_video_device.cc ?
git cl format does not like "spaces around ##" and that's what happened when I did it earlier :) Guess I will obey git cl format then ?
Yes, obey git cl format. Yes, put TegraSymbolFinder can be in tegra_v4l2_video_device.cc Let me know when a new patchset is up. On Fri, Mar 28, 2014 at 3:15 PM, <shivdasp@nvidia.com> wrote: > git cl format does not like "spaces around ##" and that's what happened > when I > did it earlier :) > Guess I will obey git cl format then ? > > https://codereview.chromium.org/137023008/ > To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Updated with rebased patchset, PTAL.
Hi Ami, Pawel, Hope there are no more comments here to be addressed. Thanks for patiently reviewing the changes.
Hi Ami, Pawel, I hope the final patchset which is rebased is good to go for R35. Thanks Kaustubh
Hi Shivdas, Could you please address my yesterday's question about the capture format (V4L2_PIX_FMT_NV12M)? Are you now allocating two discontiguous memory buffers per each v4l2 buffer and using exactly this pixel format: http://linuxtv.org/downloads/v4l-dvb-apis/re30.html and your converter/GPU now handles this format? Thanks.
On 2014/03/29 04:01:30, Pawel Osciak wrote: Hi Pawel, As per my understanding the capture format seems to be set appropriately and the GPU is able to convert. Shivdas seems to be have tested and performance wise there seems to be no issue. If the suggestions are good to have then we can plan to adddress it subsequently. Please let me know if you think otherwise. Thanks Kaustubh
LGTM % nits & posciak's say-so. https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:81: } nit: newline after this https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:82: bool Initialized() { return initialized_; } nit: simple accessors are typically lowercased: bool initialized() { ... };
On 2014/03/29 06:51:04, kpurandare wrote: > On 2014/03/29 04:01:30, Pawel Osciak wrote: > > Hi Pawel, > > As per my understanding the capture format seems to be set appropriately and the > GPU is able to convert. Shivdas seems to be have tested and performance wise > there seems to be no issue. Hi Kaustubh, I asked about the output format that the Tegra codec uses and we agreed to change the V4L2_PIX_FMT_NV12M define in Tegra device class to the actual format that the codec produces. I'm asking for no more than simply changing this format macro to the one that the Tegra codec actually uses, and not the one that Exynos uses. We agreed with Shivdas to do this 5 weeks ago and since then I asked about it multiple times. Later though Shivdas mentioned that: "We made change in the buffer allocation to have Tegra's preferred format same as Exynos." So I am merely asking for clarification on this sentence and/or follow up on what we agreed to do. I hope you could please promise me that this will be addressed in a follow up CL. Thank you. > If the suggestions are good to have then we can plan to adddress it > subsequently. > > Please let me know if you think otherwise. > > Thanks > Kaustubh
LGTM % one nit, and assuming sandboxing owners approval and that the format issue is addressed as a follow up. Thank you. https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/medi... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:142: } Since we are hardcoding, DCHECK_EQ(planes_count, 2);
Patchset incoming for addressing nits. https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/medi... File content/common/gpu/media/exynos_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/medi... content/common/gpu/media/exynos_v4l2_video_device.cc:142: } On 2014/03/29 11:16:12, Pawel Osciak wrote: > Since we are hardcoding, > > DCHECK_EQ(planes_count, 2); Done. https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/medi... File content/common/gpu/media/tegra_v4l2_video_device.cc (right): https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:81: } On 2014/03/29 08:23:51, Ami Fischman wrote: > nit: newline after this Done. https://codereview.chromium.org/137023008/diff/960001/content/common/gpu/medi... content/common/gpu/media/tegra_v4l2_video_device.cc:82: bool Initialized() { return initialized_; } On 2014/03/29 08:23:51, Ami Fischman wrote: > nit: simple accessors are typically lowercased: > bool initialized() { ... }; Done.
On 2014/03/29 11:14:20, Pawel Osciak wrote: > On 2014/03/29 06:51:04, kpurandare wrote: > > On 2014/03/29 04:01:30, Pawel Osciak wrote: > > > > Hi Pawel, > > > > As per my understanding the capture format seems to be set appropriately and > the > > GPU is able to convert. Shivdas seems to be have tested and performance wise > > there seems to be no issue. > > Hi Kaustubh, > I asked about the output format that the Tegra codec uses and we agreed to > change the V4L2_PIX_FMT_NV12M define in Tegra device class to the actual format > that the codec produces. I'm asking for no more than simply changing this format > macro to the one that the Tegra codec actually uses, and not the one that Exynos > uses. We agreed with Shivdas to do this 5 weeks ago and since then I asked about > it multiple times. > > Later though Shivdas mentioned that: > "We made change in the buffer allocation to have Tegra's preferred format same > as Exynos." > Hi Pawel, Yes on Tegra we allocate two non-contigous surfaces and I confirmed that earlier. I agree some point in time (quite a while ago) we were not reporting the correct format but we changed the buffer allocations for codecs. From http://linuxtv.org/downloads/v4l-dvb-apis/re30.html: "This is a multi-planar, two-plane version of the YUV 4:2:0 format. The three components are separated into two sub-images or planes. V4L2_PIX_FMT_NV12M differs from V4L2_PIX_FMT_NV12 in that the two planes are non-contiguous in memory" So this format is fine I believe. > So I am merely asking for clarification on this sentence and/or follow up on > what we agreed to do. This does not need any follow up CL now I think. > > I hope you could please promise me that this will be addressed in a follow up > CL. My sincere apologies if I did not make these things categorically clear earlier. Thanks. > > Thank you. > > > If the suggestions are good to have then we can plan to adddress it > > subsequently. > > > > Please let me know if you think otherwise. > > > > Thanks > > Kaustubh
Addressed minor nits. PTAL
Jorge, May I request you to take a look at the sandboxing related changes in this CL. Thanks
On 2014/03/29 18:10:17, shivdasp wrote: > Jorge, > May I request you to take a look at the sandboxing related changes in this CL. > > Thanks All review comments are addressed, waiting for approval from OWNERS too to CQ this. Thanks.
Jorgelo@ and piman@ OWNERS please.
https://codereview.chromium.org/137023008/diff/980001/content/common/sandbox_... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/980001/content/common/sandbox_... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:183: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); Please add a comment above this line to decouple it from the Mali line above. Something like "// Preload the Tegra V4L2 (video decode acceleration) library."
Addressed comment from Jorgelo@. PTAL https://codereview.chromium.org/137023008/diff/980001/content/common/sandbox_... File content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc (right): https://codereview.chromium.org/137023008/diff/980001/content/common/sandbox_... content/common/sandbox_linux/bpf_cros_arm_gpu_policy_linux.cc:183: dlopen("/usr/lib/libtegrav4l2.so", dlopen_flag); On 2014/03/31 18:07:25, Jorge Lucangeli Obes wrote: > Please add a comment above this line to decouple it from the Mali line above. > > Something like "// Preload the Tegra V4L2 (video decode acceleration) library." Done.
Sandbox lgtm.
The CQ bit was checked by shivdasp@nvidia.com
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/shivdasp@nvidia.com/137023008/1000001
lgtm
Message was sent while issue was closed.
Change committed as 260661 |