Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1754693imu; Thu, 24 Jan 2019 01:04:08 -0800 (PST) X-Google-Smtp-Source: ALg8bN6hYW33ctEf2UfgJjpwlWLNslmvTdllXB+DCqbKdq1gEaR68MM7RK/UA20tiQT6cJOCaV75 X-Received: by 2002:a62:ed0f:: with SMTP id u15mr5521832pfh.188.1548320647898; Thu, 24 Jan 2019 01:04:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548320647; cv=none; d=google.com; s=arc-20160816; b=mwHV2jcXaC7gxDxDB6fOHkdMq9wAd4ZzhPh2bs1WxCS4wyvxs4kBRs0hYh8NYwNjuE 7b1Avtva6KSlhl5OZ84LCCdKelG9xsqn7IoAWkbwTQrkiqKIw2iK1rmKXdsG+SWM+q2K whG7q+rUjK2ENNpbKxtvreGOB7ZtrXU/71ADeRRjFV5PnZSY+AXUeG3iIDhvzG3dRME/ JA2WjizFcmfMlFviaewEpUIAn0rLKz8rQMxhMjTzpu4mB0kLbqNqFN62tzzD09XSAH94 k/dqPlEvnkc7LOIhujdLw1qS7uI8skJ2dBXTcknon0tm7tD6h1I4J1PYfo2yJQ3BXNaQ MLWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=kcNFwDspXZLy7B28+kM1O+EHv84CHBAANhhZNHxZvrE=; b=PiF1nc/xBX6o1d0WKV0d/aiR5+ZriSCKJeBjd05ZjPyDFvFxS6pOyut6WOXON14H1b FEFBjgpgfLHpxFyGYcf2w98QHZ+EpaMOev63Rzhtx4t+NTptdwogaWrI/F34im1RAzXl EwEGsAwwDdw7tWMpAr+jgf584oqND5a5Wg/yqjmnHJa8H45mJVQDCKHphhp9DDPLyl+i Cu1ypCepZPmKZSOF7pr6rrpgBUqcYKtzBsSJnVnKTGXaJKuVJ4iVhlqhHklYGJGHe4j2 dxgVvBGeDMTsjVKJ+c74dPiaaLW5Yuv4V/PvFuND/V3GrYDVAMc1mWF6kYL+PirNyfuQ qvGg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=IC1ZbrCR; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l17si20005299pfd.236.2019.01.24.01.03.51; Thu, 24 Jan 2019 01:04:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=IC1ZbrCR; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727349AbfAXJCX (ORCPT + 99 others); Thu, 24 Jan 2019 04:02:23 -0500 Received: from mail-ot1-f68.google.com ([209.85.210.68]:44416 "EHLO mail-ot1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726997AbfAXJCW (ORCPT ); Thu, 24 Jan 2019 04:02:22 -0500 Received: by mail-ot1-f68.google.com with SMTP id g16so584837otg.11 for ; Thu, 24 Jan 2019 01:02:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=kcNFwDspXZLy7B28+kM1O+EHv84CHBAANhhZNHxZvrE=; b=IC1ZbrCRgmhPNtFwKx5Oavoixd4zxkzIlDwOidC1CUQsmVqyW7H4cBPTToG+EU0LbV XqxXc7/L8UbhvNOn9TOx6OhUZQgx4Y18nXCESDSOu4dK+NaQuha3VcNVJjNquODIl0fm ihW5M6pAh0R8y9OWGtjcgUex3vgG4CpqvrIgY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kcNFwDspXZLy7B28+kM1O+EHv84CHBAANhhZNHxZvrE=; b=rEAJ6Wrqn+yKJPFyyO2b2S5Bqmvm/a//fFrqeKwsd1f6izKuTSx6G9X+iV4TzAoyu/ WTrAjJUb5tOZMKN6nGCg5xx1Zy6r4kWQGFBLzjHtoN4PEb2lujlIXPsT/8JTKvCc7N9E c0BS9HITxjkr2u1MX03qi2/jzv6HXOFAjhDt6tw+4XIP1TTtXGKim/JNu00iCyT9x6GH LMXuwGwOdevV66MqICK+r5TPBL9frkcKC806oL/x9wSmDiI6zHivDCWxkKoNLi4eGlt4 0Eh1XzrW9ypTetn2eEfSBscAfCKMOqn9MbQmvgpWMDIa/2Cd0IFewrStcyvboQKpsT+v nY7Q== X-Gm-Message-State: AJcUukdwE89gBq7t0X65sNvKA2eG+CjbPuz7ySQGM9g9+FeHkJO3YYjF THF/XC3s7CXBNpXp3mmZrWfccPdfXio= X-Received: by 2002:a9d:7dd5:: with SMTP id k21mr3914974otn.214.1548320541013; Thu, 24 Jan 2019 01:02:21 -0800 (PST) Received: from mail-ot1-f45.google.com (mail-ot1-f45.google.com. [209.85.210.45]) by smtp.gmail.com with ESMTPSA id j5sm10934507otc.54.2019.01.24.01.02.19 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 01:02:20 -0800 (PST) Received: by mail-ot1-f45.google.com with SMTP id e12so4595414otl.5 for ; Thu, 24 Jan 2019 01:02:19 -0800 (PST) X-Received: by 2002:a05:6830:1193:: with SMTP id u19mr3892879otq.152.1548320539484; Thu, 24 Jan 2019 01:02:19 -0800 (PST) MIME-Version: 1.0 References: <20181205100121.181765-1-acourbot@chromium.org> <42a24867b3b4506cdb7e738eec5b2b8316f8ca19.camel@bootlin.com> In-Reply-To: From: Tomasz Figa Date: Thu, 24 Jan 2019 18:02:08 +0900 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] media: docs-rst: Document m2m stateless video decoder interface To: Paul Kocialkowski Cc: Alexandre Courbot , Mauro Carvalho Chehab , Hans Verkuil , Pawel Osciak , Linux Media Mailing List , Linux Kernel Mailing List , Thierry Reding Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 24, 2019 at 5:59 PM Paul Kocialkowski wrote: > > Hi, > > On Thu, 2019-01-24 at 17:07 +0900, Tomasz Figa wrote: > > On Wed, Jan 23, 2019 at 7:42 PM Paul Kocialkowski > > wrote: > > > Hi Alex, > > > > > > On Wed, 2019-01-23 at 18:43 +0900, Alexandre Courbot wrote: > > > > On Tue, Jan 22, 2019 at 7:10 PM Paul Kocialkowski > > > > wrote: > > > > > Hi, > > > > > > > > > > On Tue, 2019-01-22 at 17:19 +0900, Tomasz Figa wrote: > > > > > > Hi Paul, > > > > > > > > > > > > On Fri, Dec 7, 2018 at 5:30 PM Paul Kocialkowski > > > > > > wrote: > > > > > > > Hi, > > > > > > > > > > > > > > Thanks for this new version! I only have one comment left, see below. > > > > > > > > > > > > > > On Wed, 2018-12-05 at 19:01 +0900, Alexandre Courbot wrote: > > > > > > > > Documents the protocol that user-space should follow when > > > > > > > > communicating with stateless video decoders. > > > > > > > > > > > > > > > > The stateless video decoding API makes use of the new request and tags > > > > > > > > APIs. While it has been implemented with the Cedrus driver so far, it > > > > > > > > should probably still be considered staging for a short while. > > > > > > > > > > > > > > > > Signed-off-by: Alexandre Courbot > > > > > > > > --- > > > > > > > > Removing the RFC flag this time. Changes since RFCv3: > > > > > > > > > > > > > > > > * Included Tomasz and Hans feedback, > > > > > > > > * Expanded the decoding section to better describe the use of requests, > > > > > > > > * Use the tags API. > > > > > > > > > > > > > > > > Documentation/media/uapi/v4l/dev-codec.rst | 5 + > > > > > > > > .../media/uapi/v4l/dev-stateless-decoder.rst | 399 ++++++++++++++++++ > > > > > > > > 2 files changed, 404 insertions(+) > > > > > > > > create mode 100644 Documentation/media/uapi/v4l/dev-stateless-decoder.rst > > > > > > > > > > > > > > > > diff --git a/Documentation/media/uapi/v4l/dev-codec.rst b/Documentation/media/uapi/v4l/dev-codec.rst > > > > > > > > index c61e938bd8dc..3e6a3e883f11 100644 > > > > > > > > --- a/Documentation/media/uapi/v4l/dev-codec.rst > > > > > > > > +++ b/Documentation/media/uapi/v4l/dev-codec.rst > > > > > > > > @@ -6,6 +6,11 @@ > > > > > > > > Codec Interface > > > > > > > > *************** > > > > > > > > > > > > > > > > +.. toctree:: > > > > > > > > + :maxdepth: 1 > > > > > > > > + > > > > > > > > + dev-stateless-decoder > > > > > > > > + > > > > > > > > A V4L2 codec can compress, decompress, transform, or otherwise convert > > > > > > > > video data from one format into another format, in memory. Typically > > > > > > > > such devices are memory-to-memory devices (i.e. devices with the > > > > > > > > diff --git a/Documentation/media/uapi/v4l/dev-stateless-decoder.rst b/Documentation/media/uapi/v4l/dev-stateless-decoder.rst > > > > > > > > new file mode 100644 > > > > > > > > index 000000000000..7a781c89bd59 > > > > > > > > --- /dev/null > > > > > > > > +++ b/Documentation/media/uapi/v4l/dev-stateless-decoder.rst > > > > > > > > @@ -0,0 +1,399 @@ > > > > > > > > +.. -*- coding: utf-8; mode: rst -*- > > > > > > > > + > > > > > > > > +.. _stateless_decoder: > > > > > > > > + > > > > > > > > +************************************************** > > > > > > > > +Memory-to-memory Stateless Video Decoder Interface > > > > > > > > +************************************************** > > > > > > > > + > > > > > > > > +A stateless decoder is a decoder that works without retaining any kind of state > > > > > > > > +between processing frames. This means that each frame is decoded independently > > > > > > > > +of any previous and future frames, and that the client is responsible for > > > > > > > > +maintaining the decoding state and providing it to the decoder with each > > > > > > > > +decoding request. This is in contrast to the stateful video decoder interface, > > > > > > > > +where the hardware and driver maintain the decoding state and all the client > > > > > > > > +has to do is to provide the raw encoded stream. > > > > > > > > + > > > > > > > > +This section describes how user-space ("the client") is expected to communicate > > > > > > > > +with such decoders in order to successfully decode an encoded stream. Compared > > > > > > > > +to stateful codecs, the decoder/client sequence is simpler, but the cost of > > > > > > > > +this simplicity is extra complexity in the client which must maintain a > > > > > > > > +consistent decoding state. > > > > > > > > + > > > > > > > > +Stateless decoders make use of the request API and buffer tags. A stateless > > > > > > > > +decoder must thus expose the following capabilities on its queues when > > > > > > > > +:c:func:`VIDIOC_REQBUFS` or :c:func:`VIDIOC_CREATE_BUFS` are invoked: > > > > > > > > + > > > > > > > > +* The ``V4L2_BUF_CAP_SUPPORTS_REQUESTS`` capability must be set on the > > > > > > > > + ``OUTPUT`` queue, > > > > > > > > + > > > > > > > > +* The ``V4L2_BUF_CAP_SUPPORTS_TAGS`` capability must be set on the ``OUTPUT`` > > > > > > > > + and ``CAPTURE`` queues, > > > > > > > > + > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > > +Decoding > > > > > > > > +======== > > > > > > > > + > > > > > > > > +For each frame, the client is responsible for submitting a request to which the > > > > > > > > +following is attached: > > > > > > > > + > > > > > > > > +* Exactly one frame worth of encoded data in a buffer submitted to the > > > > > > > > + ``OUTPUT`` queue, > > > > > > > > > > > > > > Although this is still the case in the cedrus driver (but will be fixed > > > > > > > eventually), this requirement should be dropped because metadata is > > > > > > > per-slice and not per-picture in the formats we're currently aiming to > > > > > > > support. > > > > > > > > > > > > > > I think it would be safer to mention something like filling the output > > > > > > > buffer with the minimum unit size for the selected output format, to > > > > > > > which the associated metadata applies. > > > > > > > > > > > > I'm not sure it's a good idea. Some of the reasons why I think so: > > > > > > 1) There are streams that can have even 32 slices. With that, you > > > > > > instantly run out of V4L2 buffers even just for 1 frame. > > > > > > 2) The Rockchip hardware which seems to just pick all the slices one > > > > > > after another and which was the reason to actually put the slice data > > > > > > in the buffer like that. > > > > > > 3) Not all the metadata is per-slice. Actually most of the metadata > > > > > > is per frame and only what is located inside v4l2_h264_slice_param is > > > > > > per-slice. The corresponding control is an array, which has an entry > > > > > > for each slice in the buffer. Each entry includes an offset field, > > > > > > which points to the place in the buffer where the slice is located. > > > > > > > > > > Sorry, I realize that my email wasn't very clear. What I meant to say > > > > > is that the spec should specify that "at least the minimum unit size > > > > > for decoding should be passed in a buffer" (that's maybe not the > > > > > clearest wording), instead of "one frame worth of". > > > > > > > > > > I certainly don't mean to say that each slice should be held in a > > > > > separate buffer and totally agree with all the points you're making :) > > > > > > > > Thanks for clarifying. I will update the document and post v3 accordingly. > > > > > > > > > I just think we should still allow userspace to pass slices with a > > > > > finer granularity than "all the slices required for one frame". > > > > > > > > I'm afraid that doing so could open the door to some ambiguities. If > > > > you allow that, then are you also allowed to send more than one frame > > > > if the decode parameters do not change? How do drivers that only > > > > support full frames react when handled only parts of a frame? > > > > > > IIRC the ability to pass individual slices was brought up regarding a > > > potential latency benefit, but I doubt it would really be that > > > significant. > > > > > > Thinking about it with the points you mentionned in mind, I guess the > > > downsides are much more significant than the potential gain. > > > > > > So let's stick with requiring all the slices for a frame then! > > > > Ack. > > > > My view is that we can still loosen this requirement in the future, > > possibly behind some driver capability flag, but starting with a > > simpler API, with less freedom to the applications and less > > constraints on hardware support sounds like a better practice in > > general. > > Sounds good, a capability flag would definitely make sense for that. > > > > > > Side point: After some discussions with Thierry Reading, who's looking > > > > > into the the Tegra VPU (also stateless), it seems that using the annex- > > > > > b format for h.264 would be best for everyone. So that means including > > > > > the start code, NAL header and "raw" slice data. I guess the same > > > > > should apply to other codecs too. But that should be in the associated > > > > > pixfmt spec, not in this general document. > > > > > > > > > > What do yout think? > > > > Hmm, wouldn't that effectively make it the same as V4L2_PIX_FMT_H264? > > Well, this would only concern the slice NAL unit. As far as I > understood, V4L2_PIX_FMT_H264 takes all sorts of NAL units. > Ah, passing only slice NAL units makes much more sense indeed. > > By the way, I proposed it once, some time ago, but it was rejected > > because VAAPI didn't get the full annex B stream and a V4L2 stateless > > VAAPI backend would have to reconstruct the stream. > > Oh, right I remember. After a close look, this is apparently not the > case, according to the VAAPI docs at: > http://intel.github.io/libva/structVASliceParameterBufferH264.html > > Also looking at ffmpeg, VAAPI and VDPAU seem to pass the same data, > except that VDPAU adds a start code prefix (which IIRC is required by > the rockchip decoder): > > - VAAPI: https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/vaapi_h264.c#L331 > - VDPAU: https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/vdpau_h264.c#L182 > > So I was initially a bit reluctant to make it part of the spec that the > full slice NAL should be passed since that would imply geting the NAL > header info both in parsed form through the control and in raw form > along with the slice data. But it looks like it might be rather common > for decoders to require this if the tegra decoder also needs it. If so, I think it makes perfect sense indeed. Best regards, Tomasz