Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2253768imm; Fri, 7 Sep 2018 13:18:56 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdai2zeUgdC5bzNhr5aNHBqvIY9VxZo+cKUcfF2x6Y+Rw3GqeeoELVKfhTF/4yGsYB5Ip2Kz X-Received: by 2002:a63:a441:: with SMTP id c1-v6mr10232340pgp.182.1536351536838; Fri, 07 Sep 2018 13:18:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536351536; cv=none; d=google.com; s=arc-20160816; b=ysWnAx2J+zIU3K+H+nbHKRX4itZtPg3dOZaCKpLtaMlcDUHQRJ0ZgGaG5ibpyYxmcl W01irNME1jQSY0mvSqXHgKm/voWitEl/lZH8wdexPtLkVyz0STMihVuwxQivNfaXuUcz q+QSJyHSMjM6e7O9b/L4sfSKJ4cn619uUGkmHHHfWVmgTttchzH1M0LgCNknBw5TAIVA 9paz/o7lf8AZ1/vfjmA/0ARVrsMGbCOPUCL3QUDbmOrKsX9hdKgrx4sJglpDssaUvcYr AVeiyQACHNnoIDPeov4D9GNX8BF6nArjOJOh0sjBg5e/49jFKsBrAMDj61GsfCxEdTNX x2Lg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:date:cc:to:from:subject :message-id; bh=81zDkUb0GEy2m8nMZa3Cn4tCnFFfndndRlnRSRSH49U=; b=B8TkaCLo2t9A1nHwhDAQ00c+Ktfilr08Ml9D9x7vAGhGWJD2G2fI3zwpBJvIbj2uEP oY2x0Zb6/JQSavRy3N+T7TeMSmiDf/OHci8nK6eAIoHlA4JQVnAvFRKHn6feN9VURS4p LQkKeaQvvdpLZiJSA4fF0QmZAIPlluSfXpzuwoaofYSyYDAkrXnsWknp6Cu8zW3HOpcd cyqNk/O9pRMdWpDERXchx/+II74YEckDV9d2TH/uydiY3mVqo6hfpZRS0QITFo2yoenm qKRlK+eoRTx8rh8+83ixWGgSaPBPbjtP9kTJBUA8M+i7o0sWqjlaz5dfsL0/KXq3TUvn SCvQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o21-v6si9448146pgc.658.2018.09.07.13.18.41; Fri, 07 Sep 2018 13:18:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726308AbeIHBAF (ORCPT + 99 others); Fri, 7 Sep 2018 21:00:05 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39082 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725986AbeIHBAF (ORCPT ); Fri, 7 Sep 2018 21:00:05 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: ezequiel) with ESMTPSA id DF66927DB0E Message-ID: <19062a24c3aa2cb9e0410cf2884b4589e44c263b.camel@collabora.com> Subject: Re: [PATCH 2/2] media: docs-rst: Document memory-to-memory video encoder interface From: Ezequiel Garcia To: Tomasz Figa , linux-media@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Stanimir Varbanov , Mauro Carvalho Chehab , Hans Verkuil , Pawel Osciak , Alexandre Courbot , kamil@wypas.org, a.hajda@samsung.com, Kyungmin Park , jtp.park@samsung.com, Philipp Zabel , Tiffany Lin =?UTF-8?Q?=28=E6=9E=97=E6=85=A7=E7=8F=8A=29?= , Andrew-CT Chen =?UTF-8?Q?=28=E9=99=B3=E6=99=BA=E8=BF=AA=29?= , todor.tomov@linaro.org, nicolas@ndufresne.ca, Paul Kocialkowski , Laurent Pinchart , Dave Stevenson Date: Fri, 07 Sep 2018 17:17:03 -0300 In-Reply-To: <20180724140621.59624-3-tfiga@chromium.org> References: <20180724140621.59624-1-tfiga@chromium.org> <20180724140621.59624-3-tfiga@chromium.org> Organization: Collabora Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.2-1 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2018-07-24 at 23:06 +0900, Tomasz Figa wrote: > Due to complexity of the video encoding process, the V4L2 drivers of > stateful encoder hardware require specific sequences of V4L2 API calls > to be followed. These include capability enumeration, initialization, > encoding, encode parameters change, drain and reset. > > Specifics of the above have been discussed during Media Workshops at > LinuxCon Europe 2012 in Barcelona and then later Embedded Linux > Conference Europe 2014 in Düsseldorf. The de facto Codec API that > originated at those events was later implemented by the drivers we already > have merged in mainline, such as s5p-mfc or coda. > > The only thing missing was the real specification included as a part of > Linux Media documentation. Fix it now and document the encoder part of > the Codec API. > > Signed-off-by: Tomasz Figa > --- > Documentation/media/uapi/v4l/dev-encoder.rst | 550 +++++++++++++++++++ > Documentation/media/uapi/v4l/devices.rst | 1 + > Documentation/media/uapi/v4l/v4l2.rst | 2 + > 3 files changed, 553 insertions(+) > create mode 100644 Documentation/media/uapi/v4l/dev-encoder.rst > > diff --git a/Documentation/media/uapi/v4l/dev-encoder.rst b/Documentation/media/uapi/v4l/dev-encoder.rst > new file mode 100644 > index 000000000000..28be1698e99c > --- /dev/null > +++ b/Documentation/media/uapi/v4l/dev-encoder.rst > @@ -0,0 +1,550 @@ > +.. -*- coding: utf-8; mode: rst -*- > + > +.. _encoder: > + > +**************************************** > +Memory-to-memory Video Encoder Interface > +**************************************** > + > +Input data to a video encoder are raw video frames in display order > +to be encoded into the output bitstream. Output data are complete chunks of > +valid bitstream, including all metadata, headers, etc. The resulting stream > +must not need any further post-processing by the client. > + > +Performing software stream processing, header generation etc. in the driver > +in order to support this interface is strongly discouraged. In case such > +operations are needed, use of Stateless Video Encoder Interface (in > +development) is strongly advised. > + > +Conventions and notation used in this document > +============================================== > + > +1. The general V4L2 API rules apply if not specified in this document > + otherwise. > + > +2. The meaning of words “must”, “may”, “should”, etc. is as per RFC > + 2119. > + > +3. All steps not marked “optional” are required. > + > +4. :c:func:`VIDIOC_G_EXT_CTRLS`, :c:func:`VIDIOC_S_EXT_CTRLS` may be used > + interchangeably with :c:func:`VIDIOC_G_CTRL`, :c:func:`VIDIOC_S_CTRL`, > + unless specified otherwise. > + > +5. Single-plane API (see spec) and applicable structures may be used > + interchangeably with Multi-plane API, unless specified otherwise, > + depending on driver capabilities and following the general V4L2 > + guidelines. > + > +6. i = [a..b]: sequence of integers from a to b, inclusive, i.e. i = > + [0..2]: i = 0, 1, 2. > + > +7. For ``OUTPUT`` buffer A, A’ represents a buffer on the ``CAPTURE`` queue > + containing data (encoded frame/stream) that resulted from processing > + buffer A. > + > +Glossary > +======== > + > +CAPTURE > + the destination buffer queue; the queue of buffers containing encoded > + bitstream; ``V4L2_BUF_TYPE_VIDEO_CAPTURE```` or > + ``V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE``; data are captured from the > + hardware into ``CAPTURE`` buffers > + > +client > + application client communicating with the driver implementing this API > + > +coded format > + encoded/compressed video bitstream format (e.g. H.264, VP8, etc.); > + see also: raw format > + > +coded height > + height for given coded resolution > + > +coded resolution > + stream resolution in pixels aligned to codec and hardware requirements; > + typically visible resolution rounded up to full macroblocks; see also: > + visible resolution > + > +coded width > + width for given coded resolution > + > +decode order > + the order in which frames are decoded; may differ from display order if > + coded format includes a feature of frame reordering; ``CAPTURE`` buffers > + must be returned by the driver in decode order > + > +display order > + the order in which frames must be displayed; ``OUTPUT`` buffers must be > + queued by the client in display order > + > +IDR > + a type of a keyframe in H.264-encoded stream, which clears the list of > + earlier reference frames (DPBs) > + > +keyframe > + an encoded frame that does not reference frames decoded earlier, i.e. > + can be decoded fully on its own. > + > +macroblock > + a processing unit in image and video compression formats based on linear > + block transforms (e.g. H264, VP8, VP9); codec-specific, but for most of > + popular codecs the size is 16x16 samples (pixels) > + > +OUTPUT > + the source buffer queue; the queue of buffers containing raw frames; > + ``V4L2_BUF_TYPE_VIDEO_OUTPUT`` or > + ``V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE``; the hardware is fed with data > + from ``OUTPUT`` buffers > + > +PPS > + Picture Parameter Set; a type of metadata entity in H.264 bitstream > + > +raw format > + uncompressed format containing raw pixel data (e.g. YUV, RGB formats) > + > +resume point > + a point in the bitstream from which decoding may start/continue, without > + any previous state/data present, e.g.: a keyframe (VP8/VP9) or > + SPS/PPS/IDR sequence (H.264); a resume point is required to start decode > + of a new stream, or to resume decoding after a seek > + > +source > + data fed to the encoder; ``OUTPUT`` > + > +source height > + height in pixels for given source resolution > + > +source resolution > + resolution in pixels of source frames being source to the encoder and > + subject to further cropping to the bounds of visible resolution > + > +source width > + width in pixels for given source resolution > + > +SPS > + Sequence Parameter Set; a type of metadata entity in H.264 bitstream > + > +stream metadata > + additional (non-visual) information contained inside encoded bitstream; > + for example: coded resolution, visible resolution, codec profile > + > +visible height > + height for given visible resolution; display height > + > +visible resolution > + stream resolution of the visible picture, in pixels, to be used for > + display purposes; must be smaller or equal to coded resolution; > + display resolution > + > +visible width > + width for given visible resolution; display width > + > +Querying capabilities > +===================== > + > +1. To enumerate the set of coded formats supported by the driver, the > + client may call :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE``. > + > + * The driver must always return the full set of supported formats, > + irrespective of the format set on the ``OUTPUT`` queue. > + > +2. To enumerate the set of supported raw formats, the client may call > + :c:func:`VIDIOC_ENUM_FMT` on ``OUTPUT``. > + > + * The driver must return only the formats supported for the format > + currently active on ``CAPTURE``. > + Paul and I where discussing about the default active format on CAPTURE and OUTPUT queues. That is, the format that is active (if any) right after driver probes. Currently, the v4l2-compliance tool tests the default active format, by requiring drivers to support: fmt = g_fmt() s_fmt(fmt) Is this actually required? Should we also require this for stateful and stateless codecs? If yes, should it be documented? Regards, Ezequiel