Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp1881015ybc; Wed, 20 Nov 2019 05:39:27 -0800 (PST) X-Google-Smtp-Source: APXvYqw1mbOKcjSEM2XpoFsIdpQ7vVkOUBDzh1dgN+tMru9ygeJF16n/76N1WelOJrWDJNbDSRW7 X-Received: by 2002:a17:906:5448:: with SMTP id d8mr5350842ejp.79.1574257167812; Wed, 20 Nov 2019 05:39:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574257167; cv=none; d=google.com; s=arc-20160816; b=UGCRf10iJRI2vctadxqfNQeGrGpuM80TOcaaSXZ7UODrqDCpXb5QScMVwAiCLAu2nM y9JtMx04lxdS0sT7qXCb4LujlI/3n5q0RhmlFcDMZHwGzL9ak21YnewrgZZ2KagGf5vv kz8IsaKcKUSKwlxl5gnhqMGQuw7vdzE75cHOE2OIdATjt1qy1lVhsJPtPBxd02bVSs1z 8J1gLfT7cFPWPkVE6dk6a/iJdbOct9evWcVj74jFNhF4+pLNgHpFEb7XSGnTN8FEb01a QEznHgX7NYOWNn2Hi2tPOs9oIjp+GSsSslyiUhRz3xDtiGCBPr8hHR8ccN9ZCiI4563/ W64g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=Q+55S+QuaO7Th4WX3Vhb1ULyGdTHdLCacptPup2Qy0E=; b=OPgGZn6LrIOR9Vhkl77qcMibwPrJ+18ynkc4jXawuloNOZztDlX4Ji3fQabg8Pwtno bbmipMparjaQF9KQ+rgyA1QZMjv5CAO1Z55ni3tkylKtXyfGPgOkW1ROtKkPkmqYIU7s RWekivYKdCSqcck9smVOUNyG3VFe5qw69/YP/w93esIr2sN14Ftjj6L0tBWFzQSm0qAr Cid1lZ+BrsNGYOHJfNDlgOmvGNTFlZ/AADCVbtJYeK1DZeznPQgVbMpB8XdFgft5h7gG jCvpCVEfdakU5oe3WsQXlW7yyL/qXDTzU55luE29pKoNMOBRbjfyN3zjZtV8u7pr5oyb roZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=RfOzJ494; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h6si15914575ejt.227.2019.11.20.05.39.02; Wed, 20 Nov 2019 05:39:27 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=RfOzJ494; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729833AbfKTMlI (ORCPT + 99 others); Wed, 20 Nov 2019 07:41:08 -0500 Received: from mail-ed1-f65.google.com ([209.85.208.65]:35370 "EHLO mail-ed1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728001AbfKTMlI (ORCPT ); Wed, 20 Nov 2019 07:41:08 -0500 Received: by mail-ed1-f65.google.com with SMTP id r16so20164945edq.2 for ; Wed, 20 Nov 2019 04:41:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Q+55S+QuaO7Th4WX3Vhb1ULyGdTHdLCacptPup2Qy0E=; b=RfOzJ494tIm42VFJ2nHTZjMUE2E9u1fIhqHULNHcdGVgr5NCf6crR8AO8m63508fHy 5z0lddyOQ1hTIWTEQuviag10+HZ4MGBs+Ec1U0WaOq2a3Pck6yn36lphZIXzHOy/Y4hT Dmlr8bJZhQLW1HHjaloUCzjBtEsrc1Qtiir9A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Q+55S+QuaO7Th4WX3Vhb1ULyGdTHdLCacptPup2Qy0E=; b=ZwzdNCwfYZxx8BGMygTWNjjspe5xHzvJqqNE44X+XYdYoe5FMXbc+B4/uKSiZnhxas paw/ti7IhNw8S/akrclTDybR1bxnbivwLe8pzrVVxAO9AePvecxP4LE8/jl80U1++SRq F+pVV+styTBCU7by69TzJ3v4CrDPCWKc/gGz6iaSyqszdpKqi2IZVLP1A2miZp38zMfJ yvs4Sl0eqBoLSTou+3EWofKaRAdG8QNDp+LV7tqDfDuxsun3JLHa08UoSk7pxCtiIK38 4ot5Nmb5fRouMTyWL69UGqUTVe3858fZeVCez1Q2V9roBLCCayCPaP/DmRtFOshRvxmE ypbw== X-Gm-Message-State: APjAAAVq4yh1aNxpzEITLWHN2TIpfHfnhaHLpgiPshgWiWID2Xxtr1gY H4cip9+M9ao2KDZ1wGtDwLLu4KOClmk2tQ== X-Received: by 2002:a17:906:3019:: with SMTP id 25mr5212707ejz.280.1574253665975; Wed, 20 Nov 2019 04:41:05 -0800 (PST) Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com. [209.85.128.42]) by smtp.gmail.com with ESMTPSA id b20sm357156ejb.22.2019.11.20.04.41.04 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 20 Nov 2019 04:41:05 -0800 (PST) Received: by mail-wm1-f42.google.com with SMTP id b17so7667747wmj.2 for ; Wed, 20 Nov 2019 04:41:04 -0800 (PST) X-Received: by 2002:a1c:40c1:: with SMTP id n184mr3122697wma.116.1574253664291; Wed, 20 Nov 2019 04:41:04 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Tomasz Figa Date: Wed, 20 Nov 2019 21:40:52 +0900 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3 1/5] media: hantro: Fix H264 motion vector buffer offset To: Jonas Karlman Cc: Mauro Carvalho Chehab , Ezequiel Garcia , Hans Verkuil , Boris Brezillon , Philipp Zabel , "linux-media@vger.kernel.org" , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jonas, On Thu, Nov 7, 2019 at 7:34 AM Jonas Karlman wrote: > > A decoded 8-bit 4:2:0 frame need memory for up to 448 bytes per > macroblock and is laid out in memory as follow: > > +---------------------------+ > | Y-plane 256 bytes x MBs | > +---------------------------+ > | UV-plane 128 bytes x MBs | > +---------------------------+ > | MV buffer 64 bytes x MBs | > +---------------------------+ > > The motion vector buffer offset is currently correct for 4:2:0 because the > extra space for motion vectors is overallocated with an extra 64 bytes x MBs. > > Wrong offset for both destination and motion vector buffer are used > for the bottom field of field encoded content, wrong offset is > also used for 4:0:0 (monochrome) content. > > Fix this by setting the motion vector address to the expected 384 bytes x MBs > offset for 4:2:0 and 256 bytes x MBs offset for 4:0:0 content. > > Also use correct destination and motion vector buffer offset > for the bottom field of field encoded content. > > While at it also extend the check for 4:0:0 (monochrome) to include an > additional check for High Profile (100). > > Fixes: dea0a82f3d22 ("media: hantro: Add support for H264 decoding on G1") > Signed-off-by: Jonas Karlman > Reviewed-by: Boris Brezillon > --- > Changes in v3: > * address remarks from Boris > - use src_fmt instead of dst_fmt > Changes in v2: > * address remarks from Philipp and Ezequiel > - update commit message > - rename offset to bytes_per_mb > - remove MV_OFFSET macros > - move PIC_MB_WIDTH/HEIGHT_P change to separate patch > --- > .../staging/media/hantro/hantro_g1_h264_dec.c | 31 +++++++++++++------ > 1 file changed, 22 insertions(+), 9 deletions(-) > First of all, thanks for the patches! Good to see more members of the community contributing to the driver. Please find my comments inline. > diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c > index 70a6b5b26477..30d977c3d529 100644 > --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c > +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c > @@ -81,7 +81,7 @@ static void set_params(struct hantro_ctx *ctx) > reg |= G1_REG_DEC_CTRL4_CABAC_E; > if (sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE) > reg |= G1_REG_DEC_CTRL4_DIR_8X8_INFER_E; > - if (sps->chroma_format_idc == 0) > + if (sps->profile_idc >= 100 && sps->chroma_format_idc == 0) I'd rather make this a separate patch with proper explanation in commit message. > reg |= G1_REG_DEC_CTRL4_BLACKWHITE_E; > if (pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED) > reg |= G1_REG_DEC_CTRL4_WEIGHT_PRED_E; > @@ -234,6 +234,7 @@ static void set_buffers(struct hantro_ctx *ctx) > struct vb2_v4l2_buffer *src_buf, *dst_buf; > struct hantro_dev *vpu = ctx->dev; > dma_addr_t src_dma, dst_dma; > + size_t offset = 0; > > src_buf = hantro_get_src_buf(ctx); > dst_buf = hantro_get_dst_buf(ctx); > @@ -244,18 +245,30 @@ static void set_buffers(struct hantro_ctx *ctx) > > /* Destination (decoded frame) buffer. */ > dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0); > - vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DST); > + /* Adjust dma addr to start at second line for bottom field */ > + if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD) > + offset = ALIGN(ctx->src_fmt.width, MB_DIM); Isn't ctx->src_fmt.width already aligned to MB_DIM? Also, offset is in bytes, so should we rather use the bytesperline field? > + vdpu_write_relaxed(vpu, dst_dma + offset, G1_REG_ADDR_DST); > > /* Higher profiles require DMV buffer appended to reference frames. */ > if (ctrls->sps->profile_idc > 66 && ctrls->decode->nal_ref_idc) { > - size_t pic_size = ctx->h264_dec.pic_size; > - size_t mv_offset = round_up(pic_size, 8); > - > + unsigned int bytes_per_mb = 384; > + > + /* DMV buffer for monochrome start directly after Y-plane */ > + if (ctrls->sps->profile_idc >= 100 && > + ctrls->sps->chroma_format_idc == 0) > + bytes_per_mb = 256; nit: Adding a blank line here would make it much easier to read. > + offset = bytes_per_mb * MB_WIDTH(ctx->src_fmt.width) * > + MB_HEIGHT(ctx->src_fmt.height); It's kind of difficult to follow with this idea of bytes_per_mb IMHO. Would it perhaps make sense to rewrite the code as below? luma_size = ctx->src_fmt.planes[0].bytesperline * ctx->src_fmt.height; if (ctrls->sps->profile_idc >= 100 && ctrls->sps->chroma_format_idc == 0) chroma_size = 0; else chroma_size = ctx->src_fmt.planes[0].bytesperline * ctx->src_fmt.height / 4; offset = luma_size + chroma_size; Also, the code only handles 4:2:0 and 4:0:0. How about 4:2:2? Best regards, Tomasz > + > + /* > + * DMV buffer is split in two for field encoded frames, > + * adjust offset for bottom field > + */ > if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD) > - mv_offset += 32 * MB_WIDTH(ctx->dst_fmt.width); > - > - vdpu_write_relaxed(vpu, dst_dma + mv_offset, > - G1_REG_ADDR_DIR_MV); > + offset += 32 * MB_WIDTH(ctx->src_fmt.width) * > + MB_HEIGHT(ctx->src_fmt.height); > + vdpu_write_relaxed(vpu, dst_dma + offset, G1_REG_ADDR_DIR_MV); > } > > /* Auxiliary buffer prepared in hantro_g1_h264_dec_prepare_table(). */ > -- > 2.17.1 >