Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp620271imu; Mon, 26 Nov 2018 15:55:45 -0800 (PST) X-Google-Smtp-Source: AJdET5cuqYz1qs24WcQ0sNrFk7smZ4fM0KqYghJjGVe7Fv4vGLBHpjvE5KGnBC7os2pgXRGv726d X-Received: by 2002:a62:682:: with SMTP id 124-v6mr30472691pfg.161.1543276545194; Mon, 26 Nov 2018 15:55:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543276545; cv=none; d=google.com; s=arc-20160816; b=OuMqZU42qCS65c+GHtahA9kplq93NneNjYHIjGiwlE9jOevuT7pkFupHDoIL82BNbS DZUEeEqsgA4g5z+Q8iz5Rf89z6Yxc4b2LTfRKabkfAKREii5A2h4qvh8mz4NRjCAspgL LpTbyOojwCsx/duVIv05BdimWGTpCqFhHhXnac3iWWykxwUBVX4XlYlVZbdz3WZNO2Dv esuj6slC+TM6k/gOmcLYJX4tiEVD4hZVSqOK639TtlOi0OtS552xrVVgxv0/6ltWpmxk kzBDldk4aIgVF/al6AfjSiMOSZudynh5aJPdH2GX+DmeKue9hVexMor+Qmx7UaZ8nSL0 LtXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject; bh=gnMr2VdinD7DnojTrc/ex6vrEgy6sLoL2SkfMMghxro=; b=U6JtW5II8bSY9vcKW6bef/Yti6FDvynypWtTVbEPHyKJ5k/qLt+GxPj8aK/pHY0ADF I7keCkCC2YiElBEi7obZABjc+GJqDtKuU3rAmc/mesGwgofUfMnsNNgeW7MkG/1uBIZk MxpTgKleX/m0aX0nONYndUtyIj5hi95lWavulN9ZhMpBuTG08CCaDYrUDr4R42XD+unw vN+iAODWti1EaLfv5/FZUo8tvJAB15EjhMhM6fTZ7J2pjxY4cLC7TcTqEmw0d23RfN+G 7ae2P/SIjfoGk6TCoeaR5L/3k1mAm3x358fCcp5FLuDnIcxYW5l4hNw0HRdIPwoYI2zh Ij+A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u186si1768321pgd.131.2018.11.26.15.55.25; Mon, 26 Nov 2018 15:55:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728101AbeK0Kud (ORCPT + 99 others); Tue, 27 Nov 2018 05:50:33 -0500 Received: from bhuna.collabora.co.uk ([46.235.227.227]:44822 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728056AbeK0Kuc (ORCPT ); Tue, 27 Nov 2018 05:50:32 -0500 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: padovan) with ESMTPSA id 4B722260441 Subject: Re: [PATCH v3] drm/rockchip: update cursors asynchronously through atomic. To: Tomasz Figa , helen.koike@collabora.com, mzoran@crowfest.net, Eric Anholt Cc: Sandy Huang , =?UTF-8?Q?Heiko_St=c3=bcbner?= , David Airlie , "list@263.net:IOMMU DRIVERS" , Joerg Roedel , iommu@lists.linux-foundation.org, "list@263.net:IOMMU DRIVERS" , Joerg Roedel , joro@8bytes.org, "list@263.net:IOMMU DRIVERS" , Joerg Roedel , linux-arm-kernel@lists.infradead.org, Enric Balletbo i Serra , Linux Kernel Mailing List , dri-devel , "open list:ARM/Rockchip SoC..." , Sean Paul , kernel@collabora.com, =?UTF-8?Q?St=c3=a9phane_Marchesin?= References: <20181119190805.19139-1-helen.koike@collabora.com> From: Gustavo Padovan Organization: Collabora Ltd Message-ID: Date: Mon, 26 Nov 2018 21:54:06 -0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Tomasz, On 11/23/18 12:27 AM, Tomasz Figa wrote: > Hi Helen, > > On Fri, Nov 23, 2018 at 8:31 AM Helen Koike wrote: >> Hi Tomasz, >> >> On 11/20/18 4:48 AM, Tomasz Figa wrote: >>> Hi Helen, >>> >>> On Tue, Nov 20, 2018 at 4:08 AM Helen Koike wrote: >>>> From: Enric Balletbo i Serra >>>> >>>> Add support to async updates of cursors by using the new atomic >>>> interface for that. >>>> >>>> Signed-off-by: Enric Balletbo i Serra >>>> [updated for upstream] >>>> Signed-off-by: Helen Koike >>>> >>>> --- >>>> Hello, >>>> >>>> This is the third version of the async-plane update suport to the >>>> Rockchip driver. >>>> >>> Thanks for a quick respin. Please see my comments inline. (I'll try to >>> be better at responding from now on...) >>> >>>> I tested running igt kms_cursor_legacy and kms_atomic tests using a 96Boards Ficus. >>>> >>>> Note that before the patch, the following igt tests failed: >>>> >>>> basic-flip-before-cursor-atomic >>>> basic-flip-before-cursor-legacy >>>> cursor-vs-flip-atomic >>>> cursor-vs-flip-legacy >>>> cursor-vs-flip-toggle >>>> flip-vs-cursor-atomic >>>> flip-vs-cursor-busy-crc-atomic >>>> flip-vs-cursor-busy-crc-legacy >>>> flip-vs-cursor-crc-atomic >>>> flip-vs-cursor-crc-legacy >>>> flip-vs-cursor-legacy >>>> >>>> Full log: https://people.collabora.com/~koike/results-4.20/html/ >>>> >>>> Now with the patch applied the following were fixed: >>>> basic-flip-before-cursor-atomic >>>> basic-flip-before-cursor-legacy >>>> flip-vs-cursor-atomic >>>> flip-vs-cursor-legacy >>>> >>>> Full log: https://people.collabora.com/~koike/results-4.20-async/html/ >>> Could you also test modetest, with the -C switch to test the legacy >>> cursor API? I remember it triggering crashes due to synchronization >>> issues easily. >> Sure. I tested with >> $ modetest -M rockchip -s 37:1920x1080 -C >> >> I also vary the mode but I couldn't trigger any crashes. >> >>>> Tomasz, as you mentined in v2 about waiting the hardware before updating >>>> the framebuffer, now I call the loop you pointed out in the async path, >>>> was that what you had in mind? Or do you think I would make sense to >>>> call the vop_crtc_atomic_flush() instead of just exposing that loop? >>>> >>>> Thanks >>>> Helen >>>> >>>> Changes in v3: >>>> - Rebased on top of drm-misc >>>> - Fix missing include in rockchip_drm_vop.c >>>> - New function vop_crtc_atomic_commit_flush >>>> >>>> Changes in v2: >>>> - v2: https://patchwork.freedesktop.org/patch/254180/ >>>> - Change the framebuffer as well to cover jumpy cursor when hovering >>>> text boxes or hyperlink. (Tomasz) >>>> - Use the PSR inhibit mechanism when accessing VOP hardware instead of >>>> PSR flushing (Tomasz) >>>> >>>> Changes in v1: >>>> - Rebased on top of drm-misc >>>> - In async_check call drm_atomic_helper_check_plane_state to check that >>>> the desired plane is valid and update various bits of derived state >>>> (clipped coordinates etc.) >>>> - In async_check allow to configure new scaling in the fast path. >>>> - In async_update force to flush all registered PSR encoders. >>>> - In async_update call atomic_update directly. >>>> - In async_update call vop_cfg_done needed to set the vop registers and take effect. >>>> >>>> drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 36 ------- >>>> drivers/gpu/drm/rockchip/rockchip_drm_psr.c | 37 +++++++ >>>> drivers/gpu/drm/rockchip/rockchip_drm_psr.h | 3 + >>>> drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 108 +++++++++++++++++--- >>>> 4 files changed, 131 insertions(+), 53 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>> index ea18cb2a76c0..08bec50d9c5d 100644 >>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>> @@ -127,42 +127,6 @@ rockchip_user_fb_create(struct drm_device *dev, struct drm_file *file_priv, >>>> return ERR_PTR(ret); >>>> } >>>> >>>> -static void >>>> -rockchip_drm_psr_inhibit_get_state(struct drm_atomic_state *state) >>>> -{ >>>> - struct drm_crtc *crtc; >>>> - struct drm_crtc_state *crtc_state; >>>> - struct drm_encoder *encoder; >>>> - u32 encoder_mask = 0; >>>> - int i; >>>> - >>>> - for_each_old_crtc_in_state(state, crtc, crtc_state, i) { >>>> - encoder_mask |= crtc_state->encoder_mask; >>>> - encoder_mask |= crtc->state->encoder_mask; >>>> - } >>>> - >>>> - drm_for_each_encoder_mask(encoder, state->dev, encoder_mask) >>>> - rockchip_drm_psr_inhibit_get(encoder); >>>> -} >>>> - >>>> -static void >>>> -rockchip_drm_psr_inhibit_put_state(struct drm_atomic_state *state) >>>> -{ >>>> - struct drm_crtc *crtc; >>>> - struct drm_crtc_state *crtc_state; >>>> - struct drm_encoder *encoder; >>>> - u32 encoder_mask = 0; >>>> - int i; >>>> - >>>> - for_each_old_crtc_in_state(state, crtc, crtc_state, i) { >>>> - encoder_mask |= crtc_state->encoder_mask; >>>> - encoder_mask |= crtc->state->encoder_mask; >>>> - } >>>> - >>>> - drm_for_each_encoder_mask(encoder, state->dev, encoder_mask) >>>> - rockchip_drm_psr_inhibit_put(encoder); >>>> -} >>>> - >>>> static void >>>> rockchip_atomic_helper_commit_tail_rpm(struct drm_atomic_state *old_state) >>>> { >>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_psr.c b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c >>>> index 01ff3c858875..22a70ab6e214 100644 >>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_psr.c >>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c >>>> @@ -13,6 +13,7 @@ >>>> */ >>>> >>>> #include >>>> +#include >>>> #include >>>> >>>> #include "rockchip_drm_drv.h" >>>> @@ -109,6 +110,42 @@ int rockchip_drm_psr_inhibit_put(struct drm_encoder *encoder) >>>> } >>>> EXPORT_SYMBOL(rockchip_drm_psr_inhibit_put); >>>> >>>> +void rockchip_drm_psr_inhibit_get_state(struct drm_atomic_state *state) >>>> +{ >>>> + struct drm_crtc *crtc; >>>> + struct drm_crtc_state *crtc_state; >>>> + struct drm_encoder *encoder; >>>> + u32 encoder_mask = 0; >>>> + int i; >>>> + >>>> + for_each_old_crtc_in_state(state, crtc, crtc_state, i) { >>>> + encoder_mask |= crtc_state->encoder_mask; >>>> + encoder_mask |= crtc->state->encoder_mask; >>>> + } >>>> + >>>> + drm_for_each_encoder_mask(encoder, state->dev, encoder_mask) >>>> + rockchip_drm_psr_inhibit_get(encoder); >>>> +} >>>> +EXPORT_SYMBOL(rockchip_drm_psr_inhibit_get_state); >>>> + >>>> +void rockchip_drm_psr_inhibit_put_state(struct drm_atomic_state *state) >>>> +{ >>>> + struct drm_crtc *crtc; >>>> + struct drm_crtc_state *crtc_state; >>>> + struct drm_encoder *encoder; >>>> + u32 encoder_mask = 0; >>>> + int i; >>>> + >>>> + for_each_old_crtc_in_state(state, crtc, crtc_state, i) { >>>> + encoder_mask |= crtc_state->encoder_mask; >>>> + encoder_mask |= crtc->state->encoder_mask; >>>> + } >>>> + >>>> + drm_for_each_encoder_mask(encoder, state->dev, encoder_mask) >>>> + rockchip_drm_psr_inhibit_put(encoder); >>>> +} >>>> +EXPORT_SYMBOL(rockchip_drm_psr_inhibit_put_state); >>>> + >>>> /** >>>> * rockchip_drm_psr_inhibit_get - acquire PSR inhibit on given encoder >>>> * @encoder: encoder to obtain the PSR encoder >>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_psr.h b/drivers/gpu/drm/rockchip/rockchip_drm_psr.h >>>> index 860c62494496..25350ba3237b 100644 >>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_psr.h >>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_psr.h >>>> @@ -20,6 +20,9 @@ void rockchip_drm_psr_flush_all(struct drm_device *dev); >>>> int rockchip_drm_psr_inhibit_put(struct drm_encoder *encoder); >>>> int rockchip_drm_psr_inhibit_get(struct drm_encoder *encoder); >>>> >>>> +void rockchip_drm_psr_inhibit_get_state(struct drm_atomic_state *state); >>>> +void rockchip_drm_psr_inhibit_put_state(struct drm_atomic_state *state); >>>> + >>>> int rockchip_drm_psr_register(struct drm_encoder *encoder, >>>> int (*psr_set)(struct drm_encoder *, bool enable)); >>>> void rockchip_drm_psr_unregister(struct drm_encoder *encoder); >>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >>>> index fb70fb486fbf..176d6e8207ed 100644 >>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >>>> @@ -15,6 +15,7 @@ >>>> #include >>>> #include >>>> #include >>>> +#include >>>> #include >>>> #include >>>> #include >>>> @@ -819,10 +820,99 @@ static void vop_plane_atomic_update(struct drm_plane *plane, >>>> spin_unlock(&vop->reg_lock); >>>> } >>>> >>>> +static int vop_plane_atomic_async_check(struct drm_plane *plane, >>>> + struct drm_plane_state *state) >>>> +{ >>>> + struct vop_win *vop_win = to_vop_win(plane); >>>> + const struct vop_win_data *win = vop_win->data; >>>> + int min_scale = win->phy->scl ? FRAC_16_16(1, 8) : >>>> + DRM_PLANE_HELPER_NO_SCALING; >>>> + int max_scale = win->phy->scl ? FRAC_16_16(8, 1) : >>>> + DRM_PLANE_HELPER_NO_SCALING; >>>> + struct drm_crtc_state *crtc_state; >>>> + int ret; >>>> + >>>> + if (plane != state->crtc->cursor) >>>> + return -EINVAL; >>>> + >>>> + if (!plane->state) >>>> + return -EINVAL; >>>> + >>>> + if (!plane->state->fb) >>>> + return -EINVAL; >>>> + >>>> + if (state->state) >>>> + crtc_state = drm_atomic_get_existing_crtc_state(state->state, >>>> + state->crtc); >>>> + else /* Special case for asynchronous cursor updates. */ >>>> + crtc_state = plane->crtc->state; >>>> + >>>> + ret = drm_atomic_helper_check_plane_state(plane->state, >>>> + crtc_state, >>>> + min_scale, max_scale, >>>> + true, true); >>>> + return ret; >>>> +} >>>> + >>>> +static void vop_crtc_atomic_commit_flush(struct drm_crtc *crtc, >>>> + struct drm_crtc_state *old_crtc_state) >>>> +{ >>>> + struct drm_atomic_state *old_state = old_crtc_state->state; >>>> + struct drm_plane_state *old_plane_state, *new_plane_state; >>>> + struct vop *vop = to_vop(crtc); >>>> + struct drm_plane *plane; >>>> + int i; >>>> + >>>> + for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, >>>> + new_plane_state, i) { >>> Hmm, from what I can see, we're not going through the full atomic >>> commit sequence, with state flip, so I'm not sure where we would get >>> the new state here from. >>> >>>> + if (!old_plane_state->fb) >>>> + continue; >>>> + >>>> + if (old_plane_state->fb == new_plane_state->fb) >>>> + continue; >>>> + >>>> + drm_framebuffer_get(old_plane_state->fb); >>>> + WARN_ON(drm_crtc_vblank_get(crtc) != 0); >>>> + drm_flip_work_queue(&vop->fb_unref_work, old_plane_state->fb); >>>> + set_bit(VOP_PENDING_FB_UNREF, &vop->pending); >>>> + } >>>> +} >>>> + >>>> +static void vop_plane_atomic_async_update(struct drm_plane *plane, >>>> + struct drm_plane_state *new_state) >>>> +{ >>>> + struct vop *vop = to_vop(plane->state->crtc); >>>> + >>>> + if (vop->crtc.state->state) >>>> + vop_crtc_atomic_commit_flush(&vop->crtc, vop->crtc.state); >>> Since we just operate on one plane here, we could just do like this: >>> >>> if (plane->state->fb && plane->state->fb != new_state->fb) { >>> drm_framebuffer_get(plane->state->fb); >>> WARN_ON(drm_crtc_vblank_get(crtc) != 0); >>> drm_flip_work_queue(&vop->fb_unref_work, plane->state->fb); >>> set_bit(VOP_PENDING_FB_UNREF, &vop->pending); >>> } >>> >>> However, we cannot simply to this here, because it races with the >>> vblank interrupt. We need to program the hw plane with the new fb >>> first and trigger the update. This needs all the careful handling that >>> is done in vop_crtc_atomic_flush() and so my original suggestion to >>> just call it. >> vop_crtc_atomic_flush() also updates the crtc->state->event, I don't >> think we want that. > Good point, we don't want to touch the event. > >> And actually I don't think we have this race condition, please see below >> > Just to clarify, the race conditions I'm talking here are not anything > new to this patch alone. I had actually observed those when > implementing the Chrome OS downstream async cursor code before: > > https://chromium.googlesource.com/chromiumos/third_party/kernel/+/65d4ff0af3f8c7ebecad8f3b402b546f277d9225/drivers/gpu/drm/rockchip/rockchip_drm_vop.c#1015 > > Note that the code there actually duplicates the current plane state, > updates the copy, swaps both states and then update the hardware based > on both new and old states, as the regular atomic commit flow would > do. With that, no reference counts are dropped until the old state > gets destroyed at the end of the function. Also note the if block that > handles the condition of (plane_state->fb && plane_state->fb != > plane->state->fb) (plane_state is the old state after the swap). > >>> Of course to call it in its current shape, one needs to have a full >>> atomic state from a commit, after a flip, but we only have the new >>> plane state here. Perhaps you could duplicate existing state, update >>> the desired plane state, flip and then call vop_crtc_atomic_flush()? >> Could you please clarify your proposal? You mean duplicating >> plane->state ? I'm trying to see how this would fit it in the code. >> The drm_atomic_state structure at plate->state->state is actually always >> NULL (as an async update is applied right away). >> >> >>> Best regards, >>> Tomasz >>> >> From your comment in v2: >> >>> Isn't this going to drop the old fb reference on the floor without >>> waiting for the hardware to actually stop scanning out from it? >> I've been trying to analyze this better, I also got some help from >> Gustavo Padovan, and I think there is no problem here as the >> configuration we are doing here will just be taken into consideration by >> the hardware in the next vblank, so I guess we can set and re-set >> plane->fb as much as we want. > The point here is not about setting and resetting the plane->fb > pointer. It's about what happens inside drm_atomic_set_fb_for_plane(). > > It calls drm_framebuffer_get() for the new fb and > drm_framebuffer_put() for the old fb. In result, if the fb changes, > the old fb, which had its reference count incremented in the atomic > commit that set it to the plane before, has its reference count > decremented. Moreover, if the new reference count becomes 0, > drm_framebuffer_put() will immediately free the buffer. > > Freeing a buffer when the hardware is still scanning out of it isn't a > good idea, is it? My understanding is that in the async update path we never touch the fb that is still being scanned out - what you are referring as old fb. That is the design of async update. drm_atomic_set_fb_for_plane() replace pointers and refs between the fb we just received from userspace with the one on plane->state (that after the atomic swap holds the new state) which won't be scanned out until the next flip. However I don't understand this part of the rockchip hardware, the fb is not the one being scanned out. Is this still a problem? Regards, Gustavo -- Gustavo Padovan Collabora Ltd