Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp593195ybh; Wed, 22 Jul 2020 08:20:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy2E0UkMFQCQ0Evi1rTGeuS2zHIZj4+Hj2Ykn6njrxe2iOFBAa/RuYEa4Ogsw7Yf0Up3Wov X-Received: by 2002:a17:906:1e43:: with SMTP id i3mr79838ejj.92.1595431241311; Wed, 22 Jul 2020 08:20:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595431241; cv=none; d=google.com; s=arc-20160816; b=1Lt79lZ3dufRu/onphdDsjYWAxxIHuzy2/O1Ol/33P9S3NeNFLoS/ksNh0x1jBovg+ PWF6ct3sdhUYBLLIbrFJJ30J8P7AUy4NbUehnth4xwXg3IX7jPkaeURBv078Iw03sVLT HJ04WJQhV36iDDdtccBBFkEHECl9bmUy5vU2fZRJa8Tz12GcmcjMP7PC7qxHNRy1Nk+l 2GE2itUlfMn4h7HP1e7ap2HS1oz906m6YrpWSMm2IloUWXBc9tOf8YG4OkgD4WVihvOi nCbdmO/zW3Wzw9iCQdKSQsyN5CEFlcQnz9Hj8/pngDF5ysxC3yShrpCmvqzsZJ0zwvGu 6Gxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=GFIb9RkIl/5PngvU9D0g83i5c49gu3ClHdgFBkg+KM8=; b=XdMfPnu7BSbYhpNB9KtYGsEhYupUvpj4x4b5AHD5IjoCb+c3SAzNjJ/yjcBF6/CxB1 YSyjy4MzPw7+c0lwvcqWt4+f+yKJZrjjW/lAvFPuKd8eXBWueqhnph7yfvmr1SJ9AvWJ xJJX7fzlpz7MGaLbqH7QmmHvmXCBxMR9KIWUPlBc9nx5aJgtcC5UuMPQ0WgDYuD6NF7Y 5k2VxHqZanuQRBDYq+Zbkzx/afBkML84DUTMWMsVjbr6NjbhNHWydINQ2CtrKZt5L78v EiYv7ANGhJ+IP3qcVOs1ebBcaZNcU8+pYTjtnAYc4heWbOpep8NstZLMNnPbcXjeVoYX V/iw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=RXUSEw8q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mc27si194828ejb.272.2020.07.22.08.20.17; Wed, 22 Jul 2020 08:20:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=RXUSEw8q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732628AbgGVPRS (ORCPT + 99 others); Wed, 22 Jul 2020 11:17:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728365AbgGVPRR (ORCPT ); Wed, 22 Jul 2020 11:17:17 -0400 Received: from mail-ot1-x344.google.com (mail-ot1-x344.google.com [IPv6:2607:f8b0:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81591C0619DC for ; Wed, 22 Jul 2020 08:17:17 -0700 (PDT) Received: by mail-ot1-x344.google.com with SMTP id g37so2000490otb.9 for ; Wed, 22 Jul 2020 08:17:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GFIb9RkIl/5PngvU9D0g83i5c49gu3ClHdgFBkg+KM8=; b=RXUSEw8qJFw8zLM/pCnMRbbGAaNGa6zJKqY2tKmsL3PX1BCaa3s3lIOXS2gBkQs1UB dfezQ5FiOvR54RgsynRvuMmNUJqxnTxnC8kS3N1z5b7efvEqVJ7/2vT9mF4SDvhmnzDg efRMbUpNZ1UUrXfwqg3k4U7dyLaEdiKBtaY0Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GFIb9RkIl/5PngvU9D0g83i5c49gu3ClHdgFBkg+KM8=; b=YoPiwvJmXLK9i77LAnujDToTp+uv+onlliFopSfV4R3BZqhMqMCBiDSJvgXx1LMS/r yll4257qXoj02cOP7KURuajTk0fXGs/+JaVV2WzgIFFzP9904EAkueEY4TiSfh5o2zxe Pk3L+EVyzxNCfNQVWRM1xuWWEPCKyNOj7NKZAWl5YorK7yBffn522QLrO+hKfO+W76X2 gmpjugs4ob6xYBNj4W7Hk5a5OO2TAAYOBTWxHKnYzJKdcd2UOcSqgm7XT93iKyrQT/lv rR7AvL9EqK3He/iYZcpRy/hFBMG8HUKVzdN0IjNiNyd0iQjYjiYAvCbUtQxqSNJHR8Th CNVw== X-Gm-Message-State: AOAM533PvYZJnHPH9KiEyDymLOedGPfVaBR1cAgINGLHe9x9qcnNf9yC GYTaxrk4qFkP7ATYTiRxTOUlG6LIwvTll0Aq568VVQ== X-Received: by 2002:a05:6830:1613:: with SMTP id g19mr366916otr.303.1595431036827; Wed, 22 Jul 2020 08:17:16 -0700 (PDT) MIME-Version: 1.0 References: <20200722110411.ebkv6knvc6mzw5uf@smtp.gmail.com> <20200722120502.GK6419@phenom.ffwll.local> <20200722140604.27dfzfnzug5vb75r@smtp.gmail.com> In-Reply-To: <20200722140604.27dfzfnzug5vb75r@smtp.gmail.com> From: Daniel Vetter Date: Wed, 22 Jul 2020 17:17:05 +0200 Message-ID: Subject: Re: [PATCH] drm/vkms: add missing drm_crtc_vblank_put to the get/put pair on flush To: Melissa Wen Cc: Rodrigo Siqueira , Haneen Mohammed , David Airlie , Rodrigo Siqueira , dri-devel , Linux Kernel Mailing List , kernel-usp@googlegroups.com, Trevor Woerner Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 22, 2020 at 4:06 PM Melissa Wen wrote: > > On 07/22, daniel@ffwll.ch wrote: > > On Wed, Jul 22, 2020 at 08:04:11AM -0300, Melissa Wen wrote: > > > This patch adds a missing drm_crtc_vblank_put op to the pair > > > drm_crtc_vblank_get/put (inc/decrement counter to guarantee vblanks). > > > > > > It clears the execution of the following kms_cursor_crc subtests: > > > 1. pipe-A-cursor-[size,alpha-opaque, NxN-(on-screen, off-screen, sliding, > > > random, fast-moving])] - successful when running individually. > > > 2. pipe-A-cursor-dpms passes again > > > 3. pipe-A-cursor-suspend also passes > > > > > > The issue was initially tracked in the sequential execution of IGT > > > kms_cursor_crc subtest: when running the test sequence or one of its > > > subtests twice, the odd execs complete and the pairs get stuck in an > > > endless wait. In the IGT code, calling a wait_for_vblank before the start > > > of CRC capture prevented the busy-wait. But the problem persisted in the > > > pipe-A-cursor-dpms and -suspend subtests. > > > > > > Checking the history, the pipe-A-cursor-dpms subtest was successful when, > > > in vkms_atomic_commit_tail, instead of using the flip_done op, it used > > > wait_for_vblanks. Another way to prevent blocking was wait_one_vblank when > > > enabling crtc. However, in both cases, pipe-A-cursor-suspend persisted > > > blocking in the 2nd start of CRC capture, which may indicate that > > > something got stuck in the step of CRC setup. Indeed, wait_one_vblank in > > > the crc setup was able to sync things and free all kms_cursor_crc > > > subtests. > > > > > > Tracing and comparing a clean run with a blocked one: > > > - in a clean one, vkms_crtc_atomic_flush enables vblanks; > > > - when blocked, only in next op, vkms_crtc_atomic_enable, the vblanks > > > started. Moreover, a series of vkms_vblank_simulate flow out until > > > disabling vblanks. > > > Also watching the steps of vkms_crtc_atomic_flush, when the very first > > > drm_crtc_vblank_get returned an error, the subtest crashed. On the other > > > hand, when vblank_get succeeded, the subtest completed. Finally, checking > > > the flush steps: it increases counter to hold a vblank reference (get), > > > but there isn't a op to decreased it and release vblanks (put). > > > > > > Cc: Daniel Vetter > > > Cc: Rodrigo Siqueira > > > Cc: Haneen Mohammed > > > Signed-off-by: Melissa Wen > > > --- > > > drivers/gpu/drm/vkms/vkms_crtc.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c > > > index ac85e17428f8..a99d6b4a92dd 100644 > > > --- a/drivers/gpu/drm/vkms/vkms_crtc.c > > > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c > > > @@ -246,6 +246,7 @@ static void vkms_crtc_atomic_flush(struct drm_crtc *crtc, > > > > > > spin_unlock(&crtc->dev->event_lock); > > > > > > + drm_crtc_vblank_put(crtc); > > > > Uh so I reviewed this a bit more carefully now, and I dont think this is > > the correct bugfix. From the kerneldoc of drm_crtc_arm_vblank_event(): > > > > * Caller must hold a vblank reference for the event @e acquired by a > > * drm_crtc_vblank_get(), which will be dropped when the next vblank arrives. > > > > So when we call drm_crtc_arm_vblank_event then the vblank_put gets called > > for us. And that's the only case where we successfully acquired a vblank > > interrupt reference since on failure of drm_crtc_vblank_get (0 indicates > > success for that function, failure negative error number) we directly send > > out the event. > > > > So something else fishy is going on, and now I'm totally confused why this > > even happens. > > > > We also have a pile of WARN_ON checks in drm_crtc_vblank_put to make sure > > we don't underflow the refcount, so it's also not that I think (except if > > this patch creates more WARNING backtraces). > > > > But clearly it changes behaviour somehow ... can you try to figure out > > what changes? Maybe print out the vblank->refcount at various points in > > the driver, and maybe also trace when exactly the fake vkms vblank hrtimer > > is enabled/disabled ... > > :( > > I can check these, but I also have other suspicions. When I place the > drm_crct_vblank_put out of the if (at the end of flush), it not only solve > the issue of blocking on kms_cursor_crc, but also the WARN_ON on kms_flip > doesn't appear anymore (a total cleanup). Just after: > > vkms_output->composer_state = to_vkms_crtc_state(crtc->state); > > looks like there is something stuck around here. Hm do you have the full WARNING for this? Maybe this gives me an idea what's going wrong. > Besides, there is a lock at atomic_begin: > > /* This lock is held across the atomic commit to block vblank timer > * from scheduling vkms_composer_worker until the composer is updated > */ > spin_lock_irq(&vkms_output->lock); > > that seems to be released on atomic_flush and make me suspect something > missing on the composer update. atomic_begin/atomic_flush are symmetric functions an always called around all the plane updates. So having the spin_lock in _begin and the spin_unlock in _flush should be symmetric and correct. If you want to make sure, recompile with CONFIG_PROVE_LOCKING, which should immmediately give you a huge splat in dmesg if there's anything unbalanced with locking. > I'll check all these things and come back with news (hope) :) Have fun chasing stuff :-) Cheers, Daniel > > Thanks, > > Melissa > > > > I'm totally confused about what's going on here now. > > -Daniel > > > > > crtc->state->event = NULL; > > > } > > > > > > -- > > > 2.27.0 > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch