Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3501891rdb; Wed, 13 Sep 2023 14:10:51 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH201TbYkIy0qUytnUsxkpARhLd9nidjiqFHJGnTqnHpcP42+0bvwgfy5G3jVFdJjikbLO5 X-Received: by 2002:a05:6a20:841b:b0:14b:f86f:d9ea with SMTP id c27-20020a056a20841b00b0014bf86fd9eamr4054239pzd.3.1694639450731; Wed, 13 Sep 2023 14:10:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694639450; cv=none; d=google.com; s=arc-20160816; b=TB05+SPChkody33YG/eGs+ld+JE3FZE7pXt2U4bg2kjMZN57OXoQEVdBa6/LRcEe2H gWR2ABjP0MKRbg+FTWjnLl8VZz5s2JChLDXu8hynDysTUYiDIa3QCrSvE/PJgC4BM3xi xiZEqfjQy+v9YG0CnLBsxCLNND01gRf0fn9zEIsbizmEzzvb3Sru2TArqYCUxgJWeYxR WAcgcrrR+WTbYkWxsCkb9rp1Y4uPs2tuIzZrELBrTprl5KKgWdSyvCHK8059lTDsvCHB lUt2OuoggOkhmQ2lZil4n3mP4tqsQw1PqjObhpdRmo2mZ/MX9dAvEwcrxCWXV9HX21CN wa9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=8uMLW91YWZxr+LdoYhxEN1bJONyrj8yRppLVn3LNdRs=; fh=vqLUzcfpOSMS5y0LZGNw6eUKQnbvEgj42spte2+pF50=; b=TkTmEM7nWMhhrKZePjQHoKHaNtlR/tqdufZT/Igte1t6SllL3hD63F/9uy3H1Qu9FG 0GN4NVF6guVRYerkegNlKGDh5gjr7zbvfZyhOa9TAT79hnl2BKZMamSDvFK2h+1YJn16 9J0NHG94p/AMNbOkkPleyug0nV27I+Hit0dglainXD5pidaj4DkaAttAMARxdVjmvTd4 vynzPLvZYiOlKQT2hIiwrfWZ3HNfc5bfKDBsNXQWIyt3sLmFhOlKiMurpUE/76vObKx6 CwZltKkG7VpvDAVwjNwXlCNgHntgd+Zf2ufNXCDUQbKbpRaqpPG5e/VKOdfmxEd8Bjfi hDng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=EPjM9koD; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=cvDKVSXE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id oj14-20020a17090b4d8e00b00263d559dbf1si2401688pjb.55.2023.09.13.14.10.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 14:10:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=EPjM9koD; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=cvDKVSXE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 0AC83829F1C0; Wed, 13 Sep 2023 14:08:31 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230073AbjIMVI3 (ORCPT + 99 others); Wed, 13 Sep 2023 17:08:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229698AbjIMVI2 (ORCPT ); Wed, 13 Sep 2023 17:08:28 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86D621BCA for ; Wed, 13 Sep 2023 14:08:24 -0700 (PDT) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1694639302; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8uMLW91YWZxr+LdoYhxEN1bJONyrj8yRppLVn3LNdRs=; b=EPjM9koD5IVGcK3TzNImfOwSEignNr3wXBbM/1mx0oXc/AMHbsM/V2jLbcfphOPtpLpDLW 1OD0KY1/fBggLYeqonFD+sR/NX+1ej91c36l7x3WSri5ausglyPksuR3oVcNvNenVs7NTV hjKX+EquMjD1rpvvm5cRtmaXLH0rno25T2zqL2RbzrJReQCC24QO1vvoc0nkoSVIqGBJAA SKDNHwJUtuRN/W/xQXSQfb/78CxPXQy6pfSGLbV25z3tDvzxPFIeySMuLRQByJX2VNkyR9 tD19jKXw5lgOFxBR6W00sv01n4Eu1qH9I8lEYLsBrW2cqNTdiWCISZpal8TU7Q== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1694639302; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8uMLW91YWZxr+LdoYhxEN1bJONyrj8yRppLVn3LNdRs=; b=cvDKVSXEA5FFfnPE0Vg0fwwrSkqItU48RBEeJ+xGGNWPJHVyPXRNnEBwFilAPX4zKvMUq1 Ty4mcKmEqV9H2uAw== To: Linus Torvalds , Tetsuo Handa Cc: Rodrigo Siqueira , Melissa Wen , Maira Canal , Haneen Mohammed , Daniel Vetter , David Airlie , DRI , syzkaller@googlegroups.com, LKML , Hillf Danton , Sanan Hasanov Subject: Re: drm/vkms: deadlock between dev->event_lock and timer In-Reply-To: References: <20230913110709.6684-1-hdanton@sina.com> <99d99007-8385-31df-a659-665bf50193bc@I-love.SAKURA.ne.jp> Date: Wed, 13 Sep 2023 23:08:21 +0200 Message-ID: <87pm2lzsqi.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Wed, 13 Sep 2023 14:08:31 -0700 (PDT) X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email On Wed, Sep 13 2023 at 09:47, Linus Torvalds wrote: > On Wed, 13 Sept 2023 at 07:21, Tetsuo Handa > wrote: >> >> Hello. A deadlock was reported in drivers/gpu/drm/vkms/ . >> It looks like this locking pattern remains as of 6.6-rc1. Please fix. >> >> void drm_crtc_vblank_off(struct drm_crtc *crtc) { >> spin_lock_irq(&dev->event_lock); >> drm_vblank_disable_and_save(dev, pipe) { >> __disable_vblank(dev, pipe) { >> crtc->funcs->disable_vblank(crtc) == vkms_disable_vblank { >> hrtimer_cancel(&out->vblank_hrtimer) { // Retries with dev->event_lock held until lock_hrtimer_base() succeeds. >> ret = hrtimer_try_to_cancel(timer) { >> base = lock_hrtimer_base(timer, &flags); // Fails forever because vkms_vblank_simulate() is in progress. > > Heh. Ok. This is clearly a bug, but it does seem to be limited to just > the vkms driver, and literally only to the "simulate vblank" case. > > The worst part about it is that it's so subtle and not obvious. > > Some light grepping seems to show that amdgpu has almost the exact > same pattern in its own vkms thing, except it uses > > hrtimer_try_to_cancel(&amdgpu_crtc->vblank_timer); > > directly, which presumably fixes the livelock, but means that the > cancel will fail if it's currently running. > > So just doing the same thing in the vkms driver probably fixes things. > > Maybe the vkms people need to add a flag to say "it's canceled" so > that it doesn't then get re-enabled? Or maybe it doesn't matter > and/or already happens for some reason I didn't look into. Maybe the VKMS people need to understand locking in the first place. The first thing I saw in this code is: static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer) { ... mutex_unlock(&output->enabled_lock); What? Unlocking a mutex in the context of a hrtimer callback is simply violating all mutex locking rules. How has this code ever survived lock debugging without triggering a big fat warning? Thanks, tglx