Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp1365158pxb; Thu, 16 Sep 2021 06:06:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxDmIo/OikFZKXkx9flLF3bwejXNlQSRCZtB8dgAnhIGTsHN++Npd9z+jgdqRVl11+dXmyI X-Received: by 2002:a5d:4a08:: with SMTP id m8mr5807729wrq.263.1631797590376; Thu, 16 Sep 2021 06:06:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631797590; cv=none; d=google.com; s=arc-20160816; b=DlLQbLq587EFk98r3bi/ILNUrUohJpxV4x4a/isE/2pKttbPsBuZgHnFyxrrHgjzOu fiY16hgKDol3oLhNHkEVq88NlRYfdvLHbe5X3ZuS3BstKY6gd//Uqg5aDHYlUFK2HlYL /wyoeqE9mFveZNIjv38BOlIPKZ1htCyCLjKfizwxtptdAsgR/ZIEIxbXEf0DbRL9PmYD 7UG34Ei5Vca6hmmhQPbDMZL6JlUQ018gf7A9m5a66fJwefwicL9sxj7G0ufXsxAtcqkR WPfOWlzh5XSc36uhJzITDIi7RFClB0a+m+QExM3Shi82h0E9qirkhPkL+4h56nOVg4Bt PCsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :to:subject; bh=OxcTMdil4hMjqhFIsLmGTtvFxuFy95y2C7h1Q+Jso0M=; b=EXAifR6D6fymj1ZM3mQKlm7VJ9iK8gF5MpSmOpD6Iajc7ogiG16EJvHO0q/j2vdi3f WzGoXdE46wqMeqsT+YJxegNs1elZ5TUATJHbr/Ee/EDTmY8ZGyZlEU5aAD7kxNRQ1u4N tP/YOWJ2Htjp1X/z5LIf5GRy9o/LfLKDgKTNuILOqi+WQJpfs95r/DElJD5jbpKckA+0 C/WwnqqQX+1rBlTCWuHdHah39ouRQDKb3hjEE7Z7CpC7y+HVoBagFL/ArNznDvrlcFUq FPFG2Q3M02sn+GZNf9Nl6oqoYiBYTENqQC1FbrI2bpRmY226KA9cYbrLl9noy+KmUcyR qJGQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x8si4288778edd.9.2021.09.16.06.06.05; Thu, 16 Sep 2021 06:06:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239765AbhIPNCV (ORCPT + 99 others); Thu, 16 Sep 2021 09:02:21 -0400 Received: from mga07.intel.com ([134.134.136.100]:20503 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239555AbhIPNCU (ORCPT ); Thu, 16 Sep 2021 09:02:20 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10108"; a="286241930" X-IronPort-AV: E=Sophos;i="5.85,298,1624345200"; d="scan'208";a="286241930" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2021 06:00:44 -0700 X-IronPort-AV: E=Sophos;i="5.85,298,1624345200"; d="scan'208";a="553911087" Received: from kumardhx-mobl1.gar.corp.intel.com (HELO [10.252.50.13]) ([10.252.50.13]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2021 06:00:41 -0700 Subject: Re: [PATCH v2] kernel/locking: Add context to ww_mutex_trylock. To: Peter Zijlstra , intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, Ingo Molnar , Will Deacon , Waiman Long , Boqun Feng , Liam Girdwood , Mark Brown , linux-kernel@vger.kernel.org References: <20210907132044.157225-1-maarten.lankhorst@linux.intel.com> <96ab9cf1-250a-8f34-51ec-4a7f66a87b39@linux.intel.com> <205e1591-343b-fb77-cfca-9c16af1484bd@linux.intel.com> From: Maarten Lankhorst Message-ID: Date: Thu, 16 Sep 2021 15:00:39 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Op 14-09-2021 om 15:54 schreef Daniel Vetter: > On Tue, Sep 14, 2021 at 02:43:02PM +0200, Maarten Lankhorst wrote: >> Op 14-09-2021 om 08:50 schreef Peter Zijlstra: >>> On Mon, Sep 13, 2021 at 10:42:36AM +0200, Maarten Lankhorst wrote: >>> >>>>> +/** >>>>> + * ww_mutex_trylock - tries to acquire the w/w mutex with optional acquire context >>>>> + * @ww: mutex to lock >>>>> + * @ww_ctx: optional w/w acquire context >>>>> + * >>>>> + * Trylocks a mutex with the optional acquire context; no deadlock detection is >>>>> + * possible. Returns 1 if the mutex has been acquired successfully, 0 otherwise. >>>>> + * >>>>> + * Unlike ww_mutex_lock, no deadlock handling is performed. However, if a @ctx is >>>>> + * specified, -EALREADY handling may happen in calls to ww_mutex_trylock. >>>>> + * >>>>> + * A mutex acquired with this function must be released with ww_mutex_unlock. >>>>> + */ >>>>> +int ww_mutex_trylock(struct ww_mutex *ww, struct ww_acquire_ctx *ww_ctx) >>>>> +{ >>>>> + if (!ww_ctx) >>>>> + return mutex_trylock(&ww->base); >>>>> + >>>>> + MUTEX_WARN_ON(ww->base.magic != &ww->base); >>>>> + >>>>> + if (unlikely(ww_ctx == READ_ONCE(ww->ctx))) >>>>> + return -EALREADY; >>>> I'm not 100% sure this is a good idea, because it would make the >>>> trylock weird. For i915 I checked manually, because I didn't want to >>>> change the function signature. This is probably the other extreme. >>>> >>>> "if (ww_mutex_trylock())" would look correct, but actually be wrong >>>> and lead to double unlock without adjustments. Maybe we could make a >>>> ww_mutex_trylock_ctx_err, which would return -EALREADY or -EBUSY on >>>> failure, and 0 on success? We could keep ww_mutex_trylock without >>>> ctx, probably just #define as (!ww_mutex_trylock_ctx_err(lock, NULL)) >>> Urgh, yeah. Also, I suppose that if we already own it, we'll just fail >>> the trylock anyway. Let me take this out. >>> >>>>> + /* >>>>> + * Reset the wounded flag after a kill. No other process can >>>>> + * race and wound us here, since they can't have a valid owner >>>>> + * pointer if we don't have any locks held. >>>>> + */ >>>>> + if (ww_ctx->acquired == 0) >>>>> + ww_ctx->wounded = 0; >>>> Yeah I guess this needs fixing too. Not completely sure since trylock >>>> wouldn't do the whole ww dance, but since it's our first lock, >>>> probably best to do so regardless so other users don't trip over it. >>> This is actually critical, because if this trylock is the first lock >>> acquisition for the context, there won't be any other opportunity to >>> reset this value. >>> >>>>> + >>>>> + if (__mutex_trylock(&ww->base)) { >>>>> + ww_mutex_set_context_fastpath(ww, ww_ctx); >>>>> + mutex_acquire_nest(&ww->base.dep_map, 0, 1, &ww_ctx->dep_map, _RET_IP_); >>>>> + return 1; >>>>> + } >>>>> + >>>>> + return 0; >>>>> +} >>>>> +EXPORT_SYMBOL(ww_mutex_trylock); >>> Updated version below... >>> >>> --- >>> Subject: kernel/locking: Add context to ww_mutex_trylock() >>> From: Maarten Lankhorst >>> Date: Thu, 9 Sep 2021 11:32:18 +0200 >>> >>> From: Maarten Lankhorst >>> >>> i915 will soon gain an eviction path that trylock a whole lot of locks >>> for eviction, getting dmesg failures like below: >>> >>> BUG: MAX_LOCK_DEPTH too low! >>> turning off the locking correctness validator. >>> depth: 48 max: 48! >>> 48 locks held by i915_selftest/5776: >>> #0: ffff888101a79240 (&dev->mutex){....}-{3:3}, at: __driver_attach+0x88/0x160 >>> #1: ffffc900009778c0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: i915_vma_pin.constprop.63+0x39/0x1b0 [i915] >>> #2: ffff88800cf74de8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_vma_pin.constprop.63+0x5f/0x1b0 [i915] >>> #3: ffff88810c7f9e38 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x1c4/0x9d0 [i915] >>> #4: ffff88810bad5768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_gem_evict_something+0x110/0x860 [i915] >>> #5: ffff88810bad60e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_gem_evict_something+0x110/0x860 [i915] >>> ... >>> #46: ffff88811964d768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_gem_evict_something+0x110/0x860 [i915] >>> #47: ffff88811964e0e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_gem_evict_something+0x110/0x860 [i915] >>> INFO: lockdep is turned off. >>> >>> Fixing eviction to nest into ww_class_acquire is a high priority, but >>> it requires a rework of the entire driver, which can only be done one >>> step at a time. >>> >>> As an intermediate solution, add an acquire context to >>> ww_mutex_trylock, which allows us to do proper nesting annotations on >>> the trylocks, making the above lockdep splat disappear. >>> >>> This is also useful in regulator_lock_nested, which may avoid dropping >>> regulator_nesting_mutex in the uncontended path, so use it there. >>> >>> TTM may be another user for this, where we could lock a buffer in a >>> fastpath with list locks held, without dropping all locks we hold. >>> >>> [peterz: rework actual ww_mutex_trylock() implementations] >>> Signed-off-by: Maarten Lankhorst >>> Signed-off-by: Peter Zijlstra (Intel) >>> --- >> My original patch series with this patch in place still passes i915 selftests, looks good to me. :) > For merge logistics, can we pls have a stable branch? I expect that the > i915 patches will be ready for 5.16. > > Or send it in for -rc2 so that the interface change doesn't cause needless > conflicts, whatever you think is best. > -Daniel Yeah, some central branch drm could pull from, would make upstreaming patches that depends on it easier. :)