Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp4489681ybl; Mon, 13 Jan 2020 14:48:22 -0800 (PST) X-Google-Smtp-Source: APXvYqzHldXmaNNoKdVtoPg07rUfPaBiBRdPmw+HwMwyfhO9PpSwcup74a/e75wxpvK6hfVH0ffG X-Received: by 2002:a9d:1c95:: with SMTP id l21mr14557156ota.271.1578955702176; Mon, 13 Jan 2020 14:48:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578955702; cv=none; d=google.com; s=arc-20160816; b=olZf7OgSQu0mgLIbbYVSl3BVxH4A0BRBksgmMoJuVUsdv8pzoyiDxUxuntRFo9nWoN Cs1snnhr/GIiLY0DwGy2Grn84ZFN1GY1thCu8L87VLsuhPdOQO2N8WWSYkW1INH9990z i5vGBjD+eDz15fMzYvFnOqzKWLZCOI3/D7m110a0oob4kc+h2htN/epU88dJHAg7vdu3 RkeoF3yuQqTz8uFjRIcCEZeHh3RLbfPyZUWWN4/u1Bej8JG52mQ1xg/4zdQnsLMSEFnc 8QA3V/VY5qd32PCqtUhFC7smBzuFBbrKweOjTgAqyz7v/scFZz5WGw0zJX6TT0dfXaZE fJpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:dkim-signature:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from; bh=SFOHAESIv9tXjesCFOs4ndM/OwOZqhr4ej7VPHTD1e4=; b=gNAdpxp2ya6YOwtDCnv2yMLexZ6Fl2iYcJhiYMOgmqAU2vpNtXJwk8ARYkwbikf4MV /o47/0mEjjpv0QVK9f+cLALj4sunULbB/LeTxcvTis4G/yd0uSfCxw3Qedsf0o5Z8KmD cIlmIfjwCdEoVuWxwRagxxVdydWQ4PZaQpEQ0TQoqMjv0LUBL+YgvzNP6wyjWwnfa/ML UlAjhsbvOkuZ/Txh+BeAN/3xcCxsZckBsHqSDcSnYTT7NaWsXm4f4nhjSdPW3vH23ZFu 1+IfCnpbfj9ao6KDdYq8FgvlL4RD/b8u9NYLJ9ikkSA7/zoBTy2Deg/KzwVVET4ZspzU V5iw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=dqY0jK7e; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r26si7963376otc.163.2020.01.13.14.48.10; Mon, 13 Jan 2020 14:48:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=dqY0jK7e; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729083AbgAMWrS (ORCPT + 99 others); Mon, 13 Jan 2020 17:47:18 -0500 Received: from hqnvemgate26.nvidia.com ([216.228.121.65]:8371 "EHLO hqnvemgate26.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728994AbgAMWrP (ORCPT ); Mon, 13 Jan 2020 17:47:15 -0500 Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Mon, 13 Jan 2020 14:46:53 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Mon, 13 Jan 2020 14:47:13 -0800 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Mon, 13 Jan 2020 14:47:13 -0800 Received: from HQMAIL105.nvidia.com (172.20.187.12) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 13 Jan 2020 22:47:10 +0000 Received: from hqnvemgw03.nvidia.com (10.124.88.68) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Mon, 13 Jan 2020 22:47:10 +0000 Received: from rcampbell-dev.nvidia.com (Not Verified[10.110.48.66]) by hqnvemgw03.nvidia.com with Trustwave SEG (v7,5,8,10121) id ; Mon, 13 Jan 2020 14:47:10 -0800 From: Ralph Campbell To: , , , , CC: Jerome Glisse , John Hubbard , Christoph Hellwig , Jason Gunthorpe , "Andrew Morton" , Ben Skeggs , "Shuah Khan" , Ralph Campbell Subject: [PATCH v6 2/6] mm/mmu_notifier: add mmu_interval_notifier_put() Date: Mon, 13 Jan 2020 14:46:59 -0800 Message-ID: <20200113224703.5917-3-rcampbell@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200113224703.5917-1-rcampbell@nvidia.com> References: <20200113224703.5917-1-rcampbell@nvidia.com> MIME-Version: 1.0 X-NVConfidentiality: public Content-Transfer-Encoding: quoted-printable Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1578955614; bh=SFOHAESIv9tXjesCFOs4ndM/OwOZqhr4ej7VPHTD1e4=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:MIME-Version:X-NVConfidentiality: Content-Transfer-Encoding:Content-Type; b=dqY0jK7e8sxIFDIAcf8rRj7kcgBLXqVMIUAEvSaIz0mg7UItoNwrSqEWxTfs274AO LeS/Ug1yElj2TO7ygasIoV8XEmDl6qUeYz0MY4ll8VJOU0Rmudafx6myR7LlDP31X3 nN90TdfTBD3J7xWHe7qnQibmd11+FQiUoiBdN9Vrl56/09I+rFjtxR/m8qpS4VBOqh hFZz5KioalPlWoDz7j+xqVdL2DIVxLFROKatzcE5rk0ixo0pa0wzeAjSt4dI186xVD 1RcwKHgqTfLvRF6tOP/sf6w+bP7mqGDDVImapU61NV01+NXd+MpBpxBKXt8tDi94VX oMtSRAYRX+1UQ== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org mmu_interval_notifier_remove() can't be called safely from inside the invalidate() callback because it sleeps waiting for invalidate callbacks to finish. Removals might be needed when the invalidate() callback is for munmap() (i.e., the event type MMU_NOTIFY_UNMAP), and the interval being tracked is no longer needed. Add a new function mmu_interval_notifier_put() which is safe to call from the invalidate() callback. The ops->release() function will be called when all callbacks are finished and no CPUs are accessing the mmu_interval_notifier. Signed-off-by: Ralph Campbell --- include/linux/mmu_notifier.h | 6 +++ mm/mmu_notifier.c | 86 ++++++++++++++++++++++++++++-------- 2 files changed, 74 insertions(+), 18 deletions(-) diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index 027c9c8f3a69..6dcaa632eef7 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -233,11 +233,16 @@ struct mmu_notifier { * @invalidate: Upon return the caller must stop using any SPTEs within th= is * range. This function can sleep. Return false only if sleep= ing * was required but mmu_notifier_range_blockable(range) is fa= lse. + * @release: This function should be defined when using + * mmu_interval_notifier_put(). It will be called when the + * mmu_interval_notifier is removed from the interval tree. + * No other callbacks will be generated after this returns. */ struct mmu_interval_notifier_ops { bool (*invalidate)(struct mmu_interval_notifier *mni, const struct mmu_notifier_range *range, unsigned long cur_seq); + void (*release)(struct mmu_interval_notifier *mni); }; =20 struct mmu_interval_notifier { @@ -304,6 +309,7 @@ int mmu_interval_notifier_insert_safe( unsigned long start, unsigned long length, const struct mmu_interval_notifier_ops *ops); void mmu_interval_notifier_remove(struct mmu_interval_notifier *mni); +void mmu_interval_notifier_put(struct mmu_interval_notifier *mni); =20 /** * mmu_interval_set_seq - Save the invalidation sequence diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index a5ff19cd1bc5..40c837ae8d90 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -129,6 +129,7 @@ static void mn_itree_inv_end(struct mmu_notifier_mm *mm= n_mm) { struct mmu_interval_notifier *mni; struct hlist_node *next; + struct hlist_head removed_list; =20 spin_lock(&mmn_mm->lock); if (--mmn_mm->active_invalidate_ranges || @@ -144,20 +145,35 @@ static void mn_itree_inv_end(struct mmu_notifier_mm *= mmn_mm) * The inv_end incorporates a deferred mechanism like rtnl_unlock(). * Adds and removes are queued until the final inv_end happens then * they are progressed. This arrangement for tree updates is used to - * avoid using a blocking lock during invalidate_range_start. + * avoid using a blocking lock while walking the interval tree. */ + INIT_HLIST_HEAD(&removed_list); hlist_for_each_entry_safe(mni, next, &mmn_mm->deferred_list, deferred_item) { + hlist_del(&mni->deferred_item); if (RB_EMPTY_NODE(&mni->interval_tree.rb)) interval_tree_insert(&mni->interval_tree, &mmn_mm->itree); - else + else { interval_tree_remove(&mni->interval_tree, &mmn_mm->itree); - hlist_del(&mni->deferred_item); + if (mni->ops->release) + hlist_add_head(&mni->deferred_item, + &removed_list); + } } spin_unlock(&mmn_mm->lock); =20 + hlist_for_each_entry_safe(mni, next, &removed_list, deferred_item) { + struct mm_struct *mm =3D mni->mm; + + hlist_del(&mni->deferred_item); + mni->ops->release(mni); + + /* pairs with mmgrab() in __mmu_interval_notifier_insert() */ + mmdrop(mm); + } + wake_up_all(&mmn_mm->wq); } =20 @@ -1006,24 +1022,13 @@ int mmu_interval_notifier_insert_safe( } EXPORT_SYMBOL_GPL(mmu_interval_notifier_insert_safe); =20 -/** - * mmu_interval_notifier_remove - Remove a interval notifier - * @mni: Interval notifier to unregister - * - * This function must be paired with mmu_interval_notifier_insert(). It ca= nnot - * be called from any ops callback. - * - * Once this returns ops callbacks are no longer running on other CPUs and - * will not be called in future. - */ -void mmu_interval_notifier_remove(struct mmu_interval_notifier *mni) +static unsigned long __mmu_interval_notifier_put( + struct mmu_interval_notifier *mni) { struct mm_struct *mm =3D mni->mm; struct mmu_notifier_mm *mmn_mm =3D mm->mmu_notifier_mm; unsigned long seq =3D 0; =20 - might_sleep(); - spin_lock(&mmn_mm->lock); if (mn_itree_is_invalidating(mmn_mm)) { /* @@ -1043,6 +1048,28 @@ void mmu_interval_notifier_remove(struct mmu_interva= l_notifier *mni) } spin_unlock(&mmn_mm->lock); =20 + return seq; +} + +/** + * mmu_interval_notifier_remove - Remove an interval notifier + * @mni: Interval notifier to unregister + * + * This function must be paired with one of the mmu_interval_notifier_inse= rt() + * functions. It cannot be called from any ops callback. + * Once this returns, ops callbacks are no longer running on other CPUs an= d + * will not be called in future. + */ +void mmu_interval_notifier_remove(struct mmu_interval_notifier *mni) +{ + struct mm_struct *mm =3D mni->mm; + struct mmu_notifier_mm *mmn_mm =3D mm->mmu_notifier_mm; + unsigned long seq; + + might_sleep(); + + seq =3D __mmu_interval_notifier_put(mni); + /* * The possible sleep on progress in the invalidation requires the * caller not hold any locks held by invalidation callbacks. @@ -1053,11 +1080,34 @@ void mmu_interval_notifier_remove(struct mmu_interv= al_notifier *mni) wait_event(mmn_mm->wq, READ_ONCE(mmn_mm->invalidate_seq) !=3D seq); =20 - /* pairs with mmgrab in mmu_interval_notifier_insert() */ - mmdrop(mm); + /* pairs with mmgrab() in __mmu_interval_notifier_insert() */ + if (!mni->ops->release) + mmdrop(mm); } EXPORT_SYMBOL_GPL(mmu_interval_notifier_remove); =20 +/** + * mmu_interval_notifier_put - Unregister an interval notifier + * @mni: Interval notifier to unregister + * + * This function must be paired with one of the mmu_interval_notifier_inse= rt() + * functions. It is safe to call from the invalidate() callback. + * Once this returns, ops callbacks may still be running on other CPUs and + * the release() callback will be called when they finish. + */ +void mmu_interval_notifier_put(struct mmu_interval_notifier *mni) +{ + struct mm_struct *mm =3D mni->mm; + + if (!__mmu_interval_notifier_put(mni)) { + mni->ops->release(mni); + + /* pairs with mmgrab() in __mmu_interval_notifier_insert() */ + mmdrop(mm); + } +} +EXPORT_SYMBOL_GPL(mmu_interval_notifier_put); + /** * mmu_notifier_synchronize - Ensure all mmu_notifiers are freed * --=20 2.20.1