Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp2727139pxb; Mon, 31 Jan 2022 03:02:35 -0800 (PST) X-Google-Smtp-Source: ABdhPJx4HIV2JFf8OigL7a4g2UUqJst0iBiEyeJyRTmqs9b5oLwUpM68i6bIixQHmMcksY/cAkob X-Received: by 2002:aa7:d7d1:: with SMTP id e17mr19944519eds.315.1643626955570; Mon, 31 Jan 2022 03:02:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643626955; cv=none; d=google.com; s=arc-20160816; b=kmjopfYu1jGQ8OZ4kCETomoZhq6PxIWkoTwbZON6CD4KGjJv+bemFOOUr/XGXbM6iX unW+sDugs9RTebmPmL+L8jIWjkLY/evmNBLILhXhrTF2diuiUeot+9CD+aKItgZfcQ6m 7BzZ2QqoeyTzAbex5+2BsPABY6bTS3mjUmrercX34r/V7XjPPJHg88ngfOczA6B2uVzy l9lGYcqoz3Fpu7VEbImPInrPZt90xo62T6zxSnXfGBJsTg/C2CSo23WB9oE7qfJu4DBC M8j5mHuy/wu4HinNz2inbRit0HYmyj3/MuhnAwOs6XRGpM8uH+w6fTjRAjAb4qrEVFhN 2+Rg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:dkim-signature; bh=oZA2R37ZdQsPKRya4evD/1AxU5M2JserIU8FqrANBD0=; b=UHyebBRM8dF+Q+hR127bliU7/6wTbXdpEA/8FmuL6aVEqZhpNx/phLg+1vYRzySB6C wA1BwB8BMABNowF4mizX8bNQar+Qyry2w7q6HjNhs3SClIyBT7A0msj4MNibkEJV/wkq 95I0O1rHmahicu5tB+1/yOJ1xtozrexxLWumj9CgrD0PqwijDg/rSP17eOLkwLY150of YfeyrZfZXDbghUqp5PZBczC7PRfpa38KeDUH1jvFGvjIjB6hcFuKeDBJNq8wjdho8IMU uRPtyHgs6irLNeTidFjCL6dx1ivFRqeSOOzU03bfIZ87iwRQ4J5OssiPNBr2e8Sxzk5p Bfvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (no key) header.i=@lespinasse.org header.s=srv-52-ed; dkim=pass (test mode) header.i=@lespinasse.org header.s=srv-52-rsa header.b="CTWnbr/B"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=lespinasse.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dr8si7214779ejc.823.2022.01.31.03.02.02; Mon, 31 Jan 2022 03:02:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=neutral (no key) header.i=@lespinasse.org header.s=srv-52-ed; dkim=pass (test mode) header.i=@lespinasse.org header.s=srv-52-rsa header.b="CTWnbr/B"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=lespinasse.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348792AbiA1NTN (ORCPT + 99 others); Fri, 28 Jan 2022 08:19:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244893AbiA1NTI (ORCPT ); Fri, 28 Jan 2022 08:19:08 -0500 Received: from server.lespinasse.org (server.lespinasse.org [IPv6:2001:470:82ab::100:0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB293C06173B for ; Fri, 28 Jan 2022 05:19:07 -0800 (PST) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=lespinasse.org; i=@lespinasse.org; q=dns/txt; s=srv-52-ed; t=1643375407; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : from; bh=oZA2R37ZdQsPKRya4evD/1AxU5M2JserIU8FqrANBD0=; b=qW+sQ7HyABXKBsze8DrBwRseqXMce9RCfxlhrUvp9AMkFrpRRy4fHhhD3Kt+/3yVzvqTr yudQB2bVFoLPLJEBA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lespinasse.org; i=@lespinasse.org; q=dns/txt; s=srv-52-rsa; t=1643375407; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : from; bh=oZA2R37ZdQsPKRya4evD/1AxU5M2JserIU8FqrANBD0=; b=CTWnbr/BOodTzck8x1znoxdaW3l5DVCyKQbkGJ8mGFqrhPK4thb9kYgNu/C+uBXfuICLr WB6qqJHNs96os/5WLhVkJGSRdtV8229d8QKw3uzPFMxnADFVZ1yziiZTrVi18UYqSryWkSa OV0ZaV7NTu7wdGC9yxpIx7lPT1rMwu6hc+cRdXSy9x9rSRuU4CAQh+oYgx4k3e7d3AoGtPU MgSSDjF6szOCkRHrmG69ho3BWQFZUDfi/TX4AUolM5NFpi5kPy00kdGGkMpj/1WfIJ9e7s4 EodZMIXxQC1pAukAcCBnXNhuzahxpjbkJbbWVbP85BvMtgdKmGCMyhKNmd5w== Received: from zeus.lespinasse.org (zeus.lespinasse.org [10.0.0.150]) by server.lespinasse.org (Postfix) with ESMTPS id 04071160968; Fri, 28 Jan 2022 05:10:07 -0800 (PST) Received: by zeus.lespinasse.org (Postfix, from userid 1000) id DC56A2023B; Fri, 28 Jan 2022 05:10:06 -0800 (PST) From: Michel Lespinasse To: Linux-MM , linux-kernel@vger.kernel.org, Andrew Morton Cc: kernel-team@fb.com, Laurent Dufour , Jerome Glisse , Peter Zijlstra , Michal Hocko , Vlastimil Babka , Davidlohr Bueso , Matthew Wilcox , Liam Howlett , Rik van Riel , Paul McKenney , Song Liu , Suren Baghdasaryan , Minchan Kim , Joel Fernandes , David Rientjes , Axel Rasmussen , Andy Lutomirski , Michel Lespinasse Subject: [PATCH v2 10/35] mm: add per-mm mmap sequence counter for speculative page fault handling. Date: Fri, 28 Jan 2022 05:09:41 -0800 Message-Id: <20220128131006.67712-11-michel@lespinasse.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20220128131006.67712-1-michel@lespinasse.org> References: <20220128131006.67712-1-michel@lespinasse.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The counter's write side is hooked into the existing mmap locking API: mmap_write_lock() increments the counter to the next (odd) value, and mmap_write_unlock() increments it again to the next (even) value. The counter's speculative read side is supposed to be used as follows: seq = mmap_seq_read_start(mm); if (seq & 1) goto fail; .... speculative handling here .... if (!mmap_seq_read_check(mm, seq) goto fail; This API guarantees that, if none of the "fail" tests abort speculative execution, the speculative code section did not run concurrently with any mmap writer. This is very similar to a seqlock, but both the writer and speculative readers are allowed to block. In the fail case, the speculative reader does not spin on the sequence counter; instead it should fall back to a different mechanism such as grabbing the mmap lock read side. Signed-off-by: Michel Lespinasse --- include/linux/mm_types.h | 4 +++ include/linux/mmap_lock.h | 58 +++++++++++++++++++++++++++++++++++++-- 2 files changed, 60 insertions(+), 2 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 0ae3bf854aad..e4965a6f34f2 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -523,6 +523,10 @@ struct mm_struct { * cacheline. */ struct rw_semaphore mmap_lock; +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT + unsigned long mmap_seq; +#endif + struct list_head mmlist; /* List of maybe swapped mm's. These * are globally strung together off diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 1b14468183d7..a2459eb15a33 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -8,8 +8,16 @@ #include #include -#define MMAP_LOCK_INITIALIZER(name) \ - .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock), +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT +#define MMAP_LOCK_SEQ_INITIALIZER(name) \ + .mmap_seq = 0, +#else +#define MMAP_LOCK_SEQ_INITIALIZER(name) +#endif + +#define MMAP_LOCK_INITIALIZER(name) \ + .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock), \ + MMAP_LOCK_SEQ_INITIALIZER(name) DECLARE_TRACEPOINT(mmap_lock_start_locking); DECLARE_TRACEPOINT(mmap_lock_acquire_returned); @@ -63,13 +71,52 @@ static inline void __mmap_lock_trace_released(struct mm_struct *mm, bool write) static inline void mmap_init_lock(struct mm_struct *mm) { init_rwsem(&mm->mmap_lock); +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT + mm->mmap_seq = 0; +#endif } +static inline void __mmap_seq_write_lock(struct mm_struct *mm) +{ +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT + VM_BUG_ON_MM(mm->mmap_seq & 1, mm); + mm->mmap_seq++; + smp_wmb(); +#endif +} + +static inline void __mmap_seq_write_unlock(struct mm_struct *mm) +{ +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT + smp_wmb(); + mm->mmap_seq++; + VM_BUG_ON_MM(mm->mmap_seq & 1, mm); +#endif +} + +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT +static inline unsigned long mmap_seq_read_start(struct mm_struct *mm) +{ + unsigned long seq; + + seq = READ_ONCE(mm->mmap_seq); + smp_rmb(); + return seq; +} + +static inline bool mmap_seq_read_check(struct mm_struct *mm, unsigned long seq) +{ + smp_rmb(); + return seq == READ_ONCE(mm->mmap_seq); +} +#endif + static inline void mmap_write_lock(struct mm_struct *mm) { __mmap_lock_trace_start_locking(mm, true); down_write(&mm->mmap_lock); __mmap_lock_trace_acquire_returned(mm, true, true); + __mmap_seq_write_lock(mm); } static inline void mmap_write_lock_nested(struct mm_struct *mm, int subclass) @@ -77,6 +124,7 @@ static inline void mmap_write_lock_nested(struct mm_struct *mm, int subclass) __mmap_lock_trace_start_locking(mm, true); down_write_nested(&mm->mmap_lock, subclass); __mmap_lock_trace_acquire_returned(mm, true, true); + __mmap_seq_write_lock(mm); } static inline int mmap_write_lock_killable(struct mm_struct *mm) @@ -86,6 +134,8 @@ static inline int mmap_write_lock_killable(struct mm_struct *mm) __mmap_lock_trace_start_locking(mm, true); error = down_write_killable(&mm->mmap_lock); __mmap_lock_trace_acquire_returned(mm, true, !error); + if (likely(!error)) + __mmap_seq_write_lock(mm); return error; } @@ -96,18 +146,22 @@ static inline bool mmap_write_trylock(struct mm_struct *mm) __mmap_lock_trace_start_locking(mm, true); ok = down_write_trylock(&mm->mmap_lock) != 0; __mmap_lock_trace_acquire_returned(mm, true, ok); + if (likely(ok)) + __mmap_seq_write_lock(mm); return ok; } static inline void mmap_write_unlock(struct mm_struct *mm) { __mmap_lock_trace_released(mm, true); + __mmap_seq_write_unlock(mm); up_write(&mm->mmap_lock); } static inline void mmap_write_downgrade(struct mm_struct *mm) { __mmap_lock_trace_acquire_returned(mm, false, true); + __mmap_seq_write_unlock(mm); downgrade_write(&mm->mmap_lock); } -- 2.20.1