Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp15685846rwb; Mon, 28 Nov 2022 14:58:50 -0800 (PST) X-Google-Smtp-Source: AA0mqf5+cEgZG9x0CY9+gaTyxf2RWTpSQD3GZS8937iRgHd9ObGkxEy7oN+v0bUvuBBp4z+Cq+gd X-Received: by 2002:a17:90a:dd83:b0:218:61bd:d00d with SMTP id l3-20020a17090add8300b0021861bdd00dmr60213459pjv.236.1669676330219; Mon, 28 Nov 2022 14:58:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669676330; cv=none; d=google.com; s=arc-20160816; b=UaJ8dGq+G5NumGj3JoFNNcxZuRYQmFPLtcFUuWRBil1NKgrts5GlmEG9b03X9zktGn yK3VLvheWvgn9a7RSJieKgRX4X/UxN2FlAO25EemRTS/wve/R20o1a9veHOLCUTnSzUw ADqVFsYsb0pQJbj3GG97StSrs5nFbO+ilrtRY3A3ejAgXumpd+WdtfIU5sePc7D+6kpV n0El3yPmTgyNoHhvBmPn3yMbfwbtvP1Y6fOr975C29js2du+UC6JNrHdvfcjHh0Ht48B 3NBtGL5ia7PsTtcWFgHAW9Hr4ocAFv1w1pFbOZYeYgahgE//QW+2A49s8+WaPN18VBsN 1d1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=WQ1MDRW1e7gVGBgFgV8gW3QkC9WMZddxzT1WNY3Uo4I=; b=xqbpD6onH8PIhQ3XjOB5jP+bj/XhjBDZGztKkIvuivLDriz89NerY0eRJSSbtORyh3 HXtBtioEGfSQec+fWBtn5JSC70gi1ecgpNRfZxx5B6ouwj2cbZN0prqQEPGHmawh5aVN 0khiGDCN0uMoRD9dHNYh2++YYpWKwXwaASNeFXhNZBn2/8Kd5EkUN5cwvPr3Lzs5C9lk b3nu/lJTp+4HmCPbeW2R9m/L0iHkLL6QXpkI4tmwl951UwYGIAzTp0rfuzVI5sLcR+uR 3asGDYb0VZ7JzANUcElOkjakMFE8bN3vuEOVdyyxPoqxxuPi4MinLH1lvq8HCXgFqlii upww== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=CAQUbiNJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u2-20020a170902e5c200b00188dba5d7dbsi14385546plf.78.2022.11.28.14.58.39; Mon, 28 Nov 2022 14:58:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=CAQUbiNJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233897AbiK1WKj (ORCPT + 83 others); Mon, 28 Nov 2022 17:10:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232531AbiK1WKe (ORCPT ); Mon, 28 Nov 2022 17:10:34 -0500 Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A4FD22B19 for ; Mon, 28 Nov 2022 14:10:33 -0800 (PST) Received: by mail-pf1-x42c.google.com with SMTP id b4so11868047pfb.9 for ; Mon, 28 Nov 2022 14:10:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=WQ1MDRW1e7gVGBgFgV8gW3QkC9WMZddxzT1WNY3Uo4I=; b=CAQUbiNJicl7IXT9VbNf06kK+1kweVcWbM161FF+QmwTQY+Kb5Wbo4aDrG0xE624ea At5gqb2RPP+Xs2x0oZp0WgbY1Qfm2JX8ibf5a+SXz4FB6XrbW9wEdDc+aiDuERjx9cag Sye31QhoC5MSuhoyFtKFevf2GTpQMRbcOQazEVIEQrMy6EyRrncl56EZVlk9ftPLOuwu EPK8N3kQCGUmW+v73jA6zRCgpkIw5G29Phxp4fyslDvidAIiZcw1tp7X703kbDVR65mD n5JquNcqqCCTKXykeld0eVnFOddCqKJXAqN9eQaaL4/Dzj2uSbPsZvqLPyDmbTcxAbsa KRng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=WQ1MDRW1e7gVGBgFgV8gW3QkC9WMZddxzT1WNY3Uo4I=; b=INKzhCUk5QilA+oVN+PZPD5M+W0hQVm79lGB32XjkGYVDTSxKhepqx+oTMIuZ/4Ncy RhNSlZPrReT2CK5MQja/MDCzwpon43yxxxbXKyZJuHKiFy5eCCX5ATV29vwFsbQgr345 YXV8TqoJg2zoVqunsR1mK6iYxyJEotgkvCvnI/+NG7MQ/x+gfCk2d2Grp59+agvPDqIt 22SYWrrjG4oUWI0ldlYZzxRdOBw0vHgreYBbVRrfBuFE04N0IOmLvGdP6gxjBkX4X0lQ 7VmoVpDPhbO+QBm6wSBr2Dx3u+3hH29WGUrHwBBBruX6Bn+tW6J0yKREo/nJKlqU3N4w 7bvA== X-Gm-Message-State: ANoB5pmII/1LHMK+148jyX06xu+vXstkT9BGu66wn1bDxQ/tFPLMGwc5 +bBlggx9GA9fXVW7RrC4IC2lmX33lRZb7eUQoQw= X-Received: by 2002:a63:225d:0:b0:477:beb8:70f8 with SMTP id t29-20020a63225d000000b00477beb870f8mr23454969pgm.281.1669673432639; Mon, 28 Nov 2022 14:10:32 -0800 (PST) MIME-Version: 1.0 References: <20221128180252.1684965-1-jannh@google.com> <20221128180252.1684965-2-jannh@google.com> In-Reply-To: From: Yang Shi Date: Mon, 28 Nov 2022 14:10:21 -0800 Message-ID: Subject: Re: [PATCH v4 2/3] mm/khugepaged: Fix GUP-fast interaction by sending IPI To: Jann Horn Cc: security@kernel.org, Andrew Morton , David Hildenbrand , Peter Xu , John Hubbard , linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 28, 2022 at 12:12 PM Jann Horn wrote: > > On Mon, Nov 28, 2022 at 9:10 PM Yang Shi wrote: > > On Mon, Nov 28, 2022 at 11:57 AM Jann Horn wrote: > > > > > > On Mon, Nov 28, 2022 at 8:54 PM Yang Shi wrote: > > > > > > > > On Mon, Nov 28, 2022 at 10:03 AM Jann Horn wrote: > > > > > > > > > > Since commit 70cbc3cc78a99 ("mm: gup: fix the fast GUP race against THP > > > > > collapse"), the lockless_pages_from_mm() fastpath rechecks the pmd_t to > > > > > ensure that the page table was not removed by khugepaged in between. > > > > > > > > > > However, lockless_pages_from_mm() still requires that the page table is not > > > > > concurrently freed or reused to store non-PTE data. Otherwise, problems > > > > > can occur because: > > > > > > > > > > - deposited page tables can be freed when a THP page somewhere in the > > > > > mm is removed > > > > > - some architectures store non-PTE information inside deposited page > > > > > tables (see radix__pgtable_trans_huge_deposit()) > > > > > > > > > > Additionally, lockless_pages_from_mm() is also somewhat brittle with > > > > > regards to page tables being repeatedly moved back and forth, but > > > > > that shouldn't be an issue in practice. > > > > > > > > > > Fix it by sending IPIs (if the architecture uses > > > > > semi-RCU-style page table freeing) before freeing/reusing page tables. > > > > > > > > > > As noted in mm/gup.c, on configs that define CONFIG_HAVE_FAST_GUP, > > > > > there are two possible cases: > > > > > > > > > > 1. CONFIG_MMU_GATHER_RCU_TABLE_FREE is set, causing > > > > > tlb_remove_table_sync_one() to send an IPI to synchronize with > > > > > lockless_pages_from_mm(). > > > > > 2. CONFIG_MMU_GATHER_RCU_TABLE_FREE is unset, indicating that all > > > > > TLB flushes are already guaranteed to send IPIs. > > > > > tlb_remove_table_sync_one() will do nothing, but we've already > > > > > run pmdp_collapse_flush(), which did a TLB flush, which must have > > > > > involved IPIs. > > > > > > > > I'm trying to catch up with the discussion after the holiday break. I > > > > understand you switched from always allocating a new page table page > > > > (we decided before) to sending IPIs to serialize against fast-GUP, > > > > this is fine to me. > > > > > > > > So the code now looks like: > > > > pmdp_collapse_flush() > > > > sending IPI > > > > > > > > But the missing part is how we reached "TLB flushes are already > > > > guaranteed to send IPIs" when CONFIG_MMU_GATHER_RCU_TABLE_FREE is > > > > unset? ARM64 doesn't do it IIRC. Or did I miss something? > > > > > > From arch/arm64/Kconfig: > > > > > > select MMU_GATHER_RCU_TABLE_FREE > > > > > > CONFIG_MMU_GATHER_RCU_TABLE_FREE is not a config option that the user > > > can freely toggle; it is an option selected by the architecture. > > > > Aha, I see :-) BTW, shall we revert "mm: gup: fix the fast GUP race > > against THP collapse"? It seems not necessary anymore if this approach > > is used IIUC. > > Yeah, I agree. Since this patch could solve two problems: the use-after-free of the data page (pinned by fast-GUP) and the page table page and my patch will be reverted, so could you please catch both issues in this patch's commit log? I'd like to preserve the description of the issue fixed by my patch. I think that it is helpful to see the information about all the fixed problems in one commit instead of digging into another reverted commit. > > > Reviewed-by: Yang Shi > > Thanks!