Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp1149469pxb; Thu, 23 Sep 2021 20:10:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzIRdhTRf0mzS4albw2R1fbUDgxo7EcZ+NFFLcDdJVXlnzHbFKCXXapRvXewsS8ytVIskPO X-Received: by 2002:a50:d8ce:: with SMTP id y14mr2577260edj.92.1632453037287; Thu, 23 Sep 2021 20:10:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632453037; cv=none; d=google.com; s=arc-20160816; b=YZTu464tcDti5YiFdMFivWLjFxq+Z9VIiFjQkcswPv1GoIBixUWrEdDIUSgwPBE/CF XdBEtLQD9FQ1kWaqKoN1P2a4UObGWYuznMzInkssH0HNLkyyham/DMakyOHEUBzok1OM sqtfZIe5aZm59pQV/b7xUOQUO7sybWOT17/hkMfGko2T9zL3/DfOdUOmdY1QiIa7VF49 UHteTybVBqC52quaO0Q2Sm1+thR7Yeg0BuJWKEHQ4IaZTnRYBJShBNKpCp43Wyk7zwPb Lv3XwtSZW/U9ForMeSeWzHVixOrOsJxPdW+hpkW26PUXQyt253KOytaopal3x32bWuG2 XbOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=pqOUF2ID5Z3ZhgcypPTAUuh9KgfVjRAG/Kv6sgvowxc=; b=ntVTjW57l67VUvjn2Zu7QGw0bsiqv4uxa0T8wFVAv1cSL9CUldJB+SDwh5J6UnBnH8 Nf7KZFkDgIjnDaIVd99+6PxCowTeMXxhkwpyAWmEr+8JNE/D0kFNZaIpclqGgYHoLrnE ME8j+kfbFOyTjn7Wc7+jK2XzwjsI7ln63vnQuUk0O6nkTNSWRZLN2GQ0KTyoPwByYpjP sqeZJMRpFJl1yDv4GiN1Pe7VF+yF+//AAoc+vAseY9mjjOyfPx0+D1w2xX/bD0014u1A qIOyM6Ohsgos3aphvjFzH3n3HAU4Gxst7rAEDoIxbA7Httn+60so/6p5pRjFhmvgOeIv JckA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=XcpN0Omn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 8si8347181edx.613.2021.09.23.20.10.09; Thu, 23 Sep 2021 20:10:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=XcpN0Omn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243954AbhIXDKR (ORCPT + 99 others); Thu, 23 Sep 2021 23:10:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230123AbhIXDKR (ORCPT ); Thu, 23 Sep 2021 23:10:17 -0400 Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A89EDC061574 for ; Thu, 23 Sep 2021 20:08:44 -0700 (PDT) Received: by mail-ed1-x52f.google.com with SMTP id bx4so30655755edb.4 for ; Thu, 23 Sep 2021 20:08:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=pqOUF2ID5Z3ZhgcypPTAUuh9KgfVjRAG/Kv6sgvowxc=; b=XcpN0OmnZiCmTRLGp6cKERD7HTR89Kj9jpH7/CHRyc9RFZt/YZRSVGdqasbne76v2v vSX6EMJiGswIs9cCNO7/M9UNU3c+M6+5GxBupYFV7BbGAyBcfDVJ3kGoknQsh5BMRDfC hq2Udpf/5k84jW9VLPeSpZssJSRbJK3Bbr0at69y57bm+SrcMrwZiWPP3ziPw9qWMWxv dwIfZyCvdf9Lk9dyx23VrJga56LJQIOMRaEtHzDaaDP8A0spxD5fbuzqWXMG8IfxczV0 LhPtRN/QwNcIxDR+ZVj6UL1qMBZ39hzfBQxeFFy5VF71j6AK6rYOKPUzucIVVUcyNaBb Y4Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=pqOUF2ID5Z3ZhgcypPTAUuh9KgfVjRAG/Kv6sgvowxc=; b=exU82XFYLQRo+WyVjVNHXXVhgCJmBnPNnM3wJ+8fCh9M0dRYNxqCghaeyGKNWkS4cK lIQsRruKPRBsZPZtsRj/nUiQ1cP7SIYT6NbobOueg9ceApkVUkXMHAmj976yw5NRWjlx IAL5GsB1YJjPBKv7pu2M6eyEYpzZpsJ1DSOEADDZBjpi0Uu8cbCRn1cn9+R0Jg4ailXb GesGJIa1c1TIVu2xOxAski0zHICNEia3pUDdyGkBlpnVSJUoXKMMd3FQ4ttYG+BmKdwR mkF1DyBag/ROnOyoPXBQW+YYB0lsp2YO7Pdqz0ZGc86gCM8A4jtvr5fnlOhIp4Cbv7RB MICw== X-Gm-Message-State: AOAM533zJ7WbdoFfNOO7+v0GSxOihPJWJ9ylO77+HlSD9KLxXGVrNfBx oK2B6MNTFbWok6tLUNgctrMy6dAiGUlrhbAuJGs= X-Received: by 2002:a17:906:3854:: with SMTP id w20mr8488896ejc.537.1632452923167; Thu, 23 Sep 2021 20:08:43 -0700 (PDT) MIME-Version: 1.0 References: <20210906121200.57905-1-rongwei.wang@linux.alibaba.com> <20210922070645.47345-2-rongwei.wang@linux.alibaba.com> <20210923194343.ca0f29e1c4d361170343a6f2@linux-foundation.org> In-Reply-To: <20210923194343.ca0f29e1c4d361170343a6f2@linux-foundation.org> From: Yang Shi Date: Thu, 23 Sep 2021 20:08:31 -0700 Message-ID: Subject: Re: [PATCH v2 1/2] mm, thp: check page mapping when truncating page cache To: Andrew Morton Cc: Rongwei Wang , Matthew Wilcox , Linux MM , Linux Kernel Mailing List , song@kernel.org, william.kucharski@oracle.com, Hugh Dickins Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 23, 2021 at 7:43 PM Andrew Morton w= rote: > > On Thu, 23 Sep 2021 01:04:54 +0800 Rongwei Wang wrote: > > > > > > > > On Sep 22, 2021, at 7:37 PM, Matthew Wilcox wro= te: > > > > > > On Wed, Sep 22, 2021 at 03:06:44PM +0800, Rongwei Wang wrote: > > >> Transparent huge page has supported read-only non-shmem files. The f= ile- > > >> backed THP is collapsed by khugepaged and truncated when written (fo= r > > >> shared libraries). > > >> > > >> However, there is race in two possible places. > > >> > > >> 1) multiple writers truncate the same page cache concurrently; > > >> 2) collapse_file rolls back when writer truncates the page cache; > > > > > > As I've said before, the bug here is that somehow there is a writable= fd > > > to a file with THPs. That's what we need to track down and fix. > > Hi, Matthew > > I am not sure get your means. We know =E2=80=9Cmm, thp: relax the VM_DE= NYWRITE constraint on file-backed THPs" > > Introduced file-backed THPs for DSO. It is possible {very rarely} for D= SO to be opened in writeable way. > > > > ... > > > > > https://lore.kernel.org/linux-mm/YUdL3lFLFHzC80Wt@casper.infradead.or= g/ > > All in all, what you mean is that we should solve this race at the sour= ce? > > Matthew is being pretty clear here: we shouldn't be permitting > userspace to get a writeable fd for a thp-backed file. No, he doesn't mean it IIRC. Actually we had the same conversation for another patch. Quoted below: " > > Things have already gone wrong before we get to this point. See > > do_dentry_open(). You aren't supposed to be able to get a writable fil= e > > descriptor on a file which has had huge pages added to the page cache > > without the filesystem's knowledge. That's the problem that needs to > > be fixed. > > I don't quite understand your point here. Do you mean do_dentry_open() > should fail for such cases instead of truncating the page cache? No, do_dentry_open() should have truncated the page cache when it was called and found that there were THPs in the cache. Then khugepaged should see that someone has the file open for write and decline to create new THPs. So it shouldn't be possible to get here with THPs in the cache." Please see https://lore.kernel.org/linux-mm/YUkCI2I085Sos%2F64@casper.infra= dead.org/ But actually "mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs" did so exactly. > > Why are we permitting the DSO to be opened writeably? If there's a > legitimate case for doing this then presumably "mm, thp: relax the > VM_DENYWRITE constraint on file-backed THPs: should be fixed or > reverted. Unfortunately we can't revert this commit anymore since VM_DENYWRITE is gone due to commit 8d0920bde5eb ("mm: remove VM_DENYWRITE") > > If there is no legitimate use case for returning a writeable fd for a > thp-backed file then we should fail such an attempt at open(). This > approach has back-compatibility issues which need to be thought about. > Perhaps we should permit the open-writeably attempt to appear to > succeed, but to really return a read-only fd? > >