Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp6077636rwr; Mon, 1 May 2023 16:14:32 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7AMpLwNLTjw1NSA0c/dCGvdi1Y2dPgEbXwgUsRnlGP7aobJ9DxczyvXQr4AbGPVCtcokve X-Received: by 2002:a05:6a20:a69c:b0:ef:e589:28a3 with SMTP id ba28-20020a056a20a69c00b000efe58928a3mr16429518pzb.16.1682982872258; Mon, 01 May 2023 16:14:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682982872; cv=none; d=google.com; s=arc-20160816; b=YxrnNFdrGXsYAbsvyOjnBBf2dNJODu8WTTFVDEJXk5K6b+iNYsutD+D9f3npZY7gd/ fwy7pFg5uaczQ8Uln3WDyRlEkx6yEbvd3CmYLNqvYB8Sr2r72oS0RJrceL6AZj+TBnTb ic0l+4hA/3ceObiygIw/nkFsubS7VRzgA8Qcv85A81dAsWzm1ckcOkllQiDlW1tiFo4B 83kFmpecKL+grc2Riv+9b+D5IGeB9Y+oBzg7BpnK4juYFCXTPnhXd+j2xcJWbSF6F9Hz kvJ8+r5vrHjB3Pp9RWVPDv7Vs59qrQkMQBxTurMuWu4eszR3T2p6O7tp0ZX5WLcLIbSi Ykqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=dH0PVRqeh1hbjfzzn3FtIhyE9VI9lTkG4LXY49ABXmY=; b=tXOA+eq+BvtXZAlqV0EolIBOwLjB1639/452zoQRT4Oajn6dRdBAJrL/6UIc5t4Om/ OWeB5lKfR6lau0yawF2VN1t86vXEgahYPU6N8G6A4Lfj0cq4OfSsdRM3fpGiM5i1nOos 7TF/Ra0Kgch4w+rXQC4W6ybBTe4EdNI9RXm+OCEN6LcBkkhRNxHHodh2KZGGLYtDqGrK YhKx5aOXM4psEg6YOG6s7nKNp2U+m/ukOoo2rBxfhbip4UQdW/EjacTisVTpiVJv0Dgq i2M2EYPDEHYIsjoXZPKirP9WIyyY62maayTAv4jHOMEDVFBsN+Fl7LsJ197ZcI+kDy65 vuZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=JQj0JK6I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a22-20020a656556000000b0050f66d53c3fsi28261897pgw.299.2023.05.01.16.14.20; Mon, 01 May 2023 16:14:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=JQj0JK6I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232402AbjEAXOE (ORCPT + 99 others); Mon, 1 May 2023 19:14:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229810AbjEAXOC (ORCPT ); Mon, 1 May 2023 19:14:02 -0400 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C7D730D6; Mon, 1 May 2023 16:14:00 -0700 (PDT) Received: by mail-wm1-x32e.google.com with SMTP id 5b1f17b1804b1-3f09b4a1527so31657175e9.0; Mon, 01 May 2023 16:14:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1682982838; x=1685574838; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=dH0PVRqeh1hbjfzzn3FtIhyE9VI9lTkG4LXY49ABXmY=; b=JQj0JK6I2bTiPohR3qubZWuRn+sGFRxt6BiT32mEZPbkhdb+VDED/zAQ7lBuswWTRu z/wa9FNBxqoJyBRr1tEicrUqsort73a5tOt/dvLxRJXwtYzaOxaZSEh4metr65/4MmiW E4iwriT+F51wa6I+jRK21/2vn1dJNNORgGSHNAWFjwgktUkZ3lN21JYI5lznuqMp+q/e J+UDk+SzLCrItZ9rhWkvwbFEdc036nWSXBqMtCEpdK3hfAxjClYVCkZKynnA8LsHEc0p +tCc2fXNTCwiS2GLkhOV5E4Q03YTYRyOflX1DfH8buK9Myr7cllHM741uQNN5I97MQRU zNYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682982838; x=1685574838; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dH0PVRqeh1hbjfzzn3FtIhyE9VI9lTkG4LXY49ABXmY=; b=RPaKffdaZPXrj+iIIs6/Has9CTx5q0NI3x8xaXDfSUOvGjiSYtN2OIwKhqWSUIENX8 33Dlb/DKNL9HkbeGoYjyytOFUzBS717zTTD/rdIn5IXYSs1mR0X7IAlTBriwD+mCnwzQ I9kPop71V9oQ+xxpcVw/NHnXGuW+15o+PusSDvmzlU10iKPclBCMNwq3G28r90YNU8aR rdBfX5hlBuSJzyp+ypGdp39aa7RFaq+JSjyG+x2aNXC1mT8btTzdkv2HH7n5rHPYYrpw KWhP+b9q0KMpP+UF6ENb4ulNg5xv6Y96h6eD4kz4k6XIXkTNB9SgGU546M1NYvc+Jg2N Cwew== X-Gm-Message-State: AC+VfDwrs9b5Uh4lrH+vPwy+3dSHt7IP0l8kdJ/AuHrp7LrXbranOGhF QS/p5+khgxuuth2+c3G8TFw= X-Received: by 2002:a05:600c:2046:b0:3f1:9503:4db0 with SMTP id p6-20020a05600c204600b003f195034db0mr10552021wmg.13.1682982838155; Mon, 01 May 2023 16:13:58 -0700 (PDT) Received: from lucifer.home (host86-156-84-164.range86-156.btcentralplus.com. [86.156.84.164]) by smtp.googlemail.com with ESMTPSA id v9-20020a05600c444900b003f173be2ccfsm48948904wmn.2.2023.05.01.16.13.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 May 2023 16:13:57 -0700 (PDT) From: Lorenzo Stoakes To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton Cc: Jason Gunthorpe , Jens Axboe , Matthew Wilcox , Dennis Dalessandro , Leon Romanovsky , Christian Benvenuti , Nelson Escobar , Bernard Metzler , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , Bjorn Topel , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Christian Brauner , Richard Cochran , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , linux-fsdevel@vger.kernel.org, linux-perf-users@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, Oleg Nesterov , Jason Gunthorpe , John Hubbard , Jan Kara , "Kirill A . Shutemov" , Pavel Begunkov , Mika Penttila , David Hildenbrand , Dave Chinner , Theodore Ts'o , Peter Xu , Lorenzo Stoakes Subject: [PATCH v6 0/3] mm/gup: disallow GUP writing to file-backed mappings by default Date: Tue, 2 May 2023 00:11:46 +0100 Message-Id: X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Writing to file-backed mappings which require folio dirty tracking using GUP is a fundamentally broken operation, as kernel write access to GUP mappings do not adhere to the semantics expected by a file system. A GUP caller uses the direct mapping to access the folio, which does not cause write notify to trigger, nor does it enforce that the caller marks the folio dirty. The problem arises when, after an initial write to the folio, writeback results in the folio being cleaned and then the caller, via the GUP interface, writes to the folio again. As a result of the use of this secondary, direct, mapping to the folio no write notify will occur, and if the caller does mark the folio dirty, this will be done so unexpectedly. For example, consider the following scenario:- 1. A folio is written to via GUP which write-faults the memory, notifying the file system and dirtying the folio. 2. Later, writeback is triggered, resulting in the folio being cleaned and the PTE being marked read-only. 3. The GUP caller writes to the folio, as it is mapped read/write via the direct mapping. 4. The GUP caller, now done with the page, unpins it and sets it dirty (though it does not have to). This change updates both the PUP FOLL_LONGTERM slow and fast APIs. As pin_user_pages_fast_only() does not exist, we can rely on a slightly imperfect whitelisting in the PUP-fast case and fall back to the slow case should this fail. v6: - Rebased on latest mm-unstable as of 28th April 2023. - Add PUP-fast check with handling for rcu-locked TLB shootdown to synchronise correctly. - Split patch series into 3 to make it more digestible. v5: - Rebased on latest mm-unstable as of 25th April 2023. - Some small refactorings suggested by John. - Added an extended description of the problem in the comment around writeable_file_mapping_allowed() for clarity. - Updated commit message as suggested by Mika and John. https://lore.kernel.org/all/6b73e692c2929dc4613af711bdf92e2ec1956a66.1682638385.git.lstoakes@gmail.com/ v4: - Split out vma_needs_dirty_tracking() from vma_wants_writenotify() to reduce duplication and update to use this in the GUP check. Note that both separately check vm_ops_needs_writenotify() as the latter needs to test this before the vm_pgprot_modify() test, resulting in vma_wants_writenotify() checking this twice, however it is such a small check this should not be egregious. https://lore.kernel.org/all/3b92d56f55671a0389252379237703df6e86ea48.1682464032.git.lstoakes@gmail.com/ v3: - Rebased on latest mm-unstable as of 24th April 2023. - Explicitly check whether file system requires folio dirtying. Note that vma_wants_writenotify() could not be used directly as it is very much focused on determining if the PTE r/w should be set (e.g. assuming private mapping does not require it as already set, soft dirty considerations). - Tested code against shmem and hugetlb mappings - confirmed that these are not disallowed by the check. - Eliminate FOLL_ALLOW_BROKEN_FILE_MAPPING flag and instead perform check only for FOLL_LONGTERM pins. - As a result, limit check to internal GUP code. https://lore.kernel.org/all/23c19e27ef0745f6d3125976e047ee0da62569d4.1682406295.git.lstoakes@gmail.com/ v2: - Add accidentally excluded ptrace_access_vm() use of FOLL_ALLOW_BROKEN_FILE_MAPPING. - Tweak commit message. https://lore.kernel.org/all/c8ee7e02d3d4f50bb3e40855c53bda39eec85b7d.1682321768.git.lstoakes@gmail.com/ v1: https://lore.kernel.org/all/f86dc089b460c80805e321747b0898fd1efe93d7.1682168199.git.lstoakes@gmail.com/ Lorenzo Stoakes (3): mm/mmap: separate writenotify and dirty tracking logic mm/gup: disallow FOLL_LONGTERM GUP-nonfast writing to file-backed mappings mm/gup: disallow FOLL_LONGTERM GUP-fast writing to file-backed mappings include/linux/mm.h | 1 + mm/gup.c | 128 +++++++++++++++++++++++++++++++++++++++++++-- mm/mmap.c | 36 +++++++++---- 3 files changed, 153 insertions(+), 12 deletions(-) -- 2.40.1