Received: by 2002:ab2:4a89:0:b0:1f4:a8b6:6e69 with SMTP id w9csp233530lqj; Wed, 10 Apr 2024 08:58:17 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUplkbc0Szab12kIPF+sZfbQgPPW1g6auDDoYuoE6a1GyloW5Ja2adLapp2rzRO1L4/ZBIQQ1MLDxZGyIbq3XxtqI2PV2LxpBHCH/m7bg== X-Google-Smtp-Source: AGHT+IGz26I021QRNHhupjl2sUhvLQIEzzBPNQGUus3cX8bgAEL0e06kwOhMTUmgr2nappE4BNtG X-Received: by 2002:a05:620a:3728:b0:78d:6f37:3588 with SMTP id de40-20020a05620a372800b0078d6f373588mr4319769qkb.48.1712764697485; Wed, 10 Apr 2024 08:58:17 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712764697; cv=pass; d=google.com; s=arc-20160816; b=nDlzML8jBj0fMeqhEEvUK9Y/OPyT7ZzIKexNBc8t5UgGGVSvJZ0yCwy0psWeH7r6k2 DZ91L0SGNfDaPa6e/V/cIY4mtZFbgQr/UoX2YRy/mE5oH48GuGY+BFKN9njz8F7gq5iS MGkarV8I09cxSJ0gpQrOn4NcisY8UZdkBHK89ZZ4yTJHEuCo0P1VbWgYmQXtf0r1rUvh kaFyrCIkPMY4s6WDbkFmNdX2qWbT9gKLetGzo4JuGxOpQVuGtdKcoomsnMGa2luisAM1 9/75OB6EBPBBjTFyK/PWzVc95KVNGQewdxhhP+nomM4k9+5nJVu1tzXFJ95yHX6I5gMm RN5g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=P8ZsNI89NzeHBwo5bvI1yofxYUq5jUxcvm+li/FE14A=; fh=tWOzPakP8zg0e0Kta+Ukw51eack/wMocTURWcsX+blw=; b=e1Z98o4levKWrU/z6L1U2EuH0V82a7GjCVeoJgYnNCikTle56zVjd278dl3QiKJ8qo wVh0DLWaXzof3+QzmE1iS8bf2ewdX8LrpgGoOGW3q3Yim0GZXSEuBwilX5PCh52+c9dZ n2YV6suap2Mwrd5CjJRthxxvuZQ7z5zTMhpgAX+PJSC1MgZu6ZcoKcwUr2SYionOixJ5 l0asNXtPdfOO+48LivNqUAnSqXSPnumY9wYd/7jdQ32qY+lmhcBC65Dvb5XMq+QxyBHV q1KAxVZHKkGGLa4z+oIzSV8hE6UMJAs0fC+iz8c8eiSWZaUE5WEaByzoxO4SKc3YHmpd XZQA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dkZh4oMv; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-138937-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-138937-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id k8-20020a05620a414800b0078e82257f65si1335739qko.622.2024.04.10.08.58.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Apr 2024 08:58:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-138937-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dkZh4oMv; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-138937-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-138937-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 38BD11C20F9A for ; Wed, 10 Apr 2024 15:58:17 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 96A811791F3; Wed, 10 Apr 2024 15:56:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="dkZh4oMv" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F4B5178CDF for ; Wed, 10 Apr 2024 15:56:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712764571; cv=none; b=DEWNW8+5KCLH3uomAYZZvpSnMs48ngm6X29Coc43lzTFoWQLhas1e40YBGa/JVzIW2qlGnfrHIOiPcGKdVXBp4FYbyjcdSLkZa8JThYPZHhPE7cnfvAwdIXnE/p7lsq7pT3MdhvrASD6rVLixadaxiw9vHfD9OMGSS2BjUCn6xo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712764571; c=relaxed/simple; bh=ib/Czh9TMd/11JWLM99RkSfUuaLTcKgI+kz9vZpIItQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=N0i43GF9SYIsk6ZXbKTr2rrXii+0XDzRzC4UnvRjZfcwxozg1R1+XaWGDMrZCkie1oQtqcRPGGVtibbftwdFfKmXHKPkxNB8vygQgIFMIvlSO/c8sRvlddCCCGyCCu9ZRCpNNwVider7DFmBLrQ3PagQN6g33lpKrLdYqpylL10= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=dkZh4oMv; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712764568; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P8ZsNI89NzeHBwo5bvI1yofxYUq5jUxcvm+li/FE14A=; b=dkZh4oMv/vf3Q67wIKopWjS53RoA1TvxsQ2Km3Ap7CUM1LOl2sYjFVljHCKGfaCyAdcY4I iVK7VSDSa16G9eckvhz8aIBcWUqBrgbomQeP83gZzQ/OkSKx/Wf8xoRPYxQlvu+dEN4bS5 a1w0sz3zLtG+xo5/N21k3bBhXkKGYk0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-591-sePtCS1xOjyfrSAEA2ASBA-1; Wed, 10 Apr 2024 11:56:04 -0400 X-MC-Unique: sePtCS1xOjyfrSAEA2ASBA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8930C8007BA; Wed, 10 Apr 2024 15:56:03 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.193.162]) by smtp.corp.redhat.com (Postfix) with ESMTP id 82C97920; Wed, 10 Apr 2024 15:56:00 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, linux-s390@vger.kernel.org, kvm@vger.kernel.org, David Hildenbrand , Andrew Morton , Yonghua Huang , Fei Li , Christoph Hellwig , Gerald Schaefer , Heiko Carstens , Ingo Molnar , Alex Williamson , Paolo Bonzini Subject: [PATCH v1 3/3] mm: follow_pte() improvements Date: Wed, 10 Apr 2024 17:55:27 +0200 Message-ID: <20240410155527.474777-4-david@redhat.com> In-Reply-To: <20240410155527.474777-1-david@redhat.com> References: <20240410155527.474777-1-david@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 follow_pte() is now our main function to lookup PTEs in VM_PFNMAP/VM_IO VMAs. Let's perform some more sanity checks to make this exported function harder to abuse. Further, extend the doc a bit, it still focuses on the KVM use case with MMU notifiers. Drop the KVM+follow_pfn() comment, follow_pfn() is no more, and we have other users nowadays. Also extend the doc regarding refcounted pages and the interaction with MMU notifiers. KVM is one example that uses MMU notifiers and can deal with refcounted pages properly. VFIO is one example that doesn't use MMU notifiers, and to prevent use-after-free, rejects refcounted pages: pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn)). Protection changes are less of a concern for users like VFIO: the behavior is similar to longterm-pinning a page, and getting the PTE protection changed afterwards. The primary concern with refcounted pages is use-after-free, which callers should be aware of. Signed-off-by: David Hildenbrand --- mm/memory.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index ab01fb69dc72..535ef2686f95 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5935,15 +5935,21 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) * * On a successful return, the pointer to the PTE is stored in @ptepp; * the corresponding lock is taken and its location is stored in @ptlp. - * The contents of the PTE are only stable until @ptlp is released; - * any further use, if any, must be protected against invalidation - * with MMU notifiers. + * + * The contents of the PTE are only stable until @ptlp is released using + * pte_unmap_unlock(). This function will fail if the PTE is non-present. + * Present PTEs may include PTEs that map refcounted pages, such as + * anonymous folios in COW mappings. + * + * Callers must be careful when relying on PTE content after + * pte_unmap_unlock(). Especially if the PTE maps a refcounted page, + * callers must protect against invalidation with MMU notifiers; otherwise + * access to the PFN at a later point in time can trigger use-after-free. * * Only IO mappings and raw PFN mappings are allowed. The mmap semaphore * should be taken for read. * - * KVM uses this function. While it is arguably less bad than the historic - * ``follow_pfn``, it is not a good general-purpose API. + * This function must not be used to modify PTE content. * * Return: zero on success, -ve otherwise. */ @@ -5957,6 +5963,10 @@ int follow_pte(struct vm_area_struct *vma, unsigned long address, pmd_t *pmd; pte_t *ptep; + mmap_assert_locked(mm); + if (unlikely(address < vma->vm_start || address >= vma->vm_end)) + goto out; + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) goto out; -- 2.44.0