Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp780538pxb; Tue, 12 Apr 2022 13:16:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwqtfXbHAcQOjJcDCwCTm4isstfLo9JabUsjYfvF+pK7CVkk05AeFRXa0KqrvEFp46X0ZQW X-Received: by 2002:a17:903:40ce:b0:14d:8ab1:919 with SMTP id t14-20020a17090340ce00b0014d8ab10919mr12970575pld.122.1649794584494; Tue, 12 Apr 2022 13:16:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649794584; cv=none; d=google.com; s=arc-20160816; b=Mt2R3pXUVTe1B8dJp/31+SpH9RYDOFlsCFyCeW1bWSc21p/7aYcldXDHHN75wVHxfU 4LHIB+7Sv19DHlBXEG6St/CdTnYLyPhWARXJYbq9mXz6VCEyqTseA/q//uQq854reGNc sZwfdLTsdgt2VOYUich6V9S4ekyqRUjNO1YUrY2rbrv6h1V2JZBKwllCGLKvc1MEEIVH u79Lx75cSR7rO6B5NtTOS9B04ueRS1VZ5O6F2sGDneSF5o+olwDyuISyEzXdys0PzhE2 m4nieYiIjQ/uK8uNqGpV4aS0DhNn55vzTo2lT1CA0K1xZfzSMhBKJJ2nbObV1aYKGD7d Duvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=Vbo9Xu8hEfWjJ2p56iq75EsLyjEwBSOPTvFoQh/ngvw=; b=nzoFhCKmJhXYnNd7mGDYfHgJbbPkDXmPU5qGV/PGqdOxB7b9g14L6nO6XPhWtmk6hr GLQ7bltd3hItk99vMBcoV/yt8MkwT/F0y/5MQIQdVht11Q/tNsr39EJNYX5EZxFVlPQZ eHZPjTzKnXLyKLODUf6QxUsn4ZftLakCS0/XV0hHhaZhskYs2kxwgNSpGUGKMCR2zqdb YGV1WO06iuegDHxonh+gUGzwJM5j6G97kbDolM/S6Bmy54V2aB26h9iFgFQnf1TW8Jpy M7UKf5G0R4s3Pnmdr/nT+FaWBWp0WIk9wHQSGmOgdcp1zT8oqFo1cABIuE1zi3SFJz+U Jh5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=bZNeMkTy; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id n4-20020a17090a928400b001c635828b95si16365069pjo.100.2022.04.12.13.16.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Apr 2022 13:16:24 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=bZNeMkTy; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9C6F97806F; Tue, 12 Apr 2022 12:59:57 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242491AbiDLG7m (ORCPT + 99 others); Tue, 12 Apr 2022 02:59:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1351462AbiDLGxk (ORCPT ); Tue, 12 Apr 2022 02:53:40 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32CF03FD89; Mon, 11 Apr 2022 23:41:02 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C3E9960A6A; Tue, 12 Apr 2022 06:41:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9CCFBC385A6; Tue, 12 Apr 2022 06:41:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1649745661; bh=xD81+Rr8cJDb3Xf3pXhS9TlxXt3tbEQ3dRJHQ1ihxVI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bZNeMkTydWwBChEbLZF7J35P8xoDjzpJ4/LkEB53njZ0xPkK6bCTd2FhUk1tBnIFY 6jgGPBSWsAGDGqZlZK3t8DMo2oJwmQSbFQCajsLZza/uIA0sHChrxGT0Y1xu7cw6YF Yc3RK0Pj/Oz369sKDdQCHmgEPB2ZlXXojUv1LluA= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Peter Xu , John Hubbard , David Hildenbrand , Hugh Dickins , Alistair Popple , Andrea Arcangeli , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Yang Shi , Andrew Morton , Linus Torvalds Subject: [PATCH 5.10 161/171] mm: dont skip swap entry even if zap_details specified Date: Tue, 12 Apr 2022 08:30:52 +0200 Message-Id: <20220412062932.557154134@linuxfoundation.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220412062927.870347203@linuxfoundation.org> References: <20220412062927.870347203@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Peter Xu commit 5abfd71d936a8aefd9f9ccd299dea7a164a5d455 upstream. Patch series "mm: Rework zap ptes on swap entries", v5. Patch 1 should fix a long standing bug for zap_pte_range() on zap_details usage. The risk is we could have some swap entries skipped while we should have zapped them. Migration entries are not the major concern because file backed memory always zap in the pattern that "first time without page lock, then re-zap with page lock" hence the 2nd zap will always make sure all migration entries are already recovered. However there can be issues with real swap entries got skipped errornoously. There's a reproducer provided in commit message of patch 1 for that. Patch 2-4 are cleanups that are based on patch 1. After the whole patchset applied, we should have a very clean view of zap_pte_range(). Only patch 1 needs to be backported to stable if necessary. This patch (of 4): The "details" pointer shouldn't be the token to decide whether we should skip swap entries. For example, when the callers specified details->zap_mapping==NULL, it means the user wants to zap all the pages (including COWed pages), then we need to look into swap entries because there can be private COWed pages that was swapped out. Skipping some swap entries when details is non-NULL may lead to wrongly leaving some of the swap entries while we should have zapped them. A reproducer of the problem: ===8<=== #define _GNU_SOURCE /* See feature_test_macros(7) */ #include #include #include #include #include int page_size; int shmem_fd; char *buffer; void main(void) { int ret; char val; page_size = getpagesize(); shmem_fd = memfd_create("test", 0); assert(shmem_fd >= 0); ret = ftruncate(shmem_fd, page_size * 2); assert(ret == 0); buffer = mmap(NULL, page_size * 2, PROT_READ | PROT_WRITE, MAP_PRIVATE, shmem_fd, 0); assert(buffer != MAP_FAILED); /* Write private page, swap it out */ buffer[page_size] = 1; madvise(buffer, page_size * 2, MADV_PAGEOUT); /* This should drop private buffer[page_size] already */ ret = ftruncate(shmem_fd, page_size); assert(ret == 0); /* Recover the size */ ret = ftruncate(shmem_fd, page_size * 2); assert(ret == 0); /* Re-read the data, it should be all zero */ val = buffer[page_size]; if (val == 0) printf("Good\n"); else printf("BUG\n"); } ===8<=== We don't need to touch up the pmd path, because pmd never had a issue with swap entries. For example, shmem pmd migration will always be split into pte level, and same to swapping on anonymous. Add another helper should_zap_cows() so that we can also check whether we should zap private mappings when there's no page pointer specified. This patch drops that trick, so we handle swap ptes coherently. Meanwhile we should do the same check upon migration entry, hwpoison entry and genuine swap entries too. To be explicit, we should still remember to keep the private entries if even_cows==false, and always zap them when even_cows==true. The issue seems to exist starting from the initial commit of git. [peterx@redhat.com: comment tweaks] Link: https://lkml.kernel.org/r/20220217060746.71256-2-peterx@redhat.com Link: https://lkml.kernel.org/r/20220217060746.71256-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20220216094810.60572-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20220216094810.60572-2-peterx@redhat.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Peter Xu Reviewed-by: John Hubbard Cc: David Hildenbrand Cc: Hugh Dickins Cc: Alistair Popple Cc: Andrea Arcangeli Cc: "Kirill A . Shutemov" Cc: Matthew Wilcox Cc: Vlastimil Babka Cc: Yang Shi Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/memory.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) --- a/mm/memory.c +++ b/mm/memory.c @@ -1204,6 +1204,17 @@ copy_page_range(struct vm_area_struct *d return ret; } +/* Whether we should zap all COWed (private) pages too */ +static inline bool should_zap_cows(struct zap_details *details) +{ + /* By default, zap all pages */ + if (!details) + return true; + + /* Or, we zap COWed pages only if the caller wants to */ + return !details->check_mapping; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1295,16 +1306,18 @@ again: continue; } - /* If details->check_mapping, we leave swap entries. */ - if (unlikely(details)) - continue; - - if (!non_swap_entry(entry)) + if (!non_swap_entry(entry)) { + /* Genuine swap entry, hence a private anon page */ + if (!should_zap_cows(details)) + continue; rss[MM_SWAPENTS]--; - else if (is_migration_entry(entry)) { + } else if (is_migration_entry(entry)) { struct page *page; page = migration_entry_to_page(entry); + if (details && details->check_mapping && + details->check_mapping != page_rmapping(page)) + continue; rss[mm_counter(page)]--; } if (unlikely(!free_swap_and_cache(entry)))