Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp1255706pxa; Thu, 20 Aug 2020 06:54:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyOIJfkn3NLhqctW0R4ttibsTmFZOzPCGpWp9E9V02N5cl3NZW7Nzrcpu5ynW426PcH7tQw X-Received: by 2002:a05:6402:1bc1:: with SMTP id ch1mr3024238edb.142.1597931650593; Thu, 20 Aug 2020 06:54:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597931650; cv=none; d=google.com; s=arc-20160816; b=fBC1v0/dWpeQjUObNNPjXLiv1TlPoIpGsJT6iyDM3s1WB/ajAlDLTkJUE9qwbp2kzJ +5XKYd6IXSGNMdRycpSA/cEzUD7FBgC3I1bAibqk8hyFruQpVO0T4jPYHdIDrkhpIUWQ p5AxLbY3cdyUq0xTSWlZ4L8gLcGItF2cslEKEu57jp+dGeA01y0HN7HclhHJeMSg8tHS 4/UooHVvlWQ6KZEHD+5dbX31b8yjjTpe21exguvKoIT+WtDfGCi8YX4w7lJPgMFssa7y S0ushJ86noBTqt8QfmMPNTkx0knq5XOk2OA/xDe84KGLVCQNFD3V6lxmpRoz1hJFQX2M B8Ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=EHHM0503XBLuZJHBLvObWG+xCfInIInjqviyLmF2fSA=; b=Y9yHUyw16mynigszDGRTj+VBVrSZDx82ldM6xRjja46w5S5KT3/ACrDzMoqXGYTtoK JbWs87/vNM0aZ8UcjsLnLFlnD6FdSCtpuxnQ4fekwpzFV6clCCqXtImRHQVqSDLBzK2N 2FB7eQqKGKmwGo7lw/gNwn+4XZd7nK5AD6GqANMKHikqVWB3ZGIWsPexxzDLHn/snJpl zJe98pCEsAgnegAAx8B07ruYbyDdIOR/GNodWtiyV+XIHYYxXWYWUIIqTZRbM+54oEf1 hiJcyrbwJ81Of/R5XsKydLeqQ6CYeFfLZ2+cq8A5xm0UEWGxzQu//eBI6uQ0/mKR7VKn 8avA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=s32Be73F; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q22si1368335eds.346.2020.08.20.06.53.46; Thu, 20 Aug 2020 06:54:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=s32Be73F; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730247AbgHTNul (ORCPT + 99 others); Thu, 20 Aug 2020 09:50:41 -0400 Received: from mail.kernel.org ([198.145.29.99]:34302 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727964AbgHTJ1P (ORCPT ); Thu, 20 Aug 2020 05:27:15 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5F0202075E; Thu, 20 Aug 2020 09:27:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1597915629; bh=MHhUlzLlI69tOcSkrDiQ7ZqvzmpcP6BrO37C/4ObuX4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=s32Be73Ffwh+Mv8Q2hxPlOvJKjHNT3k6QTZ5EbfeGj1b0LQf3LVfZSOQXgIdG5gX5 vuFQpY+kQJkJ1azpUUaEApgN1EEivAMzG53omd3rqH53gSpjvrw7a2P5MRAca2okl+ VPVhSoIRfiXiHvP3142yNuXUeHk/Pj2SdN6z3yFo= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Hugh Dickins , Andrew Morton , "Kirill A. Shutemov" , Andrea Arcangeli , Mike Kravetz , Song Liu , Linus Torvalds Subject: [PATCH 5.8 080/232] khugepaged: retract_page_tables() remember to test exit Date: Thu, 20 Aug 2020 11:18:51 +0200 Message-Id: <20200820091616.689152030@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200820091612.692383444@linuxfoundation.org> References: <20200820091612.692383444@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Hugh Dickins commit 18e77600f7a1ed69f8ce46c9e11cad0985712dfa upstream. Only once have I seen this scenario (and forgot even to notice what forced the eventual crash): a sequence of "BUG: Bad page map" alerts from vm_normal_page(), from zap_pte_range() servicing exit_mmap(); pmd:00000000, pte values corresponding to data in physical page 0. The pte mappings being zapped in this case were supposed to be from a huge page of ext4 text (but could as well have been shmem): my belief is that it was racing with collapse_file()'s retract_page_tables(), found *pmd pointing to a page table, locked it, but *pmd had become 0 by the time start_pte was decided. In most cases, that possibility is excluded by holding mmap lock; but exit_mmap() proceeds without mmap lock. Most of what's run by khugepaged checks khugepaged_test_exit() after acquiring mmap lock: khugepaged_collapse_pte_mapped_thps() and hugepage_vma_revalidate() do so, for example. But retract_page_tables() did not: fix that. The fix is for retract_page_tables() to check khugepaged_test_exit(), after acquiring mmap lock, before doing anything to the page table. Getting the mmap lock serializes with __mmput(), which briefly takes and drops it in __khugepaged_exit(); then the khugepaged_test_exit() check on mm_users makes sure we don't touch the page table once exit_mmap() might reach it, since exit_mmap() will be proceeding without mmap lock, not expecting anyone to be racing with it. Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Hugh Dickins Signed-off-by: Andrew Morton Acked-by: Kirill A. Shutemov Cc: Andrea Arcangeli Cc: Mike Kravetz Cc: Song Liu Cc: [4.8+] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021215400.27773@eggly.anvils Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/khugepaged.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1532,6 +1532,7 @@ out: static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) { struct vm_area_struct *vma; + struct mm_struct *mm; unsigned long addr; pmd_t *pmd, _pmd; @@ -1560,7 +1561,8 @@ static void retract_page_tables(struct a continue; if (vma->vm_end < addr + HPAGE_PMD_SIZE) continue; - pmd = mm_find_pmd(vma->vm_mm, addr); + mm = vma->vm_mm; + pmd = mm_find_pmd(mm, addr); if (!pmd) continue; /* @@ -1570,17 +1572,19 @@ static void retract_page_tables(struct a * mmap_lock while holding page lock. Fault path does it in * reverse order. Trylock is a way to avoid deadlock. */ - if (mmap_write_trylock(vma->vm_mm)) { - spinlock_t *ptl = pmd_lock(vma->vm_mm, pmd); - /* assume page table is clear */ - _pmd = pmdp_collapse_flush(vma, addr, pmd); - spin_unlock(ptl); - mmap_write_unlock(vma->vm_mm); - mm_dec_nr_ptes(vma->vm_mm); - pte_free(vma->vm_mm, pmd_pgtable(_pmd)); + if (mmap_write_trylock(mm)) { + if (!khugepaged_test_exit(mm)) { + spinlock_t *ptl = pmd_lock(mm, pmd); + /* assume page table is clear */ + _pmd = pmdp_collapse_flush(vma, addr, pmd); + spin_unlock(ptl); + mm_dec_nr_ptes(mm); + pte_free(mm, pmd_pgtable(_pmd)); + } + mmap_write_unlock(mm); } else { /* Try again later */ - khugepaged_add_pte_mapped_thp(vma->vm_mm, addr); + khugepaged_add_pte_mapped_thp(mm, addr); } } i_mmap_unlock_write(mapping);