Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp633635ybt; Fri, 19 Jun 2020 09:48:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwfCUUlubouMKMSKls2lIkG1/KJdO93BhpuOlitlElP/S1DRGJUMOljBIN9rZ8xN0yt15VM X-Received: by 2002:aa7:db51:: with SMTP id n17mr4060755edt.241.1592585326168; Fri, 19 Jun 2020 09:48:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592585326; cv=none; d=google.com; s=arc-20160816; b=ojusLN3DOU4vf4YpX/p+5ANq9ER16fjZqQ6BaOAermSmnlRaUP76uOQ5feEAeRLolp kR9Cni0RdrhqqqBPWBFEpVTdSurkM6iEvsTglvKzsU9iwJ2v4U9cweTjV8b7ngdpEKXU sgQF/Dw8A0IBhVXkV8OtKLPEnc67o44wBP2FPeR+QMc9s+wgVLIT70pfnTWaAgriAVsY TYJFIQ92VdBFuHcT40cFCLm+Fyk718KPV1cljefglomjpELNA73TUpsP+tG/qHurBm2k HKuaDcGrfANH216WD5brQbIrtnLhI6DTlikw39GVaJpKC0UzcHgPwrd6ko24/mqVpHI6 r4IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=q0wQbFYGTwzsFO9lk5TlJqu3Nk4PHEoOcpuWSZeSWQ4=; b=BD5FG5mC+ytPjIhPvc38fOz6MZlEd2678/IRjOWqs9etbF/Ld3gUq6//zoWejHkn73 23Oo0A5b21/Oqx7ZUhocQFWibWTAXZjFoIxb9aRu0Gb28OmDChJDDNlVjWq5258YB6Sv OfuJh2D04Y3ldBxkvzqvJ70WfkN50QTV7at5opi6btJlC0JMGO1qC8xEX0lCGlINEzM1 AX0b/V4xj4TVCe3Z3xIqBI1OqvBoov8cbilScRmHMyqaqJprtq46BVnD3+VeHkdbmrB3 g9t0BC5PauBYvK9at+WB5hTeUpOZsKDaTup8uGlXX73l7RJG5IPIK/SlQheMuLAkDCkr 00/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=LE6O7PhZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cb22si4322833ejb.99.2020.06.19.09.48.23; Fri, 19 Jun 2020 09:48:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=LE6O7PhZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2395565AbgFSQng (ORCPT + 99 others); Fri, 19 Jun 2020 12:43:36 -0400 Received: from mail.kernel.org ([198.145.29.99]:34376 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388476AbgFSOn1 (ORCPT ); Fri, 19 Jun 2020 10:43:27 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9AD022168B; Fri, 19 Jun 2020 14:43:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1592577807; bh=uFwyZt1jxmvhD48DavXd5qLHB47tEeS/Z/5wVKXfbQk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LE6O7PhZfTgJYu1q0yHPED01VVMfSRneKL7D7e9cK1gI0UUFyWDK3MjTfAMqMmyu6 aBWDbo46N9XztKVs9eq7rxFCX0ZNIjmzeYVKZsBSuX6vB1A+1NVr6OHxr3ggMoA5so ZMJQk13UKxav6+zMJ2AtgUwuhAw9/ZBkngYRsJXU= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Andrea Arcangeli , Jann Horn , "Kirill A. Shutemov" , Linus Torvalds Subject: [PATCH 4.9 098/128] mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked() Date: Fri, 19 Jun 2020 16:33:12 +0200 Message-Id: <20200619141625.314982137@linuxfoundation.org> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200619141620.148019466@linuxfoundation.org> References: <20200619141620.148019466@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andrea Arcangeli commit c444eb564fb16645c172d550359cb3d75fe8a040 upstream. Write protect anon page faults require an accurate mapcount to decide if to break the COW or not. This is implemented in the THP path with reuse_swap_page() -> page_trans_huge_map_swapcount()/page_trans_huge_mapcount(). If the COW triggers while the other processes sharing the page are under a huge pmd split, to do an accurate reading, we must ensure the mapcount isn't computed while it's being transferred from the head page to the tail pages. reuse_swap_cache() already runs serialized by the page lock, so it's enough to add the page lock around __split_huge_pmd_locked too, in order to add the missing serialization. Note: the commit in "Fixes" is just to facilitate the backporting, because the code before such commit didn't try to do an accurate THP mapcount calculation and it instead used the page_count() to decide if to COW or not. Both the page_count and the pin_count are THP-wide refcounts, so they're inaccurate if used in reuse_swap_page(). Reverting such commit (besides the unrelated fix to the local anon_vma assignment) would have also opened the window for memory corruption side effects to certain workloads as documented in such commit header. Signed-off-by: Andrea Arcangeli Suggested-by: Jann Horn Reported-by: Jann Horn Acked-by: Kirill A. Shutemov Fixes: 6d0a07edd17c ("mm: thp: calculate the mapcount correctly for THP pages during WP faults") Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/huge_memory.c | 31 ++++++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1755,6 +1755,8 @@ void __split_huge_pmd(struct vm_area_str spinlock_t *ptl; struct mm_struct *mm = vma->vm_mm; unsigned long haddr = address & HPAGE_PMD_MASK; + bool was_locked = false; + pmd_t _pmd; mmu_notifier_invalidate_range_start(mm, haddr, haddr + HPAGE_PMD_SIZE); ptl = pmd_lock(mm, pmd); @@ -1764,11 +1766,32 @@ void __split_huge_pmd(struct vm_area_str * pmd against. Otherwise we can end up replacing wrong page. */ VM_BUG_ON(freeze && !page); - if (page && page != pmd_page(*pmd)) - goto out; + if (page) { + VM_WARN_ON_ONCE(!PageLocked(page)); + was_locked = true; + if (page != pmd_page(*pmd)) + goto out; + } +repeat: if (pmd_trans_huge(*pmd)) { - page = pmd_page(*pmd); + if (!page) { + page = pmd_page(*pmd); + if (unlikely(!trylock_page(page))) { + get_page(page); + _pmd = *pmd; + spin_unlock(ptl); + lock_page(page); + spin_lock(ptl); + if (unlikely(!pmd_same(*pmd, _pmd))) { + unlock_page(page); + put_page(page); + page = NULL; + goto repeat; + } + put_page(page); + } + } if (PageMlocked(page)) clear_page_mlock(page); } else if (!pmd_devmap(*pmd)) @@ -1776,6 +1799,8 @@ void __split_huge_pmd(struct vm_area_str __split_huge_pmd_locked(vma, pmd, haddr, freeze); out: spin_unlock(ptl); + if (!was_locked && page) + unlock_page(page); mmu_notifier_invalidate_range_end(mm, haddr, haddr + HPAGE_PMD_SIZE); }