Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp720892ybt; Fri, 10 Jul 2020 10:36:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyH2yBmPmr0fyqrjhm8WkDhV6w//0Uv/ZWa7Q452xDwjz2IfwbA9Btx2ZUcEq6CtCS47QPL X-Received: by 2002:aa7:cf82:: with SMTP id z2mr61108409edx.15.1594402603867; Fri, 10 Jul 2020 10:36:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594402603; cv=none; d=google.com; s=arc-20160816; b=A3dQXEeRc02PaRi3usse0m0x6B0ZEMOfnjpJxAYGOvnAFf9kTLMEnUlHsEjEFllYSu dVNf3SLlIbtEw6FwjiBJRWLVMV0AghySW43kq6PAPQEUAgYP4AmB85dXyxA6EUkqK4Fd 1einrfj5fB+t+C/9IYalv+5zEdLkiD2zyI9BgDXZ55I7aTeSY3QMZhn2ncSDs99u1I9M vKqrjnr0DPgo94kP9+kxbpwWTx1e10RbCCLY07RyAc767UqP8yyJbcDBP/YQHQiZeN3z sF8mH01R7yvsji5KXfIPdO5BnsliAgYs5vLkDoOLYjqyckLbrbdU5zEVQTujzWIrlO6E CmOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=gSY1rcDPsE99OF+Cg3q4GJ4YlCtsXCLzuDwwf8+XXmo=; b=RXgVxVWgfCTSfx5zrOHE6/q/Gs42luuNsZc6MT1RvL6kxToY+CIQmPiggOgIxJt24q WPs2p6HaQyQJrvgAgE6W+Yv5IEtWGQSWcI93vbhB5O3vcNsEfL3dsrJQ1pCgBAuzUx4N zLkvwBZCi2aDCLaNpUnSGYTv54BWtc2a2gDhyWUFiHXAh8Hke5fP8uLamIe24BGvpChG wsetNyZM5bVM4EoWtI98qjOY42sBh5W8v9GZeHjfZFd14ZS13NBSyZgC2jDQKazK7LjQ 6MZDBhxNAkSEDo0o/bO1XGyIbLnc86r/9f26UJXuyvEfQG4oBO025TkVCEhTcSeAUyiz CqaQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y2si4730636edt.252.2020.07.10.10.36.20; Fri, 10 Jul 2020 10:36:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727916AbgGJRgG (ORCPT + 99 others); Fri, 10 Jul 2020 13:36:06 -0400 Received: from out30-57.freemail.mail.aliyun.com ([115.124.30.57]:46326 "EHLO out30-57.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727078AbgGJRgF (ORCPT ); Fri, 10 Jul 2020 13:36:05 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07425;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0U2JRgj0_1594402558; Received: from US-143344MP.local(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0U2JRgj0_1594402558) by smtp.aliyun-inc.com(127.0.0.1); Sat, 11 Jul 2020 01:36:00 +0800 Subject: Re: [PATCH] mm: Close race between munmap() and expand_upwards()/downwards() To: "Kirill A. Shutemov" , Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Jann Horn , stable@vger.kernel.org, Vlastimil Babka , Oleg Nesterov , Matthew Wilcox References: <20200709105309.42495-1-kirill.shutemov@linux.intel.com> From: Yang Shi Message-ID: Date: Fri, 10 Jul 2020 10:35:50 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20200709105309.42495-1-kirill.shutemov@linux.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/9/20 3:53 AM, Kirill A. Shutemov wrote: > VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under > mmap_read_lock(). It can lead to race with __do_munmap(): > > Thread A Thread B > __do_munmap() > detach_vmas_to_be_unmapped() > mmap_write_downgrade() > expand_downwards() > vma->vm_start = address; > // The VMA now overlaps with > // VMAs detached by the Thread A > // page fault populates expanded part > // of the VMA > unmap_region() > // Zaps pagetables partly > // populated by Thread B > > Similar race exists for expand_upwards(). > > The fix is to avoid downgrading mmap_lock in __do_munmap() if detached > VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA. Thanks for catching this. The fix makes sense to me. Reviewed-by: Yang Shi > > Signed-off-by: Kirill A. Shutemov > Reported-by: Jann Horn > Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") > Cc: # 4.20 > Cc: Yang Shi > Cc: Vlastimil Babka > Cc: Oleg Nesterov > Cc: Matthew Wilcox > --- > mm/mmap.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 59a4682ebf3f..71df4b36b42a 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -2620,7 +2620,7 @@ static void unmap_region(struct mm_struct *mm, > * Create a list of vma's touched by the unmap, removing them from the mm's > * vma list as we go.. > */ > -static void > +static bool > detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma, > struct vm_area_struct *prev, unsigned long end) > { > @@ -2645,6 +2645,17 @@ detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma, > > /* Kill the cache */ > vmacache_invalidate(mm); > + > + /* > + * Do not downgrade mmap_sem if we are next to VM_GROWSDOWN or > + * VM_GROWSUP VMA. Such VMAs can change their size under > + * down_read(mmap_sem) and collide with the VMA we are about to unmap. > + */ > + if (vma && (vma->vm_flags & VM_GROWSDOWN)) > + return false; > + if (prev && (prev->vm_flags & VM_GROWSUP)) > + return false; > + return true; > } > > /* > @@ -2825,7 +2836,8 @@ int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len, > } > > /* Detach vmas from rbtree */ > - detach_vmas_to_be_unmapped(mm, vma, prev, end); > + if (!detach_vmas_to_be_unmapped(mm, vma, prev, end)) > + downgrade = false; > > if (downgrade) > mmap_write_downgrade(mm);