Received: by 2002:ac0:aed5:0:0:0:0:0 with SMTP id t21csp2050508imb; Sun, 3 Mar 2019 16:14:20 -0800 (PST) X-Google-Smtp-Source: APXvYqwrP5rl90DRKlTdaKy+0cFvPtqUtrFZ6MjtlNsmXPG+fQu4YexmTtA697lHXkt0LT2dLb2l X-Received: by 2002:a63:ed0b:: with SMTP id d11mr15847737pgi.435.1551658460699; Sun, 03 Mar 2019 16:14:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551658460; cv=none; d=google.com; s=arc-20160816; b=p6iyiQpuO30F8KmOvZIsstbg3rj6aesA/DBLxaeWrll0PIciTZ6lPDRXhVYVlH3UdM yVuTDhfG6yx8gDgSNtoD2hGZw/w014jHzSG37LGDcoBWPvJovog/m/wXjvFtFJNxTjjV m4DcUe0D/PJrLzktx67iUhIynnBNjEm0hzdRIZ7V6a9Q7kPtQBEDkoAkslAkOd8gk1W1 PjXyIwL4hTyNOr5xS0L/yZRgVnXdB8k8trzgBaHV2D6whHEH1GCEyyakKosxndVO/mIn eq5lmQHU7fnhLEn1a8nBwrI8XgZ0FL2d2CY8Vf/f0oueBfJhH/LCCijNKNSGNzkc92Rt B05w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=Q4R4YZtBEN7I9uyGVnGwsthpgpOBRFSntkvgaLtbe6s=; b=KRpnpMiWJcMLIIxIIAzGxfHaSBgujeULIbvhzbkNxFcOsySZ8EMj7d2jyw/xamQxIp T4rtk00G7zx6F8dMNkFuO2S0rT5JbMkcmalMkP1nsZhfg9WorYgELx76yk+HoSf9e8xR UdHC2559EFGH7JoMpfxnNBdRwxK8n70JMUs28Bb4FN+IboTB5MYYtUTgNpc9W34CjfEO ke5HvyZqCv53aylobYd7iVYP3RY09BUc9SQOBw5ghg/HuLYhvqeHtqAaNxVw8vPzIWHS xLPiPNlelHjlvnEv2DFB0JzxCcJBu4YQ8eYssk3M3tHQxwWlUtatYSVPSq4aSzWD1TwF 2RSw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z3si3945657pgr.90.2019.03.03.16.14.04; Sun, 03 Mar 2019 16:14:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725993AbfCDANh (ORCPT + 99 others); Sun, 3 Mar 2019 19:13:37 -0500 Received: from mx1.redhat.com ([209.132.183.28]:42738 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725962AbfCDANh (ORCPT ); Sun, 3 Mar 2019 19:13:37 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4CBCF3084215; Mon, 4 Mar 2019 00:13:36 +0000 (UTC) Received: from x230.aquini.net (ovpn-116-145.phx2.redhat.com [10.3.116.145]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B15A75D6A6; Mon, 4 Mar 2019 00:13:30 +0000 (UTC) Date: Sun, 3 Mar 2019 21:13:28 -0300 From: Rafael Aquini To: Jan Stancek Cc: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, peterz@infradead.org, riel@surriel.com, mhocko@suse.com, ying.huang@intel.com, jrdr.linux@gmail.com, jglisse@redhat.com, aneesh.kumar@linux.ibm.com, david@redhat.com, aarcange@redhat.com, raquini@redhat.com, rientjes@google.com, kirill@shutemov.name, mgorman@techsingularity.net, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3] mm/memory.c: do_fault: avoid usage of stale vm_area_struct Message-ID: <20190304001328.GA27580@x230.aquini.net> References: <20190302185144.GD31083@redhat.com> <5b3fdf19e2a5be460a384b936f5b56e13733f1b8.1551595137.git.jstancek@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5b3fdf19e2a5be460a384b936f5b56e13733f1b8.1551595137.git.jstancek@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Mon, 04 Mar 2019 00:13:37 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Mar 03, 2019 at 08:28:04AM +0100, Jan Stancek wrote: > LTP testcase mtest06 [1] can trigger a crash on s390x running 5.0.0-rc8. > This is a stress test, where one thread mmaps/writes/munmaps memory area > and other thread is trying to read from it: > > CPU: 0 PID: 2611 Comm: mmap1 Not tainted 5.0.0-rc8+ #51 > Hardware name: IBM 2964 N63 400 (z/VM 6.4.0) > Krnl PSW : 0404e00180000000 00000000001ac8d8 (__lock_acquire+0x7/0x7a8) > Call Trace: > ([<0000000000000000>] (null)) > [<00000000001adae4>] lock_acquire+0xec/0x258 > [<000000000080d1ac>] _raw_spin_lock_bh+0x5c/0x98 > [<000000000012a780>] page_table_free+0x48/0x1a8 > [<00000000002f6e54>] do_fault+0xdc/0x670 > [<00000000002fadae>] __handle_mm_fault+0x416/0x5f0 > [<00000000002fb138>] handle_mm_fault+0x1b0/0x320 > [<00000000001248cc>] do_dat_exception+0x19c/0x2c8 > [<000000000080e5ee>] pgm_check_handler+0x19e/0x200 > > page_table_free() is called with NULL mm parameter, but because > "0" is a valid address on s390 (see S390_lowcore), it keeps > going until it eventually crashes in lockdep's lock_acquire. > This crash is reproducible at least since 4.14. > > Problem is that "vmf->vma" used in do_fault() can become stale. > Because mmap_sem may be released, other threads can come in, > call munmap() and cause "vma" be returned to kmem cache, and > get zeroed/re-initialized and re-used: > > handle_mm_fault | > __handle_mm_fault | > do_fault | > vma = vmf->vma | > do_read_fault | > __do_fault | > vma->vm_ops->fault(vmf); | > mmap_sem is released | > | > | do_munmap() > | remove_vma_list() > | remove_vma() > | vm_area_free() > | # vma is released > | ... > | # same vma is allocated > | # from kmem cache > | do_mmap() > | vm_area_alloc() > | memset(vma, 0, ...) > | > pte_free(vma->vm_mm, ...); | > page_table_free | > spin_lock_bh(&mm->context.lock);| > | > > Cache mm_struct to avoid using potentially stale "vma". > > [1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/mtest06/mmap1.c > > Signed-off-by: Jan Stancek > Reviewed-by: Andrea Arcangeli > --- > mm/memory.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/memory.c b/mm/memory.c > index e11ca9dd823f..e8d69ade5acc 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3517,10 +3517,13 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf) > * but allow concurrent faults). > * The mmap_sem may have been released depending on flags and our > * return value. See filemap_fault() and __lock_page_or_retry(). > + * If mmap_sem is released, vma may become invalid (for example > + * by other thread calling munmap()). > */ > static vm_fault_t do_fault(struct vm_fault *vmf) > { > struct vm_area_struct *vma = vmf->vma; > + struct mm_struct *vm_mm = vma->vm_mm; > vm_fault_t ret; > > /* > @@ -3561,7 +3564,7 @@ static vm_fault_t do_fault(struct vm_fault *vmf) > > /* preallocated pagetable is unused: free it */ > if (vmf->prealloc_pte) { > - pte_free(vma->vm_mm, vmf->prealloc_pte); > + pte_free(vm_mm, vmf->prealloc_pte); > vmf->prealloc_pte = NULL; > } > return ret; > -- > 1.8.3.1 > Acked-by: Rafael Aquini