Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp794317pxp; Fri, 11 Mar 2022 15:18:59 -0800 (PST) X-Google-Smtp-Source: ABdhPJz4nmcSEyYVzSFb3TcCSEwigHx8vzRY3qOMdLTQJziOXXtkvOw2scjkFgDLaYq0ul/k6oYm X-Received: by 2002:a62:b410:0:b0:4f7:b6a:924f with SMTP id h16-20020a62b410000000b004f70b6a924fmr12217793pfn.62.1647040739214; Fri, 11 Mar 2022 15:18:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1647040739; cv=none; d=google.com; s=arc-20160816; b=Iz5Lq+IZDAj2bRxA3/jfLcPR+VI5otNkrUWRdVcKGbqR55spR/OY2uabKzbWsvhhrJ BZLVZ1Gj/VMH7D97zbu7x2SM/Ccxtegfr3Ti1XuTWCMP60wvN42pYPzmfVq35YGckHzZ k7TqVT40WCrn6kQqqOgi0WKPJQ0hmRP3zdvC5oVnHDwr7gj5fpZaeKVgxkInSNuETN7W Rz4I8fSzTozy3f9i4MmfHZR6hoRUvH/gwCMdCc0+Ean22AI/ErWuWK4qIVr3TF1MClsd W+yRQ5DnrVpo5V+WQm8m9YExzVWv8VzfuS5h9le6zmj+0rbAyqh0UbEZwf5Nkxk9k9u2 nz/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=6ECjPK11wsuEVryWmqpR3k3ZiIuxweVEaECaw5mE+94=; b=DbSlJgX7UX3lEj23SuDhtq1UYEaxTTcbsB90SX8v1rA5UkQEw2XuKO5e7Zxk1Er/Qc pcOQgrREtikO7X+X8ZER11XoUmRl16FTpfF0ofx+rYjNW7M3q6xD0ZLFES8Iu6sjx3Am NcwVDQbE7MM53DrYVJ+AsmGJdVPJ0k+LedRNhoSQfJW4hFrwbR1hBdXUIfSh+YEieJHW tds3YjLdOcVbWnWUb5t+aW60pXRwW7pjEUvLgqKwlltbl8Y/bl/dJ+hhL/5NWjTovxkL bx3uVAFL/rkacZ+DlcKFYwzbx2v4o8afYSq9qj1S7LkYSxW/LDgFhk/uCtzudzIkVV2I o0KA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=GmLy3PEJ; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id gj21-20020a17090b109500b001bd14e03040si5934290pjb.24.2022.03.11.15.18.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 15:18:59 -0800 (PST) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=GmLy3PEJ; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 3BFE92AEFBF; Fri, 11 Mar 2022 14:05:38 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344723AbiCJXcT (ORCPT + 99 others); Thu, 10 Mar 2022 18:32:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233647AbiCJXcS (ORCPT ); Thu, 10 Mar 2022 18:32:18 -0500 Received: from mail-yw1-x112b.google.com (mail-yw1-x112b.google.com [IPv6:2607:f8b0:4864:20::112b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B49F19CCC3 for ; Thu, 10 Mar 2022 15:31:16 -0800 (PST) Received: by mail-yw1-x112b.google.com with SMTP id 00721157ae682-2dbc48104beso76490687b3.5 for ; Thu, 10 Mar 2022 15:31:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6ECjPK11wsuEVryWmqpR3k3ZiIuxweVEaECaw5mE+94=; b=GmLy3PEJ+iGbMTqEd3nft0r2e1Htire9ySKkYTK3SJdj26W8x5OK9maG0s0nrk5qai YYTDjp0Xk4HaMlr2adQ04baCGdz27GQ2lHKwd67QfpLyvfN68DHPad2jsJtfaR/H7b+P BDOCgNgNszHnF8lQ/A1NwyQ5neBC+IXQbumfcBpywcLx8UgLsdF2a6888Y6A/KYGwVTE IjN/F2xJ0EWsDl+V5JL6i36r35mL7cuZNK0VA5oc3yv8i9O22WIF6/2SoDisAz8KuYBM zLhBH00ueSytiX/8m35w65JCauLCxlI3lEObx186Q5peIpy/OkrJyaSVP6MblDcG+PlO 9Hjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6ECjPK11wsuEVryWmqpR3k3ZiIuxweVEaECaw5mE+94=; b=uLlZsz8aumsr/r19nhTbNkNG7eqV7hbGAbOQ7m7UYRy0sZhfM4I1fKxLkHoqGT+/Jd b4gLFQ6V4fU+Sx65A/86bUSNdAv/vGloBe+AYkjNAiBZ1i9EY9SK3NOc/DI2+ZiNJbj8 A4npbzymsLuvAbt6PvFjp/v2AbYLps+w9QabeRE8lMNd3JTcfGRgYZUEfP9gV9I1KUOW fbwFxtRhl9WeYtrW/TrD1FnapqW2iqENJpSu7iG8m/5NPA3ZPREjiF291TYOT0yWRi08 UNtRGnP1vCz6jsj4fS+PkPQsVFEREwEkBxt/fe0cbrMUR5AVqqEc3QGss1cpWF3N5/+9 U29g== X-Gm-Message-State: AOAM530R1WH70a9KSfKlRCGdisvdssTT9ZYR+V/7uLeinqbTQjNo+qLt 5i8bLQimSxnhM4tC8YDr7yXndbpyVGU082ttIUrIxA== X-Received: by 2002:a0d:c847:0:b0:2d6:9010:5721 with SMTP id k68-20020a0dc847000000b002d690105721mr6251548ywd.380.1646955075261; Thu, 10 Mar 2022 15:31:15 -0800 (PST) MIME-Version: 1.0 References: <20220215201922.1908156-1-surenb@google.com> <20220224201859.a38299b6c9d52cb51e6738ea@linux-foundation.org> <20220310155454.g6lt54yxel3ixnp3@revolver> <20220310222206.dttvvlgfqysrcl2s@revolver> In-Reply-To: <20220310222206.dttvvlgfqysrcl2s@revolver> From: Suren Baghdasaryan Date: Thu, 10 Mar 2022 15:31:04 -0800 Message-ID: Subject: Re: [PATCH 1/1] mm: fix use-after-free bug when mm->mmap is reused after being freed To: Liam Howlett Cc: Matthew Wilcox , Andrew Morton , "mhocko@kernel.org" , "mhocko@suse.com" , "shy828301@gmail.com" , "rientjes@google.com" , "hannes@cmpxchg.org" , "guro@fb.com" , "riel@surriel.com" , "minchan@kernel.org" , "kirill@shutemov.name" , "aarcange@redhat.com" , "brauner@kernel.org" , "christian@brauner.io" , "hch@infradead.org" , "oleg@redhat.com" , "david@redhat.com" , "jannh@google.com" , "shakeelb@google.com" , "luto@kernel.org" , "christian.brauner@ubuntu.com" , "fweimer@redhat.com" , "jengelh@inai.de" , "timmurray@google.com" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "kernel-team@android.com" , "syzbot+2ccf63a4bd07cf39cab0@syzkaller.appspotmail.com" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 10, 2022 at 2:22 PM Liam Howlett wrote: > > * Suren Baghdasaryan [220310 11:28]: > > On Thu, Mar 10, 2022 at 7:55 AM Liam Howlett wrote: > > > > > > * Suren Baghdasaryan [220225 00:51]: > > > > On Thu, Feb 24, 2022 at 8:23 PM Matthew Wilcox wrote: > > > > > > > > > > On Thu, Feb 24, 2022 at 08:18:59PM -0800, Andrew Morton wrote: > > > > > > On Tue, 15 Feb 2022 12:19:22 -0800 Suren Baghdasaryan wrote: > > > > > > > > > > > > > After exit_mmap frees all vmas in the mm, mm->mmap needs to be reset, > > > > > > > otherwise it points to a vma that was freed and when reused leads to > > > > > > > a use-after-free bug. > > > > > > > > > > > > > > ... > > > > > > > > > > > > > > --- a/mm/mmap.c > > > > > > > +++ b/mm/mmap.c > > > > > > > @@ -3186,6 +3186,7 @@ void exit_mmap(struct mm_struct *mm) > > > > > > > vma = remove_vma(vma); > > > > > > > cond_resched(); > > > > > > > } > > > > > > > + mm->mmap = NULL; > > > > > > > mmap_write_unlock(mm); > > > > > > > vm_unacct_memory(nr_accounted); > > > > > > > } > > > > > > > > > > > > After the Maple tree patches, mm_struct.mmap doesn't exist. So I'll > > > > > > revert this fix as part of merging the maple-tree parts of linux-next. > > > > > > I'll be sending this fix to Linus this week. > > > > > > > > > > > > All of which means that the thusly-resolved Maple tree patches might > > > > > > reintroduce this use-after-free bug. > > > > > > > > > > I don't think so? The problem is that VMAs are (currently) part of > > > > > two data structures -- the rbtree and the linked list. remove_vma() > > > > > only removes VMAs from the rbtree; it doesn't set mm->mmap to NULL. > > > > > > > > > > With maple tree, the linked list goes away. remove_vma() removes VMAs > > > > > from the maple tree. So anyone looking to iterate over all VMAs has to > > > > > go and look in the maple tree for them ... and there's nothing there. > > > > > > > > Yes, I think you are right. With maple trees we don't need this fix. > > > > > > > > > Yes, this is correct. The maple tree removes the entire linked list... > > > but since the mm is unstable in the exit_mmap(), I had added the > > > destruction of the maple tree there. Maybe this is the wrong place to > > > be destroying the tree tracking the VMAs (althought this patch partially > > > destroys the VMA tracking linked list), but it brought my attention to > > > the race that this patch solves and the process_mrelease() function. > > > Couldn't this be avoided by using mmget_not_zero() instead of mmgrab() > > > in process_mrelease()? > > > > That's what we were doing before [1]. That unfortunately has a problem > > of process_mrelease possibly calling the last mmput and being blocked > > on IO completion in exit_aio. > > Oh, I see. Thanks. > > > > The race between exit_mmap and > > process_mrelease is solved by using mmap_lock. > > I think an important part of the race fix isn't just the lock holding > but the setting of the start of the linked list to NULL above. That > means the code in __oom_reap_task_mm() via process_mrelease() will > continue to execute but iterate for zero VMAs. > > > I think by destroying the maple tree in exit_mmap before the > > mmap_write_unlock call, you keep things working and functionality > > intact. Is there any reason this can't be done? > > Yes, unfortunately. If MMF_OOM_SKIP is not set, then process_mrelease() > will call __oom_reap_task_mm() which will get a null pointer dereference > or a use after free in the vma iterator as it tries to iterate the maple > tree. I think the best plan is to set MMF_OOM_SKIP unconditionally > when the mmap_write_lock() is acquired. Doing so will ensure nothing > will try to gain memory by reaping a task that no longer has memory to > yield - or at least won't shortly. If we do use MMF_OOM_SKIP in such a > way, then I think it is safe to quickly drop the lock? That technically would work but it changes the semantics of MMF_OOM_SKIP flag from "mm is of no interest for the OOM killer" to something like "mm is empty" akin to mm->mmap == NULL. So, there is no way for maple tree to indicate that it is empty? > > Also, should process_mrelease() be setting MMF_OOM_VICTIM on this mm? > It would enable the fast path on a race with exit_mmap() - thought that > may not be desirable? Michal does not like that approach because again, process_mrelease is not oom-killer to set MMF_OOM_VICTIM flag. Besides, we want to get rid of that special mm_is_oom_victim(mm) branch inside exit_mmap. Which reminds me to look into it again. > > > > > [1] ba535c1caf3ee78a ("mm/oom_kill: allow process_mrelease to run > > under mmap_lock protection") > > > > > That would ensure we aren't stepping on an > > > exit_mmap() and potentially the locking change in exit_mmap() wouldn't > > > be needed either? Logically, I view this as process_mrelease() having > > > issue with the fact that the mmaps are no longer stable in tear down > > > regardless of the data structure that is used. > > > > > > Thanks, > > > Liam > > > > > > -- > > > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. > > > > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. >