Received: by 2002:a05:6358:489b:b0:bb:da1:e618 with SMTP id x27csp2188467rwn; Fri, 9 Sep 2022 09:42:32 -0700 (PDT) X-Google-Smtp-Source: AA6agR73Uvb+2o0qtNLa7QCfy4+pqJOCfVg1RTwRt36G6wEsI4osXPgzpSdcv+5DjhwXTY6iJ8K9 X-Received: by 2002:a17:907:781a:b0:730:cd06:3572 with SMTP id la26-20020a170907781a00b00730cd063572mr10449646ejc.487.1662741752025; Fri, 09 Sep 2022 09:42:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662741752; cv=none; d=google.com; s=arc-20160816; b=HwgANgCeWyH4xKLHniOmV2RziUNYit5fEG/f/XHohy0UswKlzoWdi2YiUuhWwM5g7M +PXh2aXuJ562gFW5ziuxHWyHsgLLynJI+k1dETsDSgQBRwp/g5HRM7vnMo7l0PP0KLpn /tBqZEjWo9H1OE8ClrmkN4QUdkM9E4hCVaKRpyAHXqaXcpuQ6JneyxsDU+l+S+vqZLEA 1PaT+GuMKWWIBIdmcvaYnhumqgoReLwo8xX+N4rNvnvRTC3myhUq4wDuCBm5O6+uAisW a4Pw0d8Tgw70fAo3RDmvsou2klOWhorZGmAb23/npVRwplXofuH/QWzIbCsi01FIfBjn kCgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=/FXE6KT3IEbyxLGcwK+3ue2mjXL44FHoYVt0dMcsy2k=; b=lqOx3kuivSgDIgXtrYbFF7/UpCkOT7SNqc3zicFzzK7gaOfCIQ+4P3mzByxekFVQFx mhsWf2ZR2LyreiU+Pvt9kmRhlbtdBvC3wrMaGgW0fkQWGS3HpYJycvAjpfCqsMWfjrq+ D3CgIrcL7PvMWM5byh/I1Mqe7A8qiFOslYy2+KK4eY4hpeOzFqqJDdjHS9eg7u5uTqUH PTXZsiHsriiSBxttxIgT1wnIGYZ9l8P7VPmcrOPT28pldN4AcEwjlJA+SLKsXQIoO/jf ILXfyYpKs4qMLdYSSVzQa2os5G8EinWMjXC9ybqzO8rXR8I6MLgq05DYsNtO4M+w4Uih ipgw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=WFC+mmfL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hd40-20020a17090796a800b007306ac0faa0si290941ejc.615.2022.09.09.09.42.05; Fri, 09 Sep 2022 09:42:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=WFC+mmfL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231509AbiIIQaB (ORCPT + 99 others); Fri, 9 Sep 2022 12:30:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229941AbiIIQ35 (ORCPT ); Fri, 9 Sep 2022 12:29:57 -0400 Received: from mail-yb1-xb2c.google.com (mail-yb1-xb2c.google.com [IPv6:2607:f8b0:4864:20::b2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6A7E19C05 for ; Fri, 9 Sep 2022 09:29:54 -0700 (PDT) Received: by mail-yb1-xb2c.google.com with SMTP id 130so3444239ybz.9 for ; Fri, 09 Sep 2022 09:29:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date; bh=/FXE6KT3IEbyxLGcwK+3ue2mjXL44FHoYVt0dMcsy2k=; b=WFC+mmfLyYIjlJGVeXw314g1tcL5tJ4xqzIO2vLl/9rsbaCoGnBIPy0p853g4f7ctg 8sPwQJXyO0UDB/KTe9E1ptVVFk01M9nrUJanKkaeE7/x8YMgQJdHf/BFgfI3HNW8rgRv 6vobpUPxSYdYDFfdxIVAMxCj9HiA28eHAR4hUGCqxOsaz4OkiEsdJPiCQwV6GMlORkUI BXtJonrmNJppW9fC62mn2CBMHbx/j40w37N9WdzohE0jh06zgBKzULolHeam1yoUoSrH Hu4CD+yQ9nXFhR70qoHQP8/d8iHwqo7pk9Xvduf20IcOIKAE33Hdw4PWAhnGhJ63ogWo CIXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date; bh=/FXE6KT3IEbyxLGcwK+3ue2mjXL44FHoYVt0dMcsy2k=; b=F8RkeBiNTE3nzATnS9Kl0RE/Tq7/BWbmJIQR3Hd+CC01hXazCetG94Y6PRqQPLcpwl 90mh/Ae7ZxJ1a9PN6N0CgnZdA/plE2KRjvlAEMZUEEo97WC8xHWCjsTtm3m7HzJqYnHW 4UAxzrOXN4iy3txIOrsQhm9RQ3t9aEaVjNQfSFp5KGY+fVTjf0apD0tozgB92R2yGYSn EYXIGFS7HX+RWYbYovVDTSRY9dey0LTPXYDpI/bRn1G8IB3DpiiEKBc6Kj2ugNQ0Nlbi 2u+eO+tySIyRog9gtSY5usEcbbOzz+t0LKTrymTtZOPdQQ6nzN3IDJWDr4Jp/CeRZurL ccGg== X-Gm-Message-State: ACgBeo3kYjWH5gd9LH4qaH92pRPwIAfRJ7BjQVRAmicSggiwfSTUQROj V5Ui+okW0+BCzr7uZgsErn2lk8CZat9v7uiWiyAUPQ== X-Received: by 2002:a25:abea:0:b0:6a8:1bd5:deef with SMTP id v97-20020a25abea000000b006a81bd5deefmr12701640ybi.431.1662740993786; Fri, 09 Sep 2022 09:29:53 -0700 (PDT) MIME-Version: 1.0 References: <20220901173516.702122-1-surenb@google.com> <20220901173516.702122-8-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Fri, 9 Sep 2022 09:29:42 -0700 Message-ID: Subject: Re: [RFC PATCH RESEND 07/28] kernel/fork: mark VMAs as locked before copying pages during fork To: Laurent Dufour Cc: akpm@linux-foundation.org, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, kernel-team@android.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 9, 2022 at 6:27 AM Laurent Dufour wrote= : > > Le 09/09/2022 =C3=A0 01:57, Suren Baghdasaryan a =C3=A9crit : > > On Tue, Sep 6, 2022 at 7:38 AM Laurent Dufour w= rote: > >> > >> Le 01/09/2022 =C3=A0 19:34, Suren Baghdasaryan a =C3=A9crit : > >>> Protect VMAs from concurrent page fault handler while performing > >>> copy_page_range for VMAs having VM_WIPEONFORK flag set. > >> > >> I'm wondering why is that necessary. > >> The copied mm is write locked, and the destination one is not reachabl= e. > >> If any other readers are using the VMA, this is only for page fault ha= ndling. > > > > Correct, this is done to prevent page faulting in the VMA being > > duplicated. I assume we want to prevent the pages in that VMA from > > changing when we are calling copy_page_range(). Am I wrong? > > If a page is faulted while copy_page_range() is in progress, the page may > not be backed on the child side (PTE lock should protect the copy, isn't = it). > Is that a real problem? It will be backed later if accessed on the child = side. > Maybe the per process pages accounting could be incorrect... This feels to me like walking on the edge. Maybe we can discuss this with more people at LPC before trying it? > > > > >> I should have miss something because I can't see any need to mark the = lock > >> VMA here. > >> > >>> Signed-off-by: Suren Baghdasaryan > >>> --- > >>> kernel/fork.c | 4 +++- > >>> 1 file changed, 3 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/kernel/fork.c b/kernel/fork.c > >>> index bfab31ecd11e..1872ad549fed 100644 > >>> --- a/kernel/fork.c > >>> +++ b/kernel/fork.c > >>> @@ -709,8 +709,10 @@ static __latent_entropy int dup_mmap(struct mm_s= truct *mm, > >>> rb_parent =3D &tmp->vm_rb; > >>> > >>> mm->map_count++; > >>> - if (!(tmp->vm_flags & VM_WIPEONFORK)) > >>> + if (!(tmp->vm_flags & VM_WIPEONFORK)) { > >>> + vma_mark_locked(mpnt); > >>> retval =3D copy_page_range(tmp, mpnt); > >>> + } > >>> > >>> if (tmp->vm_ops && tmp->vm_ops->open) > >>> tmp->vm_ops->open(tmp); > >> >