Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp33611742rwd; Sat, 8 Jul 2023 16:34:21 -0700 (PDT) X-Google-Smtp-Source: APBJJlERmH9xnXDVbjtH6TISJUIHOOp/dyFXaVxv1xh4PbdBeYPm1FDl6vXVcdHtbl20RuNWOT78 X-Received: by 2002:a17:902:854c:b0:1b9:cb8b:3bd3 with SMTP id d12-20020a170902854c00b001b9cb8b3bd3mr4128646plo.31.1688859261558; Sat, 08 Jul 2023 16:34:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688859261; cv=none; d=google.com; s=arc-20160816; b=PDBJNjFUaoSm0rjgXBHuaTasUIbcno0Ah6dTcgK6Hf97wQE7hi50LV9azi0D74k8Fh Z9XKid36d7bgbohp5dN/eEp2WZMX+ZlhB8X1Jz/s2cjKvttwDv/87jq+texAmv+IM6vV GbSK6SYES0Rh/T5JMAxdRnnRdWbuYW7pT+EBj6VGh3TOnExBW1D7bkHAFpwzkd+CdzgE wzBRO3WG2Oz2hr/oqgqJx7ZW11Lb4lavVRhTHOlZzqPfHwFniIivd712p8SxK17TFUVs yDBMWuGMNkSgd+sg7PMyfCc+bNnvFPMCgk65RwczWA31bpE1canJb1E3Uysi5rR4RdRg vruA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=gwBj92eIiUbEa/JYqaOxSr02NhzAqRIjjJvbmPFMSxo=; fh=uAKpO4lU3T9zxYhM5XFXYuJoKxhzUeL2NjAbiudOxp4=; b=iopvJtcK2dU6JxZRt63YDZZ4RsmSQGX9xPK6tfbsewwTNB7JKjzCXzHk4uI8gFEqEv Wy7I2x7QQw/dksMEweeudd840hrdhJXH/XGS5qaxFz44sDAR837juZZO//jHToAfncGU XUSx5OGnecIXChJSd6+YJm7oLVnmJEMoP1xs3urkVlIWjPtxN34FNpLnoSoZtrk1nLpl YBg1mtk65M4y9rpDdEJOKTDrOps4dEpJZnLj+VimD9MCKa9p6h7SR1M8KXDB+uWPqD9q 5otV9pcDYj4k2b8z5eD7NFL+2qKr0YOEWeKcyI/sYHcoSBW7tAoCg/bCI8qTiMfpfiYm HriQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=q8wnNoLt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t10-20020a170902e1ca00b001b8865ea0ffsi5512600pla.587.2023.07.08.16.34.00; Sat, 08 Jul 2023 16:34:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=q8wnNoLt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229454AbjGHXDu (ORCPT + 99 others); Sat, 8 Jul 2023 19:03:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45582 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229468AbjGHXDt (ORCPT ); Sat, 8 Jul 2023 19:03:49 -0400 Received: from mail-yb1-xb2f.google.com (mail-yb1-xb2f.google.com [IPv6:2607:f8b0:4864:20::b2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 337DBEB for ; Sat, 8 Jul 2023 16:03:48 -0700 (PDT) Received: by mail-yb1-xb2f.google.com with SMTP id 3f1490d57ef6-c5079a9f1c8so3734908276.0 for ; Sat, 08 Jul 2023 16:03:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688857427; x=1691449427; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=gwBj92eIiUbEa/JYqaOxSr02NhzAqRIjjJvbmPFMSxo=; b=q8wnNoLteM9JW+3RGd0GkHIMD3jU1gDl75C+fTaQWt5QAwCs7wlLOiouTZAdrQtK3i a4IrwA/yeP6VXeQxGaWW0JAcdY/lMDOjJd9nMB0Y/fmz4TXWlll59y557Q3hYUceCeRe w9gh7oweFP0r3kMXkm64YZ/B/ggFhCU2n3MzDKImFPTM2g4chapbtSKG9SW6Yhcu0TGK pHtMSE7YGKer6zBijlY+FgA5y64a2UUMOIN6Q/iBXQEQiYp9tqAOEAJQMeLj8kxDQIsa TStlZEZ4ZB19RRJuB0nJ5gv5cUn5jhSqlqKXDZuwiK75dUs2IsgXNeJsAGMXu2QCmD4T QrJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688857427; x=1691449427; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gwBj92eIiUbEa/JYqaOxSr02NhzAqRIjjJvbmPFMSxo=; b=M57giFzMPukc/Se++r018rDMjvNLE6+ZqgMRvYa2r70YR3nBGr1NQ1ORBkrJFbdE9Y eOKpGBV/bRKoN7k6w8BClbQXTQFDTjVDeWy3Kt4lf42o6b2lMjLe3xT1aqn4eobzoaiy uQ44b50mWfADfBi4Ba82DDQnS5Dtmm3eYuXpHwcbW+PM7NOgJkWc8Ax5jQ0BhCHA995b 5+s4XbqnamIbCwr1J0WNGHeIM6aTq3SkKOU7Ix2INM++iu+bT5LOb2Z0yMTlSDfSgKaB z5ioWHL1TZbsLRHQ6uFKdqEvKWp8C8dJfwLqSpjvQd3vDtKpicAhU9yo6klmNZkTp+vO tBlw== X-Gm-Message-State: ABy/qLbBt6zfVeFhvJ8WOLVHrCkPqIvFRGx+5lVKv6KLIj3PNVyLZTJx gOIZgkbIDyhRTDojNIvG2Z1g0xjFg7v+GTBHQZwv2w== X-Received: by 2002:a25:ad88:0:b0:c77:abc9:d577 with SMTP id z8-20020a25ad88000000b00c77abc9d577mr1345314ybi.52.1688857427141; Sat, 08 Jul 2023 16:03:47 -0700 (PDT) MIME-Version: 1.0 References: <20230708191212.4147700-1-surenb@google.com> <20230708191212.4147700-3-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Sat, 8 Jul 2023 16:03:35 -0700 Message-ID: Subject: Re: [PATCH v2 3/3] fork: lock VMAs of the parent process when forking To: Linus Torvalds Cc: David Hildenbrand , akpm@linux-foundation.org, regressions@leemhuis.info, bagasdotme@gmail.com, jacobly.alt@gmail.com, willy@infradead.org, liam.howlett@oracle.com, peterx@redhat.com, ldufour@linux.ibm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-arm-kernel@lists.infradead.org, gregkh@linuxfoundation.org, regressions@lists.linux.dev, Jiri Slaby , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 8, 2023 at 3:54=E2=80=AFPM Linus Torvalds wrote: > > On Sat, 8 Jul 2023 at 15:36, Suren Baghdasaryan wrote= : > > > > On Sat, Jul 8, 2023 at 2:18=E2=80=AFPM Linus Torvalds > > > > > > Again - maybe I messed up, but it really feels like the missing > > > vma_start_write() was more fundamental, and not some "TLB coherency" > > > issue. > > > > Sounds plausible. I'll try to use the reproducer to verify if that's > > indeed happening here. > > I really don't think that's what people are reporting, I was just > trying to make up a completely different case that has nothing to do > with any TLB issues. > > My real point was simply this one: > > > It's likely there are multiple problematic > > scenarios due to this missing lock though. > > Right. That's my issue. I felt your explanation was *too* targeted at > some TLB non-coherency thing, when I think the problem was actually a > much larger "page faults simply must not happen while we're copying > the page tables because data isn't coherent". > > The anon_vma case was just meant as another random example of the > other kinds of things I suspect can go wrong, because we're simply not > able to do this whole "copy vma while it's being modified by page > faults". > > Now, I agree that the PTE problem is real, and probable the main > thing, ie when we as part of fork() do this: > > /* > * If it's a COW mapping, write protect it both > * in the parent and the child > */ > if (is_cow_mapping(vm_flags) && pte_write(pte)) { > ptep_set_wrprotect(src_mm, addr, src_pte); > pte =3D pte_wrprotect(pte); > } > > and the thing that can go wrong before the TLB flush happens is that - > because the TLB's haven't been flushed yet - some threads in the > parent happily continue to write to the page and didn't see the > wrprotect happening. > > And then you get into the situation where *some* thread see the page > protections change (maybe they had a TLB flush event on that CPU for > random reasons), and they will take a page fault and do the COW thing > and create a new page. > > And all the while *other* threads still see the old writeable TLB > state, and continue to write to the old page. > > So now you have a page that gets its data copied *while* somebody is > still writing to it, and the end result is that some write easily gets > lost, and so when that new copy is installed, you see it as data > corruption. > > And I agree completely that that is probably the thing that most > people actually saw and reacted to as corruption. > > But the reason I didn't like the explanation was that I think this is > just one random example of the more fundamental issue of "we simply > must not take page faults while copying". > > Your explanation made me think "stale TLB is the problem", and *that* > was what I objected to. The stale TLB was just one random sign of the > much larger problem. > > It might even have been the most common symptom, but I think it was > just a *symptom*, not the *cause* of the problem. > > And I must have been bad at explaining that, because David Hildenbrand > also reacted negatively to my change. > > So I'll happily take a patch that adds more commentary about this, and > gives several examples of the things that go wrong. How about adding your example to the original description as yet another scenario which is broken without this change? I guess having both issues described would not hurt. > > Linus