Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp781131pxp; Fri, 11 Mar 2022 15:00:08 -0800 (PST) X-Google-Smtp-Source: ABdhPJw2BfMVcRe488XOFbZMpmcrcIE3S6g4PkZu5Acv3w59Mr8ph9am3XDRr72st4mMPgB+C+Ow X-Received: by 2002:a17:902:b183:b0:14f:c266:20d5 with SMTP id s3-20020a170902b18300b0014fc26620d5mr12913620plr.136.1647039608001; Fri, 11 Mar 2022 15:00:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1647039607; cv=none; d=google.com; s=arc-20160816; b=nRORx7hy1dhuN05r/y9N+Uk8NBI3m1K55jmO3izQdjVlFkRU5+5zs+D8Ylqm1kyDJQ YyLNoEQQYMv+8zjJCMXM12L0IMoCk7oR+BeyLdZYsRJoorEgwsb+s5vDMRtY2NwOJfZ5 2HsPyCx50hO43OwH3TPX2E5+dOpPI5ZinXl9fh2NWFlAhZZwS2OBr3fYvM+mSarBh3Do GYhvVIJ+VtqO6Wd0DGM/VhgYX8hv3nK5RN29WQOqZ7InV5tUwMeqLWLYru4IG9EJ2WUt mNrYC1jgXck84okjICfIiBPCbneFvw2yMhmiiDkeQBmiZStfIGacbXJ5mUIMXF+/2jik OQWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=DddPVEIDoazghkvqa3kgiYVD4CopQfEujLOe2fCtyYs=; b=KgEfTznMD1NVuKjdAqeK8os3KAl0RY2sdAdlU/7f84Ft8yg6eN+gnz3xBbCBKniPQu y2aQzYZQ+yERqAJ/v1vOj2pYO1ljMFXr5Mc0knNqgF+t8t/KQOhzBrc9RInMQmxKaxE3 YodZeVgENEN5Wq3q/0/fV7naquixK0P7Jh4/AgF0/B/PGjF/rJDMuz7WyhCLgxQg7Pbw aL1NyU45wuPhuLA1P1iEh/IMeyYfR1W9hK08mAgkuabUvKSyh2IZ/3UNYCFBVY/0wNk1 sBaOLdnonHDGfyI5Pt9xCp4wpT36BQlpofQc1MW0qWLOxHwhK0Nw8tpNxJqv7ads18wr 5pKQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=aqyFksFH; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id b10-20020a17090acc0a00b001bd14e03070si5873166pju.72.2022.03.11.15.00.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 15:00:07 -0800 (PST) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=aqyFksFH; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AA46D1E0176; Fri, 11 Mar 2022 13:56:51 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242671AbiCKIzk (ORCPT + 99 others); Fri, 11 Mar 2022 03:55:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45284 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241847AbiCKIzi (ORCPT ); Fri, 11 Mar 2022 03:55:38 -0500 Received: from mail-oi1-x230.google.com (mail-oi1-x230.google.com [IPv6:2607:f8b0:4864:20::230]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C28381BB70E for ; Fri, 11 Mar 2022 00:54:33 -0800 (PST) Received: by mail-oi1-x230.google.com with SMTP id n7so8653904oif.5 for ; Fri, 11 Mar 2022 00:54:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=DddPVEIDoazghkvqa3kgiYVD4CopQfEujLOe2fCtyYs=; b=aqyFksFHGrIOIaRkUw98hF8qDNsn4GBqPWUfp2MCfpYwATcuoXNrxhKFU7usWrRErO lI5oi6moHJbiZWweFaNsFwCemkhaucQbOqTEoE+ffuMLR/3nUzCDCsvAzFXVG1qAcycM +N5sLv/6M5VZKoYfsJxSZzjVUavPGTOdZE+NWA2lArRXwNPiXAKCiMMID1EPinJBWixc rHSdxTHTh6aqTtIMFuNzP87XfQVWYHaTkxkzBq9J87ZzjSnjZsBh+jlWgGGpMZYkOomP Mdcs71th38DlkD+Re1NrezNcxhAruyfMB/5OeXaBwZxzGfO/a5HFQ21k7slQTXCZr9EQ KKEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=DddPVEIDoazghkvqa3kgiYVD4CopQfEujLOe2fCtyYs=; b=2LMk2AIB9IT4tAi3CYFOcaGqjh5yPPTao/f40O2ak4NBpCeYo5rxgssryEMb3PvmiO eGaqI02vZQq2Vhdreot+lcVOdmUz2CpE7J0Y8uCBjQLMI7TvsG+u6aWy82HLu/FYV8VU Le+oqFIS9d7LuJL2ERT++s9SSNK4p68h/pq1Nr6Qq/INecTXwQSAigMBGP01tXpb/wix ztYZ+TzQQnvFPBDi/KcYm2jV7YnwAGa1q8B0/gFqt4S44tOARoJd/o1nFCke5WxkNQRw 7FHcFYUEwcIREh0//YGtvc+4eiKKFx7dcya4GqRKJr4x3Z1dl1JJdSxLIZ3NMz/X/J3+ neqA== X-Gm-Message-State: AOAM533oPctjOGdfsFcLHM6JwvXcHaYVeYEGzztEJxYR66Y+etrIzJuC nBPUJlb9ANtJwQoO8f/VIF2rog== X-Received: by 2002:a05:6808:1442:b0:2ce:29d3:a764 with SMTP id x2-20020a056808144200b002ce29d3a764mr5714526oiv.26.1646988872238; Fri, 11 Mar 2022 00:54:32 -0800 (PST) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id bl16-20020a056808309000b002d43b28a8bdsm3646862oib.14.2022.03.11.00.54.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 00:54:31 -0800 (PST) Date: Fri, 11 Mar 2022 00:54:15 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@ripple.anvils To: Vlastimil Babka cc: Hugh Dickins , Liam Howlett , Andrew Morton , Oleg Nesterov , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" Subject: Re: [PATCH mmotm] mempolicy: mbind_range() set_policy() after vma_merge() In-Reply-To: <105e1620-5cf2-fecd-27e7-21a6045cc3ac@suse.cz> Message-ID: <173fbbd0-d631-ede7-4641-39ead6531d9@google.com> References: <319e4db9-64ae-4bca-92f0-ade85d342ff@google.com> <20220304184927.vkq6ewn6uqtcesma@revolver> <20220304190531.6giqbnnaka4xhovx@revolver> <6038ebc2-bc88-497d-a3f3-5936726fb023@google.com> <20220305020021.qmwg5dkham4lyz6v@revolver> <29eac73-4f94-1688-3834-8bd6687a18@google.com> <20220308160552.d3dlcaclkqnlkzzj@revolver> <6036627b-6110-cc58-ca1-a6f736553dd@google.com> <105e1620-5cf2-fecd-27e7-21a6045cc3ac@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 9 Mar 2022, Vlastimil Babka wrote: > On 3/8/22 22:32, Hugh Dickins wrote: > > On Tue, 8 Mar 2022, Liam Howlett wrote: > >> > >> I must be missing something. If mpol_equal() isn't sufficient to ensure > >> we don't need to set_policy(), then why are the other vma_merge() cases > >> okay - such as madvise_update_vma() and mlock_fixup()? Won't the mem > >> policy change in the same way in these cases? > > > > mlock provides a good example to compare. > > > > Mlocking pages is the business of mlock(), and mlock_fixup() needs to > > attend to mm->locked_vm, and calling something to mark as PageMlocked > > those pages already in the area now covered by mlock. But it doesn't > > need to worry about set_policy(), that's not its business, and is > > unaffected by mlock changes (though merging of vmas needs mpol_equal() > > to check that policy is the same, and merging and splitting of vmas > > need to maintain the refcount of the shared policy if any). > > > > Whereas NUMA mempolicy is the business of mbind(), and mbind_range() > > needs to attend to vma->vm_policy, and if it's a mapping of something > > supporting a shared set_policy(), call that to establish the new range > > on the object mapped. But it doesn't need to worry about mm->locked_vm > > or whether pages are Mlocked, that's not its business, and is unaffected > > by mbind changes (though merging of vmas needs to check VM_LOCKED among > > other flags to check that they are the same before it can merge). > > So if I understand correctly, we have case 8 of vma_merge(): > > AAAA > PPPPNNNNXXXX > becomes > PPPPXXXXXXXX 8 > > N is vma with some old policy different from new_pol > A is the range where we change to new policy new_pol, which happens to be > the same as existing policy of X > Thus vma_merge() extends vma X to include range A - the vma N > vma_merge() succeeds because it's passed new_pol to do the compatibility > checks (although N still has the previous policy) I *think* you have it the wrong way round there: my reading is that this vma_merge() case 8 was correctly handled before, because in its case !mpol_equal(vma_policy(vma), new_pol): I think case 8 was being handled correctly, but the other cases were not. Or was the comment even correct to reference case 8 especially? I'm afraid bringing it all back to mind is a bit of an effort: I won't stake my life on it, perhaps I'm the one who has it the wrong way round. > > Before Hugh's patch we would then realize "oh X already has new_pol, nothing > to do". Note that this AFAICS doesn't affect actual pages migration between > nodes, because that happens outside of mbind_range(). But it causes us to > skip vma_replace_policy(), which causes us to skip vm_ops->set_policy, where > tmpfs does something important (we could maybe argue that Hugh didn't > specify the user visible effects of this exactly enough :) what is "leaving > the new mbind unenforced" - are pages not migrated in this case?). Went back to check the original (internal) report: mbind MPOL_BIND on tmpfs can result in allocations on the wrong node. And it was a genuine practical case, though the finder was kind enough to distil it down to a minimal sequence (and correctly suggest the fix). The user visible effect was that the pages got allocated on the local node (happened to be 0), after the mbind() caller had specifically asked for them to be allocated on node 1. There was not any page migration involved in the case reported: the pages simply got allocated on the wrong node. And yes, on this patch I should have asked for a Cc: > > HTH (if I'm right), > Vlastimil