Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp719076rwe; Thu, 25 Aug 2022 08:05:47 -0700 (PDT) X-Google-Smtp-Source: AA6agR7b0ePMoDxCQLg+n/8BflSpZTHMkGR3+FUCZekohbTdS0zXW7VVxjMmk1vA2apxQguF2L8k X-Received: by 2002:a17:906:58d0:b0:73d:9bc9:947f with SMTP id e16-20020a17090658d000b0073d9bc9947fmr2773392ejs.367.1661439946744; Thu, 25 Aug 2022 08:05:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661439946; cv=none; d=google.com; s=arc-20160816; b=bjnuPraUFn4Xu32gR2Gir/ULdrR+kSu6RYhT27VwkJ4CV4UnqgoxwqwIi2PVp34P5v kR7yZKiD/XXNLSV++Q6ar07nqNOSqyodpsSIQxyAAENnKet+Aul5qFYviobM+NqYDQto ur0FqszNwBEc35N1fGnSG6ZpyS/twJw2onVdGhq6sVIceH/lzGEZ4/9NkZbvcK3TwRvw 8G1u+u61DmnH8OwvOYh6fYWg/YXgZmdS44KNnZbVUvCOvkM5l1ycBBjlkmm1851iXCS2 6VsGIGotssM3VbbpW0Vomg8hn3XVq6kO19trSOPPNasQHk6ctQDzCx771dQ2pqpdOx2K UV3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=9gl8JtwHu2ieQqvlFlp3IZDIk+riOkODkt1dgMqEPWw=; b=azQT99uA0FpkGwEf8K+Fyu9zl3AylNjIkO6eCRuTNJEIhXaGnj/IT72ApypYUSqgjt UiH92RMUbHnqixfNheeaLeq9L7AgprjPTnlvNPOb1+Avzu1PrKD871GWPfmWw8wIUJ6v aDDQHIRuU+ecUAoxKPRKv8mOsyl1JPmolcwwb4cYoqQ/VkUbi8plClg1kdApDBmkUh/N nLpf6atNfDlc5IToIPCObYlj8hVeBsw8egbCI5Z23++zah9El4N6308fjSZNM56T01/1 +w3FHpM6+fxqw48vUBu/SVpRPmjEHe9nYHMEUasUi6pwAtu96lM4PWG6mkAIvVYRuUm2 stew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YYnh7VN+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jg31-20020a170907971f00b0073d753fe303si4620428ejc.611.2022.08.25.08.05.19; Thu, 25 Aug 2022 08:05:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YYnh7VN+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242525AbiHYOml (ORCPT + 99 others); Thu, 25 Aug 2022 10:42:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242670AbiHYOmK (ORCPT ); Thu, 25 Aug 2022 10:42:10 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0278D7CB6E for ; Thu, 25 Aug 2022 07:40:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661438427; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9gl8JtwHu2ieQqvlFlp3IZDIk+riOkODkt1dgMqEPWw=; b=YYnh7VN+NUhs83NDJ9kojKNUJNRwYCu3LOgqHIX8IGl6uhwQhwQETmdcMK//+fizqcBUxS tRQ77MjKvkCPaCdBvqI/8nKxefKmJTv6YA9s6Ba0dupVh+CXJNe9osxPOj7nWmEmC+JVUN lr4NnaF8c+RRatRnieviFDYMwxJXUHU= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-1-jkjHTOkIO0-Ybzl6NWqMzw-1; Thu, 25 Aug 2022 10:40:25 -0400 X-MC-Unique: jkjHTOkIO0-Ybzl6NWqMzw-1 Received: by mail-qv1-f71.google.com with SMTP id ea4-20020ad458a4000000b0049682af0ca8so11954290qvb.21 for ; Thu, 25 Aug 2022 07:40:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=9gl8JtwHu2ieQqvlFlp3IZDIk+riOkODkt1dgMqEPWw=; b=6vLTMmVleJC+VMf4lv3wLu0ByQgEN9ty958Sj+sHcjMW4trln/bDBoXbqs/WwKyHjY YiAuCTo3ktDNG8Os6a/a/tW+pdydBdXKkTw+502huo6YSYr/asLbXQ5LydS1SR1OO/ZC +NhK36szfKdOZXIpIzeLtXDZbJNTOBhiMJ7jnGEl1npvpR94VLBS5lETqs9tSHDyaIZZ cBBytubz7GRsLoNNMDtHBbO/Q2OWvRMkXVPkGYvgqRRrYZ44BtycGMO9w+pA+8DIXKIe CAzySKMlGjPz8FAUbOGM+niYj2gVN/IqBge4fsO7ERTC8V+GR2D2SW9vxKVwn8xX+cgb 7BeA== X-Gm-Message-State: ACgBeo30fGtEB56xSyvoWyo7jUuTbuJjz8e35g5mVZLOKxNMzZv3vRB/ pgEqBnLh7F8QYOaMRP2ZbFb+hHWumQ5FEvhE9rNa+g5YAfYC8sr454s7m+UN2F8XV5256wP0I+O /ZshpN6xSiT5VIn1MsVqe/vEK X-Received: by 2002:a05:622a:170d:b0:344:646b:73e with SMTP id h13-20020a05622a170d00b00344646b073emr3753335qtk.138.1661438425289; Thu, 25 Aug 2022 07:40:25 -0700 (PDT) X-Received: by 2002:a05:622a:170d:b0:344:646b:73e with SMTP id h13-20020a05622a170d00b00344646b073emr3753302qtk.138.1661438425022; Thu, 25 Aug 2022 07:40:25 -0700 (PDT) Received: from xz-m1.local (bras-base-aurron9127w-grc-35-70-27-3-10.dsl.bell.ca. [70.27.3.10]) by smtp.gmail.com with ESMTPSA id j20-20020a05620a411400b006b9a24dc9d7sm17831800qko.7.2022.08.25.07.40.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 25 Aug 2022 07:40:24 -0700 (PDT) Date: Thu, 25 Aug 2022 10:40:21 -0400 From: Peter Xu To: Alistair Popple Cc: "Huang, Ying" , Nadav Amit , huang ying , Linux MM , Andrew Morton , LKML , "Sierra Guiza, Alejandro (Alex)" , Felix Kuehling , Jason Gunthorpe , John Hubbard , David Hildenbrand , Ralph Campbell , Matthew Wilcox , Karol Herbst , Lyude Paul , Ben Skeggs , Logan Gunthorpe , paulus@ozlabs.org, linuxppc-dev@lists.ozlabs.org, stable@vger.kernel.org Subject: Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page Message-ID: References: <87tu6bbaq7.fsf@yhuang6-desk2.ccr.corp.intel.com> <1D2FB37E-831B-445E-ADDC-C1D3FF0425C1@gmail.com> <87czcyawl6.fsf@yhuang6-desk2.ccr.corp.intel.com> <874jy9aqts.fsf@yhuang6-desk2.ccr.corp.intel.com> <87czcqiecd.fsf@nvdebian.thelocal> <87o7w9f7dp.fsf@nvdebian.thelocal> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <87o7w9f7dp.fsf@nvdebian.thelocal> X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 25, 2022 at 10:42:41AM +1000, Alistair Popple wrote: > > Peter Xu writes: > > > On Wed, Aug 24, 2022 at 04:25:44PM -0400, Peter Xu wrote: > >> On Wed, Aug 24, 2022 at 11:56:25AM +1000, Alistair Popple wrote: > >> > >> Still I don't know whether there'll be any side effect of having stall tlbs > >> > >> in !present ptes because I'm not familiar enough with the private dev swap > >> > >> migration code. But I think having them will be safe, even if redundant. > >> > > >> > What side-effect were you thinking of? I don't see any issue with not > >> > TLB flushing stale device-private TLBs prior to the migration because > >> > they're not accessible anyway and shouldn't be in any TLB. > >> > >> Sorry to be misleading, I never meant we must add them. As I said it's > >> just that I don't know the code well so I don't know whether it's safe to > >> not have it. > >> > >> IIUC it's about whether having stall system-ram stall tlb in other > >> processor would matter or not here. E.g. some none pte that this code > >> collected (boosted both "cpages" and "npages" for a none pte) could have > >> stall tlb in other cores that makes the page writable there. > > > > For this one, let me give a more detailed example. > > Thanks, I would have been completely lost about what you were talking > about without this :-) > > > It's about whether below could happen: > > > > thread 1 thread 2 thread 3 > > -------- -------- -------- > > write to page P (data=P1) > > (cached TLB writable) > > zap_pte_range() > > pgtable lock > > clear pte for page P > > pgtable unlock > > ... > > migrate_vma_collect > > pte none, npages++, cpages++ > > allocate device page > > copy data (with P1) > > map pte as device swap > > write to page P again > > (data updated from P1->P2) > > flush tlb > > > > Then at last from processor side P should have data P2 but actually from > > device memory it's P1. Data corrupt. > > In the above scenario migrate_vma_collect_pmd() will observe pte_none. > This will mark the src_pfn[] array as needing a new zero page which will > be installed by migrate_vma_pages()->migrate_vma_insert_page(). > > So there is no data to be copied hence there can't be any data > corruption. Remember these are private anonymous pages, so any > zap_pte_range() indicates the data is no longer needed (eg. > MADV_DONTNEED). My bad to have provided an example but invalid. :) So if the trylock in the function is the only way to migrate this page, then I agree stall tlb is fine. > > >> > >> When I said I'm not familiar with the code, it's majorly about one thing I > >> never figured out myself, in that migrate_vma_collect_pmd() has this > >> optimization to trylock on the page, collect if it succeeded: > >> > >> /* > >> * Optimize for the common case where page is only mapped once > >> * in one process. If we can lock the page, then we can safely > >> * set up a special migration page table entry now. > >> */ > >> if (trylock_page(page)) { > >> ... > >> } else { > >> put_page(page); > >> mpfn = 0; > >> } > >> > >> But it's kind of against a pure "optimization" in that if trylock failed, > >> we'll clear the mpfn so the src[i] will be zero at last. Then will we > >> directly give up on this page, or will we try to lock_page() again > >> somewhere? > > That comment is out dated. We used to try locking the page again but > that was removed by ab09243aa95a ("mm/migrate.c: remove > MIGRATE_PFN_LOCKED"). See > https://lkml.kernel.org/r/20211025041608.289017-1-apopple@nvidia.com > > Will post a clean-up for it. That'll help, thanks. -- Peter Xu