From:   "Huang, Ying" <ying.huang@intel.com>
To:     Hugh Dickins <hughd@google.com>
Cc:     Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org, "Xu, Pengfei" <pengfei.xu@intel.com>,
        Christoph Hellwig <hch@lst.de>,
        Stefan Roesch <shr@devkernel.io>, Tejun Heo <tj@kernel.org>,
        Xin Hao <xhao@linux.alibaba.com>, Zi Yan <ziy@nvidia.com>,
        Yang Shi <shy828301@gmail.com>,
        Baolin Wang <baolin.wang@linux.alibaba.com>,
        Matthew Wilcox <willy@infradead.org>,
        Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH 1/3] migrate_pages: fix deadlock in batched migration
References: <20230224141145.96814-1-ying.huang@intel.com>
        <20230224141145.96814-2-ying.huang@intel.com>
        <ea4dc95a-e6b2-ff6-62df-1590b93269f@google.com>
        <87h6v6b6er.fsf@yhuang6-desk2.ccr.corp.intel.com>
        <da5ba36a-dba-f44-926a-c5c912148b@google.com>
Date:   Wed, 01 Mar 2023 09:17:50 +0800
In-Reply-To: <da5ba36a-dba-f44-926a-c5c912148b@google.com> (Hugh Dickins's
        message of "Tue, 28 Feb 2023 13:07:41 -0800 (PST)")
Message-ID: <878rghb77l.fsf@yhuang6-desk2.ccr.corp.intel.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=ascii
Precedence: bulk

Hugh Dickins <hughd@google.com> writes:

> On Tue, 28 Feb 2023, Huang, Ying wrote:
>> Hugh Dickins <hughd@google.com> writes:
>> > On Fri, 24 Feb 2023, Huang Ying wrote:
>> >> @@ -1247,7 +1236,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>> >>  		/* Establish migration ptes */
>> >>  		VM_BUG_ON_FOLIO(folio_test_anon(src) &&
>> >>  			       !folio_test_ksm(src) && !anon_vma, src);
>> >> -		try_to_migrate(src, TTU_BATCH_FLUSH);
>> >> +		try_to_migrate(src, mode == MIGRATE_ASYNC ? TTU_BATCH_FLUSH : 0);
>> >
>> > Why that change, I wonder? The TTU_BATCH_FLUSH can still be useful for
>> > gathering multiple cross-CPU TLB flushes into one, even when it's only
>> > a single page in the batch.
>> 
>> Firstly, I would have thought that we have no opportunities to batch the
>> TLB flushing now.  But as you pointed out, it is still possible to batch
>> if mapcount > 1.  Secondly, without TTU_BATCH_FLUSH, we may flush the
>> TLB for a single page (with invlpg instruction), otherwise, we will
>> flush the TLB for all pages.  The former is faster and will not
>> influence other TLB entries of the process.
>> 
>> Or we use TTU_BATCH_FLUSH only if mapcount > 1?
>
> I had not thought at all of the "invlpg" advantage (which I imagine
> some other architectures than x86 share) to not delaying the TLB flush
> of a single PTE.
>
> Frankly, I just don't have any feeling for the tradeoff between
> multiple remote invlpgs versus one remote batched TLB flush of all.
> Which presumably depends on number of CPUs, size of TLBs, etc etc.
>
> Your "mapcount > 1" idea might be good, but I cannot tell: I'd say
> now that there's no reason to change your "mode == MIGRATE_ASYNC ?
> TTU_BATCH_FLUSH : 0" without much more thought, or a quick insight
> from someone else.  Some other time maybe.

Yes.  I think that this is reasonable.  We can revisit this later.

Best Regards,
Huang, Ying