Date: Fri, 18 Aug 2017 15:46:22 +0100
From: Mel Gorman <mgorman@techsingularity.net>
To: "Liang, Kan" <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
        Mel Gorman <mgorman@suse.de>,
        "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
        Tim Chen <tim.c.chen@linux.intel.com>,
        Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@elte.hu>,
        Andi Kleen <ak@linux.intel.com>,
        Andrew Morton <akpm@linux-foundation.org>,
        Johannes Weiner <hannes@cmpxchg.org>, Jan Kara <jack@suse.cz>,
        linux-mm <linux-mm@kvack.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] sched/wait: Break up long wake list walk
Message-ID: <20170818144622.oabozle26hasg5yo@techsingularity.net>
References: <84c7f26182b7f4723c0fe3b34ba912a9de92b8b7.1502758114.git.tim.c.chen@linux.intel.com>
 <CA+55aFznC1wqBSfYr8=92LGqz5-F6fHMzdXoqM4aOYx8sT1Dhg@mail.gmail.com>
 <37D7C6CF3E00A74B8858931C1DB2F07753786CE9@SHSMSX103.ccr.corp.intel.com>
 <CA+55aFwzTMrZwh7TE_VeZt8gx5Syoop-kA=Xqs56=FkyakrM6g@mail.gmail.com>
 <37D7C6CF3E00A74B8858931C1DB2F0775378761B@SHSMSX103.ccr.corp.intel.com>
 <CA+55aFy_RNx5TQ8esjPPOKuW-o+fXbZgWapau2MHyexcAZtqsw@mail.gmail.com>
 <20170818122339.24grcbzyhnzmr4qw@techsingularity.net>
 <37D7C6CF3E00A74B8858931C1DB2F077537879BB@SHSMSX103.ccr.corp.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Disposition: inline
In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F077537879BB@SHSMSX103.ccr.corp.intel.com>
User-Agent: NeoMutt/20170421 (1.8.2)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1020
Lines: 25

On Fri, Aug 18, 2017 at 02:20:38PM +0000, Liang, Kan wrote:
> > Nothing fancy other than needing a comment if it works.
> > 
> 
> No, the patch doesn't work.
> 

That indicates that it may be a hot page and it's possible that the page is
locked for a short time but waiters accumulate.  What happens if you leave
NUMA balancing enabled but disable THP?  Waiting on migration entries also
uses wait_on_page_locked so it would be interesting to know if the problem
is specific to THP.

Can you tell me what this workload is doing? I want to see if it's something
like many threads pounding on a limited number of pages very quickly. If
it's many threads working on private data, it would also be important to
know how each buffers threads are aligned, particularly if the buffers
are smaller than a THP or base page size. For example, if each thread is
operating on a base page sized buffer then disabling THP would side-step
the problem but THP would be false sharing between multiple threads.


-- 
Mel Gorman
SUSE Labs