Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754992AbcLOEPO (ORCPT ); Wed, 14 Dec 2016 23:15:14 -0500 Received: from out0-131.mail.aliyun.com ([140.205.0.131]:51571 "EHLO out0-131.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754820AbcLOEPM (ORCPT ); Wed, 14 Dec 2016 23:15:12 -0500 X-Greylist: delayed 331 seconds by postgrey-1.27 at vger.kernel.org; Wed, 14 Dec 2016 23:15:02 EST X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R951e4;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e02c03288;MF=hillf.zj@alibaba-inc.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---.7IhUqLd_1481774947; Reply-To: "Hillf Danton" From: "Hillf Danton" To: "'Johannes Weiner'" , "'Vlastimil Babka'" Cc: "'Andrew Morton'" , "'Mel Gorman'" , , , References: <20161210172658.5182-1-hannes@cmpxchg.org> <5cc0eb6f-bede-a34a-522b-e30d06723ffa@suse.cz> <20161212155552.GA7148@cmpxchg.org> <20161214210017.GA1465@cmpxchg.org> In-Reply-To: <20161214210017.GA1465@cmpxchg.org> Subject: Re: [PATCH v2] mm: fadvise: avoid expensive remote LRU cache draining after FADV_DONTNEED Date: Thu, 15 Dec 2016 12:09:07 +0800 Message-ID: <04a301d25688$fbb8f7f0$f32ae7d0$@alibaba-inc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQHatcZibD1v5lEsWBF9j42Q03XZhQKjbGElAhIu0M4BfndH9QJ2NMK9oLJR28A= Content-Language: zh-cn Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2600 Lines: 63 On Thursday, December 15, 2016 5:00 AM Johannes Weiner wrote: > When FADV_DONTNEED cannot drop all pages in the range, it observes > that some pages might still be on per-cpu LRU caches after recent > instantiation and so initiates remote calls to all CPUs to flush their > local caches. However, in most cases, the fadvise happens from the > same context that instantiated the pages, and any pre-LRU pages in the > specified range are most likely sitting on the local CPU's LRU cache, > and so in many cases this results in unnecessary remote calls, which, > in a loaded system, can hold up the fadvise() call significantly. > > [ I didn't record it in the extreme case we observed at Facebook, > unfortunately. We had a slow-to-respond system and noticed it > lru_add_drain_all() leading the profile during fadvise calls. This > patch came out of thinking about the code and how we commonly call > FADV_DONTNEED. > > FWIW, I wrote a silly directory tree walker/searcher that recurses > through /usr to read and FADV_DONTNEED each file it finds. On a 2 > socket 40 ht machine, over 1% is spent in lru_add_drain_all(). With > the patch, that cost is gone; the local drain cost shows at 0.09%. ] > > Try to avoid the remote call by flushing the local LRU cache before > even attempting to invalidate anything. It's a cheap operation, and > the local LRU cache is the most likely to hold any pre-LRU pages in > the specified fadvise range. > > Signed-off-by: Johannes Weiner > Acked-by: Vlastimil Babka > Acked-by: Mel Gorman > --- Acked-by: Hillf Danton > mm/fadvise.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/mm/fadvise.c b/mm/fadvise.c > index 6c707bfe02fd..a43013112581 100644 > --- a/mm/fadvise.c > +++ b/mm/fadvise.c > @@ -139,7 +139,20 @@ SYSCALL_DEFINE4(fadvise64_64, int, fd, loff_t, offset, loff_t, len, int, advice) > } > > if (end_index >= start_index) { > - unsigned long count = invalidate_mapping_pages(mapping, > + unsigned long count; > + > + /* > + * It's common to FADV_DONTNEED right after > + * the read or write that instantiates the > + * pages, in which case there will be some > + * sitting on the local LRU cache. Try to > + * avoid the expensive remote drain and the > + * second cache tree walk below by flushing > + * them out right away. > + */ > + lru_add_drain(); > + > + count = invalidate_mapping_pages(mapping, > start_index, end_index); > > /* > -- > 2.10.2