Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp469799pxu; Wed, 25 Nov 2020 07:41:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJwc+TNMZ90f/YXWxH/SDlod921PzD1QmZ/c7whjfpVcx3+w01QRihTmkWIlgb8zY9CLsH6s X-Received: by 2002:a05:6402:104d:: with SMTP id e13mr4007172edu.179.1606318908395; Wed, 25 Nov 2020 07:41:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606318908; cv=none; d=google.com; s=arc-20160816; b=mT3FXc/l0xPEs9zJ57yhXtbKYy2TLNYL1OiPyHQz6DYg14bD4upyAECt4ZN6/Y7cVx jU7IYct6Kk/ZnAJCZX447+FAya7pu3a1Aa+2BWbPDKJB7kieA69z2f8OcI+Hn7dU6CEo Cja9Vjhpvo7CPPD4l/lydWZTiZdK8N5o+Qcy9Hu2KCkZT44ARcvdjKsbLOgLkuRe6nZs 5kxHFe+66zFVaHdzAwVrHo9mQvuYMwbTV6z8JOpFBY8jOjfe2JQh2rqTI0IerANHW6Px YdG66jRaMAOn1KmmibuBpxPtuzPgc//v/QyQycfz++/jrpmHPeWLthsJEf8+V+7ShOCF PMPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject:from :references:cc:to; bh=evw2xeOT9s/HOz3bY6EpVJQ1RF2KsgPSfCVMBL/XyxQ=; b=zONAVv+0r3mAa4tDk/FzHaRIVpYUgihvZQs+/hMRwirkGfq3GiMmCT5mDwEQXh14yn b8bas3Wp6XVenSLhq8gedugO/v6a6Oi7b2NJBs9Xak5EJN92gVZ8mAxecF6GbrZgIp1S zTQo/bsATw0w7wStrQY+xwngC5XUnsCgljkQL5a9Z8Zc2AkotrZZlcNZIb4L1hpZPMta A2RpKsA7Pl7C3fWgSDRgh2i7rGjojPbZuNrHP7oAlp7Ojlsm03xfi4DRBU3IGODNKMbX 6/ItacrFy9+prT3/xA2BFjHos2Y4ro11acXuoflPOaSqwQ1co06vgdDKmosCpyt1fuSe EKBA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ec19si1499263ejb.669.2020.11.25.07.41.25; Wed, 25 Nov 2020 07:41:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731449AbgKYPic (ORCPT + 99 others); Wed, 25 Nov 2020 10:38:32 -0500 Received: from mx2.suse.de ([195.135.220.15]:58618 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731403AbgKYPiU (ORCPT ); Wed, 25 Nov 2020 10:38:20 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id E5C3FAC22; Wed, 25 Nov 2020 15:38:18 +0000 (UTC) To: Alex Shi Cc: Konstantin Khlebnikov , Andrew Morton , Hugh Dickins , Yu Zhao , Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <1605860847-47445-1-git-send-email-alex.shi@linux.alibaba.com> From: Vlastimil Babka Subject: Re: [PATCH next] mm/swap.c: reduce lock contention in lru_cache_add Message-ID: Date: Wed, 25 Nov 2020 16:38:18 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <1605860847-47445-1-git-send-email-alex.shi@linux.alibaba.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/20/20 9:27 AM, Alex Shi wrote: > The current relock logical will change lru_lock when found a new > lruvec, so if 2 memcgs are reading file or alloc page at same time, > they could hold the lru_lock alternately, and wait for each other for > fairness attribute of ticket spin lock. > > This patch will sort that all lru_locks and only hold them once in > above scenario. That could reduce fairness waiting for lock reget. > Than, vm-scalability/case-lru-file-readtwice could get ~5% performance > gain on my 2P*20core*HT machine. Hm, once you sort the pages like this, it's a shame not to splice them instead of more list_del() + list_add() iterations. update_lru_size() could be also called once? > Suggested-by: Konstantin Khlebnikov > Signed-off-by: Alex Shi > Cc: Konstantin Khlebnikov > Cc: Andrew Morton > Cc: Hugh Dickins > Cc: Yu Zhao > Cc: Michal Hocko > Cc: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > --- > mm/swap.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 49 insertions(+), 8 deletions(-) > > diff --git a/mm/swap.c b/mm/swap.c > index 490553f3f9ef..c787b38bf9c0 100644 > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -1009,24 +1009,65 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec) > trace_mm_lru_insertion(page, lru); > } > > +struct lruvecs { > + struct list_head lists[PAGEVEC_SIZE]; > + struct lruvec *vecs[PAGEVEC_SIZE]; > +}; > + > +/* Sort pvec pages on their lruvec */ > +int sort_page_lruvec(struct lruvecs *lruvecs, struct pagevec *pvec) > +{ > + int i, j, nr_lruvec; > + struct page *page; > + struct lruvec *lruvec = NULL; > + > + lruvecs->vecs[0] = NULL; > + for (i = nr_lruvec = 0; i < pagevec_count(pvec); i++) { > + page = pvec->pages[i]; > + lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); > + > + /* Try to find a same lruvec */ > + for (j = 0; j <= nr_lruvec; j++) > + if (lruvec == lruvecs->vecs[j]) > + break; > + > + /* A new lruvec */ > + if (j > nr_lruvec) { > + INIT_LIST_HEAD(&lruvecs->lists[nr_lruvec]); > + lruvecs->vecs[nr_lruvec] = lruvec; > + j = nr_lruvec++; > + lruvecs->vecs[nr_lruvec] = 0; > + } > + > + list_add_tail(&page->lru, &lruvecs->lists[j]); > + } > + > + return nr_lruvec; > +} > + > /* > * Add the passed pages to the LRU, then drop the caller's refcount > * on them. Reinitialises the caller's pagevec. > */ > void __pagevec_lru_add(struct pagevec *pvec) > { > - int i; > - struct lruvec *lruvec = NULL; > + int i, nr_lruvec; > unsigned long flags = 0; > + struct page *page; > + struct lruvecs lruvecs; > > - for (i = 0; i < pagevec_count(pvec); i++) { > - struct page *page = pvec->pages[i]; > + nr_lruvec = sort_page_lruvec(&lruvecs, pvec); > > - lruvec = relock_page_lruvec_irqsave(page, lruvec, &flags); > - __pagevec_lru_add_fn(page, lruvec); > + for (i = 0; i < nr_lruvec; i++) { > + spin_lock_irqsave(&lruvecs.vecs[i]->lru_lock, flags); > + while (!list_empty(&lruvecs.lists[i])) { > + page = lru_to_page(&lruvecs.lists[i]); > + list_del(&page->lru); > + __pagevec_lru_add_fn(page, lruvecs.vecs[i]); > + } > + spin_unlock_irqrestore(&lruvecs.vecs[i]->lru_lock, flags); > } > - if (lruvec) > - unlock_page_lruvec_irqrestore(lruvec, flags); > + > release_pages(pvec->pages, pvec->nr); > pagevec_reinit(pvec); > } >