Received: by 10.213.65.68 with SMTP id h4csp883378imn; Tue, 20 Mar 2018 18:52:55 -0700 (PDT) X-Google-Smtp-Source: AG47ELvWmQpvg6/sCzhrIbYY4FZkrEYo6NOnGdyfHqgAIX1Gh7ukiBvCzCON0iyEFEjl0+hK7rZm X-Received: by 2002:a17:902:9a42:: with SMTP id x2-v6mr6211653plv.201.1521597175947; Tue, 20 Mar 2018 18:52:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521597175; cv=none; d=google.com; s=arc-20160816; b=hvCkt1nBJNQNWvbAUYW8QHWqI2/UelRkxLjT+g9QBR1t4sBdmBemuI6wsf6+2mkdrW AkY6mQzwKO6jUfPb717VaG8YYjOK3EiksXQ2SFjIWhIj7XIEE0jVYCOkjIt2IwXz5RI5 6hkxn8gZLhTdZI6RZH8aCAPfrUYxhrMksjpyNLdkJhMAE4cnlMtx8hhEUREWsJtl2haZ tA7JNO7nPzE6EnFP2Iy912BXwESkxMXdK1H3B/T24qdVEWEEnsXuTV8Ra7WZ5hx3mzh3 ircXmssQZIHaIGLkLBV0cwnOKaWdv+8iHMc5keipBsbTFLkBpT7ZzxUQ1WE/J4XeR/qm ubdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=wzK6btju9t+epyZlIiCCsX5AIozQdpr+4flNJY3mz6w=; b=se/7UsdZM50AIJJGqoIyNUq1MsJzXZRJS80momtSxmM5Phm6WTUSn0E+ezk1B1c7qN sZ4DHcW1BVyiWChSu4MoU8A3RglCiNWDwah8d04ms1cgNhnVH85ty80oKPmc4UXY3w+y 8zlxp4i+ZvcJOeD8uoG1tfOUWhASYYFCPSNEXtSgAHoq1VEn4X18bFOCeeQUfOqlzklW yLCV3G381r5IYLIBgppQpuMmF4TVeJGyMsrDFKQc++t/YE6kYpRwed5aKd1//QjUGnpg SPCt8DgjHouPgvTYn9nZECk+Jj5RoLr1Er3k2j8xMZmRVpvqNnNEJG5xwXsdHnbMhm3L vHqA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c1-v6si2958602plo.171.2018.03.20.18.52.41; Tue, 20 Mar 2018 18:52:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751680AbeCUBvt (ORCPT + 99 others); Tue, 20 Mar 2018 21:51:49 -0400 Received: from mga11.intel.com ([192.55.52.93]:1861 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751459AbeCUBvq (ORCPT ); Tue, 20 Mar 2018 21:51:46 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Mar 2018 18:51:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,338,1517904000"; d="scan'208";a="36746927" Received: from aaronlu.sh.intel.com (HELO intel.com) ([10.239.159.135]) by orsmga003.jf.intel.com with ESMTP; 20 Mar 2018 18:51:15 -0700 Date: Wed, 21 Mar 2018 09:52:23 +0800 From: Aaron Lu To: "Figo.zhang" Cc: Linux MM , LKML , Andrew Morton , Huang Ying , Dave Hansen , Kemi Wang , Tim Chen , Andi Kleen , Michal Hocko , Vlastimil Babka , Mel Gorman , Matthew Wilcox , Daniel Jordan Subject: Re: [RFC PATCH v2 3/4] mm/rmqueue_bulk: alloc without touching individual page structure Message-ID: <20180321015223.GA28705@intel.com> References: <20180320085452.24641-1-aaron.lu@intel.com> <20180320085452.24641-4-aaron.lu@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 20, 2018 at 03:29:33PM -0700, Figo.zhang wrote: > 2018-03-20 1:54 GMT-07:00 Aaron Lu : > > > Profile on Intel Skylake server shows the most time consuming part > > under zone->lock on allocation path is accessing those to-be-returned > > page's "struct page" on the free_list inside zone->lock. One explanation > > is, different CPUs are releasing pages to the head of free_list and > > those page's 'struct page' may very well be cache cold for the allocating > > CPU when it grabs these pages from free_list' head. The purpose here > > is to avoid touching these pages one by one inside zone->lock. > > > > One idea is, we just take the requested number of pages off free_list > > with something like list_cut_position() and then adjust nr_free of > > free_area accordingly inside zone->lock and other operations like > > clearing PageBuddy flag for these pages are done outside of zone->lock. > > > > sounds good! > your idea is reducing the lock contention in rmqueue_bulk() function by Right, the idea is to reduce the lock held time. > split the order-0 > freelist into two list, one is without zone->lock, other is need zone->lock? But not by splitting freelist into two lists, I didn't do that. I moved part of the things done previously inside the lock outside, i.e. clearing PageBuddy flag etc. is now done outside so that we do not need to take the penalty of cache miss on those "struct page"s inside the lock and have all other CPUs waiting. > > it seems that it is a big lock granularity holding the zone->lock in > rmqueue_bulk() , > why not we change like it? It is believed frequently taking and dropping lock is worse than taking it and do all needed things and then drop. > > static int rmqueue_bulk(struct zone *zone, unsigned int order, > unsigned long count, struct list_head *list, > int migratetype, bool cold) > { > > for (i = 0; i < count; ++i) { > spin_lock(&zone->lock); > struct page *page = __rmqueue(zone, order, migratetype); > spin_unlock(&zone->lock); > ... > } In this case, spin_lock() and spin_unlock() should be outside the loop. > __mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order)); > > return i; > }