Received: by 2002:ac0:aa62:0:0:0:0:0 with SMTP id w31-v6csp2916987ima; Mon, 22 Oct 2018 19:20:19 -0700 (PDT) X-Google-Smtp-Source: ACcGV62yTG0CLhi85iyOOWbbMOHPld/4446NW4nTdHypKcfOnUQRfMnelk/UNZt+XeNVLkpvMVkR X-Received: by 2002:a63:d34f:: with SMTP id u15-v6mr45056116pgi.325.1540261219610; Mon, 22 Oct 2018 19:20:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540261219; cv=none; d=google.com; s=arc-20160816; b=KnRyIr3bWhVsDwDg5o7vaPBpljIcsVeY/L4z5vf3NSQXmygLyLgEkRU/S42TEh3MLw W7nW5Tj0tapyqGbr1viwD3kujD1CT2JRaUWi7Li6x/Uhz56AOnli6+ibr3lolVoCm9ay 0EqNMD6PJ6n6iMGDfthEUeCxUvF37nCQJEtmWHXNNPacypMbP/HynCLiinR0TjiG/tRi zGCdgi7+diMVyMOl1HzxeLRqi+9d+oj56juNck8MPVbV5csIepl5W7zjqMUkE6bvrzqN srHrTa9V8I1hvVPvqJEtMhtG9EnddQOH5SORUrA01i0bo82ViVH0VZ2TawIqOTWVJF0l 0t5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=ziQ/y5WHL0Zis7PiTY70HuzNabblar4w2JxcHW0MJsc=; b=UrhpwsEFdUgkHKGGT2y5T/TG0pX3cRt+TjQIJ3cCXOdRj3U9AQoSDjHrV7ixuUcWAf Nryl2btN6r1DaOe8EZWSDPxb/fAh0F3oHwpHV7N2HeGQaOcBDfqpyEawvnPGleobFhh/ WKUVuAkLTw+9CJAfeM+0CE+0VkCZhy8kBD454cAxsbnirvoRuAiopDBaDP2qzsCe3hyV 91CZRKtT5AP/KYbiVwjyjlnCmP0ljEoWkonhrH04DN8tkAu7ogeY2d8PTE7tGLZtEvCL MJoMfbRb9cyeQQf5cha5qicXYm5oDGtzHtnzfwh5nk0zkjKcxvffdK+PdDzyXqmiob8F poHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o125-v6si29635840pgo.302.2018.10.22.19.20.02; Mon, 22 Oct 2018 19:20:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727116AbeJWKkp (ORCPT + 99 others); Tue, 23 Oct 2018 06:40:45 -0400 Received: from mga05.intel.com ([192.55.52.43]:39201 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726438AbeJWKko (ORCPT ); Tue, 23 Oct 2018 06:40:44 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 19:19:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,414,1534834800"; d="scan'208";a="97649079" Received: from aaronlu.sh.intel.com (HELO intel.com) ([10.239.159.44]) by fmsmga002.fm.intel.com with ESMTP; 22 Oct 2018 19:19:30 -0700 Date: Tue, 23 Oct 2018 10:19:30 +0800 From: Aaron Lu To: Vlastimil Babka Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Huang Ying , Dave Hansen , Kemi Wang , Tim Chen , Andi Kleen , Michal Hocko , Mel Gorman , Matthew Wilcox , Daniel Jordan , Tariq Toukan , Jesper Dangaard Brouer Subject: Re: [RFC v4 PATCH 3/5] mm/rmqueue_bulk: alloc without touching individual page structure Message-ID: <20181023021930.GA20069@intel.com> References: <20181017063330.15384-1-aaron.lu@intel.com> <20181017063330.15384-4-aaron.lu@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 22, 2018 at 11:37:53AM +0200, Vlastimil Babka wrote: > On 10/17/18 8:33 AM, Aaron Lu wrote: > > Profile on Intel Skylake server shows the most time consuming part > > under zone->lock on allocation path is accessing those to-be-returned > > page's "struct page" on the free_list inside zone->lock. One explanation > > is, different CPUs are releasing pages to the head of free_list and > > those page's 'struct page' may very well be cache cold for the allocating > > CPU when it grabs these pages from free_list' head. The purpose here > > is to avoid touching these pages one by one inside zone->lock. > > What about making the pages cache-hot first, without zone->lock, by > traversing via page->lru. It would need some safety checks obviously > (maybe based on page_to_pfn + pfn_valid, or something) to make sure we > only read from real struct pages in case there's some update racing. The > worst case would be not populating enough due to race, and thus not > gaining the performance when doing the actual rmqueueing under lock. Yes, there are the 2 potential problems you have pointed out: 1 we may be prefetching something that isn't a page due to page->lru can be reused as different things under different scenerios; 2 we may not be able to prefetch much due to other CPU is doing allocation inside the lock, it's possible we end up with prefetching pages that are on another CPU's pcp list. Considering the above 2 problems, I feel prefetching outside lock a little risky and troublesome. Allocation path is the hard part of improving page allocator performance - in free path, we can prefetch them safely outside the lock and we can even pre-merge them outside the lock to reduce the pressure of the zone lock; but in allocation path, there is pretty nothing we can do before acquiring the lock, except taking the risk to prefetch them without taking the lock as you mentioned here. We can come back to this if 'address space range' lock doesn't work out.