Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp528774imm; Wed, 17 Oct 2018 04:21:38 -0700 (PDT) X-Google-Smtp-Source: ACcGV62vos5+iiuR1Zg8Oi35L6jV50QilCYHURgIOG1MuUNvnW8sA9Fbi7toSX14Hxo1V6qWDael X-Received: by 2002:a63:e40e:: with SMTP id a14-v6mr24189044pgi.28.1539775298310; Wed, 17 Oct 2018 04:21:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539775298; cv=none; d=google.com; s=arc-20160816; b=psE4IvrhLSBv2DewSGJYKL/f35+lxBJVOJSPtLdsA2ki2jVmo6X2USZNySz/LtLfZ/ l6JVr4V7f7Qx8q+u9cyMylroWI9p/NLif83tMc3QwBhSjFFhAewR8JtiTd2sP28ph6Jl BEPwA9Nz3b/AHVZM9nSquw3Qr/3a8sW8qGfUV+1s53cc8oTFFdDEQpGOKZrObdo6/KGK O4K8klGF4pqkH6XZvNY+ldanBimhLzaRpBUu+wnPRn4Z3TIOpHJ6DGI+AFFfSNLDpJ8/ IhV8k6NZ8FuAMCpHQIX4jWIQYPBjJjtZZeJRuZuXymnsflr9k3Nzr6EpCJ9gx2vennpV 1H0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=n8kXcKfWa1wAguyg+WT37x9dQ4qXkGLoXAD84ktaDuo=; b=oJPOKKe8tzjaOS+8HT4b9H9clOP039IMqmjyzJ/Q0mhVhT+5w4IsY5DRd1B84Ks4sV bIR0ZW4Zg02q7mQxY2iDZ7KVHDdaiY/QI+eyAv+yUHwCpO9PUUq4m6e115XpIMNQbvhm Jc3G35/tVDXjzyJH4oHU+HBzK78eCnuXZgy23sC2Aa9+j3o0z525WgYTcKyUE1JMgAot KS/XLyxI6BVeRF/f58b3MyFRWE8vqJWHhPwC7+Nn6LezffWEP3bSaXNrXN3klS4tbo/v MyLzYhzV0iqCr0IFMHuJqBuKsvX5Zt+a5VDPEdE/5Byv8z9fUyfUHzkWS3fQtm6t3fvr pWtw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z20-v6si16898492pfe.245.2018.10.17.04.21.22; Wed, 17 Oct 2018 04:21:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727053AbeJQTQA (ORCPT + 99 others); Wed, 17 Oct 2018 15:16:00 -0400 Received: from outbound-smtp10.blacknight.com ([46.22.139.15]:41156 "EHLO outbound-smtp10.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726904AbeJQTP7 (ORCPT ); Wed, 17 Oct 2018 15:15:59 -0400 Received: from mail.blacknight.com (pemlinmail02.blacknight.ie [81.17.254.11]) by outbound-smtp10.blacknight.com (Postfix) with ESMTPS id D881F1C2FB7 for ; Wed, 17 Oct 2018 12:20:43 +0100 (IST) Received: (qmail 26267 invoked from network); 17 Oct 2018 11:20:43 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[37.228.229.142]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 17 Oct 2018 11:20:43 -0000 Date: Wed, 17 Oct 2018 12:20:42 +0100 From: Mel Gorman To: Aaron Lu Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Huang Ying , Dave Hansen , Kemi Wang , Tim Chen , Andi Kleen , Michal Hocko , Vlastimil Babka , Matthew Wilcox , Daniel Jordan , Tariq Toukan , Jesper Dangaard Brouer Subject: Re: [RFC v4 PATCH 3/5] mm/rmqueue_bulk: alloc without touching individual page structure Message-ID: <20181017112042.GK5819@techsingularity.net> References: <20181017063330.15384-1-aaron.lu@intel.com> <20181017063330.15384-4-aaron.lu@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20181017063330.15384-4-aaron.lu@intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 17, 2018 at 02:33:28PM +0800, Aaron Lu wrote: > Profile on Intel Skylake server shows the most time consuming part > under zone->lock on allocation path is accessing those to-be-returned > page's "struct page" on the free_list inside zone->lock. One explanation > is, different CPUs are releasing pages to the head of free_list and > those page's 'struct page' may very well be cache cold for the allocating > CPU when it grabs these pages from free_list' head. The purpose here > is to avoid touching these pages one by one inside zone->lock. > I didn't read this one in depth because it's somewhat ortogonal to the lazy buddy merging which I think would benefit from being finalised and ensuring that there are no reductions in high-order allocation success rates. Pages being allocated on one CPU and freed on another is not that unusual -- ping-pong workloads or things like netperf used to exhibit this sort of pattern. However, this part stuck out > +static inline void zone_wait_cluster_alloc(struct zone *zone) > +{ > + while (atomic_read(&zone->cluster.in_progress)) > + cpu_relax(); > +} > + RT has had problems with cpu_relax in the past but more importantly, as this delay for parallel compactions and allocations of contig ranges, we could be stuck here for very long periods of time with interrupts disabled. It gets even worse if it's from an interrupt context such as jumbo frame allocation or a high-order slab allocation that is atomic. These potentially large periods of time with interrupts disabled is very hazardous. It may be necessary to consider instead minimising the number of struct page update when merging to PCP and then either increasing the size of the PCP or allowing it to exceed pcp->high for short periods of time to batch the struct page updates. I didn't read the rest of the series as it builds upon this patch. -- Mel Gorman SUSE Labs