Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1115390pxb; Fri, 22 Jan 2021 07:32:01 -0800 (PST) X-Google-Smtp-Source: ABdhPJzZ1bNzUOEUtEZl8REX2rV+w0iGuNyHi3eN1nzPp9nd31sSgzCQ+m4wJ0rgSsO1HYsYNpJE X-Received: by 2002:a17:906:1e87:: with SMTP id e7mr3332337ejj.322.1611329521399; Fri, 22 Jan 2021 07:32:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611329521; cv=none; d=google.com; s=arc-20160816; b=HdzLbEQjmeO1lyWjEfv1p6Gh+sZeGdqMX5UWg2x+x6oezuX0C92ruwGXADdpMvTxH/ Qv97uOseh7Uh5DUShZ4WK4LlIWwsSU1xiMyofxtIea7UomwiIsa/A2frrxS98TfL+9Se 77cJUKir3LV8fhGACpTdyONnlqQnmPcj//U3N0eP3mYu01aLIIYO0XUBsVZG2H5R0IM8 nn62Svx695IYlcCdk8hF07uDcRCvwQRqK7KCxDILNbxo0lB1SWzE8kl0A+ZJcQyn7zcV e5qdXRLkBE6VedAdyAAQ45PFEBCnz5p62D8OsObZhzwbqn9DUxqbhZs9jgksxt/Yifc8 qc5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject:from :references:cc:to; bh=Va67th1GQR7WOVvHaDK5DdD4ysjFcdRYVtu1/aBnohs=; b=0NdDX89X51Nit5eKdxhCga5K1kHP2IMTcFYF46xlRThY9opXrkgeQPPQBA7iarBy0k St9V0JkaaZH0vOR8+xju7DRUiF6+CaxJqASC7mn56091bCokDphZOwUvIeRQtE72lfWd 57rsWcLjdGO55aaiaHfG2GTQwrtG9jxlovfH/DvhOGCoMxVjyT32BudDNd6+1L6RoyMh OVwFjWx8TPAcYfGStnWgBOYpnJLEKZZ5XWDf4Ln70zWCdtjOpXdm6val8nfmffhQ7rm+ l7do9mscYtgluwB4f2iP5G3QoySKu1cPgvD8IclDbqRwj5R0yURp8EYEOkh33t8TkBHF /P/g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v1si3646857edt.548.2021.01.22.07.31.37; Fri, 22 Jan 2021 07:32:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727383AbhAVP2c (ORCPT + 99 others); Fri, 22 Jan 2021 10:28:32 -0500 Received: from mx2.suse.de ([195.135.220.15]:48988 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728944AbhAVP1y (ORCPT ); Fri, 22 Jan 2021 10:27:54 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 12797ACC6; Fri, 22 Jan 2021 15:27:13 +0000 (UTC) To: Jann Horn Cc: Christoph Lameter , Bharata B Rao , Vincent Guittot , linux-kernel , Linux-MM , David Rientjes , Joonsoo Kim , Andrew Morton , Roman Gushchin , Shakeel Butt , Johannes Weiner , aneesh.kumar@linux.ibm.com, Michal Hocko References: <20201118082759.1413056-1-bharata@linux.ibm.com> <20210121053003.GB2587010@in.ibm.com> From: Vlastimil Babka Subject: Re: [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order Message-ID: Date: Fri, 22 Jan 2021 16:27:12 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/22/21 2:05 PM, Jann Horn wrote: > On Thu, Jan 21, 2021 at 7:19 PM Vlastimil Babka wrote: >> On 1/21/21 11:01 AM, Christoph Lameter wrote: >> > On Thu, 21 Jan 2021, Bharata B Rao wrote: >> > >> >> > The problem is that calculate_order() is called a number of times >> >> > before secondaries CPUs are booted and it returns 1 instead of 224. >> >> > This makes the use of num_online_cpus() irrelevant for those cases >> >> > >> >> > After adding in my command line "slub_min_objects=36" which equals to >> >> > 4 * (fls(num_online_cpus()) + 1) with a correct num_online_cpus == 224 >> >> > , the regression diseapears: >> >> > >> >> > 9 iterations of hackbench -l 16000 -g 16: 3.201sec (+/- 0.90%) >> >> I'm surprised that hackbench is that sensitive to slab performance, anyway. It's >> supposed to be a scheduler benchmark? What exactly is going on? > > Uuuh, I think powerpc doesn't have cmpxchg_double? The benchmark was done by Vincent on arm64, AFAICS. PowerPC (ppc64) was what Bharata had used to demonstrate the order calculation change in his patch. There seems to be some implementation dependency on CONFIG_ARM64_LSE_ATOMICS but AFAICS that doesn't determine if cmpxchg_double is provided. > "vgrep cmpxchg_double arch/" just spits out arm64, s390 and x86? And > > says under "POWERPC": "no DW LL/SC" Interesting find in any case. > So powerpc is probably hitting the page-bitlock-based implementation > all the time for stuff like __slub_free()? Do you have detailed > profiling results from "perf top" or something like that? > > (I actually have some WIP patches and a design document for getting > rid of cmpxchg_double in struct page that I hacked together in the > last couple days; I'm currently in the process of sending them over to > some other folks in the company who hopefully have cycles to > review/polish/benchmark them so that they can be upstreamed, assuming > that those folks think they're important enough. I don't have the > cycles for it...) I'm curious, so I hope this works out :)