Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp133226ybj; Mon, 4 May 2020 17:41:17 -0700 (PDT) X-Google-Smtp-Source: APiQypLzzkWhJSIj7fY5dsm9Rw+4v8Cn+/Ys/Sea9M9OoBTWWlXNhIpvcUozgAwLVOKTEVYRC/rp X-Received: by 2002:a50:f058:: with SMTP id u24mr510399edl.171.1588639276956; Mon, 04 May 2020 17:41:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588639276; cv=none; d=google.com; s=arc-20160816; b=ENiAI+nClm5C/QceC9rXux8kes28XeXC/zgY32lLSFZ0ayQwmB97uDDgPtaOIlwRv2 0xdVEFZxl62w4sHr9cOXMtqxAQDDdXA7CT50VMPUYGQ5sjZZ05tqYKnPwJbwaHpJhN79 j2Lw426Qpurnsk7pFnu2FGq2QKzPz0Y36n7460ifcsIaW2NTFwy2ykuHOgy8qDbRQNxU WLhPoZF5AvjQHswTLPdpVDv/N0TnvylLkgiZ2sKwjK5UnWb8iBrm5O0Txa4z1SLK/qzZ ZK7rxcKM9T/tyI03g1qpGkw3KZpCVzo9K2BTxtALgC7nPTjZUf2OxFyI3UD3Cl7ZZ3Mu g8XA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=F2N0yw9CtjKdQBdxtCgQ9EQo5C05PfhCk6kaMbyQP8g=; b=iHXK3V3i+hqElA6JKjYioOFZdk0amilGIwezTG2neuEzQLcGke5IOKm4BDlYNfd7nw 3bEcKQsGP14vmeqylWa2T45hTF1VSZl0f2LpGr90vC2qVnV+/p3G1ID2aVK6eWDn+fj1 VMCoDgKpB3j9Z14UqFj7/xE/SWX2WNaXkjg54CmwrSV/STS28RBahgHYHkVR8qe8tCh5 JVZ1o3HRFg9MhalqDx8whjCu5KPzxP7z1xApx+Kh+6lzjCco+dOOTZbSZMEv+SRp/3mR T57RcJJhlCMY1dI4ovx0Kv/Vg8I34FEqcHT8GXSkZNLRqcS/KFnbDr4jRc6f5p2ehxR3 /i5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=kVBOX4+i; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i24si383508edy.602.2020.05.04.17.40.43; Mon, 04 May 2020 17:41:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=kVBOX4+i; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726421AbgEEAkc (ORCPT + 99 others); Mon, 4 May 2020 20:40:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725981AbgEEAkb (ORCPT ); Mon, 4 May 2020 20:40:31 -0400 Received: from mail-il1-x143.google.com (mail-il1-x143.google.com [IPv6:2607:f8b0:4864:20::143]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81F18C061A0E; Mon, 4 May 2020 17:40:31 -0700 (PDT) Received: by mail-il1-x143.google.com with SMTP id w6so710055ilg.1; Mon, 04 May 2020 17:40:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=F2N0yw9CtjKdQBdxtCgQ9EQo5C05PfhCk6kaMbyQP8g=; b=kVBOX4+ioERAfmKI4jXBXAKq+7hDz0X9NL48BCmJIx9i9P+DaYsb9uEFk13CIwzZ86 kDS3FNP05eHe+dYITHg7QdO42ehdtvjC5lUVoDV5TQ0wsibMC5z945sANd72P4ovftdC 4l9P/1COaITBdAJEE8nj4u2SDM+PkNM4pgWEgNL/6k/6mvmyeHzw4G3V5yJwSsAIWQ2q ECmzZeZeQcANmXX11Ea40DHvhaOc0P5eNvVhEtApZauetHRjdKYGSFYh3amkSNdMO2Wg GmPNgsPuxZGiAesYP0UzOnNhcrb+D7ghov2qE06ifkBDPSkmfkiE9dSm1k0kbj98eksx zfXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=F2N0yw9CtjKdQBdxtCgQ9EQo5C05PfhCk6kaMbyQP8g=; b=KhuTywGfT7oTml+BPoPXoZBz9ddfo/mUnjPcGEVaf1Xspz46rZLzMf3m4H60hw0CDK Z2kzqNDXSzuzfRvBIkg2eTNJjYb1T0H8Q0ynPhB715tlAozzUZz43V/gXrgww11bhLFS askgvL1+9V4ku8VkVzCBvtBU7eOelSzlVQ/Vv0dcwJkfH1JefLurlUeOekHVDS2yZMF4 vlAPJMfTT2ktUgoQ1SVPO6+q+ecU4aRwg7710mQUAdIcZInI6wJuEhSfL7E1uwz/zP8X hE9oxmS5wvMUfbdg9LuekksHcvdNoqfOpy6tjhb0l+OCZ2li06gdfPU04YOJlCZmqsmF FVfA== X-Gm-Message-State: AGi0PuYDja0uAstSCgmoB70JzgqlxKMvYPKQeV4uc0FeTTQq+QdcqFy0 eFUk8G62EFdNfCCd5wkO3vWU2vRbXLjY7fyMzy0= X-Received: by 2002:a92:3dd5:: with SMTP id k82mr1178579ilf.237.1588639230609; Mon, 04 May 2020 17:40:30 -0700 (PDT) MIME-Version: 1.0 References: <20200430201125.532129-1-daniel.m.jordan@oracle.com> <20200430201125.532129-7-daniel.m.jordan@oracle.com> <3C3C62BE-6363-41C3-834C-C3124EB3FFAB@joshtriplett.org> In-Reply-To: <3C3C62BE-6363-41C3-834C-C3124EB3FFAB@joshtriplett.org> From: Alexander Duyck Date: Mon, 4 May 2020 17:40:19 -0700 Message-ID: Subject: Re: [PATCH 6/7] mm: parallelize deferred_init_memmap() To: Josh Triplett Cc: Daniel Jordan , Andrew Morton , Herbert Xu , Steffen Klassert , Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Shile Zhang , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Mon, May 4, 2020 at 4:44 PM Josh Triplett wrote: > > On May 4, 2020 3:33:58 PM PDT, Alexander Duyck wrote: > >On Thu, Apr 30, 2020 at 1:12 PM Daniel Jordan > > wrote: > >> /* > >> - * Initialize and free pages in MAX_ORDER sized increments so > >> - * that we can avoid introducing any issues with the buddy > >> - * allocator. > >> + * More CPUs always led to greater speedups on tested > >systems, up to > >> + * all the nodes' CPUs. Use all since the system is > >otherwise idle now. > >> */ > > > >I would be curious about your data. That isn't what I have seen in the > >past. Typically only up to about 8 or 10 CPUs gives you any benefit, > >beyond that I was usually cache/memory bandwidth bound. > > I've found pretty much linear performance up to memory bandwidth, and on = the systems I was testing, I didn't saturate memory bandwidth until about t= he full number of physical cores. From number of cores up to number of thre= ads, the performance stayed about flat; it didn't get any better or worse. That doesn't sound right though based on the numbers you provided. The system you had was 192GB spread over 2 nodes with 48thread/24core per node, correct? Your numbers went from ~290ms to ~28ms so a 10x decrease, that doesn't sound linear when you spread the work over 24 cores to get there. I agree that the numbers largely stay flat once you hit the peak, I have seen similar behavior when I was working on the deferred init code previously. One concern I have though is that we may end up seeing better performance with a subset of cores instead of running all of the cores/threads, especially if features such as turbo come into play. In addition we are talking x86 only so far. I would be interested in seeing if this has benefits or not for other architectures. Also what is the penalty that is being paid in order to break up the work before-hand and set it up for the parallel work? I would be interested in seeing what the cost is on a system with fewer cores per node, maybe even down to 1. That would tell us how much additional overhead is being added to set things up to run in parallel. If I get a chance tomorrow I might try applying the patches and doing some testing myself. Thanks. - Alex