Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp5309061rdb; Sat, 16 Sep 2023 11:00:55 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHlbJKDo54AGwSDCnA7RLxftge1jD8a3S9uFNb8q9eAUOFUe896/SdF2MVkNyy9pfrO/zTU X-Received: by 2002:a17:902:d315:b0:1c0:cb4d:df7c with SMTP id b21-20020a170902d31500b001c0cb4ddf7cmr4400420plc.1.1694887255074; Sat, 16 Sep 2023 11:00:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694887255; cv=none; d=google.com; s=arc-20160816; b=ga08qxzbYQczdgoOIuk08U6/4agC6VJ5b6EVc3tKGqPwe0UR1mBtvgNfgw3Vn0HTf3 QpVYoeC0WTkomTVcWNe1MlfNMNYSLd1K76zswLTgfBW0oMtD6gvnGX7XvN/DyJdi90eL CAk2rm2g5jDtm7LJChMFePf9r4t4zzxqrlj3GLaovKjhZWk7uOplrn68Y3nB6oIT7LDZ 8IZ9x/habI9TC+4ZI7nakksH5re+wLp0SN2mSXo7mRZcXJcrsQhFYqdEsmOWlU24G4Cd uYaYmdv2eK3nUcZx9gzZaOLpkXbs1fZMqjS5Y/YGvMwkY9uTGe1IHyHpxSiGed/l36Ic 39Uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=MIDeAWdahUMHOh7TW9BYceBIpM4QDz+5Y/okXYT71D4=; fh=41p6CGyVFm2XJnvtvPXBbIfok83TPNwuGXBRxit0PcQ=; b=NBABSlitJVHOuATbRdHOVg4nJRJtdErnQR0iCJkZGYRx4CiDZDBBVO+rB+E11LgWa/ tqNu13P3jpPIWoL/yNh3AsGPsCqnKTjx8qTSjKUKeSy1lQtjESSGWRW+TRy0s4iZ1Wtm QYGQDMvBlug7oXg64fYm4Asl+1nlhJgDzul85OkHo6spXJXpNNhin8gKe5m+l990JHnw BDk+WJnXYz7SjNNvrB+8qYK7AcDjeY9UIElgvyxQtm0+0axpSdnVqK5d5u42PZ1scYPG JYYKE4+IL1nZ+XrjpePBMIsdOkaG7YMDWZnVL1K1iuMLFeLQ3/AHnwavM3WZ9qDH/4jx FIHA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ventanamicro.com header.s=google header.b="DRB/9v9c"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id kp3-20020a170903280300b001c0eefc0dfesi5172111plb.130.2023.09.16.11.00.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 Sep 2023 11:00:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@ventanamicro.com header.s=google header.b="DRB/9v9c"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 016BD83A0C10; Fri, 15 Sep 2023 23:46:42 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232518AbjIPGqK (ORCPT + 99 others); Sat, 16 Sep 2023 02:46:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59202 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237823AbjIPGp7 (ORCPT ); Sat, 16 Sep 2023 02:45:59 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 211FE1BC1 for ; Fri, 15 Sep 2023 23:45:54 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id ffacd0b85a97d-31f71b25a99so2722696f8f.2 for ; Fri, 15 Sep 2023 23:45:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1694846752; x=1695451552; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=MIDeAWdahUMHOh7TW9BYceBIpM4QDz+5Y/okXYT71D4=; b=DRB/9v9cvgnnoUwQ7JFyZUJPxKst18z3etP1l97V638sqWWLmGCWi6lE7lbuSPeEox INDg3cznm6b6yNExDJBXe3k/lzgMyx+iTRpCKMYtJDazj4qVkTjnYO2C9MYCwbC4RtA/ Dy8WQx3OvWF6Qnesj18Dq6Fjq/J1k/w/Jpk/7QoVRYX8+Bn8WehfnEz0iuE6DobnM/d8 cnaHz7uROwNf9u/LNFJ0mIQ2oy1dX7wT9i/ehYNBIFkedvX1/Aoti41IzFCu1XYu7Vmn /n+9FZ6HpDWRECAwoZNO9Lw38cTwgmcDaJx/pK8mMnpyfgLPOiHp87eQI8n7OtM47vY3 ffCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694846752; x=1695451552; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=MIDeAWdahUMHOh7TW9BYceBIpM4QDz+5Y/okXYT71D4=; b=HfTDKJTTRu0N9irAKZXVLFZraKTqwXJBPOQa6ccrqYP8yrsvliAKDz09fcQ6x6Lw5m kIYZJj1PjrsCgBsxUUapTvGV710hPxNl6BvuezZqR0yfq4O1qFchmoJ7PucYLMK3LhAo 8d9lOC4P7fVKKD6dKPLLn0sZh2PwmDeYb4WnvKNPJwr/9uzT13U00oYc4VGhCDlgc4Uu W89gkekdFKFM7cUU/OZzLqyQ9n6tfdm0Uo3q1P02JKpFpwLYFfCVSGxswWbrx/xyToV3 VTZ05ITy2qgEzLcJHzi4UEo7LEU3JchpSdEJ/BO3zWURCcr6WMWRWviuLqCVJGAn6z9S Qm9A== X-Gm-Message-State: AOJu0YyM2odguc+0mKx4eHm09QdKvhhHftpZB59uKYMNP+WjtCHKLKDX 3Kn1fjgvw/lLhxPdkyN6WbpT8w== X-Received: by 2002:a5d:5965:0:b0:313:f61c:42ab with SMTP id e37-20020a5d5965000000b00313f61c42abmr3019654wri.56.1694846752519; Fri, 15 Sep 2023 23:45:52 -0700 (PDT) Received: from localhost (cst2-173-16.cust.vodafone.cz. [31.30.173.16]) by smtp.gmail.com with ESMTPSA id h4-20020a056000000400b00317909f9985sm6277045wrx.113.2023.09.15.23.45.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 23:45:51 -0700 (PDT) Date: Sat, 16 Sep 2023 08:45:51 +0200 From: Andrew Jones To: Evan Green Cc: Palmer Dabbelt , David Laight , Jisheng Zhang , Albert Ou , Anup Patel , Conor Dooley , Greentime Hu , Heiko Stuebner , Ley Foon Tan , Marc Zyngier , Palmer Dabbelt , Paul Walmsley , Sunil V L , linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org Subject: Re: [PATCH] RISC-V: Probe misaligned access speed in parallel Message-ID: <20230916-ab31c90dd56c99d36d5fce6c@orel> References: <20230915184904.1976183-1-evan@rivosinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230915184904.1976183-1-evan@rivosinc.com> X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Fri, 15 Sep 2023 23:46:42 -0700 (PDT) On Fri, Sep 15, 2023 at 11:49:03AM -0700, Evan Green wrote: > Probing for misaligned access speed takes about 0.06 seconds. On a > system with 64 cores, doing this in smp_callin() means it's done > serially, extending boot time by 3.8 seconds. That's a lot of boot time. > > Instead of measuring each CPU serially, let's do the measurements on > all CPUs in parallel. If we disable preemption on all CPUs, the > jiffies stop ticking, so we can do this in stages of 1) everybody > except core 0, then 2) core 0. > > The measurement call in smp_callin() stays around, but is now > conditionalized to only run if a new CPU shows up after the round of > in-parallel measurements has run. The goal is to have the measurement > call not run during boot or suspend/resume, but only on a hotplug > addition. Yay! I had just recently tested suspend/resume and wanted to report the probe as an issue, but I hadn't gotten around to it. This patch resolves the issue, so Test-by: Andrew Jones > > Signed-off-by: Evan Green > > --- > > Jisheng, I didn't add your Tested-by tag since the patch evolved from > the one you tested. Hopefully this one brings you the same result. > > --- > arch/riscv/include/asm/cpufeature.h | 3 ++- > arch/riscv/kernel/cpufeature.c | 28 +++++++++++++++++++++++----- > arch/riscv/kernel/smpboot.c | 11 ++++++++++- > 3 files changed, 35 insertions(+), 7 deletions(-) > > diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h > index d0345bd659c9..19e7817eba10 100644 > --- a/arch/riscv/include/asm/cpufeature.h > +++ b/arch/riscv/include/asm/cpufeature.h > @@ -30,6 +30,7 @@ DECLARE_PER_CPU(long, misaligned_access_speed); > /* Per-cpu ISA extensions. */ > extern struct riscv_isainfo hart_isa[NR_CPUS]; > > -void check_unaligned_access(int cpu); > +extern bool misaligned_speed_measured; Do we need this new state or could we just always check the boot cpu's state to get the same information? per_cpu(misaligned_access_speed, 0) != RISCV_HWPROBE_MISALIGNED_UNKNOWN > +int check_unaligned_access(void *unused); > > #endif > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c > index 1cfbba65d11a..8eb36e1dfb95 100644 > --- a/arch/riscv/kernel/cpufeature.c > +++ b/arch/riscv/kernel/cpufeature.c > @@ -42,6 +42,9 @@ struct riscv_isainfo hart_isa[NR_CPUS]; > /* Performance information */ > DEFINE_PER_CPU(long, misaligned_access_speed); > > +/* Boot-time in-parallel unaligned access measurement has occurred. */ > +bool misaligned_speed_measured; > + > /** > * riscv_isa_extension_base() - Get base extension word > * > @@ -556,8 +559,9 @@ unsigned long riscv_get_elf_hwcap(void) > return hwcap; > } > > -void check_unaligned_access(int cpu) > +int check_unaligned_access(void *unused) > { > + int cpu = smp_processor_id(); > u64 start_cycles, end_cycles; > u64 word_cycles; > u64 byte_cycles; > @@ -571,7 +575,7 @@ void check_unaligned_access(int cpu) > page = alloc_pages(GFP_NOWAIT, get_order(MISALIGNED_BUFFER_SIZE)); > if (!page) { > pr_warn("Can't alloc pages to measure memcpy performance"); > - return; > + return 0; > } > > /* Make an unaligned destination buffer. */ > @@ -643,15 +647,29 @@ void check_unaligned_access(int cpu) > > out: > __free_pages(page, get_order(MISALIGNED_BUFFER_SIZE)); > + return 0; > +} > + > +static void check_unaligned_access_nonboot_cpu(void *param) > +{ > + if (smp_processor_id() != 0) > + check_unaligned_access(param); > } > > -static int check_unaligned_access_boot_cpu(void) > +static int check_unaligned_access_all_cpus(void) > { > - check_unaligned_access(0); > + /* Check everybody except 0, who stays behind to tend jiffies. */ > + on_each_cpu(check_unaligned_access_nonboot_cpu, NULL, 1); > + > + /* Check core 0. */ > + smp_call_on_cpu(0, check_unaligned_access, NULL, true); > + > + /* Boot-time measurements are complete. */ > + misaligned_speed_measured = true; > return 0; > } > > -arch_initcall(check_unaligned_access_boot_cpu); > +arch_initcall(check_unaligned_access_all_cpus); > > #ifdef CONFIG_RISCV_ALTERNATIVE > /* > diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c > index 1b8da4e40a4d..39322ae20a75 100644 > --- a/arch/riscv/kernel/smpboot.c > +++ b/arch/riscv/kernel/smpboot.c > @@ -27,6 +27,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -246,7 +247,15 @@ asmlinkage __visible void smp_callin(void) > > numa_add_cpu(curr_cpuid); > set_cpu_online(curr_cpuid, 1); > - check_unaligned_access(curr_cpuid); > + > + /* > + * Boot-time misaligned access speed measurements are done in parallel > + * in an initcall. Only measure here for hotplug. > + */ > + if (misaligned_speed_measured && > + (per_cpu(misaligned_access_speed, curr_cpuid) == RISCV_HWPROBE_MISALIGNED_UNKNOWN)) { > + check_unaligned_access(NULL); > + } > > if (has_vector()) { > if (riscv_v_setup_vsize()) > -- > 2.34.1 > Besides my reluctance to add another global variable, this looks good to me. Reviewed-by: Andrew Jones Thanks, drew