Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp654286pxb; Tue, 1 Feb 2022 07:46:53 -0800 (PST) X-Google-Smtp-Source: ABdhPJynDWJOggGlBdcWBTWL+oFE01UqToZLgHqDQQmp7k8BK4zavNuR9aMbydpgZRmHOGXRv8eb X-Received: by 2002:aa7:9d9b:: with SMTP id f27mr25396939pfq.84.1643730412856; Tue, 01 Feb 2022 07:46:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643730412; cv=none; d=google.com; s=arc-20160816; b=A8KTSS6+N0l9oilKHUGA7UXPvOcyXoi9EyCwdOLjDysp+zgCtw3csLo2307VBZX5mc wTGcklwwThlbtLnL5YiHDw//dgUDbFbT830JgYfKQsaKXeb5DwDcgkZRSVkSmNbvtSNd /ANckn0zUu88daE5bHOG6sniOcAAVB8EgCnadMUCJjJbIhzpDk02TTYgEVtxlT1SxOG2 TQilpHXwKSUz0CKU4ANVOSJm7KcEcrDWDCFcY4J1qaN0oNcyOIS7m8KejCxtJAZCNx+I wcuEfs+oE4QQpuTLZoj/DRZpnddTyDwBfExmIdiuxDcHRxKv6/OhCAmZiZwS5IBIaBBm 2X0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=i63MXT8FfcAJ6NKG0kmSh3EqNn5Sgi3xKtTOMUOf0b8=; b=MtMqyIxt2q8hBq6nLFW5ZbFLezhut+IHnp9iWEuM9nck7V4jIJP1PqeRF/hJY5OWZd +uQ/TIAfIUhZYv/1xhSc15jGb4AtskrGu3VZ8KZL6sr8bmhhgyuA4Ls2KWKwki1KjYc7 /4bjQgO6hZkcUD0DbdOBAYdZ45Qo+xp+uJGf/ubY93YTrh6RRBQOg2/+Ec3SF0UDDoj9 MgNuVC5AWeCIRtfL6FZVqxuebZc/8JXiR/uAm0Sr4nGNuglYVboxC+dNNv6lsmvanqzK N8WRzhXowYr8iF5N4OBTuEhykVrLBc2ZoOr8eldqn29M8Atal9XeYzD0gpOT1O0juXwv jUPQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@brainfault-org.20210112.gappssmtp.com header.s=20210112 header.b="8BTsbIb/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bj10si7817466plb.129.2022.02.01.07.46.40; Tue, 01 Feb 2022 07:46:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@brainfault-org.20210112.gappssmtp.com header.s=20210112 header.b="8BTsbIb/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343980AbiAaIfR (ORCPT + 99 others); Mon, 31 Jan 2022 03:35:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231194AbiAaIfR (ORCPT ); Mon, 31 Jan 2022 03:35:17 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3089C06173B for ; Mon, 31 Jan 2022 00:35:16 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id e2so23887196wra.2 for ; Mon, 31 Jan 2022 00:35:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brainfault-org.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=i63MXT8FfcAJ6NKG0kmSh3EqNn5Sgi3xKtTOMUOf0b8=; b=8BTsbIb/tYnrYst6bOnHmle4MB6tOwZdVXAd/HEobWHHPPyMmFCq8Xrtcnwt1G3PJG naXw6vhziX8PMyZ26p5O5cB3/eGYaNbJpouhWc8DjCukDk31EaIK9QJIsN5aWqLuFjfU T6qH7fRLmUAAUHGRYAMvlHJ/A+rP+hHuC0i59J9PteKh34T/LST9QHtMGrzsO0bPfRje Ena36cjlZa0mreR6Bdi+w5AUqi2qXJCsJO+puzx8Po3wfZSR855SBBws0ZOtFyvteI3c Mud5Kntpu+4c8VHwP9WZxrSB2elGHSgZN/vIT34j+O0I5mNik9rewhQ92qOxaT9fBYuk j7zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=i63MXT8FfcAJ6NKG0kmSh3EqNn5Sgi3xKtTOMUOf0b8=; b=7jODICHZmtNpp+u5209W1RRDVGc++cszoA2DzuW306Uruv7+pr7rCJAcMGlQ1fb+3d q4L+XDLIgQvrtU94Y/Ue2pCu9Z9UevP6lYQrJ0EETVAiGzdBhPjoxhiwMeyZNJQAOov3 w3U5UVbmIg6jfIAKbmBOn10s9bsy2xdV2h+AfT6JlOAJBecOqZXiYIVNB5f5lxQNrMfM o3XPTZ9fNP1MrlqONIlibqNHLzdembyGqipn/EknKNw8ykkVrmU1N2Dpy7xQBkQGiPsl 1BQaLXn1qU5fANoLmrrw/CrWr5ALRSG8+pvXQ22BVtDQsJZCwVVRz+bzONI1aI2aM7/h QCCg== X-Gm-Message-State: AOAM532N3F6aoIrhyeMPFWQwUMmNODRv15uE9NN8EuKCdkC6Y9nDA2hJ dVDtcR9LwfDBeFaHH8Qk9rml06796hciRljEs4GTYw== X-Received: by 2002:a05:6000:387:: with SMTP id u7mr16640039wrf.214.1643618115301; Mon, 31 Jan 2022 00:35:15 -0800 (PST) MIME-Version: 1.0 References: <20220120090918.2646626-1-atishp@rivosinc.com> <20220120090918.2646626-7-atishp@rivosinc.com> <1AA3005C-E9C8-4E4B-900D-DD48B37CEA41@jrtc27.com> In-Reply-To: From: Anup Patel Date: Mon, 31 Jan 2022 14:05:03 +0530 Message-ID: Subject: Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap To: Atish Patra Cc: Geert Uytterhoeven , Jessica Clarke , Atish Patra , Linux Kernel Mailing List , Albert Ou , Damien Le Moal , devicetree , Jisheng Zhang , Krzysztof Kozlowski , linux-riscv , Palmer Dabbelt , Paul Walmsley , Rob Herring Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 27, 2022 at 6:32 AM Atish Patra wrote: > > On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven wrote: > > > > Hi Atish, > > > > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven wrote: > > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra wrote: > > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke wrote: > > > > > On 20 Jan 2022, at 09:09, Atish Patra wrote: > > > > > > Currently, SBI APIs accept a hartmask that is generated from struct > > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it > > > > > > is not the correct data structure for hartids as it can be higher > > > > > > than NR_CPUs for platforms with sparse or discontguous hartids. > > > > > > > > > > > > Remove all association between hartid mask and struct cpumask. > > > > > > > > > > > > Reviewed-by: Anup Patel (For Linux RISC-V changes) > > > > > > Acked-by: Anup Patel (For KVM RISC-V changes) > > > > > > Signed-off-by: Atish Patra > > > > > > > I am yet to reproduce it on my end. > > > > @Geert Uytterhoeven: can you please try the below diff on your end. > > > > > > Unfortunately it doesn't fix the issue for me. > > > > > > /me debugging... > > > > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and > > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with > > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and > > hbase = 0. > > > > cpuid 1 maps to hartid 0 > > cpuid 0 maps to hartid 1 > > > > __sbi_rfence_v02:364: cpuid 1 hartid 0 > > __sbi_rfence_v02:377: hartid 0 hbase 1 > > hmask |= 1UL << (hartid - hbase); > > > > oops > > > > __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask > > 8000000000000001 hbase 1 > > > > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher > hartid and it is trying to do a remote tlb flush/IPI > to lower the hartid. We should generate the hartid array before the loop. > > Can you try this diff ? It seems to work for me during multiple boot > cycle on the unleashed. > > You can find the patch here as well > https://github.com/atishp04/linux/commits/v5.17-rc1 > > -------------------------------------------------------------------------------------------------------------------------------- > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > index f72527fcb347..4ebeb5813edc 100644 > --- a/arch/riscv/kernel/sbi.c > +++ b/arch/riscv/kernel/sbi.c > @@ -8,6 +8,8 @@ > #include > #include > #include > +#include > + > #include > #include > > @@ -85,7 +87,7 @@ static unsigned long > __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mas > pr_warn("Unable to send any request to hartid > BITS_PER_LONG for > SBI v0.1\n"); > break; > } > - hmask |= 1 << hartid; > + hmask |= 1UL << hartid; > } > > return hmask; > @@ -160,7 +162,7 @@ static int __sbi_send_ipi_v01(const struct cpumask > *cpu_mask) > { > unsigned long hart_mask; > > - if (!cpu_mask) > + if (!cpu_mask || cpumask_empty(cpu_mask)) > cpu_mask = cpu_online_mask; > hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask); > > @@ -176,7 +178,7 @@ static int __sbi_rfence_v01(int fid, const struct > cpumask *cpu_mask, > int result = 0; > unsigned long hart_mask; > > - if (!cpu_mask) > + if (!cpu_mask || cpumask_empty(cpu_mask)) > cpu_mask = cpu_online_mask; > hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask); > > @@ -236,6 +238,18 @@ static int __sbi_rfence_v01(int fid, const struct > cpumask *cpu_mask, > static void sbi_set_power_off(void) {} > #endif /* CONFIG_RISCV_SBI_V01 */ > > +static int cmp_ulong(const void *A, const void *B) > +{ > + const unsigned long *a = A, *b = B; > + > + if (*a < *b) > + return -1; > + else if (*a > *b) > + return 1; > + else > + return 0; > +} > + > static void __sbi_set_timer_v02(uint64_t stime_value) > { > #if __riscv_xlen == 32 > @@ -251,13 +265,22 @@ static int __sbi_send_ipi_v02(const struct > cpumask *cpu_mask) > { > unsigned long hartid, cpuid, hmask = 0, hbase = 0; > struct sbiret ret = {0}; > - int result; > + int result, index = 0, max_index = 0; > + unsigned long hartid_arr[NR_CPUS] = {0}; No need to clear the hartid_arr[] because you have "index" and "max_index" telling us number of enteries. > > - if (!cpu_mask) > + if (!cpu_mask || cpumask_empty(cpu_mask)) > cpu_mask = cpu_online_mask; > > for_each_cpu(cpuid, cpu_mask) { > hartid = cpuid_to_hartid_map(cpuid); > + hartid_arr[index] = hartid; You can create a sorted array on the fly instead of calling sort() > + index++; > + } > + > + max_index = index; > + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL); > + for (index = 0; index < max_index; index++) { > + hartid = hartid_arr[index]; > if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) { > ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI, > hmask, hbase, 0, 0, 0, 0); > @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const > struct cpumask *cpu_mask, > unsigned long arg4, unsigned long arg5) > { > unsigned long hartid, cpuid, hmask = 0, hbase = 0; > - int result; > + int result, index = 0, max_index = 0; > + unsigned long hartid_arr[NR_CPUS] = {0}; > > - if (!cpu_mask) > + if (!cpu_mask || cpumask_empty(cpu_mask)) > cpu_mask = cpu_online_mask; > > for_each_cpu(cpuid, cpu_mask) { > hartid = cpuid_to_hartid_map(cpuid); > + hartid_arr[index] = hartid; > + index++; > + } > + max_index = index; > + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL); > + for (index = 0; index < max_index; index++) { > + hartid = hartid_arr[index]; > if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) { > result = __sbi_rfence_v02_call(fid, hmask, hbase, > start, size, arg4, arg5); > > -------------------------------------------------------------------------------------------------------------------------------- > > > Gr{oetje,eeting}s, > > > > Geert > > > > -- > > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org > > > > In personal conversations with technical people, I call myself a hacker. But > > when I'm talking to journalists I just say "programmer" or something like that. > > -- Linus Torvalds > > > > -- > Regards, > Atish My main concern is the sizeof hartid_arr[] on stack. Using kmalloc() will only further slow it down. Further, for small systems with fewer HARTs, this sorting business will be a unnecessary overhead. Regards, Anup