Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1155869pxb; Tue, 26 Oct 2021 03:59:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwpTzr9MTqXlZEiT7k3NF6UY7mg52OG6RscHfFY4WkAumqoHuS2/7/qsmxd+uZq8D99KPuZ X-Received: by 2002:a63:6c89:: with SMTP id h131mr18095782pgc.423.1635245950283; Tue, 26 Oct 2021 03:59:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635245950; cv=none; d=google.com; s=arc-20160816; b=UxSImFKN5cBoUHOPUvWK6ml4tyndg2+y9zTvAn1rNImatpzbJ+kl7wLKIlL9g3RX3Z l6bZk20LFD0gmMrvqYjVUUSL9sfJAWsxAa1tUxsr5I+Pg1dnIkk8ky6YeUJbVQ3RLmyD XiaF/6cMQYh32Ghqvo1iL7Il5MKGzzan3OnYag1LsDG+EfUcVQWVy5PIKI7B6gahIvOq drOs4FAB6Y+k3kSjknkuHwWLaJMP/uqoIUW6OxhAge4D8mV5fmuGhy9kp3t1XeoJYwWP W+OcPdsXAma1TVfNIwpYYYa0h9ntg687XP4UWBfhc7vru1mJ0Hx59lt5WBjdbsgT5KFt 6ebQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version; bh=WP4KMCUXl+rWEf2feu/35l51pXt02OCY9vDl3R9CVSQ=; b=054nx9erXdFqa/6AU660Oj3+szUtoYLwjT2XkFST9tjdf09xxOJQ3vzwLJjt25KzS3 9SwPu1j74gqVdiUhy7yYYyG1xMrWkHveLeWmGv0sJ+CT5K+ELMgD1Esur+lvvYd94fE7 Ts0uk1JPF6jST1hDqSwYGt26qo7De1N6q6PVJVS9abOErhmQ3YOhxW57nlPASdOIJY8I WSq31tvN6NF2zIgfz3GCOAYpyF6SjHI4QDGBofU47IY2UP2SyCi7gVydQC4lyXelO4AN jUpvtGnJJWnuBQjVpmmeNHYBTxcdeVFviQ8eVg4b+NMkCBmGBAPGQSa5g/YHkPQJQyE2 pjog== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y5si4641247pgk.632.2021.10.26.03.58.55; Tue, 26 Oct 2021 03:59:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233196AbhJZJFz (ORCPT + 99 others); Tue, 26 Oct 2021 05:05:55 -0400 Received: from mail-ua1-f42.google.com ([209.85.222.42]:45908 "EHLO mail-ua1-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233240AbhJZJFw (ORCPT ); Tue, 26 Oct 2021 05:05:52 -0400 Received: by mail-ua1-f42.google.com with SMTP id f24so8586213uav.12 for ; Tue, 26 Oct 2021 02:03:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=WP4KMCUXl+rWEf2feu/35l51pXt02OCY9vDl3R9CVSQ=; b=ocmm1T/fLVs7tuAnKsx0KMQdfBv4RxIXcJkuSE4L/0cz1lZGO1lpNUqV3t7ByZ1AeW 9GD1kQRoP3A5i8dktqVehqYP+Ei9p0+i2D1V9yqHFd2PnT08mZZj+eDCdnbt+xQi4QHS /G5iylx5pcHPUJAlfYDIq4kVc2dA9om3K8EljZsJENxU8JCxm7WAW/JIsVXR5THJ3BMM XpTiUuVkJoy1pjNEXRFGOjsrbb/pUvFbOsTNLK9pcz9lwPM3UfUIYweC4jDVzhnsxMDt 3JBzpL8iJMuj8NFfDBZfBJ8/ultDkK+O0VjCYNuipcSdAidwZlOAhr4GM44snfrbxWS1 /Tag== X-Gm-Message-State: AOAM533cdD9syCf5qI4xjaK6HW4EXJfxLIrIuPrCcBSDyF6X0hx6atUb cleVqvB+DyHn7oUFd1EUU62vgH6qNZtfuQ== X-Received: by 2002:ab0:5542:: with SMTP id u2mr21852938uaa.62.1635239008181; Tue, 26 Oct 2021 02:03:28 -0700 (PDT) Received: from mail-ua1-f51.google.com (mail-ua1-f51.google.com. [209.85.222.51]) by smtp.gmail.com with ESMTPSA id k185sm10471363vsc.21.2021.10.26.02.03.27 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 26 Oct 2021 02:03:27 -0700 (PDT) Received: by mail-ua1-f51.google.com with SMTP id e10so27790042uab.3 for ; Tue, 26 Oct 2021 02:03:27 -0700 (PDT) X-Received: by 2002:a67:cb0a:: with SMTP id b10mr21592387vsl.9.1635239007608; Tue, 26 Oct 2021 02:03:27 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Geert Uytterhoeven Date: Tue, 26 Oct 2021 11:03:16 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Out-of-bounds access when hartid >= NR_CPUS To: Atish Patra Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Atish, On Tue, Oct 26, 2021 at 10:55 AM Atish Patra wrote: > On Mon, Oct 25, 2021 at 8:54 AM Geert Uytterhoeven wrote: > > When booting a kernel with CONFIG_NR_CPUS=4 on Microchip PolarFire, > > the 4th CPU either fails to come online, or the system crashes. > > > > This happens because PolarFire has 5 CPU cores: hart 0 is an e51, > > and harts 1-4 are u54s, with the latter becoming CPUs 0-3 in Linux: > > - unused core has hartid 0 (sifive,e51), > > - processor 0 has hartid 1 (sifive,u74-mc), > > - processor 1 has hartid 2 (sifive,u74-mc), > > - processor 2 has hartid 3 (sifive,u74-mc), > > - processor 3 has hartid 4 (sifive,u74-mc). > > > > I assume the same issue is present on the SiFive fu540 and fu740 > > SoCs, but I don't have access to these. The issue is not present > > on StarFive JH7100, as processor 0 has hartid 1, and processor 1 has > > hartid 0. > > > > arch/riscv/kernel/cpu_ops.c has: > > > > void *__cpu_up_stack_pointer[NR_CPUS] __section(".data"); > > void *__cpu_up_task_pointer[NR_CPUS] __section(".data"); > > > > void cpu_update_secondary_bootdata(unsigned int cpuid, > > struct task_struct *tidle) > > { > > int hartid = cpuid_to_hartid_map(cpuid); > > > > /* Make sure tidle is updated */ > > smp_mb(); > > WRITE_ONCE(__cpu_up_stack_pointer[hartid], > > task_stack_page(tidle) + THREAD_SIZE); > > WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle); > > > > The above two writes cause out-of-bound accesses beyond > > __cpu_up_{stack,pointer}_pointer[] if hartid >= CONFIG_NR_CPUS. > > > > } > > > > Thanks for reporting this. We need to fix this and definitely shouldn't hide it > using configs. I guess I never tested with lower values (2 or 4) for > CONFIG_NR_CPUS which explains how this bug was not noticed until now. > > How to fix this? > > > > We could skip hartids >= NR_CPUS, but that feels strange to me, as > > you need NR_CPUS to be larger (much larger if the first usable hartid > > is a large number) than the number of CPUs used. > > > > We could store the minimum hartid, and always subtract that when > > accessing __cpu_up_{stack,pointer}_pointer[] (also in > > arch/riscv/kernel/head.S), but that means unused cores cannot be in the > > middle of the hartid range. > > Yeah. Both of the above proposed solutions are not ideal. > > > > > Are hartids guaranteed to be continuous? If not, we have no choice but > > to index __cpu_up_{stack,pointer}_pointer[] by cpuid instead, which > > needs a more expensive conversion in arch/riscv/kernel/head.S. > > This will work for ordered booting with SBI HSM extension. However, it may > fail for spinwait booting because cpuid_to_hartid_map might not have setup > depending on when secondary harts are jumping to linux. > > Ideally, the size of the __cpu_up_{stack,task}_pointer[] should be the maximum > hartid possible. How about adding a config for that ? (reading more RISC-V specs) Hart IDs can use up to XLEN (32, 64, or 128) bits. So creative sparse multi-level encodings like used in MPIDR on ARM[1] makes using a simple array infeasible. [1] arch/arm{,64}/include/asm/cputype.h Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds