Received: by 2002:ab2:60d1:0:b0:1f7:5705:b850 with SMTP id i17csp613914lqm; Wed, 1 May 2024 10:13:08 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWsPrUwmoWQL3m66S5IJVgpxXbE1Fh1eyUeqafbR5G+GCAKPOfGJe7OKGSQrVPJsILCtozqXbhHpjiKDs8oqqaDqK3S4r8nycwumYVU8w== X-Google-Smtp-Source: AGHT+IH4T4owGKuoUXzs9CCFELecJ1/Wu4ZX6D2EthBA0kowo3H8KjY+hyRQHhg6iKujGjIPY5b9 X-Received: by 2002:a05:620a:135a:b0:790:ef5d:25ab with SMTP id c26-20020a05620a135a00b00790ef5d25abmr3187940qkl.9.1714583588550; Wed, 01 May 2024 10:13:08 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714583588; cv=pass; d=google.com; s=arc-20160816; b=xF9BqQ6v4FmGqJUI+autt9eWrICAZxJpflg2X/yAyyCMlwWwDX1NqyUOQN7efua88j aw66w/KgIj5e7sR5GkqkDVc1QKHFThd9KfIwU+ME1eb6TvrZsE5SLxRTmPbix97D/mYo T6qSawN+s3eb2O+kzHtNlv55sPHwOj/jDVg+rCXA9q3mWSb5WYMkLKhrwxVDPvF24wX4 9gBAvavBOjXFmjPeRYDhG1cO3hF7dSDN4AlCKIxoYdIy04zNCUaEGONuua6hCXZemIRV UwSfH2+GlMSvGUg9TSWehNf1JObbsmL/sWzBonbuXA326YQ5SnIvFsd5ad3axIvQY6dm IzoQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :message-id:date:references:in-reply-to:subject:cc:to:from :dkim-signature; bh=Ib5r8F6rnoQG862N/wTM9QGX6a/bq9PVV21wlC715pA=; fh=udFy0C8DEEARCLPUwdCcCTuIxeDa9j5wAY1MSDQ5IG0=; b=otfZiwxzi9qWmMMxhNzv27R2bKkMjutcOxzfnJIbsBoTakYo4Gdp3DcsTZh6vdAKUS fvYYDGplpJQtg9vmYvAUwxLSAOwAOql6dNmf5SIsohGHu3qw6VtWyRroWrtI7yJhlVQk RXCYGoqCXyfTGI6kfoglzY14jltW07I248nwahem5CXv9jDHZ7H/U9B6BY5SwevLRg56 XZIy7BpSQGE5G4jB8rena1ExiumUjR0/ER2ATLuLRB5sivAtLQ8zMkG6x/NpMOhNB2BT j8ULwv/0rLcgBs/dbQUOjiHpU7WDRFvpvkqB3r7RlH2Wpa5s+ePHenT9Rucii/zGxlwW 2vEw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=mzDOwakc; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-165629-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-165629-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id c29-20020a05620a269d00b00790946148dasi18735490qkp.475.2024.05.01.10.13.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 May 2024 10:13:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-165629-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=mzDOwakc; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-165629-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-165629-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 3A6CE1C2123E for ; Wed, 1 May 2024 17:13:08 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id BB01418654; Wed, 1 May 2024 17:13:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="mzDOwakc" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF0C417C74; Wed, 1 May 2024 17:12:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714583579; cv=none; b=R5lWzQMZhCGHzcRxzRTrq3C/EPP90gDVzw/fYUsJXFSaGATDfej6eRs4B866gTDhU830WWf/HcRGMgCSFKe5nHy9+7hKG1qi8ED8c0lLP3K66NHi7BNG24840exGIGQpIq82D8b86b3gkgDLo8VMQ50FCFLiRto56Y2HXsXTDY8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714583579; c=relaxed/simple; bh=zbcbIeexpIPCHZ07qIjb2xvy0LAIrIkIli7fZrlkkPs=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=IhlxhjsnUcxQHZOF59WIXcpVYzvCeoaq4vi1OdY90g122z8X0WtAzAr4LmrOW//fsgbnqc+HCTT3JWgSW/2M04qYOc0BIRK/pn0jMZgbVMuNELTCNKgxJE2UUmCTY++OWa+EZ+QV7X1z/KWZIh3KcBS2bIudA7EHWNUx9dli/P4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=mzDOwakc; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 01CF5C4AF18; Wed, 1 May 2024 17:12:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1714583579; bh=zbcbIeexpIPCHZ07qIjb2xvy0LAIrIkIli7fZrlkkPs=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=mzDOwakc6N/hsFge1akpD47JDrwLOLSaetMZZCmYVmWU9z77Y3smwP96XGF/jK7Lu e7aAtkFjfydVufECpNDfIBHWdjYLU5hKhaz5RPKM9JwQsu6SdXL95X1b4aQVOD3NVk J6hT+7VPC4XZG9m6xxKi1PErdm/GfALSq528wIRxRRg2MhxptlhXxxzxZWYjn9Q1L7 IBxowFFy6xHRYUuv6LOSXvKz+vVjFv9m4akDdidqK1TMxbjCx+3JXJ/qxTuqELRZ+B FfRfP6x0R/RjQc3x9X+Or2FuHqCNDtMyeUB/TUedwngviJWWsqkwHygxDRteaXOtAb Ukrj8mq6vgeHA== From: Puranjay Mohan To: Mark Rutland Cc: Catalin Marinas , Will Deacon , Sumit Garg , Stephen Boyd , Douglas Anderson , "Peter Zijlstra (Intel)" , Thomas Gleixner , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, Ard Biesheuvel Subject: Re: [PATCH] arm64: implement raw_smp_processor_id() using thread_info In-Reply-To: References: <20240501154236.10236-1-puranjay@kernel.org> Date: Wed, 01 May 2024 17:12:52 +0000 Message-ID: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain Mark Rutland writes: > Hi Puranjay, > > On Wed, May 01, 2024 at 03:42:36PM +0000, Puranjay Mohan wrote: >> ARM64 defines THREAD_INFO_IN_TASK which means the cpu id can be found >> from current_thread_info()->cpu. > > Nice! > > This is something that we'd wanted to do, but there were some historical > reasons that prevented that. I think it'd be worth describing that in the > commit message, e.g. > > | Historically, arm64 implemented raw_smp_processor_id() as a read of > | current_thread_info()->cpu. This changed when arm64 moved thread_info into > | task struct, as at the time CONFIG_THREAD_INFO_IN_TASK made core code use > | thread_struct::cpu for the cpu number, and due to header dependencies > | prevented using this in raw_smp_processor_id(). As a workaround, we moved to > | using a percpu variable in commit: > | > | 57c82954e77fa12c ("arm64: make cpu number a percpu variable") > | > | Since then, thread_info::cpu was reintroduced, and core code was made to use > | this in commits: > | > | 001430c1910df65a ("arm64: add CPU field to struct thread_info") > | bcf9033e5449bdca ("sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y") > | > | Consequently it is possible to use current_thread_info()->cpu again. > >> Implement raw_smp_processor_id() using the above. This decreases the >> number of emitted instructions like in the following example: >> >> Dump of assembler code for function bpf_get_smp_processor_id: >> 0xffff8000802cd608 <+0>: nop >> 0xffff8000802cd60c <+4>: nop >> 0xffff8000802cd610 <+8>: adrp x0, 0xffff800082138000 >> 0xffff8000802cd614 <+12>: mrs x1, tpidr_el1 >> 0xffff8000802cd618 <+16>: add x0, x0, #0x8 >> 0xffff8000802cd61c <+20>: ldrsw x0, [x0, x1] >> 0xffff8000802cd620 <+24>: ret >> >> After this patch: >> >> Dump of assembler code for function bpf_get_smp_processor_id: >> 0xffff8000802c9130 <+0>: nop >> 0xffff8000802c9134 <+4>: nop >> 0xffff8000802c9138 <+8>: mrs x0, sp_el0 >> 0xffff8000802c913c <+12>: ldr w0, [x0, #24] >> 0xffff8000802c9140 <+16>: ret >> >> A microbenchmark[1] was built to measure the performance improvement >> provided by this change. It calls the following function given number of >> times and finds the runtime overhead: >> >> static noinline int get_cpu_id(void) >> { >> return smp_processor_id(); >> } >> >> Run the benchmark like: >> modprobe smp_processor_id nr_function_calls=1000000000 >> >> +--------------------------+------------------------+ >> | | Number of Calls | Time taken | >> +--------+-----------------+------------------------+ >> | Before | 1000000000 | 1602888401ns | >> +--------+-----------------+------------------------+ >> | After | 1000000000 | 1206212658ns | >> +--------+-----------------+------------------------+ >> | Difference (decrease) | 396675743ns (24.74%) | >> +---------------------------------------------------+ >> >> This improvement is in this very specific microbenchmark but it proves >> the point. >> >> The percpu variable cpu_number is left as it is because it is used in >> set_smp_ipi_range() >> >> [1] https://github.com/puranjaymohan/linux/commit/77d3fdd >> >> Signed-off-by: Puranjay Mohan >> --- >> arch/arm64/include/asm/smp.h | 8 ++------ >> 1 file changed, 2 insertions(+), 6 deletions(-) >> >> diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h >> index efb13112b408..88fd2ab805ec 100644 >> --- a/arch/arm64/include/asm/smp.h >> +++ b/arch/arm64/include/asm/smp.h >> @@ -34,13 +34,9 @@ >> DECLARE_PER_CPU_READ_MOSTLY(int, cpu_number); >> >> /* >> - * We don't use this_cpu_read(cpu_number) as that has implicit writes to >> - * preempt_count, and associated (compiler) barriers, that we'd like to avoid >> - * the expense of. If we're preemptible, the value can be stale at use anyway. >> - * And we can't use this_cpu_ptr() either, as that winds up recursing back >> - * here under CONFIG_DEBUG_PREEMPT=y. >> + * This relies on THREAD_INFO_IN_TASK, but arm64 defines that unconditionally. >> */ >> -#define raw_smp_processor_id() (*raw_cpu_ptr(&cpu_number)) >> +#define raw_smp_processor_id() (current_thread_info()->cpu) > > I think we can (and should) delete the comment entirely. Sure, I will add the information to the commit message and remove this comment in the next version. I think it would be useful to remove the cpu_number percpu variable as well. We can use &irq_stat in place of &cpu_number in set_smp_ipi_range() in the calls to request_percpu_nmi/irq() as this is just a dummy value and ipi_handler() doesn't use it. There are no other users of cpu_number. Thanks, Puranjay