Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2565514rwd; Fri, 19 May 2023 07:24:02 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5vCEZqWOSlrYt5dxGVQOpD787LtdwSqlj3ipkUkt0rVSPF363rSg0dmTqnIcJH26EGdl9F X-Received: by 2002:a05:6a20:6f0a:b0:d5:73ad:87c2 with SMTP id gt10-20020a056a206f0a00b000d573ad87c2mr1853675pzb.56.1684506242539; Fri, 19 May 2023 07:24:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684506242; cv=none; d=google.com; s=arc-20160816; b=nbFEJlR+o97zhI+rIBd7srIuhb2XJWUGg8OEm5Bs10Jq4MgLCULWAE9cXg8Os5gQOb AqNdvCz9nAnpC9kIZpz0WrEniwhxRQKwQpgABUqx5XqYkXcFWEPfOIBQ4x/sovLx2+ip 0OIR7NNeIGx54OykmqBItweaWMtHnTrf1+uaDibgRnYRalPWxyxdWgRN8rBvG9i/obiG ezX5vqIiGl+dyTR6ezrY4HO/NYoAap1FGQ+WYaamaByoC/79ZmXExAOR+2Z6bI4Oqdag 3ZvlOhN/Zs8zXZVhdtfTAl5bMPyJsG6dSJNOVa3Hl15LM5EnUgvfHxmHtlhLWrvVqPrN ++CQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=ufduArJwhhwy38KzLqqt3UM7SkBCkcJA+joXIJ2WGeA=; b=OdfxS36ecOB1MJremzPBD4uideMEI5ezsxfdADQCsCyV642M0OVoacXMDFD48NHgP7 LtLURebzMJ2G7NjI31NFWxwsgOcVTjHEqYZX4BpAM9XCbWvGCc6hfJtLcb47twkoewcP pd4u5T3XNEnN37NPKqp+k9qfEeZhCGJQqDedeLGVO6CVUXhtbVTTsPLPHVnjvB7UUEVK ZuwKQKfKV3zGtIftcmgtKt1n5fZio8le+pkKEUJnex+ABC2kCMlBorxkYwExHk5lMcNq mho0HZqkZLW4a9QyLEy9EZVUvgAeRFxhJUwFmX7amE+P5kXGMA1Be03qpfIoRmoQO8K7 BwXQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b="LU9/Qsjs"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s30-20020a63925e000000b0050bf5a814a8si903487pgn.403.2023.05.19.07.23.48; Fri, 19 May 2023 07:24:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b="LU9/Qsjs"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231937AbjESOP0 (ORCPT + 99 others); Fri, 19 May 2023 10:15:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44238 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231653AbjESOPV (ORCPT ); Fri, 19 May 2023 10:15:21 -0400 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7AEF21A5; Fri, 19 May 2023 07:15:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1684505706; bh=hrLDk4Ku/N+Gf0DGHxR5oOwcJE4ZzrE2JYuRc9BDUc8=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=LU9/QsjsSej9UKNksRhmw8R7x+3+sG/kDctrb20XugRuqs4Yp1tAnZUJr8XBZe01J 9iOrpxuZxLSE2nERz3Zf4UfjMSo1NOOPWzOpZ3eBhzWX+sPkvaa62fRsu+zjf16Smu rcl7010x+KL0IdFivUfiLbTA0HNQvVieMRQwkZvTlXJadxkHZkPr+uYmeSqomaQVnI DHzJooY08xYC9Av03nPl8IR6geSxNobN0v4fAHrWhR4BbcbWhh9Zszim/DbZE5TPik Oss5BmD17Y3Y7RvquNS5MJtTNBQGHndpQT6ph+GGTGPxvwI/GyqaDMpCHSi8fW/xAe Cv/O9GxYUkVOw== Received: from [192.168.18.200] (unknown [198.16.236.227]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4QN84V0J5Tz13HD; Fri, 19 May 2023 10:15:06 -0400 (EDT) Message-ID: <06ee47e0-99e0-4b6a-ab67-239fccf2777d@efficios.com> Date: Fri, 19 May 2023 10:15:11 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [RFC PATCH 1/4] rseq: Add sched_state field to struct rseq Content-Language: en-US To: Boqun Feng Cc: Peter Zijlstra , linux-kernel@vger.kernel.org, Thomas Gleixner , "Paul E . McKenney" , "H . Peter Anvin" , Paul Turner , linux-api@vger.kernel.org, Christian Brauner , Florian Weimer , David.Laight@aculab.com, carlos@redhat.com, Peter Oskolkov , Alexander Mikhalitsyn , Chris Kennelly , Ingo Molnar , Darren Hart , Davidlohr Bueso , =?UTF-8?Q?Andr=c3=a9_Almeida?= , libc-alpha@sourceware.org, Steven Rostedt , Jonathan Corbet , Florian Weimer References: <20230517152654.7193-1-mathieu.desnoyers@efficios.com> <20230517152654.7193-2-mathieu.desnoyers@efficios.com> From: Mathieu Desnoyers In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2023-05-18 17:49, Boqun Feng wrote: > On Wed, May 17, 2023 at 11:26:51AM -0400, Mathieu Desnoyers wrote: [...] >> diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h >> index c233aae5eac9..c6d8537e23ca 100644 >> --- a/include/uapi/linux/rseq.h >> +++ b/include/uapi/linux/rseq.h >> @@ -37,6 +37,13 @@ enum rseq_cs_flags { >> (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT), >> }; >> >> +enum rseq_sched_state { >> + /* >> + * Task is currently running on a CPU if bit is set. >> + */ >> + RSEQ_SCHED_STATE_ON_CPU = (1U << 0), >> +}; [...] >> >> + /* >> + * Restartable sequences sched_state field. Updated by the kernel. Read >> + * by user-space with single-copy atomicity semantics. This fields can >> + * be read by any userspace thread. Aligned on 32-bit. Contains a > > Maybe this is a premature optimization, but since most of the time the > bit would be read by another thread, does it make sense putting the > "sched_state" into a different cache line to avoid false sharing? I'm puzzled by your optimization proposal, so I'll say it outright: I'm probably missing something. I agree that false-sharing would be an issue if various threads would contend for updating any field within this cache line. But the only thread responsible for updating this cache line's fields is the current thread, either from userspace (stores to rseq_abi->rseq_cs) or from the kernel (usually on return to userspace, except for this new ON_CPU bit clear on context switch). The other threads busy-waiting on the content of this sched_state field will only load from it, never store. And they will only busy-wait on it as long as the current task runs. When that task gets preempted, other threads will notice the flag change and use sys_futex instead. So the very worse I can think of in terms of pattern causing cache coherency traffic due to false-sharing is if the lock owner happens to have lots of rseq critical sections as well, causing it to repeatedly store to the rseq_abi->rseq_cs field, which is in the same cache line. But even then I'm wondering if this really matters, because each of those stores to rseq_cs would only slow down loads from other threads which will need to retry busy-wait anyway if the on-cpu flag is still set. So, what am I missing ? Is this heavy use of rseq critical sections while the lock is held the scenario you are concerned about ? Note that the heavy cache-line bouncing in my test-case happens on the lock structure (cmpxchg expecting NULL, setting the current thread rseq_get_abi() pointer on success). There are probably better ways to implement that part, it is currently just a simple prototype showcasing the approach. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com