Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp4426258rwb; Tue, 16 Aug 2022 22:44:40 -0700 (PDT) X-Google-Smtp-Source: AA6agR5lLqlfYv3d7exxk1SZksJ3WtpVmdnb3SHfUpTsffkfzuQzDXiXi5hkoYqEN/zb0+VPh3D9 X-Received: by 2002:a50:fb99:0:b0:43c:d008:d4f9 with SMTP id e25-20020a50fb99000000b0043cd008d4f9mr21912917edq.13.1660715080252; Tue, 16 Aug 2022 22:44:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660715080; cv=none; d=google.com; s=arc-20160816; b=KMOBYbi8Mw2sbRGKx3kFWAUWoz564tSR+uPc/Jb8PhYpX+gsmDStTmepyBLLEX66tx tOFqcw4Nl1FFqCb6Dt8b9R50TMUi2DJXihQr2+yIGLhU2finlQjfyRQs7FC5msquLQMj jgmBTCdxNav1YNoffvGY4OiIsPS8LQT2l88t6Fe0HM2pD4uEK7hdeuBB58+gj3W+OAQJ idckZvva80MP8lbro0t9UmsdgW65GLuvKjVkDfChf78IlzWDkFZyqrwbSdiXNTaHowwh ZTOxXJXOA/4RRHr3PR5Uc7eJiVQBRuRM18FDDoFRDY8i9t0FvnALigBhqa/2Vfmx8mjI nUbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=LtiOQ+mkHG3cwstCPu63k81uDm+Tw39gJQT7l7BfDLA=; b=sp3PIjBNDDtfGru+dt6kbOPbUG9Ay/ldoQQx1J3/tyr5Imt0ynwH+Yde9maPTxkE1B NXFr3xYWH0X6PbM3oCyNi/W7NDWfYKkTmTmfAgkYvBIzKKOLl563jfjIqDdK9dUtuiSz 1vV14EVPfhdzDdaO9RCvG3bQqdV7NNwxdYCREyeA/U+4TUieWSnPA7U0Z54YwP2CDlIS 1sj1ZM/FBRHPUT9MR3RhkqO5c3lg37hMDUqMOe+kTobForW2kAh0/dyIYxiVvbFw7WdQ Cqk+hfBwvy+LNDeNy9aLCy1ISoqeyAWyM5zkdPhI9bEr2Uycw2ASbpUDAA6eMHJMDSDd mNgQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@solid-run-com.20210112.gappssmtp.com header.s=20210112 header.b=Ath3yvPo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hg2-20020a1709072cc200b007307c42ab08si11239301ejc.375.2022.08.16.22.44.11; Tue, 16 Aug 2022 22:44:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@solid-run-com.20210112.gappssmtp.com header.s=20210112 header.b=Ath3yvPo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238559AbiHQFlL (ORCPT + 99 others); Wed, 17 Aug 2022 01:41:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238574AbiHQFlJ (ORCPT ); Wed, 17 Aug 2022 01:41:09 -0400 Received: from mail-io1-xd29.google.com (mail-io1-xd29.google.com [IPv6:2607:f8b0:4864:20::d29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E5E273912 for ; Tue, 16 Aug 2022 22:41:07 -0700 (PDT) Received: by mail-io1-xd29.google.com with SMTP id p184so6562442iod.6 for ; Tue, 16 Aug 2022 22:41:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=solid-run-com.20210112.gappssmtp.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=LtiOQ+mkHG3cwstCPu63k81uDm+Tw39gJQT7l7BfDLA=; b=Ath3yvPoCbvg61Letpy3d0PqozWLwwe9AmUmHNCgOgQ+HJySqSJYFKiqdif9aAzbwC /0Fvf2ynY2zf4g364t39yCZiiLQX+Phj0RXTS41kp9aWAtRs2vqSNGPh3LROO1A8e1SG GIVawD3lxyeB+YewQwmkRQBwFKZafkYi0BkZS/MBkSs01TJqnGTc7q6o2FM9JQKCzy+6 as58x35inJKn+OzcxYhDGWcvUHnKQBuNzOof7OCnTatdp0aUq2IsTm+m5Fo/mYAZdiHc mJND3g5enXtfobike48Sxz4zmTyGD4D4yyKankC8sW2T4cbIIyOpWONzuFh+aUZXvPHC x+RQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=LtiOQ+mkHG3cwstCPu63k81uDm+Tw39gJQT7l7BfDLA=; b=Kd+QN9hzyTQvAHTzsS8O1bF9Pi4unVWt5REvzf87CIm2qZSZaf8JqYlbnW3Ce6IJKA 5WPWlUlCSvZCr+K05ifq5L5Vcs4D6i0j5hYMnkFlVRXrUdfe4wETuTea7N+s4LbpNOQg D5ICEcbRrH469OZ7HPCExBqJA901zeqfCAuZ9Lg31WeHX/dUCChtAk8VXW/97JbQm5/8 KeZpgx6S7NoVwrQtVOs5uYdXZZugDMNCGgyYRDyUn/WQozZd6AR7VQt5jwTGv+VZqj57 GA47+GzQR6W3TYKEODQJGqcN9AsAw/ELslt7gu5fVS5GdStUe7PB1v4LYGN5kFGLK1V0 ZyEA== X-Gm-Message-State: ACgBeo0SHukgEZ7TLntp2XLDiG2MwobctNWaJbHfoaqZ38MrkXl3EusV 3ezj9y5XRhNWscxuZMPlEr9tw+y+hXvtjW9tyJlr4g== X-Received: by 2002:a05:6638:d45:b0:343:2ae6:e39a with SMTP id d5-20020a0566380d4500b003432ae6e39amr11419090jak.139.1660714866198; Tue, 16 Aug 2022 22:41:06 -0700 (PDT) MIME-Version: 1.0 References: <20220816070311.89186-1-marcan@marcan.st> <20220816140423.GC11202@willie-the-truck> <20220816173654.GA11766@willie-the-truck> In-Reply-To: From: Jon Nettleton Date: Wed, 17 Aug 2022 07:40:29 +0200 Message-ID: Subject: Re: [PATCH] locking/atomic: Make test_and_*_bit() ordered on failure To: Linus Torvalds Cc: Will Deacon , Hector Martin , Peter Zijlstra , Arnd Bergmann , Ingo Molnar , Alan Stern , Andrea Parri , Boqun Feng , Nicholas Piggin , David Howells , Jade Alglave , Luc Maranget , "Paul E. McKenney" , Akira Yokosawa , Daniel Lustig , Joel Fernandes , Mark Rutland , Jonathan Corbet , Tejun Heo , jirislaby@kernel.org, Marc Zyngier , Catalin Marinas , Oliver Neukum , linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Asahi Linux , stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 16, 2022 at 8:02 PM Linus Torvalds wrote: > > On Tue, Aug 16, 2022 at 10:49 AM Jon Nettleton wrote: > > > > It is moot if Linus has already taken the patch, but with a stock > > kernel config I am > > still seeing a slight performance dip but only ~1-2% in the specific > > tests I was running. > > It would be interesting to hear if you can pinpoint in the profiles > where the time is spent. > > It might be some random place that really doesn't care about ordering > at all, and then we could easily rewrite _that_ particular case to do > the unordered test explicitly, ie something like > > - if (test_and_set_bit()) ... > + if (test_bit() || test_and_set_bit()) ... > > or even introduce an explicitly unordered "test_and_set_bit_relaxed()" thing. > > Linus This is very interesting, the additional performance overhead doesn't seem to be coming from within the kernel but from userspace. Comparing patched and unpatched kernels I am seeing more cycles being taken up by glibc atomics like __aarch64_cas4_acq and __aarch64_ldadd4_acq_rel. I need to test further to see if there is less effect on a system with less cores, This is a 16-core Cortex-A72, it is possible this is less of an issue on 4 core A72's and A53's. -Jon