Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp6123837rdb; Thu, 14 Dec 2023 08:55:47 -0800 (PST) X-Google-Smtp-Source: AGHT+IHOBUJPa7f+RuR3MOhd5W9CWeQXtvzKyPQzgyG+KezQDymGjAYComwZjLfswnrrYnpgJzaE X-Received: by 2002:a17:902:eb82:b0:1d2:eb8e:8557 with SMTP id q2-20020a170902eb8200b001d2eb8e8557mr6222342plg.10.1702572947645; Thu, 14 Dec 2023 08:55:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702572947; cv=none; d=google.com; s=arc-20160816; b=apjtmsaBSKyn1yo9qwrDPDcrfGJKQOmYNnaAW9ILLKrnPXEbmwQj7Ncb7wRtks9bm0 nLsjZ+bnM5Iq7M3ka8IxNyCjPSRbwRhnomOQUUAESE0F3aFXcYdd/kOnb/V0+0GVcPws akrOK6xs/ypDuEWsD1xOJy22OdO1jkAtYHT3HIw7gtU8HbGhclR2YZgczZyOI9FAUvnI tclqp4t22dpMntbO+6++PKpDiDMa4Em/gQTxHollnFX3aGJKBk61gYLb+492sjhmpqfm Sr9EKRG1RSgM1GJeyO4RMTf+y3+RibNodWjJ8vgX66WteDVdRM7pBIYUSgv61P3p3cEX 8twQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=Uz1wBP3XGG8Hh/kDt9IpLLnwnguDNkFC0CHy+H9mro0=; fh=JnmPWmgzBVXFNCVT5PLF0Zk+bQCYgGE3k9ZGPWXgV7w=; b=j1P/9RDBqoQAkkdM+emhYShTIhpI3H1acQt0hS2v46jOKiUwXRG3/1K3KweRPNyusE TGKkIVb9I6369DIm1qh0wiHXQ5WKErh0SGg5fuugumBoqjycLIVftwPPx3tpg8HHWqtf c7EAonT7HnjM/KngdzphBeun4cqY3PCr3IKH5zt2clZyvNWtc5v7X6DDr5YAOx5Gw2Fh R4EtXpObP5L5jpwsk6SHC17BHwiVZEl9pUUXat3HWur3y2KYAT43jKEajMn2bTSvk3MN 2cS+D22S9b7cvA9kizV0it8F+vr+rnRWOVksFxFoz3TMMI8f3XPacd/r6+NsiFDmk4yq x43g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id u4-20020a170902e5c400b001d360f8b5ddsi2144141plf.293.2023.12.14.08.55.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 08:55:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 9D50E806227C; Thu, 14 Dec 2023 08:55:43 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230252AbjLNQz0 (ORCPT + 99 others); Thu, 14 Dec 2023 11:55:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229532AbjLNQzX (ORCPT ); Thu, 14 Dec 2023 11:55:23 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D901118 for ; Thu, 14 Dec 2023 08:55:30 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DA510C433C8; Thu, 14 Dec 2023 16:55:28 +0000 (UTC) Date: Thu, 14 Dec 2023 11:56:14 -0500 From: Steven Rostedt To: Linus Torvalds Cc: "linux-arch@vger.kernel.org" , LKML , Linux Trace Kernel , Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers Subject: Re: [PATCH] ring-buffer: Remove 32bit timestamp logic Message-ID: <20231214115614.2cf5a40e@gandalf.local.home> In-Reply-To: References: <20231213211126.24f8c1dd@gandalf.local.home> <20231213214632.15047c40@gandalf.local.home> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Thu, 14 Dec 2023 08:55:43 -0800 (PST) On Wed, 13 Dec 2023 22:53:19 -0800 Linus Torvalds wrote: > On Wed, 13 Dec 2023 at 18:45, Steven Rostedt wrote: > > > > tl;dr; The ring-buffer timestamp requires a 64-bit cmpxchg to keep the > > timestamps in sync (only in the slow paths). I was told that 64-bit cmpxchg > > can be extremely slow on 32-bit architectures. So I created a rb_time_t > > that on 64-bit was a normal local64_t type, and on 32-bit it's represented > > by 3 32-bit words and a counter for synchronization. But this now requires > > three 32-bit cmpxchgs for where one simple 64-bit cmpxchg would do. > > It's not that a 64-bit cmpxchg is even slow. It doesn't EXIST AT ALL > on older 32-bit x86 machines. > > Which is why we have > > arch/x86/lib/cmpxchg8b_emu.S > > which emulates it on machines that don't have the CX8 capability > ("CX8" being the x86 capability flag name for the cmpxchg8b > instruction, aka 64-bit cmpxchg). > > Which only works because those older 32-bit cpu's also don't do SMP, > so there are no SMP cache coherency issues, only interrupt atomicity > issues. > > IOW, the way to do an atomic 64-bit cmpxchg on the affected hardware > is to simply disable interrupts. > > In other words - it's not just slow. It's *really* slow. As in 10x > slower, not "slightly slower". Ah, I'm starting to remember this for the rationale in doing it. I should have read up on the LWN article I even wrote about it! https://lwn.net/Articles/831892/ "I mentioned that I used the local64 variants of operations like local_read/cmpxchg/etc. operations. Desnoyers went on to argue that the local64 operations on 32-bit machines were horrible in performance, and worse, some require that interrupts be disabled, meaning that they could not be used in NMI context." And yes, this does get called in NMI context. > > > We started discussing how much time this is actually saving to be worth the > > complexity, and actually found some hardware to test. One Atom processor. > > That atom processor won't actually show the issue. It's much too > recent. So your "test" is actually worthless. > > And you probably did this all with a kernel config that had > CONFIG_X86_CMPXCHG64 set anyway, which wouldn't even boot on a i486 > machine. > > So in fact your test was probably doubly broken, in that not only > didn't you test the slow case, you tested something that wouldn't even > have worked in the environment where the slow case happened. > > Now, the real question is if anybody cares about CPUs that don't have > cmpxchg8b support. > > IBecause in practice, it's really just old 486-class machines (and a > couple of clone manufacturers who _claimed_ to be Pentium class, but > weren't - there was also some odd thing with Windows breaking if you > had CPUID claiming to support CX8 > > We dropped support for the original 80386 some time ago. I'd actually > be willing to drop support for ll pre-cmpxchg8b machines, and get rid > of the emulation. > > I also suspect that from a perf angle, none of this matters. The > emulation being slow probably is a non-issue, simply because even if > you run on an old i486 machine, you probably won't be doing perf or > tracing on it. Thanks for the background. I had a patch that added: + /* ring buffer does cmpxchg, make sure it is safe in NMI context */ + if (!IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) && + (unlikely(in_nmi()))) { + return NULL; + } But for ripping out this code, I should probably change that to: if ((!IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) || (IS_ENABLED(X86_32) && !IS_ENABLED(X86_CMPXCHG64))) && unlikely(in_nmi())) { return NULL; } Not sure if there's other architectures that are affected by this (hence why I Cc'd linux-arch). I don't think anyone actually cares about the performance overhead of 486 doing 64-bit cmpxchg by disabling interrupts. Especially since this only happens in the slow path (if an event interrupts the processing of another event). If someone complains, we can always add back this code. Now back to my original question. Are you OK with me sending this to you now, or should I send you just the subtle fixes to the 32-bit rb_time_* code and keep this patch for the merge window? My preference is to get it in now and label it as stable, but I'm fine either way. -- Steve