Received: by 2002:a05:7412:f589:b0:e2:908c:2ebd with SMTP id eh9csp318690rdb; Tue, 31 Oct 2023 08:19:59 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG9RpDB9vwlAtLF+q4pYgXL4VhIWuf+wEim9ChDoFRtvzLLA5lGrNChA8XUylg4kqJtTw8b X-Received: by 2002:a05:6a00:814:b0:6bb:aaf:d7db with SMTP id m20-20020a056a00081400b006bb0aafd7dbmr17454520pfk.29.1698765599125; Tue, 31 Oct 2023 08:19:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698765599; cv=none; d=google.com; s=arc-20160816; b=CM8WpAU94vOWNcIyNUmKmb7U/vMU4ZsMhQ0epHQtivzyt3+qW2dHcS6jlrzjD6wXi3 vTcgpKbFHrjFmVdrZPb6/IyPcD43s5nBsn9aWtH2FZtgStgLQiNATxw9pgwljqm4UYfZ SSItrhq0lYRXTx2UbYgJhvp60FReS1CN3AcjNYrZdmyac2Ye8bOjwbU5voHvzC1kVijL WrHFE541IhOEdaCQXAIrf1mzChu1UhCvR10gIglCz3MXI9jk95g4worPuFfBkdA4TTL9 8724G1oQiycCday3mTDM4dgSRxQNs4GNeHRb59oqlK2YYwpgANMnVQqhMi1QpVEGVO7P yVDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=LiGcsZom1SDIM8uQ4Ed6vqC5uJcrUXeqZ5iXIeC2l7o=; fh=xWTpxreBvdBsp3ohv+BlC5O7wFwXQgeEud70wRtiYZs=; b=gZB+CTwiCDsIxkp1Qom1aIpcXgi0N8TtpsXgpi+FCajS3FDCo54EtK7jhNjvRQDeJm 7bN/4grqgNeFWJgRu//iWp9rJ1f/ReqGHOG+3NPR79zT/MJYcvmlrloLF1tEi0ZBfW5s LxNfbDTRpPtC9qo+8DvEbS6Hr80chjyNJUsFjtJ4hiktC83l6tcrV1mOwOYSV9hKo8E8 fw4drABRc1D2A25YOjqbFAlURNheiMr/EVqlSldY4KCtb++5Wl7/168wVfJ5TIHc5WyP KYWjW5lVvY7t7kNR3ImGVcx0T+OkanerW5RMIrQxTn3ddFqJQMYm+nEWLTwgccERDKO3 lrPw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=Ml3LC6ca; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id w64-20020a638243000000b005b7d9aace9dsi1171429pgd.46.2023.10.31.08.19.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 08:19:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=Ml3LC6ca; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 29DD48020C30; Tue, 31 Oct 2023 08:19:47 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344395AbjJaPSh (ORCPT + 99 others); Tue, 31 Oct 2023 11:18:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344229AbjJaPSg (ORCPT ); Tue, 31 Oct 2023 11:18:36 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E7C2D42; Tue, 31 Oct 2023 08:17:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=LiGcsZom1SDIM8uQ4Ed6vqC5uJcrUXeqZ5iXIeC2l7o=; b=Ml3LC6caKciiFHyY6MmqiBqFHs bhw7TUF04hGZjiUrRva9q7Irr+byF9CA/JF7tb26o1gDHhQwwtle3FZGMIgOm8C+B4Nvbrq42OmMP B20iF5kpj1IY6vgyEB7e2h5e3GwAmBLofTowO0GeyCIxXRgOSuYro7NQgFPGswNPfm7AQbr8GPIF1 NqMoRjcoxwvOh17LPcGlqI2rnlSaYKL52nijAD0l9keKRYaC/4+CGWfqkLyAhsDRvhFLl9GQA9CIB DQgbVeKRqDzhpzXRMI+YGce8/TbOTCfRJAxYmi/zqzRhJplkMupWtZ4Y+LfzMh6CXWRcynfAOO6Zx YlV7vpsA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qxqU2-004tmF-1S; Tue, 31 Oct 2023 15:16:46 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 191FB300478; Tue, 31 Oct 2023 16:16:46 +0100 (CET) Date: Tue, 31 Oct 2023 16:16:45 +0100 From: Peter Zijlstra To: Michael Matz Cc: "Paul E. McKenney" , Frederic Weisbecker , LKML , Boqun Feng , Joel Fernandes , Josh Triplett , Mathieu Desnoyers , Neeraj Upadhyay , Steven Rostedt , Uladzislau Rezki , rcu , Zqiang , "Liam R . Howlett" , ubizjak@gmail.com Subject: Re: [PATCH 2/4] rcu/tasks: Handle new PF_IDLE semantics Message-ID: <20231031151645.GB15024@noisy.programming.kicks-ass.net> References: <20231027144050.110601-1-frederic@kernel.org> <20231027144050.110601-3-frederic@kernel.org> <20231027192026.GG26550@noisy.programming.kicks-ass.net> <2a0d52a5-5c28-498a-8df7-789f020e36ed@paulmck-laptop> <20231027224628.GI26550@noisy.programming.kicks-ass.net> <200c57ce-90a7-418b-9527-602dbf64231f@paulmck-laptop> <20231030082138.GJ26550@noisy.programming.kicks-ass.net> <622438a5-4d20-4bc9-86b9-f3de55ca6cda@paulmck-laptop> <20231031095202.GC35651@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Tue, 31 Oct 2023 08:19:47 -0700 (PDT) On Tue, Oct 31, 2023 at 02:16:34PM +0000, Michael Matz wrote: > > Mostly my problem is that GCC generates such utter shite when you > > mention volatile. See, the below patch changes the perfectly fine and > > non-broken: > > > > 0148 1d8: 49 83 06 01 addq $0x1,(%r14) > > What is non-broken here that is ... > > > into: > > > > 0148 1d8: 49 8b 06 mov (%r14),%rax > > 014b 1db: 48 83 c0 01 add $0x1,%rax > > 014f 1df: 49 89 06 mov %rax,(%r14) > > ... broken here? (Sure code size and additional register use, but I don't > think you mean this with broken). The point was that the code was perfectly fine without adding the volatile, and adding volatile makes it worse. > > For absolutely no reason :-( > > The reason is simple (and should be obvious): to adhere to the abstract > machine regarding volatile. When x is volatile then x++ consists of a > read and a write, in this order. The easiest way to ensure this is to > actually generate a read and a write instruction. Anything else is an > optimization, and for each such optimization you need to actively find an > argument why this optimization is correct to start with (and then if it's > an optimization at all). In this case the argument needs to somehow > involve arguing that an rmw instruction on x86 is in fact completely > equivalent to the separate instructions, from read cycle to write cycle > over all pipeline stages, on all implementations of x86. I.e. that a rmw > instruction is spec'ed to be equivalent. > > You most probably can make that argument in this specific case, I'll give > you that. But why bother to start with, in a piece of software that is > already fairly complex (the compiler)? It's much easier to just not do > much anything with volatile accesses at all and be guaranteed correct. > Even more so as the software author, when using volatile, most likely is > much more interested in correct code (even from a abstract machine > perspective) than micro optimizations. There's a pile of situations where a RmW instruction is actively different vs a load-store split, esp for volatile variables that are explicitly expected to change asynchronously. The original RmW instruction is IRQ-safe, while the load-store version is not. If an interrupt lands in between the load and store and also modifies the variable then the store after interrupt-return will over-write said modification. These are not equivalent. In this case that's not relevant because the increment happens to happen with IRQs disabled. But the point is that these forms are very much not equivalent. > > At least clang doesn't do this, it stays: > > > > 0403 413: 49 ff 45 00 incq 0x0(%r13) > > > > irrespective of the volatile. > > And, are you 100% sure that this is correct? Even for x86 CPU > pipeline implementations that you aren't intimately knowing about? ;-) It so happens that the x86 architecture does guarantee RmW ops are IRQ-safe or locally atomic. SMP/concurrent loads will observe either pre or post but no intermediate state as well. > But all that seems to be a side-track anyway, what's your real worry with > the code sequence generated by GCC? In this case it's sub-optimal code, both larger and possibly slower for having two memops. The reason to have volatile is because that's what Linux uses to dis-allow store-tearing, something that doesn't happen in this case. A suitably insane but conforming compiler could compile a non-volatile memory increment into something insane like: load byte-0, r1 increment r1 store r1, byte-0 jno done load byte-1, r1 increment ri store r1, byte 1 jno done ... done: We want to explicitly dis-allow this. I know C has recently (2011) grown this _Atomic thing, but that has other problems.