Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp4620094rdb; Fri, 15 Sep 2023 07:37:46 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGd2+uoVU2PaeOQHOVPx1U0q9RLZhQuF9swLqRUpDDQGFb3KKSQakV5xpNoVpjZiNz7Pr5X X-Received: by 2002:a05:6a20:8e04:b0:f0:50c4:4c43 with SMTP id y4-20020a056a208e0400b000f050c44c43mr2581513pzj.5.1694788665880; Fri, 15 Sep 2023 07:37:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694788665; cv=none; d=google.com; s=arc-20160816; b=ruMsdB6aXEL7mdxSWPjOd5FNAWD/d4yyhHU+E52xiGz8pstVf+46x1Vk5JweBH3XET cxYMH7w7ujxjRmRW7iaqwBll3Rh/UZ/7v1brxvLLlphREGBCubwmoGtKYDqQ6OtXi8TK VfVOm/jvhaiXWh7P69kVtp9DQwhGiNKFNKPQ3ZBNnk67gMsS4QP6sMGiP4X3rqq7aUQz yuXmqARbJzT6OeG9yXbc18RlV6y2XgIa6D2jApPkNayv5ST6Tq0T7WRRx9JsO0cPA951 LpXO4wc8DS8hFQVaMs59xHpUp5ywEXwyla69/VsbZM7j3ug6ZKVuOw5Gqv3/H4jD36Bd ZWzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=e3fPlDRzMIONDqj3mMmHL+GbxcO1+3NgTlxYMiH+als=; fh=qYan3SFkS4dOahueruZsWFtaov58UOdAoqLx2GkA+mU=; b=ZH6JKNIhRiJWJBcd50QBzvY0d4LRNKwyLzFdtF0HeJj9xAXQBDUNbw562jDvZh7o5q cApjK07swcAFtZZ8uNK2FgsoPdd8x/2r3b6/vpmc4GxZJJNlhp0cLt2obTA/vQ2TLRzR TDUqW9mrNQIctWa30RaAl1zgLNyTUTlkQAXrHd2uk/7UteD9bkcdvyr9nLumaP+OzcnR xF04v3NKJ7dVX+Y7VaMM/nuEbIlKaSA3XdG8xT8pnVHYv6J47eVTvxN654jwQf4UI66v OjK5/9CK8Rh/CAptUK4pDde3uNw8jzGt76ztDkqmqlRgPFXGZTzd+YWtp1XH4jB/gSMT 0ExA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=V48oHatG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id ch14-20020a056a00288e00b0068e3efe866esi3380183pfb.146.2023.09.15.07.37.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 07:37:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=V48oHatG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 2801F83C1902; Fri, 15 Sep 2023 05:02:43 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234545AbjIOMCW (ORCPT + 99 others); Fri, 15 Sep 2023 08:02:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234482AbjIOMCU (ORCPT ); Fri, 15 Sep 2023 08:02:20 -0400 Received: from mail-lf1-x135.google.com (mail-lf1-x135.google.com [IPv6:2a00:1450:4864:20::135]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4AC4D2113 for ; Fri, 15 Sep 2023 05:02:13 -0700 (PDT) Received: by mail-lf1-x135.google.com with SMTP id 2adb3069b0e04-50079d148aeso3491742e87.3 for ; Fri, 15 Sep 2023 05:02:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1694779331; x=1695384131; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=e3fPlDRzMIONDqj3mMmHL+GbxcO1+3NgTlxYMiH+als=; b=V48oHatGNATmYJV7anLx2nNNbUKzb9qtTGDDl0iTKSppwzPTKuCnXCIw8D2ZbjLASz NQG2RzyGEpp0NsKg6BCJea2b2Nq7gyoo0WlpOvJ0jUAtEepowES9WH/NbRTyYtCusG0p s77Y9M4GAnG5l5FJIDFwDwHU1iWUzv13Gr+gjFMvju+edhzhJNgDjQhLRULSQOG322aT WRFRbzjNBm63J3ysJSG8HzDoWNHvZg3Um7K3kSbKqwyIN8uMzpCj14khpZs50Uyxwx3P wgoNIqLwpRf0w9G8Xww4ja79BYobsq6tUtybcQpq7N4+IQZTpRXnx5veZYx6trHY0XY1 WIYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694779331; x=1695384131; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=e3fPlDRzMIONDqj3mMmHL+GbxcO1+3NgTlxYMiH+als=; b=uDryWizAr6NgrxnxwMxHqKJQ8KulWXybE+zZd1YoChfJiwdK0rV6IKx9I5rIY8oic2 8JldDprp4actowgNUpJKkQHBXQbF/dg3e27t9fhvCMw44N0Jn6TWwkCKELdUUmMOnA4P CaLNXdF9eySh8Jl2fqzZUZRQKutekR/eCRzBLA3bBmsYhMRgpMaaJb0i+fDGipFLyupL EAAzNL1Grs3X2PicDM86MsRWvfY8p6+fbSu1AJv/p3gY2NWGohvGjsZd1cEtiQoFTMKU WkoEzsGNW6pbOq+x/e15+vaRsYye9q5KqLT3RCp7g+BPUmLxZOfpWi9O5m+gSX4VvhKv 78CQ== X-Gm-Message-State: AOJu0YxCun4uz1No0opW3odClQIxepz9Q8+SMJyYONjjEtRNAEf4ybwP 8QqEWPAGMcoj8c2OuGrIe5G2va7q6SQxDfgWMR0= X-Received: by 2002:a19:7113:0:b0:500:9a45:62f with SMTP id m19-20020a197113000000b005009a45062fmr1169402lfc.8.1694779331034; Fri, 15 Sep 2023 05:02:11 -0700 (PDT) MIME-Version: 1.0 References: <20230830151623.3900-1-ubizjak@gmail.com> <20230830151623.3900-2-ubizjak@gmail.com> In-Reply-To: From: Uros Bizjak Date: Fri, 15 Sep 2023 14:01:59 +0200 Message-ID: Subject: Re: [PATCH 2/2] x86/percpu: Use raw_cpu_try_cmpxchg in preempt_count_set To: Ingo Molnar Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Fri, 15 Sep 2023 05:02:43 -0700 (PDT) On Fri, Sep 15, 2023 at 11:47=E2=80=AFAM Ingo Molnar wro= te: > > > * Uros Bizjak wrote: > > > Use raw_cpu_try_cmpxchg instead of raw_cpu_cmpxchg (*ptr, old, new) =3D= =3D old. > > x86 CMPXCHG instruction returns success in ZF flag, so this change save= s a > > compare after cmpxchg (and related move instruction in front of cmpxchg= ). > > > > Also, raw_cpu_try_cmpxchg implicitly assigns old *ptr value to "old" wh= en > > cmpxchg fails. There is no need to re-read the value in the loop. > > > > No functional change intended. > > > > Cc: Peter Zijlstra > > Cc: Thomas Gleixner > > Cc: Ingo Molnar > > Cc: Borislav Petkov > > Cc: Dave Hansen > > Cc: "H. Peter Anvin" > > Signed-off-by: Uros Bizjak > > --- > > arch/x86/include/asm/preempt.h | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/pree= mpt.h > > index 2d13f25b1bd8..4527e1430c6d 100644 > > --- a/arch/x86/include/asm/preempt.h > > +++ b/arch/x86/include/asm/preempt.h > > @@ -31,11 +31,11 @@ static __always_inline void preempt_count_set(int p= c) > > { > > int old, new; > > > > + old =3D raw_cpu_read_4(pcpu_hot.preempt_count); > > do { > > - old =3D raw_cpu_read_4(pcpu_hot.preempt_count); > > new =3D (old & PREEMPT_NEED_RESCHED) | > > (pc & ~PREEMPT_NEED_RESCHED); > > - } while (raw_cpu_cmpxchg_4(pcpu_hot.preempt_count, old, new) !=3D= old); > > + } while (!raw_cpu_try_cmpxchg_4(pcpu_hot.preempt_count, &old, new= )); > > It would be really nice to have a before/after comparison of generated > assembly code in the changelog, to demonstrate the effectiveness of this > optimization. The assembly code improvements are in line with other try_cmpxchg conversions, but for reference, finish_task_switch() from kernel/sched/core.c that inlines preempt_count_set() improves from: 5bad: 65 8b 0d 00 00 00 00 mov %gs:0x0(%rip),%ecx 5bb4: 89 ca mov %ecx,%edx 5bb6: 89 c8 mov %ecx,%eax 5bb8: 81 e2 00 00 00 80 and $0x80000000,%edx 5bbe: 83 ca 02 or $0x2,%edx 5bc1: 65 0f b1 15 00 00 00 cmpxchg %edx,%gs:0x0(%rip) 5bc8: 00 5bc9: 39 c1 cmp %eax,%ecx 5bcb: 75 e0 jne 5bad <...> 5bcd: e9 5a fe ff ff jmpq 5a2c <...> 5bd2: to: 5bad: 65 8b 05 00 00 00 00 mov %gs:0x0(%rip),%eax 5bb4: 89 c2 mov %eax,%edx 5bb6: 81 e2 00 00 00 80 and $0x80000000,%edx 5bbc: 83 ca 02 or $0x2,%edx 5bbf: 65 0f b1 15 00 00 00 cmpxchg %edx,%gs:0x0(%rip) 5bc6: 00 5bc7: 0f 84 5f fe ff ff je 5a2c <...> 5bcd: eb e5 jmp 5bb4 <...> 5bcf: Please note missing cmp (and mov), loop without extra memory load from %gs:0x0(%rip) and better predicted jump in the later case. The improvements with {raw,this}_cpu_try_cmpxchg_128 in the third patch are even more noticeable, because __int128 value lives in a register pair, so the comparison needs three separate machine instructions, in addition to a move of the register pair. Thanks, Uros.