Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp999501iob; Fri, 13 May 2022 19:08:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxh+aJprdod9o6gxwXz8di43j6KcNN5p9QwyzIMIWpQU29Tyr0kgbl2opoipBIoxvb+xSo9 X-Received: by 2002:a5d:59ae:0:b0:20c:6f97:31b5 with SMTP id p14-20020a5d59ae000000b0020c6f9731b5mr5876463wrr.279.1652494114724; Fri, 13 May 2022 19:08:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652494114; cv=none; d=google.com; s=arc-20160816; b=lJ8KkPpGq5lLo0hR1QGaEWq/pqSb4j/jfctqVXQXjUDWkQ334FdNfzBNbTjfVc5MO9 PLwSR4cnvSUqCyNAwG8lcilTQoDbxl2q0ICyzDGeNa1z4DNUKYlMiw9PaAIY86tT/tUX SckB8abH30pfep4yuzkrdaPCZapYTAMls25nZqhoCBXPrrSc7+9rXD5OoGNXJOiXbcaw XyiMV6AKcK9gUfAEflhfTXQVy50eeMIGDjCOyez4Y7Au9yi0Fo3ZDZUVZ+nor8UmvYpb IxRGSBdiV5hToCkVVPvwwppFbcJvDRuqV0N8B2LzPdmzNBWXGRLxLNElhHyd1cxT0adq rEXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=H6ylwExavTNkB9OKuRxxHz9+Xg9/ks6xmcZnmCtHXS4=; b=WNYBYlmH/bMs8tTTT3l2b1wmef+QWrsFBU6ZkBrRQRbbpkxp3F+L3ybtf/VOKujmim boT2PGhgTfSvKrmbPjIXyO4ge5+Aq9fMrPoZicXNrSixxoVgt+bdmTQaC+XcJp9mohgI NpESkXKNkD5M888xFknX5WuUDdEJDbapo9kKO6riqMeaFVLrpcy/34ubN9Fh5nrosAxe Y7rPQdF6lJpU80lZidX9Co6VxSS2sH2C9g2dC/KWrfldSwkqhUi4+3XqK2+zW8I6coi9 0Q42ud3m+9TXCga56EfV7eI/F2y6d28zbJevvQ7r4SKDGQppWkczjhn4kbQPDCfDlZe/ i9Yg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="g7sMP1a/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id n14-20020a5d588e000000b0020abfd7fc87si3335312wrf.579.2022.05.13.19.08.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 May 2022 19:08:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="g7sMP1a/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E77D448FD99; Fri, 13 May 2022 17:28:09 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379357AbiEMKVa (ORCPT + 99 others); Fri, 13 May 2022 06:21:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379366AbiEMKVL (ORCPT ); Fri, 13 May 2022 06:21:11 -0400 Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B41F635A8C; Fri, 13 May 2022 03:21:09 -0700 (PDT) Received: by mail-qk1-x72b.google.com with SMTP id v11so6727205qkf.1; Fri, 13 May 2022 03:21:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=H6ylwExavTNkB9OKuRxxHz9+Xg9/ks6xmcZnmCtHXS4=; b=g7sMP1a/2Ow4/qzpN/Sm8aIBRpxhQcQyCbqaVkaoHj4pQXP7aKNC5YWj6wegNBoSVd TfGGvzvOuXLaN1VIXbzx0DIbATYr3Fe007x6dNTtpV1p/3gK0ZhgdjAW/wG7K46Y873b 3zgKCMlepFUMBfunyi0UblgvQKj2DlJFl4E0OnPhXdMYCGbyPPIdgCiHN7o+mNGNBDRp Oz7WztObzxYcFGquAJCzfHsHC2Y9HkHvkBCQTZ7faaKEml65v8TYxFYbesnyxNJTYklz AM90lL7v3gZTPwrrMeadUkkATk6QEcXggcp/1P/CqUOayhEWnqcKvchfs3X6pFPfI7bG Q1jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=H6ylwExavTNkB9OKuRxxHz9+Xg9/ks6xmcZnmCtHXS4=; b=LKGqb6jOlFGh2+SQo86BKsCZB3l8OkT1OGwmvXgNwznLkPXYwNH3nioCeeJf6M8bMl q3CIN8uBXTtTtF+HhKjynEWNNVNZFmWZMq9nYmL6MdUUb4voTTzNTk0004DJqivLiQux kpurmKqf+c3GsNguPC2qrYRW/yImYw8QrlXRv/k+74geOtPZT77ii12zcB1weMwh//3l q/6Y5OIXwtt4AHLdN8il4D7Qtz42Ho2993ZcPvufwKHfskizDW+6gpIYcJCWSGzN9ATG BoFK6I0vuJsNRLppG1xQS3YZxIRozOqJ8xbUu4g0F6sr9vOIeHjWSBs/C65vq2jmqEef +krg== X-Gm-Message-State: AOAM531EG7dJnsrsmufUdCgmDRji0pgVygy+FZdDswwHeS/tHUG/u/HM 6aruZByJh9DIOfA27HBlCPr9ZauRsGtQZ/rMmU8= X-Received: by 2002:a05:620a:1a01:b0:69c:fda:7404 with SMTP id bk1-20020a05620a1a0100b0069c0fda7404mr3084015qkb.522.1652437268783; Fri, 13 May 2022 03:21:08 -0700 (PDT) MIME-Version: 1.0 References: <20220510154217.5216-1-ubizjak@gmail.com> <20220513091034.GH76023@worktop.programming.kicks-ass.net> In-Reply-To: <20220513091034.GH76023@worktop.programming.kicks-ass.net> From: Uros Bizjak Date: Fri, 13 May 2022 12:20:57 +0200 Message-ID: Subject: Re: [PATCH] locking/atomic/x86: Introduce try_cmpxchg64 To: Peter Zijlstra Cc: X86 ML , LKML , kvm@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Will Deacon , Boqun Feng , Mark Rutland , "Paul E. McKenney" , Marco Elver Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 13, 2022 at 11:10 AM Peter Zijlstra wrote: > > On Tue, May 10, 2022 at 05:42:17PM +0200, Uros Bizjak wrote: > > For the Changelog I would focus on the 64bit improvement and leave 32bit > as a side-note. Thanks, I will rephrase the ChangeLog. > > > --- > > arch/x86/include/asm/cmpxchg_32.h | 43 ++++++++++++++++++++++ > > arch/x86/include/asm/cmpxchg_64.h | 6 +++ > > include/linux/atomic/atomic-instrumented.h | 40 +++++++++++++++++++- > > scripts/atomic/gen-atomic-instrumented.sh | 2 +- > > 4 files changed, 89 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h > > index 0a7fe0321613..e874ff7f7529 100644 > > --- a/arch/x86/include/asm/cmpxchg_32.h > > +++ b/arch/x86/include/asm/cmpxchg_32.h > > @@ -42,6 +42,9 @@ static inline void set_64bit(volatile u64 *ptr, u64 value) > > #define arch_cmpxchg64_local(ptr, o, n) \ > > ((__typeof__(*(ptr)))__cmpxchg64_local((ptr), (unsigned long long)(o), \ > > (unsigned long long)(n))) > > +#define arch_try_cmpxchg64(ptr, po, n) \ > > + ((__typeof__(*(ptr)))__try_cmpxchg64((ptr), (unsigned long long *)(po), \ > > + (unsigned long long)(n))) > > #endif > > > > static inline u64 __cmpxchg64(volatile u64 *ptr, u64 old, u64 new) > > @@ -70,6 +73,25 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) > > return prev; > > } > > > > +static inline bool __try_cmpxchg64(volatile u64 *ptr, u64 *pold, u64 new) > > +{ > > + bool success; > > + u64 prev; > > + asm volatile(LOCK_PREFIX "cmpxchg8b %2" > > + CC_SET(z) > > + : CC_OUT(z) (success), > > + "=A" (prev), > > + "+m" (*ptr) > > + : "b" ((u32)new), > > + "c" ((u32)(new >> 32)), > > + "1" (*pold) > > + : "memory"); > > + > > + if (unlikely(!success)) > > + *pold = prev; > > I would prefer this be more like the existing try_cmpxchg code, > perhaps: > > u64 old = *pold; > > asm volatile (LOCK_PREFIX "cmpxchg8b %[ptr]" > CC_SET(z) > : CC_OUT(z) (success), > [ptr] "+m" (*ptr) > "+A" (old) > : "b" ((u32)new) > "c" ((u32)(new >> 32)) > : "memory"); > > if (unlikely(!success)) > *pold = old; > > The existing 32bit cmpxchg code is a 'bit' crusty. I was trying to follow the existing __cmpxchg64 as much as possible, with the intention of a follow-up patch that would modernize everything in cmpxchg_32.h. I can surely go the other way and submit modernized new code. > > + return success; > > +} > > + > > #ifndef CONFIG_X86_CMPXCHG64 > > /* > > * Building a kernel capable running on 80386 and 80486. It may be necessary > > @@ -108,6 +130,27 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) > > : "memory"); \ > > __ret; }) > > > > +#define arch_try_cmpxchg64(ptr, po, n) \ > > +({ \ > > + bool success; \ > > + __typeof__(*(ptr)) __prev; \ > > + __typeof__(ptr) _old = (__typeof__(ptr))(po); \ > > + __typeof__(*(ptr)) __old = *_old; \ > > + __typeof__(*(ptr)) __new = (n); \ > > + alternative_io(LOCK_PREFIX_HERE \ > > + "call cmpxchg8b_emu", \ > > + "lock; cmpxchg8b (%%esi)" , \ > > + X86_FEATURE_CX8, \ > > + "=A" (__prev), \ > > + "S" ((ptr)), "0" (__old), \ > > + "b" ((unsigned int)__new), \ > > + "c" ((unsigned int)(__new>>32)) \ > > + : "memory"); \ > > + success = (__prev == __old); \ > > + if (unlikely(!success)) \ > > + *_old = __prev; \ > > + likely(success); \ > > +}) > > Wouldn't this be better written like the normal fallback wrapper? > > static __always_inline bool > arch_try_cmpxchg64(u64 *v, u64 *old, u64 new) > { > u64 r, o = *old; > r = arch_cmpxchg64(v, o, new); > if (unlikely(r != o)) > *old = r; > return likely(r == o); > } > > Less magical, same exact code. Also, I tried to follow up the existing #defines. Will improve the code according to your suggestion here. Thanks, Uros.