Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp3625937ybb; Mon, 23 Mar 2020 04:45:06 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuXF6PLNeKQ58Em1+NuShesgCszPOyXZkPj5+auVwJktVz+2dsL98YFKXtYl214oZQl4X0x X-Received: by 2002:aca:1812:: with SMTP id h18mr9017710oih.107.1584963906255; Mon, 23 Mar 2020 04:45:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584963906; cv=none; d=google.com; s=arc-20160816; b=pMpHJx2t2yzuvVuUpgQlBmdtFoXg9wqTThzrPhUQIyvmXn88hK8hL8hgZgG6QasHtW p1xcsdzUKK+GcHSdjZEsiAV7v+y3CU7vaGe3IuZd4cgp0vOAovFeCiYsK15z60UvZtH8 eH4HDfdUpHJCkziddeWHTnUGEPujo9cF/5Q5mCkNtZU7eoG+PuCNknBVPQBX01jJhG2V +jDPvdPtKHDVSiN28+ix2oNpchgWnGGNE3QDXm5W9TQKIgfF7uam9Bo9BtjYxgDgskXi rUqA1SDWMqcDZ9bi8GZx1A2xsjVH2XsninoLG1b8t5KJYChqameW8MQgmm/5QaAHB8ys IBhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:cc:from:subject:mime-version :message-id:date:dkim-signature; bh=lSE+DSc9amXP//IBtR/r3PMKZC9AUFcm/JzIAvbwz4Y=; b=WVJRkm0jOtvD0M4jn1slRGFLsns2MVBEETtmUi+swcUWrbwYduHdo664Z8VKpF604L QLieK13Ff/xA24a4PgHR0Eel6cNFXOaMzP357sSXoRopRmRLRCmtbX6r6j/u3vcp9irw 44IT4A/baPcb4dtmQSAmC/Iu4TkiIGCtmDMEOfnNfEO6+MyhvSI+zEZCr0zYHpedkhSQ TlfO2dcW2bbL6FsmulX0ht+/dLXTVL0PStOflzIoGFslpjcvdCUcKYI9sY/WRPm1LkoC pGONo3UrfEnUTEQZ18ZwjG126uyxHZ/OHCa2seJbr3wHX9p6KE+P1sNaC22l5S8lNq/7 l/lw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=B+SMEnEm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h9si7534851oti.155.2020.03.23.04.44.54; Mon, 23 Mar 2020 04:45:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=B+SMEnEm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728192AbgCWLnI (ORCPT + 99 others); Mon, 23 Mar 2020 07:43:08 -0400 Received: from mail-wr1-f73.google.com ([209.85.221.73]:45027 "EHLO mail-wr1-f73.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728151AbgCWLnI (ORCPT ); Mon, 23 Mar 2020 07:43:08 -0400 Received: by mail-wr1-f73.google.com with SMTP id u18so7198249wrn.11 for ; Mon, 23 Mar 2020 04:43:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:cc; bh=lSE+DSc9amXP//IBtR/r3PMKZC9AUFcm/JzIAvbwz4Y=; b=B+SMEnEmy0Cylz0cvWADVv2wy8XltTOm0SEKG3nrST2UcLXoLlZZ66WHVv8h6/y1dy g4G+JbQlyafqSZ+lE0Ey6y/mNIq/FnJnCKsA4SdDDGb0mS9OT1zJrwpNxHZHCH/w/rNT D16FUGfn33ZoZku0PmD4UYQpSZ7LWx/okUpZzVXhvJ13KojJB3f4z5P6erulhmRdZj0M +5emuTsRtho9Vviafk35ULhwb2iubD/ytpcVC3KpFPQTwLTKH7adD90TouY5BbaIMZyq tDoN/pdQ7rlPrsWvP1h8gkL484boxRo7luZJt+fqyGNmNTFs6sLzSghFUbijVJQDxbks SN1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:cc; bh=lSE+DSc9amXP//IBtR/r3PMKZC9AUFcm/JzIAvbwz4Y=; b=KH+PD39Bwgf2MdtfnOzFBJUNbiikWPBDbDn4ksn+FyiSM7M//1fEQUU1SX41QcczLv QYbSeF2Xton6QsoIUpzRrLl7q6ohPlSwDOZoC0ZIMr8FFBf5CmMOaMutVi3Mn1/+t3hj VugGqCI+OBApmduFpyAPZnoW6XUCZPBhiz4v5ml8lshVJNFmElTBowFBUoAVE1JOnuA0 9qtAMx/nbW4LMgEppYdKXMVTbkLiMcwAiNTtjNfCyy3DP6pUujNYr2XGaGyKNils6qqt DtNK29fxT3UaubnjbdVZZsieRhlKQJkTI/QPGHwhGi/DCqe/mq49/5/JdP7kGiRKZuxA vlpg== X-Gm-Message-State: ANhLgQ3XUdDtJ5PHx6zajjqDe/rWHXScbqHhVH3UoiqVE1nx6GhwcLsc 76GLl0l1CNbWUyxtznble1uxmbXe6BC5 X-Received: by 2002:adf:dfce:: with SMTP id q14mr30519861wrn.326.1584963785709; Mon, 23 Mar 2020 04:43:05 -0700 (PDT) Date: Mon, 23 Mar 2020 12:42:06 +0100 Message-Id: <20200323114207.222412-1-courbet@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.25.1.696.g5e7596f4ac-goog Subject: [PATCH] x86: Alias memset to __builtin_memset. From: Clement Courbet Cc: Clement Courbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, clang-built-linux@googlegroups.com Content-Type: text/plain; charset="UTF-8" To: unlisted-recipients:; (no To-header on input) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Recent compilers know the meaning of builtins (`memset`, `memcpy`, ...) and can replace calls by inline code when deemed better. For example, `memset(p, 0, 4)` will be lowered to a four-byte zero store. When using -ffreestanding (this is the case e.g. building on clang), these optimizations are disabled. This means that **all** memsets, including those with small, constant sizes, will result in an actual call to memset. We have identified several spots where we have high CPU usage because of this. For example, a single one of these memsets is responsible for about 0.3% of our total CPU usage in the kernel. Aliasing `memset` to `__builtin_memset` allows the compiler to perform this optimization even when -ffreestanding is used. There is no change when -ffreestanding is not used. Below is a diff (clang) for `update_sg_lb_stats()`, which includes the aforementionned hot memset: memset(sgs, 0, sizeof(*sgs)); Diff: movq %rsi, %rbx ~~~> movq $0x0, 0x40(%r8) movq %rdi, %r15 ~~~> movq $0x0, 0x38(%r8) movl $0x48, %edx ~~~> movq $0x0, 0x30(%r8) movq %r8, %rdi ~~~> movq $0x0, 0x28(%r8) xorl %esi, %esi ~~~> movq $0x0, 0x20(%r8) callq ~~~> movq $0x0, 0x18(%r8) ~~~> movq $0x0, 0x10(%r8) ~~~> movq $0x0, 0x8(%r8) ~~~> movq $0x0, (%r8) In terms of code size, this grows the clang-built kernel a bit (+0.022%): 440285512 vmlinux.clang.after 440383608 vmlinux.clang.before Signed-off-by: Clement Courbet --- arch/x86/include/asm/string_64.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h index 75314c3dbe47..7073c25aa4a3 100644 --- a/arch/x86/include/asm/string_64.h +++ b/arch/x86/include/asm/string_64.h @@ -18,6 +18,15 @@ extern void *__memcpy(void *to, const void *from, size_t len); void *memset(void *s, int c, size_t n); void *__memset(void *s, int c, size_t n); +/* Recent compilers can generate much better code for known size and/or + * fill values, and will fallback on `memset` if they fail. + * We alias `memset` to `__builtin_memset` explicitly to inform the compiler to + * perform this optimization even when -ffreestanding is used. + */ +#if (__GNUC__ >= 4) +#define memset(s, c, count) __builtin_memset(s, c, count) +#endif + #define __HAVE_ARCH_MEMSET16 static inline void *memset16(uint16_t *s, uint16_t v, size_t n) { @@ -74,6 +83,7 @@ int strcmp(const char *cs, const char *ct); #undef memcpy #define memcpy(dst, src, len) __memcpy(dst, src, len) #define memmove(dst, src, len) __memmove(dst, src, len) +#undef memset #define memset(s, c, n) __memset(s, c, n) #ifndef __NO_FORTIFY -- 2.25.1.696.g5e7596f4ac-goog