Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp4641660ybe; Mon, 16 Sep 2019 16:06:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqze9VjzrEIuabijatZk24ossBbWEXK8H0R5GY30WAH3qY+/eKwVhoFzYu/9CLyBxbfZ6Igs X-Received: by 2002:aa7:d6d8:: with SMTP id x24mr1790797edr.178.1568675210562; Mon, 16 Sep 2019 16:06:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568675210; cv=none; d=google.com; s=arc-20160816; b=uqjSYBDdiOb92RB93JOfwmQiqzAcF+/nxeZZ/AQaVXC95Gn2+RaQfZZn5KpxxBSlEz Wg2BsJAGOVvxWuvr+Se7aNFA5PUx+bq9gNK27w6T9B5lBJQYTfpBKClm/WkIh3ZcPBZt JndsaSwGzWx9nGkqobx8ZUwOCx3233cm3q9jztH5AT7ncbV+zTV5gySIaQHHKvl1TuPw 4IlEHeyfZT4GmYL10NcuiRaWWzptSSUwchnaqwtKFVdo3rQH1cUPe06eEgeNOhSNk/HL WF6ncMrotjkG/b8DDfEodFiJJnCYeQLiFHfBgoxK/j1umD2dmEpOOiQw3yUyiB2R900Y 3wtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=WCLA20gGKvQZwlis3MY7Pv1IHjNSw+BHwmOLksBRSKI=; b=dJ2IVZ7RqzQokVhtGzE01lsE4yN/alSYst3fhUhKiMpN3OECW5w8fuSsHcsg1AS6TE GtkHSot3+SKXnQlexN/ZrDW5WBq1diJ9Feb3AJ0ejqrBhNHCETVb12HNm46hI8x0gLMl fQr7ncng/lWQFuQUIpKk5//h6Bi139ovdnVa53QRfgZDt62N8+k9KjieZq8KPjW0xSMD Q+xPjWfdVuTlY4+QdTOx/b92P4NdRNpZvmzkhJqe0I0pAA53vH94tiUnRwxgVbL6Ny/9 SOOKhUH8Ind2omB3kVi1MHN6azRlCMkXad+pW2gX+DCstXBifpbs3v06AO8YAoHCwUBf wckQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=UxzDmCVq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id gh3si169473ejb.306.2019.09.16.16.05.11; Mon, 16 Sep 2019 16:06:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=UxzDmCVq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727357AbfIPVaN (ORCPT + 99 others); Mon, 16 Sep 2019 17:30:13 -0400 Received: from mail-lj1-f194.google.com ([209.85.208.194]:45764 "EHLO mail-lj1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725798AbfIPVaN (ORCPT ); Mon, 16 Sep 2019 17:30:13 -0400 Received: by mail-lj1-f194.google.com with SMTP id q64so1316979ljb.12 for ; Mon, 16 Sep 2019 14:30:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=WCLA20gGKvQZwlis3MY7Pv1IHjNSw+BHwmOLksBRSKI=; b=UxzDmCVqMM5O4D7qCxnVJPyd5mVuuzaS828o4wDPtyAbmsWgp8MqcfFJhuy/ElL2F0 26cb+ArVuU3TAP47x5x/xq81jlPd4d86Rt2SULijuG1gvUDgWpZi3vrO30KbsJODAImT DfLkGfSRwhyw53azG+nRe9A9l5c25X9d/J/FU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=WCLA20gGKvQZwlis3MY7Pv1IHjNSw+BHwmOLksBRSKI=; b=M+E1o3QaZ47cJRTiIDp2lAieM89+5h+D925suvhCJ0Y2bzGVz6VTrcIyvf3DbJDhiT oMtZIw8Koryf0sCcz54Lvxq+6P46P7MIuwPh3Y+hl44U3Uwm1PXIjTjQ9HnBgjZPwG+F oIHAt7g9vOweUnEZNbGo/Vd7jQ7fmnFe4dkA6rQp7wNqT1mEerLo1SCza7zOrpXmSJ7G JZiN1hFerSgSs8w/xnwGHZInu/YAHWpK9ZZ67EggBah8RopV2h8pE38rn9XNarbN6g1g bGtdiHltFtNp7pOirWuCUhdXLjAlO3mlccHQIz6eIUxzeUL8V0wyzhSGAbhGoKm+vWa+ nTWA== X-Gm-Message-State: APjAAAXe6sCuWWhTwNiQID85Np4+U0u2etXfz/ArN3S4rfypV8aTPZQL 5LTyvElv8JxDuD3NLbxkhBRAffP6X8U= X-Received: by 2002:a2e:9006:: with SMTP id h6mr23286ljg.42.1568669410558; Mon, 16 Sep 2019 14:30:10 -0700 (PDT) Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com. [209.85.167.47]) by smtp.gmail.com with ESMTPSA id i128sm17839lji.49.2019.09.16.14.30.08 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 16 Sep 2019 14:30:09 -0700 (PDT) Received: by mail-lf1-f47.google.com with SMTP id c195so1114235lfg.9 for ; Mon, 16 Sep 2019 14:30:08 -0700 (PDT) X-Received: by 2002:ac2:50cb:: with SMTP id h11mr153361lfm.170.1568669408577; Mon, 16 Sep 2019 14:30:08 -0700 (PDT) MIME-Version: 1.0 References: <20190913072237.GA12381@zn.tnic> <9dc9f1e6-5d19-167c-793d-2f4a5ebee097@rasmusvillemoes.dk> <20190913104232.GA4190@zn.tnic> <20190913163645.GC4190@zn.tnic> <3fc31917-9452-3a10-d11d-056bf2d8b97d@rasmusvillemoes.dk> In-Reply-To: From: Linus Torvalds Date: Mon, 16 Sep 2019 14:29:52 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC] Improve memset To: Andy Lutomirski Cc: Rasmus Villemoes , Borislav Petkov , Rasmus Villemoes , x86-ml , Josh Poimboeuf , lkml Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 16, 2019 at 10:41 AM Andy Lutomirski wrote: > > After some experimentation, I think y'all are just doing it wrong. > GCC is very clever about this as long as it's given the chance. This > test, for example, generates excellent code: > > #include > > __THROW __nonnull ((1)) __attribute__((always_inline)) void > *memset(void *s, int c, size_t n) > { > asm volatile ("nop"); > return s; > } > > /* generates 'nop' */ > void zero(void *dest, size_t size) > { > __builtin_memset(dest, 0, size); > } I think the point was that we'd like to get the default memset (for when __builtin_memset() doesn't generate inline code) also inlined into just "rep stosb", instead of that tail-call "jmp memset". > So I'm thinking maybe compiler.h should actually do something like: > > #define memset __builtin_memset > > and we should have some appropriate magic so that the memset inline is > exempt from the macro. That "appropriate magic" is easy enough: make sure the memset inline shows up before the macro definition. However, gcc never actually inlines the memset() for me, always doing that "jmp memset" > FWIW, this obviously wrong code: > > __THROW __nonnull ((1)) __attribute__((always_inline)) void > *memset(void *s, int c, size_t n) > { > __builtin_memset(s, c, n); > return s; > } > > generates 'jmp memset'. It's not entirely clear to me exactly what's > happening here. I think calling memset() is always the default fallback for __builtin_memset, and because it can't be recursiveyl inlined, it's done as a call. Which is then turned into a tailcall because the calling conventions match, thus the "jmp memset". But as mentioned, the example you claim generates excellent code doesn't actually inline memset() at all for me, and they are all doing "jmp memset" except for the cases that get turned into direct stores. Eg (removing the cfi annotations etc stuff): zero: movq %rsi, %rdx xorl %esi, %esi jmp memset rather than that "nop" showing up inside the zero function. But I agree that when __builtin_memset() generates manual inline code, it does the right thing, ie memset_a_bit: movl $0, (%rdi) ret is clearly the right thing to do. We knew that. Linus