Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp630197ybe; Fri, 13 Sep 2019 03:44:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqxTMMH9E9RnrMw8sAygxlxfWbAyyVcnonC5Z8mwmjqbK+RYmqVUQzZO0/InzmeOJz1HP4gF X-Received: by 2002:a50:fc17:: with SMTP id i23mr13420867edr.287.1568371460680; Fri, 13 Sep 2019 03:44:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568371460; cv=none; d=google.com; s=arc-20160816; b=PV+5tw+w6v3DYkNQXf0VXrdnWbEbwzkO+Q/wa7TT6tuk++SzqzF08NpDVjEhnfrkJZ EB4IYRbjrl2cGZ2/8Ig91uudMpYrxsGERWORGhoeVLYTvCrBWO2j7SROQ+7/HugZeJRX AmK4CglISoQSiX7w0dAxyDbujfk2B6WzNkffmqOvykeIZIZgWSychHcJWnwINVH0IKjy pWD+2biA4eIErGY0OMnKgrc/cswGcoNxUw6wb7UgmTFzBNlFHyncDOOvPfPAy5hmXzcv TgZertKj0Mq8JPrL6VystmS6igcHqasNj/XEzTxBV5CbUaCLgiimc6xJ+B5DEnOT9yF5 uTIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=LO5l725TC7S+8bCG/4dlZNuzhvj1SgEvwbf4wk8dwd8=; b=NlUGI39foUaWOw1kChmiHdMNEx8v+9lsVs+m3upVT1fMkHsSu2px5SttF2q/jeH33X 9eK7xUpb8/JYVfcBfkdqOCISdn4VeG79sAnhZzsH56d5RtoUX7nSlk2UPM/e8Ay2JToL Tht5Pkc8fTpAzN/UW4RBaorEKys+nfpe0vnlyU/gnpQxXr0UMdpwEnKoNNHMyOq0EbXL 81BHyKVVp3Q0wj6KN/r/669K7GYNb2N3G0B/ROYCVZj+9FDPJeCE690wA7ZWCDRQc6aP W19xtiwjUVAfifNpjadQ4jMGtUEpcaUF8RM2LhYG+pkgdr4R521xBRCucmoQTzfa449T y29Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rasmusvillemoes.dk header.s=google header.b="arJ0Fs/v"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d3si14086500eja.30.2019.09.13.03.43.57; Fri, 13 Sep 2019 03:44:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@rasmusvillemoes.dk header.s=google header.b="arJ0Fs/v"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387865AbfIMJSE (ORCPT + 99 others); Fri, 13 Sep 2019 05:18:04 -0400 Received: from mail-lj1-f194.google.com ([209.85.208.194]:39611 "EHLO mail-lj1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387424AbfIMJSE (ORCPT ); Fri, 13 Sep 2019 05:18:04 -0400 Received: by mail-lj1-f194.google.com with SMTP id j16so26412142ljg.6 for ; Fri, 13 Sep 2019 02:18:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rasmusvillemoes.dk; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=LO5l725TC7S+8bCG/4dlZNuzhvj1SgEvwbf4wk8dwd8=; b=arJ0Fs/vHF+gcMBpJjNkNDLsgoxMSHqEEUh4CAcOJx/bCcclBS7cePWfk9ty+QE/9I xgHrHWYYeVSIkBpMDoVmu5+PrVKWHhTD1Xli1RgsOdsW5Z+7jijjepiOXUPu2mRw8Fnb FI8w7lQ2nYeTf7n7BBW94AQho7z2s97LoBIbU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=LO5l725TC7S+8bCG/4dlZNuzhvj1SgEvwbf4wk8dwd8=; b=hayGIpkkOwfgGdd8bUtyy122WeZu/mM5+VYd6AJ6E1mvIjRmqJ/oM/ButYKvNiyiFA wc2vhe4EWVpJgM6W8qRiNU7ETZjL7e+BPJsYIoiXrWjk+Y3CVb+GCKuFGLgNwFArfVJ5 9WEjPVNy5qmqNo9r/rmE2xI9eo/OAn68i2wTz9OuJOENUgukEbfOcF77IftQJfoaWhI1 IRAxy1n1MPv6jbpavNehv/SKDFQF+0ndJT4Pis7Ok99+zMzqrF6I/40PXqwIiM8Fcrp0 mT68mOPAlEPf6rJJf+lhil2t4Xqm5SpNfv+JvG42iq9nF8ycoB5KIU9xOvIuIX4SE/q0 9h5A== X-Gm-Message-State: APjAAAWMRcU3plqtXmYu190Byj0gcZtkcqMemHDr/rm1CjiKL57oPK+o nbPIJnAyMkSETdFKyA2bOeGcDr+TZfFZea+W X-Received: by 2002:a2e:141c:: with SMTP id u28mr12993624ljd.44.1568366281361; Fri, 13 Sep 2019 02:18:01 -0700 (PDT) Received: from [172.16.11.28] ([81.216.59.226]) by smtp.gmail.com with ESMTPSA id q11sm3196811ljc.27.2019.09.13.02.18.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 13 Sep 2019 02:18:00 -0700 (PDT) Subject: Re: [RFC] Improve memset To: Linus Torvalds , Borislav Petkov Cc: x86-ml , Andy Lutomirski , Josh Poimboeuf , lkml References: <20190913072237.GA12381@zn.tnic> From: Rasmus Villemoes Message-ID: <9dc9f1e6-5d19-167c-793d-2f4a5ebee097@rasmusvillemoes.dk> Date: Fri, 13 Sep 2019 11:18:00 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 13/09/2019 11.00, Linus Torvalds wrote: > On Fri, Sep 13, 2019 at 8:22 AM Borislav Petkov wrote: >> >> since the merge window is closing in and y'all are on a conference, I >> thought I should take another stab at it. It being something which Ingo, >> Linus and Peter have suggested in the past at least once. >> >> Instead of calling memset: >> >> ffffffff8100cd8d: e8 0e 15 7a 00 callq ffffffff817ae2a0 <__memset> >> >> and having a JMP inside it depending on the feature supported, let's simply >> have the REP; STOSB directly in the code: > > That's probably fine for when the memset *is* a call, but: > >> The result is this: >> >> static __always_inline void *memset(void *dest, int c, size_t n) >> { >> void *ret, *dummy; >> >> asm volatile(ALTERNATIVE_2_REVERSE("rep; stosb", > > Forcing this code means that if you do > > struct { long hi, low; } a; > memset(&a, 0, sizeof(a)); > > you force that "rep stosb". Which is HORRID. > > The compiler should turn it into just one single 8-byte store. But > because you took over all of memset(), now that doesn't happen. OK, that answers my question. > So we do need to have gcc do the __builtin_memset() for the simple cases.. Something like if (__builtin_constant_p(c) && __builtin_constant_p(n) && n <= 32) return __builtin_memset(dest, c, n); might be enough? Of course it would be sad if 32 was so high that this turned into a memset() call, but there's -mmemset-strategy= if one wants complete control. Though that's of course build-time, so can't consider differences between cpu models. Rasmus