Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp614878ybe; Fri, 13 Sep 2019 03:25:59 -0700 (PDT) X-Google-Smtp-Source: APXvYqwWzniIP6Fn52WS7xSnNlu824TWWQZljNhogCD/oBVhsEr0DLvQePXytZBCuT+dTzGPx74w X-Received: by 2002:aa7:d8cf:: with SMTP id k15mr47682706eds.195.1568370359279; Fri, 13 Sep 2019 03:25:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568370359; cv=none; d=google.com; s=arc-20160816; b=UrhTWFo0uUsstVAiXv81dpSFkJ0EYAlN4dRcnCDXPGwY8dvUbdmwlgSxR461aDEtpD F56BzTRC0EvX2je8o8CAh5jKuzrlH9zi7SR78YFv4o5deA79ViX4egP6DU6fU7lJugvt lobAmt3HFkRnSP0pZtZnYGcizMytvON4gCxaKfEZiL0ve/ZYmbikfFE+HYfNU4TNyb2b eh16g3PP8Qc/ODdj+Pq7Ya1/jJq8mYaHYBJQmglnVFOz+DgYlA6FpQ0lhv8WEVx4+xFE WYjmqsR/K2kkPBoJnA32SMPXvuvZ3GbxrS/jz4KBkws0U8i5aK99GHKJLEzU3Qw95shW l92A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=iy3rPYv8IxEgFPOj88Xqu6ZodZ60o8YHq7Vyj7h0pd0=; b=bzk/il1GZzVrCaldk+o/hvw1gdjF0bMRU5DByHMzfFW/bUSgKH0VfIfrYTOa7hx5cx tI3cEBdcM2jJ3RLDlICLvtrl92wiv5ypfgYerRxs9JMN5Ezlbdb46JUptOtnlBRx0TJe y7t8GVFnYgPHAcaRfOa8u7mfM6s754Urq9tHN0TPEwCe2TGbHzL5Q/uDitk5n8VkDoMD UkB/Xm+kafGab3BIVQPmOf5Knk7/NhfA0+KB8ebvQevEkIn8Stvx5Wi9ojuZPb8DLTk7 A1v17tfF6sXa0pQkh9zVxcU/JRaMmvNq6xjdkfuUDkun7i4lszOwBBUMBjE7SDX96SA5 zIRw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=Z4iEjFWn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f47si16728095ede.263.2019.09.13.03.25.35; Fri, 13 Sep 2019 03:25:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=Z4iEjFWn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387735AbfIMJAe (ORCPT + 99 others); Fri, 13 Sep 2019 05:00:34 -0400 Received: from mail-lf1-f68.google.com ([209.85.167.68]:34969 "EHLO mail-lf1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387499AbfIMJAe (ORCPT ); Fri, 13 Sep 2019 05:00:34 -0400 Received: by mail-lf1-f68.google.com with SMTP id w6so21567032lfl.2 for ; Fri, 13 Sep 2019 02:00:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=iy3rPYv8IxEgFPOj88Xqu6ZodZ60o8YHq7Vyj7h0pd0=; b=Z4iEjFWn0T5IGOF+peJ2fu2ey8x7ARRNGdRQXARaPqTWNrJDlg/9rfWzOGlU+SXS4H w4eSp+LVS3o3UrhgOEiT+AFjRMFYjK/IKFfctD2yyMHN8ovnG9lXF4UoEBNuR8uPDAnn kylBjujazd0cknxDWHzi7sQIBLo0I372i+MwY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=iy3rPYv8IxEgFPOj88Xqu6ZodZ60o8YHq7Vyj7h0pd0=; b=KU6VZZ2PLofydwB2oD5xyhi3tMmVqfh10OIM1j5qw9Hf34vhd+gMGLoHb0n4ch0tnH MvHy6a2dP9y5r4K9JtFg4cU6W7iXN7cAuit4CIVTpNUG9zgi4itWX8PwztMEWJ7w8C+M sVXToZESbTCdzBJUIivhaSKX3pKNrPKMgvQX6duRg6NdweAQ/dGTnQfMsacIb8quHu5m Ke9ni5EETdZDeLgZRaYWouDDCocfvEa8Dz2y9Uxfx+C26a2h8ArYTAsTh3jbBpTlkIQJ 8AqX04n0/e/ZBu+yFVrJaJXSTfhoUaB4y5OEos09OUzVvbAnX8tvT+K4LsUNdNkwaAJt MdHg== X-Gm-Message-State: APjAAAX0omK9puazkvAnIIwNfKNETj+o+mJOq00hS4JWZy8Av7PeMznZ jaSP1Ug99ikLD+OKWaDEn0RRAC6wph9ldQ== X-Received: by 2002:a19:2d54:: with SMTP id t20mr1032481lft.84.1568365230742; Fri, 13 Sep 2019 02:00:30 -0700 (PDT) Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com. [209.85.208.171]) by smtp.gmail.com with ESMTPSA id u1sm6686124lfi.83.2019.09.13.02.00.29 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 13 Sep 2019 02:00:29 -0700 (PDT) Received: by mail-lj1-f171.google.com with SMTP id h2so19829362ljk.1 for ; Fri, 13 Sep 2019 02:00:29 -0700 (PDT) X-Received: by 2002:a05:651c:1108:: with SMTP id d8mr21690001ljo.180.1568365229338; Fri, 13 Sep 2019 02:00:29 -0700 (PDT) MIME-Version: 1.0 References: <20190913072237.GA12381@zn.tnic> In-Reply-To: <20190913072237.GA12381@zn.tnic> From: Linus Torvalds Date: Fri, 13 Sep 2019 10:00:13 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC] Improve memset To: Borislav Petkov Cc: x86-ml , Andy Lutomirski , Josh Poimboeuf , lkml Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 13, 2019 at 8:22 AM Borislav Petkov wrote: > > since the merge window is closing in and y'all are on a conference, I > thought I should take another stab at it. It being something which Ingo, > Linus and Peter have suggested in the past at least once. > > Instead of calling memset: > > ffffffff8100cd8d: e8 0e 15 7a 00 callq ffffffff817ae2a0 <__memset> > > and having a JMP inside it depending on the feature supported, let's simply > have the REP; STOSB directly in the code: That's probably fine for when the memset *is* a call, but: > The result is this: > > static __always_inline void *memset(void *dest, int c, size_t n) > { > void *ret, *dummy; > > asm volatile(ALTERNATIVE_2_REVERSE("rep; stosb", Forcing this code means that if you do struct { long hi, low; } a; memset(&a, 0, sizeof(a)); you force that "rep stosb". Which is HORRID. The compiler should turn it into just one single 8-byte store. But because you took over all of memset(), now that doesn't happen. In fact, the compiler should be able to keep a structure like that in registers if the use of it is fairly simple. Which again wouldn't happen due to forcing that inline asm. And "rep movsb" is ok for variable-sized memsets (well, honestly, generally only when size is sufficient, but it's been getting progressively better). But "rep movsb" is absolutely disastrous for small constant-sized memset() calls. It serializes the pipeline, it takes tens of cycles etc - for something that can take one single cycle and be easily hidden in the instruction stream among other changes. And we do have a number of small structs etc in the kernel. So we do need to have gcc do the __builtin_memset() for the simple cases.. Linus