Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp839541pxb; Tue, 14 Sep 2021 09:47:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy3gQL3B5Sed2286bcPnRS9cm4lMV/Dq17S/0vlu8bv7vgvD2CMJ4SmbWBGFm8xlIaG68yr X-Received: by 2002:a92:c9c8:: with SMTP id k8mr12770666ilq.51.1631638074960; Tue, 14 Sep 2021 09:47:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631638074; cv=none; d=google.com; s=arc-20160816; b=A98JYUtkQ+WC2/Wv3byvYNpUR96UZcA+PZ6Csbunw12NuoDg0gSaX9CU3sb9zB0/IA EjNsMnEXTFurXmduHfF0ddPjop+jjgdMNWYLHnkHqzonYwmT2+1um1D7n39B/l2EJNcm /hy+3Vp2Ax+zyBd4VI9bdqbuyLs8yO8GGg27LJkDo1WVqyjtTc+YFBodDj6bThNN26+H 2sOBF4q4uqrVoydWpXiCtV6gJvNj+uhRl0+T2Be6JyJiFNr9WAp2hiil4rq8W/hq30gE f7n2aJgL09lIsgFRLgmsC6GOCfBm9kPPvqaCclzombGFu6c2iS082Ic2E8kU4m54w8D2 3KJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=BIMDOhcWNV7jvzPb1Hgd6NMRlR8/pk3nB7n1MsescGQ=; b=GuLE9N6UaRJdkp4Q0o0YcM0fM7GnMPJxlj9PEP4jUU1PMB368x6XCZG8vQ9T1drntb 3KikHG6PRef1mVARnlYZ1ffcMsC+8VsiESr+ADUc3LyQGwNBJShhMZYf+q74Kx0w2Xnu AQpBYA/xiDPvociH67oS76eOSsxRg5jJVTynzgo413n8vrRFSTQ9jEXAC2k8bwtPFUXJ cFpCYPTNgygMvGgMPuvbEH3ZLNCA9ncGFjcT7LmGREJBgAMq5Lc6qPBVXys/SZQCzKmH jjV57Z881yM7CYUm2HtuxoNE4m3vmPNhp5rT5/iXwjvqgsZ+2V9KiM2SAwicg0vwSZto cYdg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x8si11796553jaq.91.2021.09.14.09.47.42; Tue, 14 Sep 2021 09:47:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229706AbhINQsQ (ORCPT + 99 others); Tue, 14 Sep 2021 12:48:16 -0400 Received: from wtarreau.pck.nerim.net ([62.212.114.60]:42124 "EHLO 1wt.eu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229464AbhINQsQ (ORCPT ); Tue, 14 Sep 2021 12:48:16 -0400 Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id 18EGksoH011021; Tue, 14 Sep 2021 18:46:54 +0200 Date: Tue, 14 Sep 2021 18:46:54 +0200 From: Willy Tarreau To: David Laight Cc: Douglas Gilbert , LKML Subject: Re: how many memset(,0,) calls in kernel ? Message-ID: <20210914164654.GC10488@1wt.eu> References: <1c4a94df-fc2f-1bb2-8bce-2d71f9f1f5df@interlog.com> <20210912045608.GB16216@1wt.eu> <88976a40175c491fb5e3349f6686ad67@AcuMS.aculab.com> <20210913160945.GA2456@1wt.eu> <15cd0a8e72b3460db939060db25dd59a@AcuMS.aculab.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <15cd0a8e72b3460db939060db25dd59a@AcuMS.aculab.com> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 14, 2021 at 08:23:40AM +0000, David Laight wrote: > > The exact point is, here it's up to the compiler to decide thanks to > > its builtin what it considers best for the target CPU. It already > > knows the fixed size and the code is emitted accordingly. It may > > very well be a call to the memset() function when the size is large > > and a power of two because it knows alternate variants are available > > for example. > > > > The compiler might even decide to shrink that area if other bytes > > are written just after the memset(), leaving only holes touched by > > memset(). > > You might think the compiler will make sane choices for the target CPU. > But it often makes a complete pig's breakfast of it. > I'm pretty sure 6 'rep stos' is slower than 6 write an absolutely > everything - with the possible exception of an 8088. It can be suboptimal (especially with the moderate latencies required for small areas), but my point is that in plenty of cases the memset() call will be totally eliminated. Example: The file: #include int f(int a, int b) { struct { int n1; int n2; int n3; int n4; } s; memset(&s, 0, sizeof(s)); s.n2 = a; s.n3 = b; return s.n1 + s.n2 + s.n3 + s.n4; } gives: 0000000000000000 : 0: 8d 04 37 lea (%rdi,%rsi,1),%eax 3: c3 retq See ? The builtin allowed the compiler to *know* that these areas were zeroes and could optimize them away. More importantly this can save some reads from being performed, with the data being only written into: #include struct { int n1; int n2; } s; void f(int a, int b) { memset(&s, 0, sizeof(s)); s.n1 |= a; s.n2 |= b; } Gives: 0000000000000000 : 0: 89 3d 00 00 00 00 mov %edi,0x0(%rip) # 6 6: 89 35 00 00 00 00 mov %esi,0x0(%rip) # c c: c3 retq See ? Just plain writes, no read-modify-write of the memory area. If you'd call an external memset() function, you'd instantly lose all these possibilities: 0000000000000000 : 0: 55 push %rbp 1: ba 08 00 00 00 mov $0x8,%edx 6: 89 fd mov %edi,%ebp 8: bf 00 00 00 00 mov $0x0,%edi d: 53 push %rbx e: 89 f3 mov %esi,%ebx 10: 31 f6 xor %esi,%esi 12: 48 83 ec 08 sub $0x8,%rsp 16: e8 00 00 00 00 callq 1b 1b: 09 2d 00 00 00 00 or %ebp,0x0(%rip) # 21 21: 09 1d 00 00 00 00 or %ebx,0x0(%rip) # 27 27: 48 83 c4 08 add $0x8,%rsp 2b: 5b pop %rbx 2c: 5d pop %rbp 2d: c3 retq Thus the fact that the compiler has knowledge of the memset() is useful. Willy