Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp7333123rwl; Thu, 23 Mar 2023 02:56:01 -0700 (PDT) X-Google-Smtp-Source: AK7set8Q2hicC6DgTSBTkNx8kaC6VR9A+AcGUuWVztGva2rbeqWZYgZon1uWrUgQOgXkapE0FRkc X-Received: by 2002:a17:906:fcc3:b0:8b2:c2fc:178e with SMTP id qx3-20020a170906fcc300b008b2c2fc178emr8947065ejb.74.1679565361293; Thu, 23 Mar 2023 02:56:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679565361; cv=none; d=google.com; s=arc-20160816; b=VVyxjYKW1b+PObryBN/aANddtAJaEFQ+a6IX7RSaGye5BvzRKB3b9qOM8DjbjxPoet 4qJa1iapP22fIgLEVZd0a7CtHQSWyMPJduw95Q01pJMFya1xW+wNMB3TYdEZUOIygIGr n7k4J6WSVcjQx5bPRpUzhvngvcyJuEtE8hvMteqqwrsVVG0xVQgXi/i13aNZa3nTrge/ n6ACnonTUcdGkhYQwm7kdYKyrzx5CaznxHilgPXKqmcU3JjUJ4mra0sGTT2SOFgP4Pmh nqoruDLpFHI5yi4AW4Me1ZHOH2oVn5OXwDYiHU4pgbhlL38hmLpA/SoH2T0khpRJQkSY B0Ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:message-id :in-reply-to:subject:cc:to:from:date:dkim-signature; bh=7C4N7pzqpOU5qbTYS9u7Db3EY7Oi7m72SrUzYIss0A8=; b=Aj2XowoMVYsGkwInOHHZl7yefCkGUlW44fUbvD0WrUmO0O31CetcUUQUXg6/PpxKoq G0qd4A5wv9uTU06KGGAIph6HV6s2ETV2EU1ZWTR4PpTcBX/btwz7aR/zUnLfRtCNWl3G ZJcUprvgjR8vlyAACbwv0Ub8sxuQKrjR7nXd4S0UpWO46YgTVE9nXUBzF6K11FjtfNDH M43EcxqTDEGcUn9/lE/pP1pDUj9CMUt4cEeED7Uddl8MF+qCfTp2dKjYOurbPUgfXUdb hB5OZkDcM6urDpKsuf+wIpnTSf1xg13F7AG1dBZFX9dqWYz1euFpQx6ygk86IbQNo3k4 SNIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@inria.fr header.s=dc header.b=WMQBevmu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=inria.fr Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id oz39-20020a170906cd2700b008f668e7dc24si2737358ejb.244.2023.03.23.02.55.37; Thu, 23 Mar 2023 02:56:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@inria.fr header.s=dc header.b=WMQBevmu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=inria.fr Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229941AbjCWJxq (ORCPT + 99 others); Thu, 23 Mar 2023 05:53:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229463AbjCWJxB (ORCPT ); Thu, 23 Mar 2023 05:53:01 -0400 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3123166CF for ; Thu, 23 Mar 2023 02:52:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=inria.fr; s=dc; h=date:from:to:cc:subject:in-reply-to:message-id: references:mime-version; bh=7C4N7pzqpOU5qbTYS9u7Db3EY7Oi7m72SrUzYIss0A8=; b=WMQBevmug2c9SA/rC1jTHNzQSGZsWBweFGNBRv30Id9UA25e25kWNOKK cgJrGiYWKekrU1eRV6ERzGBikupg2vXLHD6LZQ2fbXBhuTadW18nQrTdj MAD0TVfJDxasUnpiPTfw/V59Md5K1IyhxcMw+MLZpfeNfOUpefpNWXqGy U=; Authentication-Results: mail3-relais-sop.national.inria.fr; dkim=none (message not signed) header.i=none; spf=SoftFail smtp.mailfrom=julia.lawall@inria.fr; dmarc=fail (p=none dis=none) d=inria.fr X-IronPort-AV: E=Sophos;i="5.98,283,1673910000"; d="scan'208";a="51015617" Received: from 231.85.89.92.rev.sfr.net (HELO hadrien) ([92.89.85.231]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Mar 2023 10:52:36 +0100 Date: Thu, 23 Mar 2023 10:52:35 +0100 (CET) From: Julia Lawall X-X-Sender: jll@hadrien To: Greg KH cc: Alex Elder , Menna Mahmoud , outreachy@lists.linux.dev, johan@kernel.org, elder@kernel.org, linux-kernel@vger.kernel.org, linux-staging@lists.linux.dev Subject: Re: [PATCH v2] staging: greybus: use inline function for macros In-Reply-To: Message-ID: References: <20230321183456.10385-1-eng.mennamahmoud.mm@gmail.com> <2e869677-2693-6419-ea25-f0cc2efcf3dd@ieee.org> <5efa6e6d-8573-31de-639a-d15b2e9deca0@ieee.org> <48674d8f-9753-780c-f37c-f83ea2855ae6@ieee.org> User-Agent: Alpine 2.22 (DEB 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 23 Mar 2023, Greg KH wrote: > On Wed, Mar 22, 2023 at 11:00:41AM +0100, Julia Lawall wrote: > > Greg raised the question of whether the inline function is really as > > efficient as a macro. > > > > I tried the following definitions: > > > > #define to_gbphy_dev(d) container_of(d, struct gbphy_device, dev) > > > > static inline struct gbphy_device *extra_to_gbphy_dev(const struct device *_dev) > > { > > return container_of(_dev, struct gbphy_device, dev); > > } > > > > And the following uses: > > > > ssize_t macro_protocol_id_show(struct device *dev, > > struct device_attribute *attr, char *buf) > > { > > struct gbphy_device *gbphy_dev = to_gbphy_dev(dev); > > > > return sprintf(buf, "%c macro 0x%02x\n", *buf, gbphy_dev->cport_desc->protocol_id); > > } > > ssize_t extra_protocol_id_show(struct device *dev, > > struct device_attribute *attr, char *buf) > > { > > struct gbphy_device *gbphy_dev = extra_to_gbphy_dev(dev); > > > > return sprintf(buf, "extra 0x%02x %c\n", gbphy_dev->cport_desc->protocol_id, *buf); > > } > > > > They are a little bit different to avoid too much compiler optimization. > > > > After doing make drivers/staging/greybus/gbphy.s, I get similar looking > > code in both cases: > > > > Macro version: > > > > .type macro_protocol_id_show, @function > > macro_protocol_id_show: > > endbr64 > > 1: call __fentry__ > > .section __mcount_loc, "a",@progbits > > .quad 1b > > .previous > > pushq %rbp # > > movq %rdx, %rbp # tmp96, buf > > pushq %rbx # > > # drivers/staging/greybus/gbphy.c:40: { > > movq %rdi, %rbx # tmp95, dev > > # drivers/staging/greybus/gbphy.c:43: return sprintf(buf, "%c macro 0x%02x\n", *buf, gbphy_dev->cport_desc->protocol_id); > > call __sanitizer_cov_trace_pc # > > # drivers/staging/greybus/gbphy.c:43: return sprintf(buf, "%c macro 0x%02x\n", *buf, gbphy_dev->cport_desc->protocol_id); > > movq -32(%rbx), %rax # MEM[(struct gbphy_device *)dev_7(D) + -40B].cport_desc, MEM[(struct gbphy_device *)dev_7(D) + -40B].cport_desc > > # drivers/staging/greybus/gbphy.c:43: return sprintf(buf, "%c macro 0x%02x\n", *buf, gbphy_dev->cport_desc->protocol_id); > > movzbl 0(%rbp), %edx # *buf_9(D), *buf_9(D) > > movq %rbp, %rdi # buf, > > movq $.LC18, %rsi #, > > movzbl 3(%rax), %ecx # _1->protocol_id, _1->protocol_id > > call sprintf # > > # drivers/staging/greybus/gbphy.c:44: } > > movl $13, %eax #, > > popq %rbx # > > popq %rbp # > > jmp __x86_return_thunk > > .size macro_protocol_id_show, .-macro_protocol_id_show > > > > Function version: > > > > .type extra_protocol_id_show, @function > > extra_protocol_id_show: > > endbr64 > > 1: call __fentry__ > > .section __mcount_loc, "a",@progbits > > .quad 1b > > .previous > > pushq %rbp # > > movq %rdx, %rbp # tmp96, buf > > pushq %rbx # > > # drivers/staging/greybus/gbphy.c:47: { > > movq %rdi, %rbx # tmp95, dev > > # drivers/staging/greybus/gbphy.c:50: return sprintf(buf, "extra 0x%02x %c\n", gbphy_dev->cport_desc->protocol_id, *buf); > > call __sanitizer_cov_trace_pc # > > # drivers/staging/greybus/gbphy.c:50: return sprintf(buf, "extra 0x%02x %c\n", gbphy_dev->cport_desc->protocol_id, *buf); > > movq -32(%rbx), %rax # MEM[(struct gbphy_device *)dev_8(D) + -40B].cport_desc, MEM[(struct gbphy_device *)dev_8(D) + -40B].cport_desc > > # drivers/staging/greybus/gbphy.c:50: return sprintf(buf, "extra 0x%02x %c\n", gbphy_dev->cport_desc->protocol_id, *buf); > > movzbl 0(%rbp), %ecx # *buf_9(D), *buf_9(D) > > movq %rbp, %rdi # buf, > > movq $.LC19, %rsi #, > > movzbl 3(%rax), %edx # _3->protocol_id, _3->protocol_id > > call sprintf # > > # drivers/staging/greybus/gbphy.c:51: } > > movl $13, %eax #, > > popq %rbx # > > popq %rbp # > > jmp __x86_return_thunk > > .size extra_protocol_id_show, .-extra_protocol_id_show > > > > Both seem to access the memory directly. Maybe the example is too simple, > > and the compiler is more likely to optimize aggressively? > > Nice, that shows that it is the same both ways as the compiler version > you are using is smart enough > > Which compiler and version is this? Does it work the same for all of > the supported versions we have to support (i.e. really old gcc?) gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 I got a similar result for gcc-5: macro_protocol_id_show: 1: call __fentry__ .section __mcount_loc, "a",@progbits .quad 1b .previous movq %rdx, %rax # buf, buf movq -32(%rdi), %rdx # MEM[(struct gbphy_device *)dev_1(D) + -40B].cport_desc, MEM[(struct gbphy_device *)dev_1(D) + -40B].cport_desc movq $.LC19, %rsi #, movq %rax, %rdi # buf, movzbl 3(%rdx), %ecx # _3->protocol_id, D.44996 movzbl (%rax), %edx # *buf_6(D), D.44996 call sprintf # cltq jmp __x86_return_thunk .size macro_protocol_id_show, .-macro_protocol_id_show .section .text.unlikely .LCOLDE20: .text .LHOTE20: .section .rodata.str1.1 .LC21: .string "extra 0x%02x %c\n" .section .text.unlikely .LCOLDB22: .text .LHOTB22: .p2align 6,,63 .globl extra_protocol_id_show .type extra_protocol_id_show, @function extra_protocol_id_show: 1: call __fentry__ .section __mcount_loc, "a",@progbits .quad 1b .previous movq %rdx, %rax # buf, buf movzbl (%rdx), %ecx # *buf_3(D), D.45003 movq -32(%rdi), %rdx # MEM[(struct gbphy_device *)dev_2(D) + -40B].cport_desc, MEM[(struct gbphy_device *)dev_2(D) + -40B].cport_desc movq $.LC21, %rsi #, movq %rax, %rdi # buf, movzbl 3(%rdx), %edx # _6->protocol_id, D.45003 call sprintf # cltq jmp __x86_return_thunk .size extra_protocol_id_show, .-extra_protocol_id_show .section .text.unlikely > > For the most part, sysfs files are not on any sort of "fast path" so a > function call is fine, but as I mentioned before, sometimes we are > forced to move calls to container_of() to container_of_const() and that > can not be an inline function, but must remain a macro :( It seems that this is because there is not a unique return type, but not a performance issue? julia