Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp4029707rwi; Sat, 22 Oct 2022 04:08:08 -0700 (PDT) X-Google-Smtp-Source: AMsMyM51OEmIwzkIMscGMXYuVWCXrfRbgSMxPwHHd/S0E/5PalVJ5FR6o24r/oaJsg1o/DLAXKK0 X-Received: by 2002:a17:906:8471:b0:7a2:859a:873a with SMTP id hx17-20020a170906847100b007a2859a873amr654441ejc.730.1666436888407; Sat, 22 Oct 2022 04:08:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666436888; cv=none; d=google.com; s=arc-20160816; b=DDi5WnyTCuHlVGk/HOct8Aj43FHVMxMUT8UeMSSbWOD8MQM15Fz3xEBbnO4PumbvZu 8m3KWXYgObjk76pMQzI7lVUbNZn9QfbypyJ1ItjUwf0w/6dr9u4ncVgvdk3FV03Vr77L jiCXhiM8druAwrKkfvYikRea395ggHyBy/C9XuHEoxOyxsJnbI3rzFb4V1Jc+q86IjQ+ vgRb1HINDR4KAbOiIm+ly4Rpwxk2u1vtBaCM5vnuOnLW96v44t7GTz6NJSicDIIORid9 rxwdDnt5nRpfThdn621QiZCzNuhBcWIugPQuqlJ95dMzNG08GptXJ7BWzQxFVfBNvECs N5EA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=NSUFEuq31TIvWJhG0pCK7NgAIkBw6PjlQWGt6oduo74=; b=K9WfUUPEusyk/xMoWamS399sRogFAzkojzQNOMRSpyTzSNKbDGlLaHNABEkPdSQJ8h adLHnJwieFIh1xeCGsLhdhxEVPggzypPTDqyy//OAS5yoT7y8f495icwxPN9b1Bz84yP IyE2qRXrgLBInbEQUS7fzB3ehOup4ZyVfTDB4B0wCQZICJor/bk7v/QOhu4MwvKJ+nkQ M3lj3MfsjpEZ87Y+DpEonKA+lYea92jbav/lVa+uOpeuHC4b36D0V9Giz83kgOOt3Eb0 o5TLCXHKYWdYGcTwBa2CyAJdzWZszPUujxd1wZ4QBA9uM0P37BrjQr+xRhe8I1WcZARH NchA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v26-20020aa7cd5a000000b00460e1cc2d76si5726768edw.72.2022.10.22.04.07.43; Sat, 22 Oct 2022 04:08:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230487AbiJVLB4 convert rfc822-to-8bit (ORCPT + 99 others); Sat, 22 Oct 2022 07:01:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229816AbiJVLBT (ORCPT ); Sat, 22 Oct 2022 07:01:19 -0400 X-Greylist: delayed 1783 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Sat, 22 Oct 2022 03:19:43 PDT Received: from mx08-006a4e02.pphosted.com (mx08-006a4e02.pphosted.com [143.55.148.243]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB9AB5A834; Sat, 22 Oct 2022 03:19:41 -0700 (PDT) Received: from pps.filterd (m0316698.ppops.net [127.0.0.1]) by mx08-006a4e02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29M5tZmh007584; Sat, 22 Oct 2022 08:06:30 +0200 Received: from mta-out01.sim.rediris.es (mta-out01.sim.rediris.es [130.206.24.43]) by mx08-006a4e02.pphosted.com (PPS) with ESMTPS id 3kbp6ce53y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 22 Oct 2022 08:06:30 +0200 Received: from mta-out01.sim.rediris.es (localhost.localdomain [127.0.0.1]) by mta-out01.sim.rediris.es (Postfix) with ESMTPS id 5FE293008AAC; Sat, 22 Oct 2022 08:06:29 +0200 (CEST) Received: from localhost (localhost.localdomain [127.0.0.1]) by mta-out01.sim.rediris.es (Postfix) with ESMTP id 4C10E3008C30; Sat, 22 Oct 2022 08:06:29 +0200 (CEST) X-Amavis-Modified: Mail body modified (using disclaimer) - mta-out01.sim.rediris.es Received: from mta-out01.sim.rediris.es ([127.0.0.1]) by localhost (mta-out01.sim.rediris.es [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 5h559gs9VuxO; Sat, 22 Oct 2022 08:06:29 +0200 (CEST) Received: from lt-gp.iram.es (haproxy02.sim.rediris.es [130.206.24.70]) by mta-out01.sim.rediris.es (Postfix) with ESMTPA id 3D3823008AAC; Sat, 22 Oct 2022 08:06:27 +0200 (CEST) Date: Sat, 22 Oct 2022 08:06:21 +0200 From: Gabriel Paubert To: Linus Torvalds Cc: Segher Boessenkool , "Jason A. Donenfeld" , linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org, linux-arch@vger.kernel.org, linux-toolchains@vger.kernel.org, Masahiro Yamada , Kees Cook , Andrew Morton , Andy Shevchenko , Greg Kroah-Hartman Subject: Re: [PATCH] kbuild: treat char as always signed Message-ID: References: <20221019162648.3557490-1-Jason@zx2c4.com> <20221019165455.GL25951@gate.crashing.org> <20221019174345.GM25951@gate.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: 8BIT X-Proofpoint-GUID: QhCUmQhypiY_jIEz4kLssNrn_gVxl2_Z X-Proofpoint-ORIG-GUID: QhCUmQhypiY_jIEz4kLssNrn_gVxl2_Z X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-21_04,2022-10-21_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbounddefault_notspam policy=outbounddefault score=0 bulkscore=0 impostorscore=0 malwarescore=0 suspectscore=0 priorityscore=1501 spamscore=0 phishscore=0 mlxscore=0 lowpriorityscore=0 clxscore=1011 adultscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210220038 X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,PDS_RDNS_DYNAMIC_FP, RCVD_IN_DNSWL_NONE,RDNS_DYNAMIC,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 21, 2022 at 03:46:01PM -0700, Linus Torvalds wrote: > On Thu, Oct 20, 2022 at 3:41 AM Gabriel Paubert wrote: > > > > I must miss something, the strcmp man page says: > > > > "The comparison is done using unsigned characters." > > You're not missing anything, I just hadn't looked at strcmp() in forever. > > Yeah, strcmp clearly doesn't care about the signedness of 'char', and > arguably an unsigned char argument makes more sense considering the > semantics of the funmction. > > > But it's not for this that I wrote this message. Has anybody considered > > using transparent unions? > > I don't love the transparent union-as-argument syntax, but you're > right, that would fix the warning. I'm not in love with the syntax either. > > Except it then doesn't actually *work* very well. > > Try this: > > #include > > #if USE_UNION > typedef union { > const char *a; > const signed char *b; > const unsigned char *c; > } conststring_arg __attribute__ ((__transparent_union__)); > size_t strlen(conststring_arg); > #else > size_t strlen(const char *); > #endif > > int test(char *a, unsigned char *b) > { > return strlen(a)+strlen(b); > } > > int test2(void) > { > return strlen("hello"); > } > > and now compile it both ways with > > gcc -DUSE_UNION -Wall -O2 -S t.c > gcc -Wall -O2 -S t.c > Ok, I?ve just tried it, except that I had something slightly different in mind, but perhaps should have been clearer in my first post. I have change your code to the following: #include #if USE_UNION typedef union { const char *a; const signed char *b; const unsigned char *c; } conststring_arg __attribute__ ((__transparent_union__)); static inline size_t strlen(conststring_arg p) { return __builtin_strlen(p.a); } #else size_t strlen(const char *); #endif int test(char *a, unsigned char *b) { return strlen(a)+strlen(b); } int test2(void) { return strlen("hello"); } > and notice how yes, the "-DUSE_UNION" one silences the warning about > using 'unsigned char *' for strlen. So it seems to work fine. > > But then look at the code it generates for 'test2()" in the two cases. Now test2 looks properly optimized. This is a bit exploiting a compiler loophole, it calls an external function which has been defined with the same name! Depending on how you look at it, it's either disgusting or clever. I don?t have clang installed, so I don't know whether it would swallow this code or react with a strong allergy. Gabriel > > The transparent union version actually generates a function call to an > external 'strlen()' function. > > The regular version uses the compiler builtin, and just compiles > test2() to return the constant value 5. > > So playing games with anonymous union arguments ends up also disabling > all the compiler optimizations we do want, becaue apparently gcc then > decides "ok, I'm not going to warn about you declaring this > differently, but I'm also not going to use the regular one because you > declared it differently". > > This, btw, is also the reason why we don't use --freestanding in the > kernel. We do want the basic things to just DTRT. > > For the sockaddr_in games, the above isn't an issue. For strlen() and > friends, it very much is. > > Linus