Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp2145782pxb; Fri, 25 Mar 2022 11:54:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJySQWoclVdT3/Gy1CmRLJxz7wxh9NkKEZHFaBrooldNUYvsRKbf7uyu8e7APersgytEDoaz X-Received: by 2002:a17:90b:234f:b0:1c6:a3f7:98a3 with SMTP id ms15-20020a17090b234f00b001c6a3f798a3mr26595777pjb.63.1648234490368; Fri, 25 Mar 2022 11:54:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648234490; cv=none; d=google.com; s=arc-20160816; b=qcvP9Ch6qt+i8wPilckMFGbD2nICInw9V5KmBWMYXUg25TPv8qY3GL8LA6dRjFaBDl JBGZd0limSdpb/B2GY1FD/xgrJ6gG2ZRmhLeepAn4prNbHGCsEQRPq2w4TstX8eymHj0 UHOkOYa9MfGqViU0ri6Dm0ZProUKbDraZgEPNvaBWClttWv52dBLgUoMdKp6tnm5lnF3 FdjXwhdM5UNaHq2h7lUB7R3YS7GLmYOKBwIKF8aOp/rajGcxTD9Mumq3QN7ch8pmZV37 G2lSCOzf99dV0T2ivif2egGEEswDchmcSNGUa00Mp1FysYc2u5lmq8IATB1RlPtDBGiZ T8rg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=oh2Zsb3uODky2rvZj7rdU+XcoyHkVFRuAk+G0knw1RU=; b=GYRG4OS9x2xF7jSEsQMdCI+yAwAIGTHW0gDpEreHIrkZctOaS3APzgQHkSMKmWeaLX MNFPrlwsr4tdvLFUtlvDorCRCC77qHBeaexXWmgh7wSh9Ay/0ZBQ5fgKX+KI0Smz7bV5 ySb6oxuDuHRmcHV6FLkMZzr2O6czeGFqwSJCKBKOhkxwXMond0InqwdB6OyNfNS39293 +GiXCrcKVZr5mPfRu75G+zEmzod7rXGTrZUpHlxcsSIis9Ok07WxgkeBYamfnWCEpyzW eu1fNFDXrz026PySQlHG3mBelN6QBc6W2Fzw45WBc9hieKvWr/PIsTYbUlzOS69fO7Mt KUJg== ARC-Authentication-Results: i=1; mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id 21-20020a630e55000000b00381f044e11fsi2781860pgo.550.2022.03.25.11.54.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Mar 2022 11:54:50 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 6EA751D2059; Fri, 25 Mar 2022 11:07:41 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377662AbiCYPdo (ORCPT + 99 others); Fri, 25 Mar 2022 11:33:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377136AbiCYPXt (ORCPT ); Fri, 25 Mar 2022 11:23:49 -0400 Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0C9B3DCE07; Fri, 25 Mar 2022 08:17:25 -0700 (PDT) Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 22PFCcBg022474; Fri, 25 Mar 2022 10:12:38 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 22PFCcbT022473; Fri, 25 Mar 2022 10:12:38 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Fri, 25 Mar 2022 10:12:38 -0500 From: Segher Boessenkool To: Peter Zijlstra Cc: Mark Rutland , Nick Desaulniers , Borislav Petkov , Nathan Chancellor , x86-ml , lkml , llvm@lists.linux.dev, Josh Poimboeuf , linux-toolchains@vger.kernel.org Subject: Re: clang memcpy calls Message-ID: <20220325151238.GB614@gate.crashing.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi! On Fri, Mar 25, 2022 at 03:13:36PM +0100, Peter Zijlstra wrote: > > +linux-toolchains > > On Fri, Mar 25, 2022 at 12:15:28PM +0000, Mark Rutland wrote: > > On Thu, Mar 24, 2022 at 11:43:46AM -0700, Nick Desaulniers wrote: > > > On Thu, Mar 24, 2022 at 4:19 AM Borislav Petkov wrote: > > > > The issue is that clang generates a memcpy() call when a struct copy > > > > happens: > > > > > > > > if (regs != eregs) > > > > *regs = *eregs; > > > > > > Specifically, this is copying one struct pt_regs to another. It looks > > > like the sizeof struct pt_regs is just large enough to have clang emit > > > the libcall. > > > https://godbolt.org/z/scx6aa8jq > > > Otherwise clang will also use rep; movsq; when -mno-sse -O2 is set and > > > the structs are below ARBITRARY_THRESHOLD. Should ARBITRARY_THRESHOLD > > > be raised so that we continue to inline the memcpy? *shrug* I win't talk for LLVM, of course... all of what I'll write here is assuming LLVM copied the GCC requirement that memcpy is the standard function, even if freestanding (and also memmove, memset, memcmp). It is valid to replace any call to memcpy with some open-coded machine code, or conversely, insert calls to memcpy wherever its semantics are wanted. > > > As Mark said in the sibling reply; I don't know of general ways to > > > inhibit libcall optimizations on the level you're looking for, short > > > of heavy handy methods of disabling optimizations entirely. There's > > > games that can be played with -fno-builtin-*, but they're not super > > > portable, and I think there's a handful of *blessed* functions that > > > must exist in any env, freestanding or not: memcpy, memmove, memset, > > > and memcmp for which you cannot yet express "these do not exist." The easy, fool-proof, and correct way to prevent a function ending in a sibling call is to simply not let it end in a call at all. The best way I know to do that is insert asm(""); right before the end of the function. > > a) The compiler expects the out-of-line implementations of functions > > ARE NOT instrumented by address-sanitizer. > > > > If this is the case, then it's legitimate for the compiler to call > > these functions anywhere, and we should NOT instrument the kernel > > implementations of these. If the compiler wants those instrumented it > > needs to add the instrumentation in the caller. The compiler isn't assuming anything about asan. The compiler generates its code without any consideration of what asan will or will not do. The burden of making things work is on asan. It is legitimate to call (or not call!) memcpy anywhere. memcpy always is __builtin_memcpy, which either or not does a function call. > > AFAICT The two options for the compiler here are: > > > > 1) Always inline an uninstrumented form of the function in this case > > > > 2) Have distinct instrumented/uninstrumented out-of-line > > implementations, and call the uninstrumented form in this case. The compiler should not do anything differently here if it uses asan. The address sanitizer and the memcpy function implementation perhaps have to cooperate somehow, or asan needs more smarts. This needs to happen no matter what, to support other things calling memcpy, say, assembler code. > > So from those examples it seems GCC falls into bucket (a), and assumes the > > blessed functions ARE NOT instrumented. No, it doesn't show GCC assumes anything. No testing of this kind can show anything alike. > > I think something has to change on the compiler side here (e.g. as per > > options above), and we should align GCC and clang on the same > > approach... GCC *requires* memcpy to be the standard memcpy always (i.e. to have the standard-specified semantics). This means that it will have the same semantics as __builtin_memcpy always, and either or not be a call to an external function. It can also create calls to it out of thin air. All of this has been true for thirty years, and it won't change today either. Segher