Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp112988iob; Tue, 17 May 2022 20:38:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyom5JDvsMFUnZHnA/EIJKFOyXXUDKmHmj6h3YwSTV2nwBQnAunCECP8csXqi8x3jrR3rFb X-Received: by 2002:a17:90b:33c6:b0:1dc:ba92:41bb with SMTP id lk6-20020a17090b33c600b001dcba9241bbmr28031872pjb.26.1652845134929; Tue, 17 May 2022 20:38:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652845134; cv=none; d=google.com; s=arc-20160816; b=hW99xkYe2F6txZVxrdpmnzOyEG5BWXTKdmXkEV+Vzw2MlklqzZtFrD3nFhU3PbL+J3 sRtStpp/QY/u5wadwV1fl/0Sm89ta60C5ItiCkZQL5Z1W65oVoe8j7c8NS/1h7wn458W lCLvjeYudwkizFXdYKW+uY5bLbkGPtnIQHvx+tw12dxC3ANLmLLBq7KQm2kSxS6f6Y7d yk0l61sTKaPyuRouau1O/B2nWsfbdi3/Wm0Qu4DZrEUUl/ko8yCpynKpMZ439ag5gJww /9BQ49a5AgSBiG2eUrMq6Lyexd0ueDvcWBS8KulpCilmgLEQH6Pi9jFcJGnVjo6imQOl 0Ulg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=u8hjBqGeDsxKY2xXutPvYF6jcmzD6hAUIg/lOJ1zE0U=; b=Kvq1IdqM+s/NIsSYhj4AXiogv+p6gEAl1a6hxSzjtZeef8sC8hZh5mcRWfvSdNJoOO YyBkPlJjvLR8Y/EX1PkQddmH6zSyVXp02GEO2/XM6XVfqQUnf9IwfbFJr//QCaX0bCRf FpKj00LVk/ZE+FbIbQInyhOUjHmypdSzksF8bCGjsj+9qaCPVh95SNL/+g5A0O/W90SZ 8pBnaz3fxz5Hoqy3pEz97q3RGMwZuuIrNCtkDsBkjD+RkJYpGE3DjO8TkWmJJynthmQz aw+FW4Dp4yY/0Z/o399rxj/m0GU9bmfID7BucWAHAMxsOoKESszdCZafFbMwKSErl+lD 9onQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ellerman.id.au header.s=201909 header.b=NZ9E0WAM; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id r22-20020a632056000000b003c66b4c5d6csi963165pgm.857.2022.05.17.20.38.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 May 2022 20:38:54 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@ellerman.id.au header.s=201909 header.b=NZ9E0WAM; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id F3FEE73562; Tue, 17 May 2022 20:27:26 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234249AbiERC00 (ORCPT + 99 others); Tue, 17 May 2022 22:26:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232012AbiERC0Y (ORCPT ); Tue, 17 May 2022 22:26:24 -0400 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C17A7DF83 for ; Tue, 17 May 2022 19:26:21 -0700 (PDT) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4L2xgb17Rlz4xLR; Wed, 18 May 2022 12:26:18 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ellerman.id.au; s=201909; t=1652840780; bh=u8hjBqGeDsxKY2xXutPvYF6jcmzD6hAUIg/lOJ1zE0U=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=NZ9E0WAMwihSqFIlV6+kygWjO/bVYwjUWYZSRzf13YUZn7xwIO7IoXV3VnaUTh8U0 +vJiNI9d2Ol0LEP8I3pbk3cpFWlCHDUXDksWvXwxXAfwuwlFNKnDkjRxxJtu0Aei38 z8CjsZAIP/UGUiXO513L8HJew7vSnaaU09wT1Sq08CzwPAUu+7GnB6W4Z8RDA3fo1h PzjM3mxTP94UnOuhwiQcNZLI7tjXMtLVSX+RFZDlWCsQXMbiZy0WPY2SQa+3B6H9zW EVnjj7yjhW8DYrBpW4ZeyA5wEsMsImbxmHX22LLlP/5M/2lRQp49KeCAyvJtm0462V pBk+8//8CjxGQ== From: Michael Ellerman To: "Eric W. Biederman" , "Naveen N. Rao" Cc: Baoquan He , kexec@lists.infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH] kexec_file: Drop pr_err in weak implementations of arch_kexec_apply_relocations[_add] In-Reply-To: <8735h8b2f1.fsf@email.froward.int.ebiederm.org> References: <20220425174128.11455-1-naveen.n.rao@linux.vnet.ibm.com> <1652782155.56t7mah8ib.naveen@linux.ibm.com> <8735h8b2f1.fsf@email.froward.int.ebiederm.org> Date: Wed, 18 May 2022 12:26:15 +1000 Message-ID: <87v8u3o9tk.fsf@mpe.ellerman.id.au> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org "Eric W. Biederman" writes: > Looking at this the pr_err is absolutely needed. If an unsupported case > winds up in the purgatory blob and the code can't handle it things > will fail silently much worse later. It won't fail later, it will fail the syscall. sys_kexec_file_load() kimage_file_alloc_init() kimage_file_prepare_segments() arch_kexec_kernel_image_load() kexec_image_load_default() image->fops->load() elf64_load() # powerpc bzImage64_load() # x86 kexec_load_purgatory() kexec_apply_relocations() Which does: if (relsec->sh_type == SHT_RELA) ret = arch_kexec_apply_relocations_add(pi, section, relsec, symtab); else if (relsec->sh_type == SHT_REL) ret = arch_kexec_apply_relocations(pi, section, relsec, symtab); if (ret) return ret; And that error is bubbled all the way back up. So as long as arch_kexec_apply_relocations() returns an error the syscall will fail back to userspace and there'll be an error message at that level. It's true that having nothing printed in dmesg makes it harder to work out why the syscall failed. But it's a kernel bug if there are unhandled relocations in the kernel-supplied purgatory code, so a user really has no way to do anything about the error even if it is printed. > "Naveen N. Rao" writes: > >> Baoquan He wrote: >>> On 04/25/22 at 11:11pm, Naveen N. Rao wrote: >>>> kexec_load_purgatory() can fail for many reasons - there is no need to >>>> print an error when encountering unsupported relocations. >>>> This solves a build issue on powerpc with binutils v2.36 and newer [1]. >>>> Since commit d1bcae833b32f1 ("ELF: Don't generate unused section >>>> symbols") [2], binutils started dropping section symbols that it thought >>> I am not familiar with binutils, while wondering if this exists in other >>> ARCHes except of ppc. Arm64 doesn't have the ARCH override either, do we >>> have problem with it? >> >> I'm not aware of this specific file causing a problem on other architectures - >> perhaps the config options differ enough. There are however more reports of >> similar issues affecting other architectures with the llvm integrated assembler: >> https://github.com/ClangBuiltLinux/linux/issues/981 >> >>> >>>> were unused. This isn't an issue in general, but with kexec_file.c, gcc >>>> is placing kexec_arch_apply_relocations[_add] into a separate >>>> .text.unlikely section and the section symbol ".text.unlikely" is being >>>> dropped. Due to this, recordmcount is unable to find a non-weak symbol >>> But arch_kexec_apply_relocations_add is weak symbol on ppc. >> >> Yes. Note that it is just the section symbol that gets dropped. The section is >> still present and will continue to hold the symbols for the functions >> themselves. > > So we have a case where binutils thinks it is doing something useful > and our kernel specific tool gets tripped up by it. It's not just binutils, the LLVM assembler has the same behavior. > Reading the recordmcount code it looks like it is finding any symbol > within a section but ignoring weak symbols. So I suspect the only > remaining symbol in the section is __weak and that confuses > recordmcount. > > Does removing the __weak annotation on those functions fix the build > error? If so we can restructure the kexec code to simply not use __weak > symbols. > > Otherwise the fix needs to be in recordmcount or binutils, and we should > loop whoever maintains recordmcount in to see what they can do. It seems that recordmcount is not really maintained anymore now that x86 uses objtool? There've been several threads about fixing recordmcount, but none of them seem to have lead to a solution. These weak symbol vs recordmcount problems have been worked around going back as far as 2020: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/include/linux/elfcore.h?id=6e7b64b9dd6d96537d816ea07ec26b7dedd397b9 cheers