Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp236634iob; Wed, 18 May 2022 00:44:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyqlz9vVSnqk3E1LJ5Sot0B/tCxoVJYjN3DazieIuH2SahtDDRWlI0cZ5El1x53F5O5X/fg X-Received: by 2002:a17:902:c2c7:b0:159:9f9:85f3 with SMTP id c7-20020a170902c2c700b0015909f985f3mr25839327pla.18.1652859843294; Wed, 18 May 2022 00:44:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652859843; cv=none; d=google.com; s=arc-20160816; b=FfIaUUx4JHw/v8QRUnJE8Jpa1DdFi3+6Dsl6iZG45idQfuib08JsNUrHC5mI+x/zw1 G4Hn2qaWcRCmHWKcjlXCpcONhhBPfoBXa7LnVtM+HU1h4lUgeR4JARqs6jPG7pdlRqhS ecwWxv8CdPOTfFCmDtxVwEzvNQpE35ht01X3bqPkWMv9JRLVRWFpBxC/ho++gYJ5UVgk cN7fUzud1bYraG/0lpvzHXIU1dhzbvg2fLFUYVIx+DJ0vgFmWfjW+1gJBgmgDe9gnx5+ +1yRZzgKLXldhbG/N3dSahtLGXJlV15dowb3qeQoe0jlC/OffcNRfMQhGzIzQHSg6BGH v8mg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=77+MIQsxJm14/HmUlAwrMcbVJVMpzlGQUGBhYiPRONM=; b=M3WgswsTSpLs9qrNFfmPHPnLWNv6D1ZTzYVul+Pfsog3fGBqqLP8/VEpIf8JPo4QYY ziKp96y3JP+GgR8N+70Z8Syj0LoA7szh5KkddaH/06qErQvQ1+I3h+NB0ppw+9BJ+5Ch Uf3RJlS13msImXI4KzcBPzlhWyPGxYIg4YDVVFR0jtv+En+BBfrELYFI9s9vbc8CCzJK HMRqvxXzhAlgMfDY3f5riGFRhBwX1QLYkfptSWOfmnF/KJwJXx4gfPJG0h4bCkFzI5vE LVGqWyhByUN929BUDX2Lxrnk0yyvXiNxluwTicW69ofHf5WT5h3deYcP564VYm31jOSt eSFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b="QI4/JN6C"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id c25-20020aa781d9000000b0050dfc410ab6si2021602pfn.142.2022.05.18.00.44.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 May 2022 00:44:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b="QI4/JN6C"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CCE3E119924; Wed, 18 May 2022 00:42:21 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232441AbiERHmM (ORCPT + 99 others); Wed, 18 May 2022 03:42:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53012 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232388AbiERHmE (ORCPT ); Wed, 18 May 2022 03:42:04 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C3B44119066 for ; Wed, 18 May 2022 00:42:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=77+MIQsxJm14/HmUlAwrMcbVJVMpzlGQUGBhYiPRONM=; b=QI4/JN6CEpVh71pz+R+HatJOlV 4R58hFbJ5B8WDY/eWaAMKua3I6Exfy79FNHNFhb0g8cdEbdSmoMI01rlPFyHLi7rueJsZbV8bZSWI Yeym6UDqqfoTOdGSNF74r7JMceQBlYmeUdEV+5aQ4RZwSLtpuLZwcdbF3PXbnUk6y1jALRtoFK//x bdo3ILMT1yXKr0bm3S0HiqIPEHmUdKA/0yVme441pioh2y9lqvor9uAd5hpY5AwZFDP8WwvVBG2pB ff2F9RkKQpa5kg49p71Znrr6PChm5iQzmxs4pdwqSG0L4kp9I6XB9YUYdXEpMDzA3P47n1CrA4Ko/ De46MzlQ==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=worktop.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1nrEJe-00BYyb-M7; Wed, 18 May 2022 07:41:54 +0000 Received: by worktop.programming.kicks-ass.net (Postfix, from userid 1000) id 62B4298119B; Wed, 18 May 2022 09:41:52 +0200 (CEST) Date: Wed, 18 May 2022 09:41:52 +0200 From: Peter Zijlstra To: Josh Poimboeuf Cc: Nathan Chancellor , Nick Desaulniers , llvm@lists.linux.dev, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com Subject: [PATCH] objtool: Fix symbol creation Message-ID: <20220518074152.GB10117@worktop.programming.kicks-ass.net> References: <20220516214005.GQ76023@worktop.programming.kicks-ass.net> <20220518012429.4zqzarvwsraxivux@treble> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220518012429.4zqzarvwsraxivux@treble> X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Subject: objtool: Fix symbol creation From: Peter Zijlstra Date: Tue, 17 May 2022 17:42:04 +0200 Nathan reported objtool failing with the following messages: warning: objtool: no non-local symbols !? warning: objtool: gelf_update_symshndx: invalid section index The problem is due to commit 4abff6d48dbc ("objtool: Fix code relocs vs weak symbols") failing to consider the case where an object would have no non-local symbols. The problem that commit tries to address is adding a STB_LOCAL symbol to the symbol table in light of the ELF spec's requirement that: In each symbol table, all symbols with STB_LOCAL binding preced the weak and global symbols. As ``Sections'' above describes, a symbol table section's sh_info section header member holds the symbol table index for the first non-local symbol. The approach taken is to find this first non-local symbol, move that to the end and then re-use the freed spot to insert a new local symbol and increment sh_info. Except it never considered the case of object files without global symbols and got a whole bunch of details wrong -- so many in fact that it is a wonder it ever worked :/ Specifically: - It failed to re-hash the symbol on the new index, so a subsequent find_symbol_by_index() would not find it at the new location and a query for the old location would now return a non-deterministic choice between the old and new symbol. - It failed to appreciate that the GElf wrappers are not a valid disk format (it works because GElf is basically Elf64 and we only support x86_64 atm.) - It failed to fully appreciate how horrible the libelf API really is and got the gelf_update_symshndx() call pretty much completely wrong; with the direct consequence that if inserting a second STB_LOCAL symbol would require moving the same STB_GLOBAL symbol again it would completely come unstuck. Write a new elf_update_symbol() function that wraps all the magic required to update or create a new symbol at a given index. Specifically, gelf_update_sym*() require an @ndx argument that is relative to the @data argument; this means you have to manually iterate the section data descriptor list and update @ndx. Fixes: 4abff6d48dbc ("objtool: Fix code relocs vs weak symbols") Reported-by: Nathan Chancellor Signed-off-by: Peter Zijlstra (Intel) Tested-by: Nathan Chancellor --- tools/objtool/elf.c | 198 +++++++++++++++++++++++++++++++++------------------- 1 file changed, 129 insertions(+), 69 deletions(-) --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -374,6 +374,9 @@ static void elf_add_symbol(struct elf *e struct list_head *entry; struct rb_node *pnode; + INIT_LIST_HEAD(&sym->pv_target); + sym->alias = sym; + sym->type = GELF_ST_TYPE(sym->sym.st_info); sym->bind = GELF_ST_BIND(sym->sym.st_info); @@ -438,8 +441,6 @@ static int read_symbols(struct elf *elf) return -1; } memset(sym, 0, sizeof(*sym)); - INIT_LIST_HEAD(&sym->pv_target); - sym->alias = sym; sym->idx = i; @@ -603,24 +604,21 @@ static void elf_dirty_reloc_sym(struct e } /* - * Move the first global symbol, as per sh_info, into a new, higher symbol - * index. This fees up the shndx for a new local symbol. + * The libelf API is terrible; gelf_update_sym*() takes a data block relative + * index value, *NOT* the symbol index. As such, iterate the data blocks and + * adjust index until it fits. + * + * If no data block is found, allow adding a new data block provided the index + * is only one past the end. */ -static int elf_move_global_symbol(struct elf *elf, struct section *symtab, - struct section *symtab_shndx) +static int elf_update_symbol(struct elf *elf, struct section *symtab, + struct section *symtab_shndx, struct symbol *sym) { - Elf_Data *data, *shndx_data = NULL; - Elf32_Word first_non_local; - struct symbol *sym; - Elf_Scn *s; - - first_non_local = symtab->sh.sh_info; - - sym = find_symbol_by_index(elf, first_non_local); - if (!sym) { - WARN("no non-local symbols !?"); - return first_non_local; - } + Elf_Data *symtab_data = NULL, *shndx_data = NULL; + Elf64_Xword entsize = symtab->sh.sh_entsize; + Elf32_Word shndx = sym->sec->idx; + Elf_Scn *s, *t = NULL; + int max_idx, idx = sym->idx; s = elf_getscn(elf->elf, symtab->idx); if (!s) { @@ -628,79 +626,124 @@ static int elf_move_global_symbol(struct return -1; } - data = elf_newdata(s); - if (!data) { - WARN_ELF("elf_newdata"); - return -1; + if (symtab_shndx) { + t = elf_getscn(elf->elf, symtab_shndx->idx); + if (!t) { + WARN_ELF("elf_getscn"); + return -1; + } } - data->d_buf = &sym->sym; - data->d_size = sizeof(sym->sym); - data->d_align = 1; - data->d_type = ELF_T_SYM; + for (;;) { + /* get next data descriptor for the relevant sections */ + symtab_data = elf_getdata(s, symtab_data); + if (t) + shndx_data = elf_getdata(t, shndx_data); + + /* end-of-list */ + if (!symtab_data) { + /* if @idx == 0, it's the next contiguous entry, create it */ + if (!idx) { + void *buf; + + symtab_data = elf_newdata(s); + if (t) + shndx_data = elf_newdata(t); + + buf = calloc(1, entsize); + if (!buf) { + WARN("malloc"); + return -1; + } + + symtab_data->d_buf = buf; + symtab_data->d_size = entsize; + symtab_data->d_align = 1; + symtab_data->d_type = ELF_T_SYM; + + symtab->sh.sh_size += entsize; + symtab->changed = true; + + if (t) { + shndx_data->d_buf = &sym->sec->idx; + shndx_data->d_size = sizeof(Elf32_Word); + shndx_data->d_align = sizeof(Elf32_Word); + shndx_data->d_type = ELF_T_WORD; + + symtab_shndx->sh.sh_size += sizeof(Elf32_Word); + symtab_shndx->changed = true; + } - sym->idx = symtab->sh.sh_size / sizeof(sym->sym); - elf_dirty_reloc_sym(elf, sym); + break; + } - symtab->sh.sh_info += 1; - symtab->sh.sh_size += data->d_size; - symtab->changed = true; + /* we don't do holes in symbol tables */ + WARN("index out of range"); + return -1; + } - if (symtab_shndx) { - s = elf_getscn(elf->elf, symtab_shndx->idx); - if (!s) { - WARN_ELF("elf_getscn"); + /* empty blocks should not happen */ + if (!symtab_data->d_size) { + WARN("zero size data"); return -1; } - shndx_data = elf_newdata(s); + /* is this the right block? */ + max_idx = symtab_data->d_size / entsize; + if (idx < max_idx) + break; + + /* adjust index and try again */ + idx -= max_idx; + } + + /* something went side-ways */ + if (idx < 0) { + WARN("negative index"); + return -1; + } + + /* setup extended section index magic and write the symbol */ + if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) { + sym->sym.st_shndx = shndx; + if (!shndx_data) + shndx = 0; + } else { + sym->sym.st_shndx = SHN_XINDEX; if (!shndx_data) { - WARN_ELF("elf_newshndx_data"); + WARN("no .symtab_shndx"); return -1; } + } - shndx_data->d_buf = &sym->sec->idx; - shndx_data->d_size = sizeof(Elf32_Word); - shndx_data->d_align = 4; - shndx_data->d_type = ELF_T_WORD; - - symtab_shndx->sh.sh_size += 4; - symtab_shndx->changed = true; + if (!gelf_update_symshndx(symtab_data, shndx_data, idx, &sym->sym, shndx)) { + WARN_ELF("gelf_update_symshndx"); + return -1; } - return first_non_local; + return 0; } static struct symbol * elf_create_section_symbol(struct elf *elf, struct section *sec) { struct section *symtab, *symtab_shndx; - Elf_Data *shndx_data = NULL; - struct symbol *sym; - Elf32_Word shndx; + Elf32_Word first_non_local, new_idx; + struct symbol *sym, *old; symtab = find_section_by_name(elf, ".symtab"); if (symtab) { symtab_shndx = find_section_by_name(elf, ".symtab_shndx"); - if (symtab_shndx) - shndx_data = symtab_shndx->data; } else { WARN("no .symtab"); return NULL; } - sym = malloc(sizeof(*sym)); + sym = calloc(1, sizeof(*sym)); if (!sym) { perror("malloc"); return NULL; } - memset(sym, 0, sizeof(*sym)); - - sym->idx = elf_move_global_symbol(elf, symtab, symtab_shndx); - if (sym->idx < 0) { - WARN("elf_move_global_symbol"); - return NULL; - } sym->name = sec->name; sym->sec = sec; @@ -710,24 +753,41 @@ elf_create_section_symbol(struct elf *el // st_other 0 // st_value 0 // st_size 0 - shndx = sec->idx; - if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) { - sym->sym.st_shndx = shndx; - if (!shndx_data) - shndx = 0; - } else { - sym->sym.st_shndx = SHN_XINDEX; - if (!shndx_data) { - WARN("no .symtab_shndx"); + + /* + * Move the first global symbol, as per sh_info, into a new, higher + * symbol index. This fees up a spot for a new local symbol. + */ + first_non_local = symtab->sh.sh_info; + new_idx = symtab->sh.sh_size / symtab->sh.sh_entsize; + old = find_symbol_by_index(elf, first_non_local); + if (old) { + old->idx = new_idx; + + hlist_del(&old->hash); + elf_hash_add(symbol, &old->hash, old->idx); + + elf_dirty_reloc_sym(elf, old); + + if (elf_update_symbol(elf, symtab, symtab_shndx, old)) { + WARN("elf_update_symbol move"); return NULL; } + + new_idx = first_non_local; } - if (!gelf_update_symshndx(symtab->data, shndx_data, sym->idx, &sym->sym, shndx)) { - WARN_ELF("gelf_update_symshndx"); + sym->idx = new_idx; + if (elf_update_symbol(elf, symtab, symtab_shndx, sym)) { + WARN("elf_update_symbol"); return NULL; } + /* + * Either way, we added a LOCAL symbol. + */ + symtab->sh.sh_info += 1; + elf_add_symbol(elf, sym); return sym;