Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp4361655rdb; Mon, 11 Dec 2023 17:51:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IH0kuVSpu40enfqjKnfLboLrsPV5E8AbFB4MLsEkUAW6AQaWPGSYxX6/sCvPCUzt5vlk9N3 X-Received: by 2002:a05:6871:2b06:b0:1fa:e5ad:201a with SMTP id dr6-20020a0568712b0600b001fae5ad201amr7239943oac.18.1702345886964; Mon, 11 Dec 2023 17:51:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702345886; cv=none; d=google.com; s=arc-20160816; b=zDpAdPntsm5bToFapIU6gHctlFbZnf+Sxx75aYGX+MToJW3R/V6gLzAWvnNIoMj9Dc iHsX+Sq0BaVUMiJcd5EO9tmhKAw/dHFB2YHcLEj4qCvBYYlJaJy3D72J0qh2+zL+pxcE x5sCadAH+jcsJkGCuvzmikKBUlMDFZUrR0g8shGK/m4bqbHKz9pzjggxpdXat2CXL1gD ZyuMVdFUpf+lUon13sEpcUf1doDQNgDrO89ViteIcriTEYARyy9jxIxJ71u68h1Cuipj /4dZPKlGmQ3KEMdIrWPBmNwL8J2MHNTH4R791csAxyvBg86ravCJdFCnlF2o1A1F4i/d BMGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=ZX2WLxx5Rcq4VaM/IYqMbN57BjVsaN794DZZVcGhbe0=; fh=6F3cgbzcmmtDmVxvUZa1xyDTaX0b6M60V26oVdc40Hg=; b=sQINVWFYrIZmY5106E914PwDpI/LsaP5qJRHB+nqeupCM0tJof+5R+5ejIz8VMY6x9 ZWfD0zpDdVNtWRwlrRy4itkoZa9hIXWUX3nfrYX1PmYbIcfxtqOxYykzh2pZLQVj7R3P 8QV18cN3IZvGfYBtYn211jF5Ktw6Az4FmpwM4JsD5LDKRHskRHLoyhw5DtrP3NxlXKix 6cl0jvSYs2JuKNLpJ9RvErYXQcBRLfPx2eCWxWHNxEHH68CCNWJkVXLJTmJiBjZCbqjy jsKCOKN0hMgMEo2vIhkT9Ba1FkgG+bXYvhbCLLO3KDsCdb281x4vZdwjEJ7B8Qfr/PDj AjzA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=Q2udXtbu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id t7-20020a6564c7000000b005be1f24c951si6705458pgv.838.2023.12.11.17.51.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 17:51:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=Q2udXtbu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id D1D1D80B1BD9; Mon, 11 Dec 2023 17:51:23 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231153AbjLLBvG (ORCPT + 99 others); Mon, 11 Dec 2023 20:51:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229625AbjLLBvE (ORCPT ); Mon, 11 Dec 2023 20:51:04 -0500 Received: from mail-io1-xd31.google.com (mail-io1-xd31.google.com [IPv6:2607:f8b0:4864:20::d31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D03C6D2 for ; Mon, 11 Dec 2023 17:51:09 -0800 (PST) Received: by mail-io1-xd31.google.com with SMTP id ca18e2360f4ac-7b459364167so195044239f.2 for ; Mon, 11 Dec 2023 17:51:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1702345869; x=1702950669; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=ZX2WLxx5Rcq4VaM/IYqMbN57BjVsaN794DZZVcGhbe0=; b=Q2udXtbuAh9U7GTYaAewsWbSvtCAWlCvmGDwCAXrG9NRnadhRnCbkzAPN5qtCTUvFI NpTW7fEJ2wui8WduLl5SDl5Xzqmn7/k23xkZic0VedyfHKRc5ZzmZSyw2py8nc/u1W9L yZzlOdvw52LHutLp1F7tn/FSEeuFjruCIdUaGLwy2voyk3NeKrdpgMIN0is3aGu7GNzO 6jHXG1eaI+7b9mmIvZayhO9PURZD7MUkJ99d1lGLguv8rC8KdJuL1HJNz2vlIqdtW5uC NZmjUvAYj0g4sl2/uaeF2N3YOBuB/46z8UEn4G78mmdgo/Of3aGWxH5upAjvfbd5VObK /6tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702345869; x=1702950669; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ZX2WLxx5Rcq4VaM/IYqMbN57BjVsaN794DZZVcGhbe0=; b=HPpwGNpyCKrvKXB3jHiC8l8xhIilbT+8h55nZlN43BiyqwYebx/jE4yIWivDPBQJAL gkmJpoieCdnWvOujDQ/u+RAyIaFg/me5Fbxj1d0GBWxTmwbDEQypBDuuvM29tn+KlCjA 1q3hgfQB2mSw4RdMHOrggLhYA/WBYT21x6s7ww6K8eXCbNpAU9Wpwprp7Ul1W+eBQVJe qzVh/aQ9uf+N8W4TB85vlfJGk32yOvuS7tM1cUzKehIJGic2hLvqEEYfU1uhbHLXeICY 2c7tO1+z7nnf9Uuxim3OT4w953ilQgdhjDIyNkt9p/6uSgYMQl1DhUdQUjpikOsdLgm4 EB7A== X-Gm-Message-State: AOJu0Yx73JPu4P0HfkZdvASUasAu30ApLOI6vXkbski8k+NasgvjifW3 6kdtG2Zq33Hg1oJj4bnHFVR+OOzAsSgkZM7zufs= X-Received: by 2002:a05:6e02:1aa5:b0:35d:a3ce:4e50 with SMTP id l5-20020a056e021aa500b0035da3ce4e50mr4700952ilv.37.1702345869214; Mon, 11 Dec 2023 17:51:09 -0800 (PST) Received: from ghost ([12.44.203.122]) by smtp.gmail.com with ESMTPSA id jg7-20020a17090326c700b001d0d312bc2asm7296320plb.193.2023.12.11.17.51.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 17:51:08 -0800 (PST) Date: Mon, 11 Dec 2023 17:51:06 -0800 From: Charlie Jenkins To: Maxim Kochetkov Cc: linux-riscv@lists.infradead.org, bigunclemax@gmail.com, Amma Lee , Paul Walmsley , Palmer Dabbelt , Albert Ou , Conor Dooley , Andrew Jones , Jisheng Zhang , linux-kernel@vger.kernel.org Subject: Re: [PATCH v4 1/1] riscv: optimize ELF relocation function in riscv Message-ID: References: <20230913130501.287250-1-fido_max@inbox.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Mon, 11 Dec 2023 17:51:24 -0800 (PST) On Thu, Dec 07, 2023 at 05:02:16PM -0800, Charlie Jenkins wrote: > On Wed, Sep 13, 2023 at 04:05:00PM +0300, Maxim Kochetkov wrote: > > The patch can optimize the running times of insmod command by modify ELF > > relocation function. > > In the 5.10 and latest kernel, when install the riscv ELF drivers which > > contains multiple symbol table items to be relocated, kernel takes a lot > > of time to execute the relocation. For example, we install a 3+MB driver > > need 180+s. > > We focus on the riscv architecture handle R_RISCV_HI20 and R_RISCV_LO20 > > type items relocation function in the arch\riscv\kernel\module.c and > > find that there are two-loops in the function. If we modify the begin > > number in the second for-loops iteration, we could save significant time > > for installation. We install the same 3+MB driver could just need 2s. > > > > Signed-off-by: Amma Lee > > Signed-off-by: Maxim Kochetkov > > --- > > Changes in v4: > > - use 'while' loop instead of 'for' loop to avoid code duplicate > > --- > > arch/riscv/kernel/module.c | 20 ++++++++++++++++---- > > 1 file changed, 16 insertions(+), 4 deletions(-) > > > > diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c > > index 7c651d55fcbd..8c9b644ebfdb 100644 > > --- a/arch/riscv/kernel/module.c > > +++ b/arch/riscv/kernel/module.c > > @@ -346,6 +346,7 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab, > > Elf_Sym *sym; > > u32 *location; > > unsigned int i, type; > > + unsigned int j_idx = 0; > > Elf_Addr v; > > int res; > > > > @@ -384,9 +385,10 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab, > > v = sym->st_value + rel[i].r_addend; > > > > if (type == R_RISCV_PCREL_LO12_I || type == R_RISCV_PCREL_LO12_S) { > > - unsigned int j; > > + unsigned int j = j_idx; > > + bool found = false; > > > > - for (j = 0; j < sechdrs[relsec].sh_size / sizeof(*rel); j++) { > > + do { > > unsigned long hi20_loc = > > sechdrs[sechdrs[relsec].sh_info].sh_addr > > + rel[j].r_offset; > > @@ -415,16 +417,26 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab, > > hi20 = (offset + 0x800) & 0xfffff000; > > lo12 = offset - hi20; > > v = lo12; > > + found = true; > > > > break; > > } > > - } > > - if (j == sechdrs[relsec].sh_size / sizeof(*rel)) { > > + > > + j++; > > + if (j > sechdrs[relsec].sh_size / sizeof(*rel)) > > + j = 0; > Very interesting algorithm here. Assuming the hi relocation is after the > previous one seems to be a good heuristic. However I think we can do > better. In GNU ld, a hashmap of all of the hi relocations is stored and > a list of all of the lo relocations. After all of the other relocations > have been parsed, it iterates through all of the lo relocations and > looks up the associated hi relocation in the hashmap. > > There is more memory overhead here but I suspect it will be faster. I > had started to mock up a hashmap implementation to see if it was faster > but decided I should mention it here first in case somebody had some > additional insight. Turns out this is a fantastic heuristic. Using a hashmap is significantly faster than the default implementation but this algorithm above is significantly faster than the hashmap. Using the amdgpu driver (which is actually a collection of drivers) and is a size of about 469M I found that the hashmap implementation is about 30% faster than the current implementation, but this patch is 50% faster than the current implementation. It is probably possible to write an ELF header with the relocations sufficiently scrambled to make the hashmap faster, but I suspect that for all "normal" programs this algorithm is faster. I also tried a couple other smaller modules and it was faster or around the same as the hashmap in all of them. A lot of code has changed in this file since this patch was submitted, can you rebase onto 6.7-rc1? Otherwise this patch is great. Reviewed-by: Charlie Jenkins > > - Charlie > > > + > > + } while (j_idx != j); > > + > > + if (!found) { > > pr_err( > > "%s: Can not find HI20 relocation information\n", > > me->name); > > return -EINVAL; > > } > > + > > + /* Record the previous j-loop end index */ > > + j_idx = j; > > } > > > > res = handler(me, location, v); > > -- > > 2.40.1 > > > > > > _______________________________________________ > > linux-riscv mailing list > > linux-riscv@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-riscv