Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp847741yba; Thu, 9 May 2019 06:55:23 -0700 (PDT) X-Google-Smtp-Source: APXvYqysm5n1W1mdEA7VGD6Y+3EH4+76Zfc38P5FuRpz3HYuELMcmAFBZNOxl1gy4ds3hcS8Ijjo X-Received: by 2002:a17:902:5a2:: with SMTP id f31mr5221853plf.119.1557410123710; Thu, 09 May 2019 06:55:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557410123; cv=none; d=google.com; s=arc-20160816; b=c8rHHcQ5+DTAyGesAohb++WqCPCZ3+vUGa1MfH/+PiKvTmoekfIXeq0t0iwo2Bf3lJ be7fzNIyeJqJj0H29KB1m8c0j6ejj16qmpdJxjrO+j1PdxMOK37yVSeRe6D2ua7QPZhS G/6T5E9q+6nMDkD5DvgufyAQQ5NMPtGr7JnT2VbV5JuQA3SD+aF9DMssUOo7h5JzJxyg tSXKitSLZBsns1/SSEhQ3iRrzSWIqfzXlbzWMipkhtqc8c37AwqEAtck+GUjKc3/TCGQ 33qUhV60y74Jsu1TIeHDXc+zNLh7JmUqMXgJ8Zb2eW9nbLZI9TFVRDH+wquqVN0Uh8Ad 7H5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=HYmWR6kB1uQ72rnoYMOeR0rn6bH8qtIpniRdAV3Wm3w=; b=WfavTnBMrDNL4MuR/eCwqxRNc2kvLlx31B354nD+MK+7bWh9OKl5pZPOZS7zQtJ0zU RcyzUDSjH7DHYDCFEnuaQcS+/VYJSmVsidQyVqNxDPNGWAlAWph0nMPQfnVSuhFpBYs9 DuAzoIs05ND/pYyRh5BbyNauNrdihvW2jMUQ2Dsztb1CS+KXoW/H4SgffTRGZwM27THQ FEtC4fafiRih6pLgQ2ehjEr4rAfIbF0exYrL8jznR/prqKDBo9goKoXsriyCAuHlyXTr HrrtfR5mG45Q60adKDgeADNf46/t4x6ADSrc8VsIftxCi10Ccvv4B14ZzWVnTHqD7YRx 0muQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=LlWIUdgg; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 59si2756102plb.227.2019.05.09.06.55.07; Thu, 09 May 2019 06:55:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=LlWIUdgg; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726631AbfEINxj (ORCPT + 99 others); Thu, 9 May 2019 09:53:39 -0400 Received: from mail.kernel.org ([198.145.29.99]:42776 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726192AbfEINxj (ORCPT ); Thu, 9 May 2019 09:53:39 -0400 Received: from linux-8ccs (ip5f5adeaa.dynamic.kabel-deutschland.de [95.90.222.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 46F8120675; Thu, 9 May 2019 13:53:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1557410018; bh=5bgQ6CHqRJTKqil2SYpOdeiLLfAUR1hz6lrikzEL1R8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=LlWIUdggLWAE6LoVh7f4hVZUYYoNDUd+InUDc8WAv8kUZ6qdyGYx54sIYhSa3EDfZ csGB/Xw6OCdoqjS8Id/j4Vk4LxmNNthp/3ib5iqE6MiyTVB5tUv+8PChj8ry4APTHB 7Dp43vqFTtWzvYv4S18Yl5BeNqaV67n3DrrlEF5k= Date: Thu, 9 May 2019 15:53:34 +0200 From: Jessica Yu To: Prarit Bhargava Cc: linux-kernel@vger.kernel.org, Heiko Carstens , David Arcari Subject: Re: [PATCH v2] modules: Only return -EEXIST for modules that have finished loading Message-ID: <20190509135333.GA9337@linux-8ccs> References: <20190507145413.16297-1-prarit@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20190507145413.16297-1-prarit@redhat.com> X-OS: Linux linux-8ccs 5.1.0-rc1-lp150.12.28-default+ x86_64 User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org +++ Prarit Bhargava [07/05/19 10:54 -0400]: >Heiko, it would still be good to get a test of this patch from you. I >tested this here at Red Hat on some System Z machines. Without the >modification made here in v2, the systems failed to boot ~10% of the time. >After the modification I do not see any boot failures. I also was >able to reproduce the boot issue with the acpi_cpufreq driver on a very >large & fast x86 system which had closer to 100% failure rate without >the changes in v2. After the modification in v2 the system has rebooted >all weekend without any issues. > >P. > >---8<--- > >Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and >linux guests boot with repeated errors: > >amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) >amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) >amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2) >amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) >amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) >amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2) > >The warnings occur because the module code erroneously returns -EEXIST >for modules that have failed to load and are in the process of being >removed from the module list. > >module amd64_edac_mod has a dependency on module edac_mce_amd. Using >modules.dep, systemd will load edac_mce_amd for every request of >amd64_edac_mod. When the edac_mce_amd module loads, the module has >state MODULE_STATE_UNFORMED and once the module load fails and the state >becomes MODULE_STATE_GOING. Another request for edac_mce_amd module >executes and add_unformed_module() will erroneously return -EEXIST even >though the previous instance of edac_mce_amd has MODULE_STATE_GOING. >Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which >fails because of unknown symbols from edac_mce_amd. > >add_unformed_module() must wait to return for any case other than >MODULE_STATE_LIVE to prevent a race between multiple loads of >dependent modules. > >v2: The initial (old->state != MODULE_STATE_LIVE) change exposed an >additional issue in the code. wait_event_interruptible() puts each thread >to sleep until the a module finishes loading an executes the module_wq >workqueue. The result is a long delay during the boot. Switching to >wait_event_interruptible_timeout() resolves the sleep problem. > >Signed-off-by: Prarit Bhargava >Cc: Jessica Yu >Cc: Heiko Carstens >Cc: David Arcari Hi Prarit, Thanks a lot for the revised patch. I'll queue this up right after the merge window is over. Thanks! Jessica >--- > kernel/module.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > >diff --git a/kernel/module.c b/kernel/module.c >index 1c429d8d2d74..6c868aabaf37 100644 >--- a/kernel/module.c >+++ b/kernel/module.c >@@ -3568,12 +3568,12 @@ static int add_unformed_module(struct module *mod) > mutex_lock(&module_mutex); > old = find_module_all(mod->name, strlen(mod->name), true); > if (old != NULL) { >- if (old->state == MODULE_STATE_COMING >- || old->state == MODULE_STATE_UNFORMED) { >+ if (old->state != MODULE_STATE_LIVE) { > /* Wait in case it fails to load. */ > mutex_unlock(&module_mutex); >- err = wait_event_interruptible(module_wq, >- finished_loading(mod->name)); >+ err = wait_event_interruptible_timeout(module_wq, >+ finished_loading(mod->name), >+ HZ/1000); > if (err) > goto out_unlocked; > goto again; >-- >2.18.1 >