Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp1624076yba; Sat, 27 Apr 2019 03:37:28 -0700 (PDT) X-Google-Smtp-Source: APXvYqxKjeNubSUecheyQhSBg5trugkvkh55VEND3iIgiU7y4mjjpO4nR+jBoH570ojmM3INGuiW X-Received: by 2002:a62:1249:: with SMTP id a70mr52425285pfj.160.1556361448800; Sat, 27 Apr 2019 03:37:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556361448; cv=none; d=google.com; s=arc-20160816; b=i2d+9r6SpE67Yi4p0CKneGyrP+HQOVR425hTbuHLWbdrXgU6ji8G/BrV+b0jS1Q6j/ khSsCLJPJ80jDq8S8rEqAIUeT7J0qGiUTRBiz1vfJqHJBZk9Qw/j2nLI+HMiY1nFms9l ibEzg+hIYzhjibQe2anzjlZ59drUaZ90j6Mk26CR8w/GpriJKZpiSrcPfmnjfNdGMmwl An74YP1tsOcwUK2tjTbPHrctZxjghbKR/4hSctloPy1nXkenfCkNA8d24mqs4b+Mbi9V xMZbkLHnwzoRdbUX51RFB+wBfYiG3XedWpNBL+bHm3W6Ins9IFDF0BAMGupyxM1sc5BN ySIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=Wlkc+lPyLRlkOqrPAceEbCbHJFtiYqbtRrrE4nU9yeE=; b=p4Y8uWCuNdt7iAcR/RvqXW0OdoWFbRJ9RSZauO/OScT8e241Tr9E1evETOZw5f+Mzm F9b9cFUnI/cqUP7pPlb+YYXkQ1X9Ynx3o/o2oTRzikJ98TK29EbSxG50vvHDHCdb3gdE xwfD9ARXQLhHNZjNGTwltN+GEObexW3Mk0BmpXcPMw0pYmPBge2iZa0lsPQ/7gyN6UlL sSHPbxe7aUWs7UiNUoC1cqqoBV7/WfY1HfHnw/TO9jodslVxXZAVURvKkBnHL1msbcY3 vJ/LqrPXh3u3zsSbBJOkAXErfWp3f243jRutpbqGsmnA1ekxynv5VKobmky6UDhXgnOb 910A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w6si876123pgs.119.2019.04.27.03.37.12; Sat, 27 Apr 2019 03:37:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726155AbfD0KfJ (ORCPT + 99 others); Sat, 27 Apr 2019 06:35:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45924 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725871AbfD0KfI (ORCPT ); Sat, 27 Apr 2019 06:35:08 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4EF7A30BEEF0; Sat, 27 Apr 2019 10:35:08 +0000 (UTC) Received: from prarit.bos.redhat.com (prarit-guest.khw1.lab.eng.bos.redhat.com [10.16.200.63]) by smtp.corp.redhat.com (Postfix) with ESMTP id 20AD765F60; Sat, 27 Apr 2019 10:35:07 +0000 (UTC) Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading" To: Heiko Carstens Cc: Jessica Yu , linux-next@vger.kernel.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, Cathy Avery References: <20190426130736.GB8646@osiris> <20190426150741.GD8646@osiris> <20190426160956.GA3827@linux-8ccs> <2e047a7e-bf08-be8c-bdd0-429464fa133d@redhat.com> <52c293e9-ddfa-426a-a8f1-2106e250e78d@redhat.com> <20190427102440.GA28889@osiris> From: Prarit Bhargava Message-ID: <3ceb784f-89ec-d208-a5b1-6306f0894217@redhat.com> Date: Sat, 27 Apr 2019 06:35:06 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <20190427102440.GA28889@osiris> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Sat, 27 Apr 2019 10:35:08 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/27/19 6:24 AM, Heiko Carstens wrote: > On Fri, Apr 26, 2019 at 08:20:52PM -0400, Prarit Bhargava wrote: >> Heiko and Jessica, >> >> The issue doesn't appear to be with my patch AFAICT. The s390_trng fails to >> load and then the kernel occasionally hangs (as Heiko mentioned) calling >> synchronize_rcu(). >> >> The call sequence is >> >> module_load() >> do_init_module() >> do_one_initcall(mod->init) >> >> which fails. >> >> The failure path in do_one_initcall() is entered and we start executing code at >> kernel/module.c:3541 >> >> fail_free_freeinit: >> kfree(freeinit); >> fail: >> /* Try to protect us from buggy refcounters. */ >> mod->state = MODULE_STATE_GOING; >> synchronize_rcu(); >> >> ^^^ the kernel hangs here. Sometimes it's very short and other times it seems >> to hang. I've left systems that appear to be hung and come back after 10 >> minutes to find that they've somehow made it through this call. >> >> Is there a known issue with RCU on s390 that is making this occur? > > No there is no known issue with RCU on s390. The reason that > synchronize_rcu() doesn't finish is because a different cpu is within > an endless loop in add_unformed_module() just like Jessica suspected. > > Note: the kernel is compiled with CONFIG_PREEMPT off - there is no > kernel preemption that will make the looping cpu ever go over schedule > and subsequently let synchronize_rcu() finish. > > To confirm Jessicas theory - looking into the dump we have: > > crash> bt 742 > PID: 742 TASK: 1efa6c000 CPU: 7 COMMAND: "systemd-udevd" > #0 [3e0043aba30] __schedule at abb25e > #1 [3e0043abaa0] schedule at abb6a2 > #2 [3e0043abac8] schedule_timeout at abf49a > #3 [3e0043abb60] wait_for_common at abc396 > #4 [3e0043abbf0] __wait_rcu_gp at 1c0136 > #5 [3e0043abc48] synchronize_rcu at 1c72ea > #6 [3e0043abc98] do_init_module at 1f10be > #7 [3e0043abcf0] load_module at 1f3594 > #8 [3e0043abdd0] __se_sys_init_module at 1f3af0 > #9 [3e0043abea8] system_call at ac0766 > > Which is the process waiting for synchronize_rcu to finish. Wading > through the stack frames gives me this struct module: > > struct module { > state = MODULE_STATE_GOING, > list = { > next = 0x3ff80394508, > prev = 0xe25090 > }, > name = "s390_trng\000... > ... > > Then we have the looping task/cpu: > > PID: 731 TASK: 1e79ba000 CPU: 7 COMMAND: "systemd-udevd" > LOWCORE INFO: > -psw : 0x0704c00180000000 0x0000000000ab666a > -function : memcmp at ab666a > ... > -general registers: > 0x0000000000000009 0x0000000000000009 > 0x000003ff80347321 000000000000000000 > 0x000003ff8034f321 000000000000000000 > 0x000000000000001e 0x000003ff8c592708 > 0x000003e0047da900 0x000003ff8034f318 > 0x0000000000000001 0x0000000000000009 > 0x000003ff80347300 0x0000000000ad81b8 > 0x00000000001ee062 0x000003e004357cb0 > #0 [3e004357cf0] load_module at 1f1eb0 > #1 [3e004357dd0] __se_sys_init_module at 1f3af0 > #2 [3e004357ea8] system_call at ac0766 > > which is find_module_all() calling memcmp with this string: > > 3ff80347318: 733339305f74726e 6700000000000000 s390_trng....... > > So it all seems to fit. A simple cond_resched() call, which enforces > an RCU quiescent state for the calliung cpu, fixes the problem for me > (patch on top of linux-next 20190424 -- c392798a85ab): > > diff --git a/kernel/module.c b/kernel/module.c > index 410eeb7e4f1d..48748cfec991 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -3585,6 +3585,7 @@ again: > finished_loading(mod->name)); > if (err) > goto out_unlocked; > + cond_resched(); You just beat me to posting this :). I am testing out this same patch in a loop on a few systems here in the lab. Will let you know results on Monday. P. > goto again; > } > err = -EEXIST; >