Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1269738pxk; Fri, 25 Sep 2020 10:17:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwRnpUHA7TSfjCnjn3XaWu1Z0MVb1/UK5ughgvmMXCI5e5WpSb2mWBee5UzOAgjv3JBlptt X-Received: by 2002:a05:6402:1805:: with SMTP id g5mr2351098edy.135.1601054260646; Fri, 25 Sep 2020 10:17:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601054260; cv=none; d=google.com; s=arc-20160816; b=0IgztsqCyvLZ7A7fu/wosWwHDofj3/gEcAJKWxPgOOCfOBy8kPUiE8kLdMjM57x7Pk uJTZFaqsBFqBkCl/YpP4a6H5p33X8ITqJIMW2+olg3gbUZAkCJ97sEHOO2i5fuwgcnn6 3Javay6PBJQGPiiZk8RnwvTA6O3ZmR71neHanoIsUmnjJRKkR/BkN67dtxLM5MztpdQn yAxs5rBP32Upmidf5kCdOaQ4KpujJ6iYpnSh9+RNo3Qk7j13Nc0GpNpxLU2JXy7Ua/K/ ECycc1p5QNtzEuukjqzmFJ0izGKwa6vahkwSyIuI3VGcYI/WUaz42JT5lKJLVr5my78P jpng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=u3BNj/cMQkGxC9OMOhHTfZe4RQ+Q6ui6faxL839ANp8=; b=Q84qtj8d9JaMy9EkqAv2CHyBI4jM/XmJBa8IdbZ2wahAe4tjvuuqECfIjv6X2CiZxy SZl2M1o3ApLCgos3BdCiDe46nPlN4qfi9MnrFUAXeiuvfQwylm9AOfu3IG3Bx/BuJVBf UDW+bVRsUvdkvbMwioUZv25kxEo0A6sMJIW5nvqx1I1lU2AzdvSv9vHW0jzIozqaSpoV 02bXPrZWJ55ccvUEbHJYn00kglExTwoEwITGnBxU7FhfZwQQc3OBRa2JkXCFjtB2OK4O na0CqcwTuIsH4iA8BHJ85GLoyY5v0ng6LdWkl6+UJR0J0p/geVT4i/4kEEZMe+vbK5Jh NgmA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=PNyjcsNP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dv18si2216306ejb.135.2020.09.25.10.17.17; Fri, 25 Sep 2020 10:17:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=PNyjcsNP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729535AbgIYRNd (ORCPT + 99 others); Fri, 25 Sep 2020 13:13:33 -0400 Received: from mail.kernel.org ([198.145.29.99]:33366 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727749AbgIYRNc (ORCPT ); Fri, 25 Sep 2020 13:13:32 -0400 Received: from paulmck-ThinkPad-P72.home (unknown [50.45.173.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CE36E20936; Fri, 25 Sep 2020 17:13:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601054011; bh=ti22SxcurzOVwjbQ0OSmV+nIlJOiyn0KdaZY90nJPlU=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=PNyjcsNPyihicpZ9GUOulMgzb6BJEnIW3Tu3F6s1CknkkVwbVCoYrGZ7UVh8kvjZq pHJaij6QK10Yq5ZkH2QsTw6Y/mvKcDzxD8JkDNDg7t+3oWNOz1r7L8bQibhqUXvR8M KYtvF7f2rohEMaxOmXwoFTqh49vkk3mDu+MLpFIg= Received: by paulmck-ThinkPad-P72.home (Postfix, from userid 1000) id 84A0435207C1; Fri, 25 Sep 2020 10:13:31 -0700 (PDT) Date: Fri, 25 Sep 2020 10:13:31 -0700 From: "Paul E. McKenney" To: Hou Tao Cc: Davidlohr Bueso , Josh Triplett , linux-kernel@vger.kernel.org, rcu@vger.kernel.org Subject: Re: [PATCH v2 2/2] locktorture: call percpu_free_rwsem() to do percpu-rwsem cleanup Message-ID: <20200925171331.GF29330@paulmck-ThinkPad-P72> Reply-To: paulmck@kernel.org References: <20200923035137.GN29330@paulmck-ThinkPad-P72> <20200924141854.134835-1-houtao1@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200924141854.134835-1-houtao1@huawei.com> User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 24, 2020 at 10:18:54PM +0800, Hou Tao wrote: > When do percpu-rwsem writer lock torture, the RCU callback > rcu_sync_func() may still be pending after locktorture module > is removed, and it will lead to the following Oops: > > BUG: unable to handle page fault for address: ffffffffc00eb920 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 6500a067 P4D 6500a067 PUD 6500c067 PMD 13a36c067 PTE 800000013691c163 > Oops: 0000 [#1] PREEMPT SMP > CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0-rc5+ #4 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) > RIP: 0010:rcu_cblist_dequeue+0x12/0x30 > Call Trace: > > rcu_core+0x1b1/0x860 > __do_softirq+0xfe/0x326 > asm_call_on_stack+0x12/0x20 > > do_softirq_own_stack+0x5f/0x80 > irq_exit_rcu+0xaf/0xc0 > sysvec_apic_timer_interrupt+0x2e/0xb0 > asm_sysvec_apic_timer_interrupt+0x12/0x20 > > Fix it by adding an exit hook in lock_torture_ops and > use it to call percpu_free_rwsem() for percpu rwsem torture > before the module is removed, so we can ensure rcu_sync_func() > completes before module exits. > > Also needs to call exit hook if lock_torture_init() fails half-way, > so add init_called field in lock_torture_cxt to indicate that > init hook has been called. > > Signed-off-by: Hou Tao Nice, thank you! I have queued this for v5.11. As always, I could not resist reworking the commit log as below, and as always, please check to make sure that I didn't mess anything up. Thanx, Paul ------------------------------------------------------------------------ commit f794b408207152995aefdf41ee9a3f5c974f622a Author: Hou Tao Date: Thu Sep 24 22:18:54 2020 +0800 locktorture: Invoke percpu_free_rwsem() to do percpu-rwsem cleanup When executing the LOCK06 locktorture scenario featuring percpu-rwsem, the RCU callback rcu_sync_func() may still be pending after locktorture module is removed. This can in turn lead to the following Oops: BUG: unable to handle page fault for address: ffffffffc00eb920 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 6500a067 P4D 6500a067 PUD 6500c067 PMD 13a36c067 PTE 800000013691c163 Oops: 0000 [#1] PREEMPT SMP CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0-rc5+ #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:rcu_cblist_dequeue+0x12/0x30 Call Trace: rcu_core+0x1b1/0x860 __do_softirq+0xfe/0x326 asm_call_on_stack+0x12/0x20 do_softirq_own_stack+0x5f/0x80 irq_exit_rcu+0xaf/0xc0 sysvec_apic_timer_interrupt+0x2e/0xb0 asm_sysvec_apic_timer_interrupt+0x12/0x20 This commit avoids tis problem by adding an exit hook in lock_torture_ops and using it to call percpu_free_rwsem() for percpu rwsem torture during the module-cleanup function, thus ensuring that rcu_sync_func() completes before module exits. It is also necessary to call the exit hook if lock_torture_init() fails half-way, so this commit also adds an ->init_called field in lock_torture_cxt to indicate that exit hook, if present, must be called. Signed-off-by: Hou Tao Signed-off-by: Paul E. McKenney diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index 79fbd97..fd838ce 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -76,6 +76,7 @@ static void lock_torture_cleanup(void); */ struct lock_torture_ops { void (*init)(void); + void (*exit)(void); int (*writelock)(void); void (*write_delay)(struct torture_random_state *trsp); void (*task_boost)(struct torture_random_state *trsp); @@ -92,12 +93,13 @@ struct lock_torture_cxt { int nrealwriters_stress; int nrealreaders_stress; bool debug_lock; + bool init_called; atomic_t n_lock_torture_errors; struct lock_torture_ops *cur_ops; struct lock_stress_stats *lwsa; /* writer statistics */ struct lock_stress_stats *lrsa; /* reader statistics */ }; -static struct lock_torture_cxt cxt = { 0, 0, false, +static struct lock_torture_cxt cxt = { 0, 0, false, false, ATOMIC_INIT(0), NULL, NULL}; /* @@ -573,6 +575,11 @@ static void torture_percpu_rwsem_init(void) BUG_ON(percpu_init_rwsem(&pcpu_rwsem)); } +static void torture_percpu_rwsem_exit(void) +{ + percpu_free_rwsem(&pcpu_rwsem); +} + static int torture_percpu_rwsem_down_write(void) __acquires(pcpu_rwsem) { percpu_down_write(&pcpu_rwsem); @@ -597,6 +604,7 @@ static void torture_percpu_rwsem_up_read(void) __releases(pcpu_rwsem) static struct lock_torture_ops percpu_rwsem_lock_ops = { .init = torture_percpu_rwsem_init, + .exit = torture_percpu_rwsem_exit, .writelock = torture_percpu_rwsem_down_write, .write_delay = torture_rwsem_write_delay, .task_boost = torture_boost_dummy, @@ -789,9 +797,10 @@ static void lock_torture_cleanup(void) /* * Indicates early cleanup, meaning that the test has not run, - * such as when passing bogus args when loading the module. As - * such, only perform the underlying torture-specific cleanups, - * and avoid anything related to locktorture. + * such as when passing bogus args when loading the module. + * However cxt->cur_ops.init() may have been invoked, so beside + * perform the underlying torture-specific cleanups, cur_ops.exit() + * will be invoked if needed. */ if (!cxt.lwsa && !cxt.lrsa) goto end; @@ -831,6 +840,11 @@ static void lock_torture_cleanup(void) cxt.lrsa = NULL; end: + if (cxt.init_called) { + if (cxt.cur_ops->exit) + cxt.cur_ops->exit(); + cxt.init_called = false; + } torture_cleanup_end(); } @@ -878,8 +892,10 @@ static int __init lock_torture_init(void) goto unwind; } - if (cxt.cur_ops->init) + if (cxt.cur_ops->init) { cxt.cur_ops->init(); + cxt.init_called = true; + } if (nwriters_stress >= 0) cxt.nrealwriters_stress = nwriters_stress;