Received: by 10.213.65.68 with SMTP id h4csp2163319imn; Sun, 8 Apr 2018 21:22:05 -0700 (PDT) X-Google-Smtp-Source: AIpwx482Yp0AlKX0zOVTQlO93UnI5IpdT8BKGJdDyRqNwSuAPwye6bV7upUkg6PuedBMrkiEQLzH X-Received: by 10.98.194.142 with SMTP id w14mr27801415pfk.226.1523247725553; Sun, 08 Apr 2018 21:22:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523247725; cv=none; d=google.com; s=arc-20160816; b=wg2cs5MPeuW1YTKsHoklxet5AYAMMpZNG93Ielcd3qPXC3wUN2dpo7feybeLup/pRi /VbDcGTriOfG0ZQhWgQEBy7cjEBwijVsdbhDwt5XObTFGIt7kh+Iqf5MppH85tnCe8wV 2hXxxDamTdYuvB+wApIW634aWpya3k4Z2aEtp7+iZ0e3HCgumV4wPExxqiwJmjtmmYkh nr6GZR8yEYHcnHrJiSpRfsznEzfdg0Mwy3H5ufvjBbd5RGTr3ArXrjmBrNbhZ1cJWWTK 3BnPSJE+DA9na6O+FNcst1+dX4pVZ8IRhNuLveVySK59HehNwEwAviyugDcvffJUjS2y q25w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :spamdiagnosticmetadata:spamdiagnosticoutput:content-language :accept-language:in-reply-to:references:message-id:date:thread-index :thread-topic:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=PtOoFOUDF9ScFdlHJs5qE5JEhQt6q9ezpDBlqVYZ98w=; b=iDz3SoaTj5EzXQeqwXqxdq9ZC3PxNLblOqPbVeo+2UgB2T8amgnYakO05cf++BhhIS BSOLd3qnxLZpb/mtN/CJzW+2sGrNyBoe3L6gLoHAmfws7gcI6gS257rGoGLOIVFTWSZE xlh/R0WklI4ONLCkCZL94yDQWUN4hPTiE0YIc8Lvl5CeUUsLtO2lcXVyK6rPTst2oZ3i O30J51ozVz14yEm2+b83tsCivGVIAIFYKzsOK81a4HbugVbmOWh2BAuGsjC17aLGb7Sp j/3di1z4khm5LEPcgsZcXBcA7C4q2tgm4QakKYziMTvYOjPZcymUeJvlvlro+SHErsXn WPFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector1 header.b=T+nA9ApV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z11-v6si8100364plo.243.2018.04.08.21.21.27; Sun, 08 Apr 2018 21:22:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector1 header.b=T+nA9ApV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753811AbeDIENi (ORCPT + 99 others); Mon, 9 Apr 2018 00:13:38 -0400 Received: from mail-cys01nam02on0097.outbound.protection.outlook.com ([104.47.37.97]:28800 "EHLO NAM02-CY1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752556AbeDIAQt (ORCPT ); Sun, 8 Apr 2018 20:16:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=PtOoFOUDF9ScFdlHJs5qE5JEhQt6q9ezpDBlqVYZ98w=; b=T+nA9ApVz3GwqdyV4YYC9juBrjBmOz0TgxYA2bX8ERu2oklPPgg4800uPQXWKsGVbE/5e4oqu+v6cnQ1uSI2gzGfUfVCf5sz8IRPp37PuDo709dRbRbRdb8O8qsx87jEND1pC1shiiqXr7tdA9dIDVLOGSQjxi6nJEgzRkm8/ww= Received: from DM5PR2101MB1032.namprd21.prod.outlook.com (52.132.128.13) by DM5PR2101MB0968.namprd21.prod.outlook.com (52.132.133.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.696.0; Mon, 9 Apr 2018 00:16:46 +0000 Received: from DM5PR2101MB1032.namprd21.prod.outlook.com ([fe80::8109:aef0:a777:7059]) by DM5PR2101MB1032.namprd21.prod.outlook.com ([fe80::8109:aef0:a777:7059%2]) with mapi id 15.20.0696.003; Mon, 9 Apr 2018 00:16:46 +0000 From: Sasha Levin To: "stable@vger.kernel.org" , "linux-kernel@vger.kernel.org" CC: Roy Shterman , Sagi Grimberg , Christoph Hellwig , Sasha Levin Subject: [PATCH AUTOSEL for 4.15 004/189] nvme: host delete_work and reset_work on separate workqueues Thread-Topic: [PATCH AUTOSEL for 4.15 004/189] nvme: host delete_work and reset_work on separate workqueues Thread-Index: AQHTz5gLCjUzKmzdb0+tE97QpRQ/Vg== Date: Mon, 9 Apr 2018 00:16:46 +0000 Message-ID: <20180409001637.162453-4-alexander.levin@microsoft.com> References: <20180409001637.162453-1-alexander.levin@microsoft.com> In-Reply-To: <20180409001637.162453-1-alexander.levin@microsoft.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [52.168.54.252] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;DM5PR2101MB0968;7:vZw6sjhVKofY71kpZyNGuqy5GAHC/sWwDVDFCPFBclhkcjIhSGmymBBBJvQU1sBspQrfMRtOLWklrlOiVjkQugL4MD7BgJSr9MTo4jwU9tKJ1dOT1UmIUSyyUJm+uTYiz5qVaPm38JVnEvw3Gq3DWagflwd3frm7I9XHkjD+/3TYdRkvnWX0m5TF6aul3o/xKVlqnKOE9ztth9dfXdJhrV1d3/mX0P2owhjIiInQ+Jx/GXQ0D9gsoe7F6F6fhOwV;20:g0zNLVqBe2G+wbkBhJQj1Yw9W1i/P6T3r3QJcjgRkH6SDaVeIZmHWVZTEAFXfGsbKbBEHZQ7OqvRvJTW2TQa2ZOejuhK0g7lcDhkYaTdZE5rjCJaKC4nbuzlxSjtw6knAjIZs18/hJgU5/agyK88LWZ7e8h+imC5nb+Y8qWJwdY= x-ms-exchange-antispam-srfa-diagnostics: SOS; X-MS-Office365-Filtering-Correlation-Id: eb562f41-37cc-49fc-c109-08d59daf2e9c x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(4534165)(4627221)(201703031133081)(201702281549075)(5600026)(4604075)(3008032)(48565401081)(2017052603328)(7193020);SRVR:DM5PR2101MB0968; x-ms-traffictypediagnostic: DM5PR2101MB0968: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Alexander.Levin@microsoft.com; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(28532068793085)(89211679590171); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(8211001083)(61425038)(6040522)(2401047)(8121501046)(5005006)(93006095)(93001095)(3231221)(944501327)(52105095)(3002001)(10201501046)(6055026)(61426038)(61427038)(6041310)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(20161123564045)(20161123560045)(6072148)(201708071742011);SRVR:DM5PR2101MB0968;BCL:0;PCL:0;RULEID:;SRVR:DM5PR2101MB0968; x-forefront-prvs: 0637FCE711 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(39860400002)(376002)(366004)(346002)(39380400002)(396003)(51234002)(189003)(199004)(6486002)(53936002)(107886003)(81156014)(36756003)(10290500003)(6436002)(81166006)(5660300001)(478600001)(8936002)(8676002)(6512007)(106356001)(105586002)(5250100002)(2900100001)(7736002)(305945005)(2501003)(10090500001)(66066001)(25786009)(476003)(3846002)(6116002)(186003)(2616005)(59450400001)(68736007)(6506007)(486006)(14454004)(575784001)(86362001)(99286004)(26005)(102836004)(22452003)(76176011)(11346002)(2906002)(316002)(3660700001)(110136005)(97736004)(86612001)(3280700002)(1076002)(4326008)(446003)(54906003)(72206003)(22906009)(217873001);DIR:OUT;SFP:1102;SCL:1;SRVR:DM5PR2101MB0968;H:DM5PR2101MB1032.namprd21.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;MX:1;A:1; received-spf: None (protection.outlook.com: microsoft.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: g6ftRqFCbbUG7VdMS9YlyyYUt4ApgjuJTNLGaIGvpk9j2lp+rx+xEjn+OtDevnqtRsEsxAIaQF7krzDLzx6K2tpPP+LyGKwXMA1nMvPO95sDkc9WwuYiCOCKBZ9AGRtjcduKQ8u3R3kBUMaJ9GuEKu9fjjObh46lQPVK+gVgw8R8rKVZW0IguHPr10XRGMKHXI9fO0+VgoJCJr0iFQYe5OuCkAfF4WxK+5azStYz+xzdDAQQ4WBIi/XsjAZTw8TQ6PbonVTim/FBdpcH32rl7lXUf/FX/ZITY6fVRzoe6U03ahrrRNrd5xQkdcJPl511vCGDaMOq+/GLo9WfK/2+H5vSNsWsLkzDQvDgDbMVH/MLQepPD4nxrasE8sYl8cZy8VL4TZT3JkAlV7g1m56K5+s7IHIQHZOXLf9c3ptDqak= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-Network-Message-Id: eb562f41-37cc-49fc-c109-08d59daf2e9c X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Apr 2018 00:16:46.7624 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR2101MB0968 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Roy Shterman [ Upstream commit b227c59b9b5b8ae52639c8980af853d2f654f90a ] We need to ensure that delete_work will be hosted on a different workqueue than all the works we flush or cancel from it. Otherwise we may hit a circular dependency warning [1]. Also, given that delete_work flushes reset_work, host reset_work on nvme_reset_wq and delete_work on nvme_delete_wq. In addition, fix the flushing in the individual drivers to flush nvme_delete_wq when draining queued deletes. [1]: [ 178.491942] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D [ 178.492718] [ INFO: possible recursive locking detected ] [ 178.493495] 4.9.0-rc4-c844263313a8-lb #3 Tainted: G OE [ 178.494382] --------------------------------------------- [ 178.495160] kworker/5:1/135 is trying to acquire lock: [ 178.495894] ( [ 178.496120] "nvme-wq" [ 178.496471] ){++++.+} [ 178.496599] , at: [ 178.496921] [] flush_work+0x1a6/0x2d0 [ 178.497670] but task is already holding lock: [ 178.498499] ( [ 178.498724] "nvme-wq" [ 178.499074] ){++++.+} [ 178.499202] , at: [ 178.499520] [] process_one_work+0x162/0x6a0 [ 178.500343] other info that might help us debug this: [ 178.501269] Possible unsafe locking scenario: [ 178.502113] CPU0 [ 178.502472] ---- [ 178.502829] lock( [ 178.503115] "nvme-wq" [ 178.503467] ); [ 178.503716] lock( [ 178.504001] "nvme-wq" [ 178.504353] ); [ 178.504601] *** DEADLOCK *** [ 178.505441] May be due to missing lock nesting notation [ 178.506453] 2 locks held by kworker/5:1/135: [ 178.507068] #0: [ 178.507330] ( [ 178.507598] "nvme-wq" [ 178.507726] ){++++.+} [ 178.508079] , at: [ 178.508173] [] process_one_work+0x162/0x6a0 [ 178.509004] #1: [ 178.509265] ( [ 178.509532] (&ctrl->delete_work) [ 178.509795] ){+.+.+.} [ 178.510145] , at: [ 178.510239] [] process_one_work+0x162/0x6a0 [ 178.511070] stack backtrace: : [ 178.511693] CPU: 5 PID: 135 Comm: kworker/5:1 Tainted: G OE = 4.9.0-rc4-c844263313a8-lb #3 [ 178.512974] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS = 1.10.1-1ubuntu1 04/01/2014 [ 178.514247] Workqueue: nvme-wq nvme_del_ctrl_work [nvme_tcp] [ 178.515071] ffffc2668175bae0 ffffffffa7450823 ffffffffa88abd80 ffffffff= a88abd80 [ 178.516195] ffffc2668175bb98 ffffffffa70eb012 ffffffffa8d8d90d ffff9c47= 2e9ea700 [ 178.517318] ffff9c472e9ea700 ffff9c4700000000 ffff9c4700007200 ab83be61= bec0d50e [ 178.518443] Call Trace: [ 178.518807] [] dump_stack+0x85/0xc2 [ 178.519542] [] __lock_acquire+0x17d2/0x18f0 [ 178.520377] [] ? serial8250_console_putchar+0x27/0x30 [ 178.521330] [] ? wait_for_xmitr+0xa0/0xa0 [ 178.522174] [] ? flush_work+0x18b/0x2d0 [ 178.522975] [] lock_acquire+0x11b/0x220 [ 178.523753] [] ? flush_work+0x1a6/0x2d0 [ 178.524535] [] flush_work+0x1c9/0x2d0 [ 178.525291] [] ? flush_work+0x1a6/0x2d0 [ 178.526077] [] ? flush_workqueue_prep_pwqs+0x220/0x22= 0 [ 178.527040] [] __cancel_work_timer+0x10f/0x1d0 [ 178.527907] [] ? vprintk_default+0x29/0x40 [ 178.528726] [] ? printk+0x48/0x50 [ 178.529434] [] cancel_delayed_work_sync+0x13/0x20 [ 178.530381] [] nvme_stop_ctrl+0x5b/0x70 [nvme_core] [ 178.531314] [] nvme_del_ctrl_work+0x2c/0x50 [nvme_tcp= ] [ 178.532271] [] process_one_work+0x1e1/0x6a0 [ 178.533101] [] ? process_one_work+0x162/0x6a0 [ 178.533954] [] worker_thread+0x4e/0x490 [ 178.534735] [] ? process_one_work+0x6a0/0x6a0 [ 178.535588] [] ? process_one_work+0x6a0/0x6a0 [ 178.536441] [] kthread+0xff/0x120 [ 178.537149] [] ? kthread_park+0x60/0x60 [ 178.538094] [] ? kthread_park+0x60/0x60 [ 178.538900] [] ret_from_fork+0x2a/0x40 Signed-off-by: Roy Shterman Signed-off-by: Sagi Grimberg Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- drivers/nvme/host/core.c | 44 +++++++++++++++++++++++++++++++++++++++---= -- drivers/nvme/host/nvme.h | 2 ++ drivers/nvme/host/rdma.c | 2 +- drivers/nvme/target/loop.c | 2 +- 4 files changed, 43 insertions(+), 7 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 935593032123..93a4fa053e7f 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -65,9 +65,26 @@ static bool streams; module_param(streams, bool, 0644); MODULE_PARM_DESC(streams, "turn on support for Streams write directives"); =20 +/* + * nvme_wq - hosts nvme related works that are not reset or delete + * nvme_reset_wq - hosts nvme reset works + * nvme_delete_wq - hosts nvme delete works + * + * nvme_wq will host works such are scan, aen handling, fw activation, + * keep-alive error recovery, periodic reconnects etc. nvme_reset_wq + * runs reset works which also flush works hosted on nvme_wq for + * serialization purposes. nvme_delete_wq host controller deletion + * works which flush reset works for serialization. + */ struct workqueue_struct *nvme_wq; EXPORT_SYMBOL_GPL(nvme_wq); =20 +struct workqueue_struct *nvme_reset_wq; +EXPORT_SYMBOL_GPL(nvme_reset_wq); + +struct workqueue_struct *nvme_delete_wq; +EXPORT_SYMBOL_GPL(nvme_delete_wq); + static DEFINE_IDA(nvme_subsystems_ida); static LIST_HEAD(nvme_subsystems); static DEFINE_MUTEX(nvme_subsystems_lock); @@ -89,7 +106,7 @@ int nvme_reset_ctrl(struct nvme_ctrl *ctrl) { if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)) return -EBUSY; - if (!queue_work(nvme_wq, &ctrl->reset_work)) + if (!queue_work(nvme_reset_wq, &ctrl->reset_work)) return -EBUSY; return 0; } @@ -122,7 +139,7 @@ int nvme_delete_ctrl(struct nvme_ctrl *ctrl) { if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING)) return -EBUSY; - if (!queue_work(nvme_wq, &ctrl->delete_work)) + if (!queue_work(nvme_delete_wq, &ctrl->delete_work)) return -EBUSY; return 0; } @@ -3491,16 +3508,26 @@ EXPORT_SYMBOL_GPL(nvme_reinit_tagset); =20 int __init nvme_core_init(void) { - int result; + int result =3D -ENOMEM; =20 nvme_wq =3D alloc_workqueue("nvme-wq", WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_SYSFS, 0); if (!nvme_wq) - return -ENOMEM; + goto out; + + nvme_reset_wq =3D alloc_workqueue("nvme-reset-wq", + WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_SYSFS, 0); + if (!nvme_reset_wq) + goto destroy_wq; + + nvme_delete_wq =3D alloc_workqueue("nvme-delete-wq", + WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_SYSFS, 0); + if (!nvme_delete_wq) + goto destroy_reset_wq; =20 result =3D alloc_chrdev_region(&nvme_chr_devt, 0, NVME_MINORS, "nvme"); if (result < 0) - goto destroy_wq; + goto destroy_delete_wq; =20 nvme_class =3D class_create(THIS_MODULE, "nvme"); if (IS_ERR(nvme_class)) { @@ -3519,8 +3546,13 @@ destroy_class: class_destroy(nvme_class); unregister_chrdev: unregister_chrdev_region(nvme_chr_devt, NVME_MINORS); +destroy_delete_wq: + destroy_workqueue(nvme_delete_wq); +destroy_reset_wq: + destroy_workqueue(nvme_reset_wq); destroy_wq: destroy_workqueue(nvme_wq); +out: return result; } =20 @@ -3530,6 +3562,8 @@ void nvme_core_exit(void) class_destroy(nvme_subsys_class); class_destroy(nvme_class); unregister_chrdev_region(nvme_chr_devt, NVME_MINORS); + destroy_workqueue(nvme_delete_wq); + destroy_workqueue(nvme_reset_wq); destroy_workqueue(nvme_wq); } =20 diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 55c49a1aa231..d320df9ab588 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -32,6 +32,8 @@ extern unsigned int admin_timeout; #define NVME_KATO_GRACE 10 =20 extern struct workqueue_struct *nvme_wq; +extern struct workqueue_struct *nvme_reset_wq; +extern struct workqueue_struct *nvme_delete_wq; =20 enum { NVME_NS_LBA =3D 0, diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 2a0bba7f50cf..145aa97067d4 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -2040,7 +2040,7 @@ static void nvme_rdma_remove_one(struct ib_device *ib= _device, void *client_data) } mutex_unlock(&nvme_rdma_ctrl_mutex); =20 - flush_workqueue(nvme_wq); + flush_workqueue(nvme_delete_wq); } =20 static struct ib_client nvme_rdma_ib_client =3D { diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c index 1e21b286f299..5d8054f9a412 100644 --- a/drivers/nvme/target/loop.c +++ b/drivers/nvme/target/loop.c @@ -716,7 +716,7 @@ static void __exit nvme_loop_cleanup_module(void) nvme_delete_ctrl(&ctrl->ctrl); mutex_unlock(&nvme_loop_ctrl_mutex); =20 - flush_workqueue(nvme_wq); + flush_workqueue(nvme_delete_wq); } =20 module_init(nvme_loop_init_module); --=20 2.15.1