Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp6987748yba; Thu, 2 May 2019 02:02:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqysrbUn3eoi/ZzR8vA7lDqb7rW/6beZraX7y2W9K8MDh/w4dLFR8hFcyjrGNhlK0hjMZr6n X-Received: by 2002:a63:5516:: with SMTP id j22mr2636671pgb.370.1556787761169; Thu, 02 May 2019 02:02:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556787761; cv=none; d=google.com; s=arc-20160816; b=Y95UIkU8Hkb5o7Vk2r70mNJj4Z5h/izfZrfMMLU5J7d/0Jd8hSW4lUNqG1bg+SyWpW Q1+mK8UORz75iUv2sAzxSgZNHr19aA7V0J6GDmbCuxpyejQbzfzEvLs9k7lkM/UKQU/+ 5T60zhcX+7CiCt7KiuCxeVHVP3OTxgWwHZCsA3kOxhoZOqdZe0znqaluxyFpuWACV3lh eKKIunwN8AElivP83NE2ZwxjkA41u/MFxTyeTsBGv0UOYE7aMMYxmzsEdksot99T6Fkh QEJDDBliPEQhxhltyhfyxdBkP+W9Z8JDIjr9079fIfwPavrPOZ+/9TwdMaBlktIbTJXF miWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature; bh=bTlHZjPKXU6/rd36BV4DPWeIXidVC0aJ+qtcMJEcRwA=; b=rkzSwsPlFA7YnFj2/Q/mnRF5DDoDerjsfvTA33c2ki5hWveAxCHpt1OjoB17s10CtT XPL7hIAVYCw8mlwD9MBSTIdTY2zU5A+D0l7tYb6phyTjWZ97vcGFE0oZPdogLwfFSetb P5+5gHSadWLOwMLCnz73Ha3e9XjkYVlitaQzXaq7ejn4draEwtE/PLoG0i6P0TLDc4B5 EgeS2IKBhq17l0v+QgbZrP7qGxUWbj6gLHtnyFKBteTyviWKVeUVDgwTeVMBUf5Q7P6s qjC25p7JFIXs2SmxH3lkapdfEcnpinSzkA/UYRHwriZz/TvtspuI/LwEfvDCoIqTgT5J IDvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="PUuEeIU/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v28si30367090pgk.42.2019.05.02.02.02.25; Thu, 02 May 2019 02:02:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="PUuEeIU/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726563AbfEBI7v (ORCPT + 99 others); Thu, 2 May 2019 04:59:51 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:33291 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726020AbfEBI7t (ORCPT ); Thu, 2 May 2019 04:59:49 -0400 Received: by mail-pl1-f194.google.com with SMTP id y3so733032plp.0 for ; Thu, 02 May 2019 01:59:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=bTlHZjPKXU6/rd36BV4DPWeIXidVC0aJ+qtcMJEcRwA=; b=PUuEeIU/wwVjmX+fRMzRUQoyTezDNF1c2pwHBbmbbvgTOF8shXz6h0vv8kZyPVwd0T mzD1nYu8/8LvtIH+8avZupBkoeKr1+pR7UGt48aRXcKj46Dqc4ZYT3pMULGr79NqdB03 hxIE9XORTKScC3aGGwO0dCXA+Qa5lhA9J4d+q3k0yyC53zxB4TBYdkXAOHcBrhQMhRhl rjoCXzTjvnHxtxHNNWSEjPAECD13gI8LlVcR3BJ8LdfPHTiKV0RToM4Ovc/DFm62dTjk MbTi0FLlZsK7si39ULNKcbqz4qOyhQlbhnmo8MkRN4K3qFcTfOE9gZGDovV89o5yaFTe Ewsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=bTlHZjPKXU6/rd36BV4DPWeIXidVC0aJ+qtcMJEcRwA=; b=RSwX/56ruJxuAFWLGDiI+h5+5OMAszndW7638UA/G9neqMrhDNfguIKjLCL+LoXOcB hd2/hTCTuG1d2KnKJ3Zj2A9L3S0SQlBIuHzmm5d6gipYq94QKOtETndU43zkdg7U6mWq 2G050EbFArGq9k5RGkIxP7nkgExKPudiyeyYQgTX45EhsCfWrTMU1ezJAkGTn4n3W9c6 8FfCRE6Fdv1LOE9Z+s4IeRy6/A5duestILQ2ok7UydzTW3e7pH18oxPx6uv1RuZj8OnQ g61+9eSJS1bHLlLsdHYNzSdYybvRwOSa9hKY5eeRoEQnJD4sF16mrwsBCdG3i+S5VvLK lPPA== X-Gm-Message-State: APjAAAUCxalo00wXJigfx7GTkgad5Zlc+/gT6YTKp3fBrt0hN9jhalQO m2arI5nBpNyi0Is3w/xQm+0= X-Received: by 2002:a17:902:868e:: with SMTP id g14mr2541158plo.183.1556787589230; Thu, 02 May 2019 01:59:49 -0700 (PDT) Received: from localhost.localdomain ([240f:34:212d:1:1b24:991b:df50:ea3f]) by smtp.gmail.com with ESMTPSA id z7sm74960831pgh.81.2019.05.02.01.59.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 May 2019 01:59:48 -0700 (PDT) From: Akinobu Mita To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Akinobu Mita , Johannes Berg , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg Subject: [PATCH 4/4] nvme-pci: trigger device coredump before resetting controller Date: Thu, 2 May 2019 17:59:21 +0900 Message-Id: <1556787561-5113-5-git-send-email-akinobu.mita@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1556787561-5113-1-git-send-email-akinobu.mita@gmail.com> References: <1556787561-5113-1-git-send-email-akinobu.mita@gmail.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This enables the nvme driver to trigger a device coredump before resetting the controller caused by I/O timeout. The device coredump helps diagnose and debug issues. Cc: Johannes Berg Cc: Keith Busch Cc: Jens Axboe Cc: Christoph Hellwig Cc: Sagi Grimberg Signed-off-by: Akinobu Mita --- drivers/nvme/host/pci.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 7f3077c..584c2aa 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -87,7 +87,7 @@ MODULE_PARM_DESC(poll_queues, "Number of queues to use for polled IO."); struct nvme_dev; struct nvme_queue; -static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown); +static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown, bool dump); static bool __nvme_disable_io_queues(struct nvme_dev *dev, u8 opcode); /* @@ -1286,7 +1286,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) */ if (nvme_should_reset(dev, csts)) { nvme_warn_reset(dev, csts); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, false, true); nvme_reset_ctrl(&dev->ctrl); return BLK_EH_DONE; } @@ -1313,7 +1313,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) dev_warn_ratelimited(dev->ctrl.device, "I/O %d QID %d timeout, disable controller\n", req->tag, nvmeq->qid); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, false, true); nvme_req(req)->flags |= NVME_REQ_CANCELLED; return BLK_EH_DONE; default: @@ -1329,7 +1329,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) dev_warn(dev->ctrl.device, "I/O %d QID %d timeout, reset controller\n", req->tag, nvmeq->qid); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, false, true); nvme_reset_ctrl(&dev->ctrl); nvme_req(req)->flags |= NVME_REQ_CANCELLED; @@ -2396,7 +2396,9 @@ static void nvme_pci_disable(struct nvme_dev *dev) } } -static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown) +static void nvme_coredump(struct device *dev); + +static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown, bool dump) { bool dead = true; struct pci_dev *pdev = to_pci_dev(dev->dev); @@ -2421,6 +2423,9 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown) nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT); } + if (dump) + nvme_coredump(dev->dev); + nvme_stop_queues(&dev->ctrl); if (!dead && dev->ctrl.queue_count > 0) { @@ -2488,7 +2493,7 @@ static void nvme_remove_dead_ctrl(struct nvme_dev *dev, int status) dev_warn(dev->ctrl.device, "Removing after probe failure status: %d\n", status); nvme_get_ctrl(&dev->ctrl); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, false, false); nvme_kill_queues(&dev->ctrl); if (!queue_work(nvme_wq, &dev->remove_work)) nvme_put_ctrl(&dev->ctrl); @@ -2510,7 +2515,7 @@ static void nvme_reset_work(struct work_struct *work) * moving on. */ if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, false, false); mutex_lock(&dev->shutdown_lock); result = nvme_pci_enable(dev); @@ -2799,7 +2804,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id) static void nvme_reset_prepare(struct pci_dev *pdev) { struct nvme_dev *dev = pci_get_drvdata(pdev); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, false, false); } static void nvme_reset_done(struct pci_dev *pdev) @@ -2811,7 +2816,7 @@ static void nvme_reset_done(struct pci_dev *pdev) static void nvme_shutdown(struct pci_dev *pdev) { struct nvme_dev *dev = pci_get_drvdata(pdev); - nvme_dev_disable(dev, true); + nvme_dev_disable(dev, true, false); } /* @@ -2828,14 +2833,14 @@ static void nvme_remove(struct pci_dev *pdev) if (!pci_device_is_present(pdev)) { nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DEAD); - nvme_dev_disable(dev, true); + nvme_dev_disable(dev, true, false); nvme_dev_remove_admin(dev); } flush_work(&dev->ctrl.reset_work); nvme_stop_ctrl(&dev->ctrl); nvme_remove_namespaces(&dev->ctrl); - nvme_dev_disable(dev, true); + nvme_dev_disable(dev, true, false); nvme_release_cmb(dev); nvme_free_host_mem(dev); nvme_dev_remove_admin(dev); @@ -2852,7 +2857,7 @@ static int nvme_suspend(struct device *dev) struct pci_dev *pdev = to_pci_dev(dev); struct nvme_dev *ndev = pci_get_drvdata(pdev); - nvme_dev_disable(ndev, true); + nvme_dev_disable(ndev, true, false); return 0; } @@ -3103,7 +3108,7 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev, case pci_channel_io_frozen: dev_warn(dev->ctrl.device, "frozen state error detected, reset controller\n"); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, false, false); return PCI_ERS_RESULT_NEED_RESET; case pci_channel_io_perm_failure: dev_warn(dev->ctrl.device, -- 2.7.4