Received: by 2002:a17:90a:1609:0:0:0:0 with SMTP id n9csp848443pja; Wed, 1 Apr 2020 09:47:30 -0700 (PDT) X-Google-Smtp-Source: ADFU+vvT+meuZGmUs0+gZnuwFaJquDjwwBtzu5HWjTowBmJ9jHA2mPDHrdXas9xfwXnBdOn0rZ0R X-Received: by 2002:a9d:404b:: with SMTP id o11mr17882609oti.120.1585759650479; Wed, 01 Apr 2020 09:47:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585759650; cv=none; d=google.com; s=arc-20160816; b=a8gmEkeN1i1kQ9NEnUWa/p+2+4yWR9u0DPKsCGWB9XsKWOtQb42yE1gcsn3DC2I3SG NRRRdAaaX8lxt3kqZ1LEHJneISzGRi4+3r/9jQ7jQf5q0yKul0dr9LQ260kgVGhIzxuP SXYDUcYD9vfUGsXjiwJ7/1Kw4xiAi1S627xib1oxAdXFWQago9QMJ/v2w5qgCQaIMsbO 0mx/7qfsXzhMI4oDesYh3wOj1CuR0DJEhvs1OW7I8/Mw06mqvUJoKrUvuUnpml0opGVa Klpq1TYtjpzFGRiwKr6GxyAekhBl/g0pncP+bRjNciTfbtLvQMTQxoUH9UWNCKMgR2NV wn6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=sU2G/VplsWzh9tk4aJrDWLSuLXA2MqPkPxpcXjriSmQ=; b=vpOPhXgGP5y92kL/pmvxBXRS0qlxBbCe6DK7Et4KK3r9nkZ0pallxPC86SaMjMIzqT O1ECAfCMQaKmD1oUrPjwXdWITDXM0e0qqC6FcP6cUDb+M5mM3Jsh/+uc8ir2G8uiWGon ltN4eCWdnVyzVXRhLbdSm92SUdDJGM7kEkTSZTncEt0Nk60t7Cm4wb+tuQMFby6mz1d+ Dc1dWlGJIfpJJf7Lb8S1KKW4gmT1HnH9j/dkiuwOo4KAhJlpZpNSSUxRhThbxhzltY/h MJQNtzhzb9u1ujCNcODpfGp9/T5uhGrt/wMwMP7x+fUtjgk26GVAMljqAucMpqLnI4hT DZyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=d8HuDyiy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z9si1165099otj.157.2020.04.01.09.47.18; Wed, 01 Apr 2020 09:47:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=d8HuDyiy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387484AbgDAQqO (ORCPT + 99 others); Wed, 1 Apr 2020 12:46:14 -0400 Received: from mail.kernel.org ([198.145.29.99]:47490 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389899AbgDAQqN (ORCPT ); Wed, 1 Apr 2020 12:46:13 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 010A02071A; Wed, 1 Apr 2020 16:46:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1585759572; bh=4/QgQEeTLg2cONn6lXbKFlYdnrgyJb7/r7FCCAFagzs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=d8HuDyiyvr1zLa/ZfwEXyNsUyW/hgT7wE9bwlt/yZp9TIQvlAha7SzWf+pwsI/Hlv vWuJ1EPYsSqdRxMznzWBVP30YZGihPn9jUtR1s51okqaWxpLoLNE7gVJV8MHbOv/MU Z+XNpGFrrEBWyGn2YBA/iF4d6U9GVXAluDnFmg4w= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Wen Xiong , "Martin K. Petersen" , Sasha Levin Subject: [PATCH 4.14 081/148] scsi: ipr: Fix softlockup when rescanning devices in petitboot Date: Wed, 1 Apr 2020 18:17:53 +0200 Message-Id: <20200401161600.989273159@linuxfoundation.org> X-Mailer: git-send-email 2.26.0 In-Reply-To: <20200401161552.245876366@linuxfoundation.org> References: <20200401161552.245876366@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Wen Xiong [ Upstream commit 394b61711f3ce33f75bf70a3e22938464a13b3ee ] When trying to rescan disks in petitboot shell, we hit the following softlockup stacktrace: Kernel panic - not syncing: System is deadlocked on memory [ 241.223394] CPU: 32 PID: 693 Comm: sh Not tainted 5.4.16-openpower1 #1 [ 241.223406] Call Trace: [ 241.223415] [c0000003f07c3180] [c000000000493fc4] dump_stack+0xa4/0xd8 (unreliable) [ 241.223432] [c0000003f07c31c0] [c00000000007d4ac] panic+0x148/0x3cc [ 241.223446] [c0000003f07c3260] [c000000000114b10] out_of_memory+0x468/0x4c4 [ 241.223461] [c0000003f07c3300] [c0000000001472b0] __alloc_pages_slowpath+0x594/0x6d8 [ 241.223476] [c0000003f07c3420] [c00000000014757c] __alloc_pages_nodemask+0x188/0x1a4 [ 241.223492] [c0000003f07c34a0] [c000000000153e10] alloc_pages_current+0xcc/0xd8 [ 241.223508] [c0000003f07c34e0] [c0000000001577ac] alloc_slab_page+0x30/0x98 [ 241.223524] [c0000003f07c3520] [c0000000001597fc] new_slab+0x138/0x40c [ 241.223538] [c0000003f07c35f0] [c00000000015b204] ___slab_alloc+0x1e4/0x404 [ 241.223552] [c0000003f07c36c0] [c00000000015b450] __slab_alloc+0x2c/0x48 [ 241.223566] [c0000003f07c36f0] [c00000000015b754] kmem_cache_alloc_node+0x9c/0x1b4 [ 241.223582] [c0000003f07c3760] [c000000000218c48] blk_alloc_queue_node+0x34/0x270 [ 241.223599] [c0000003f07c37b0] [c000000000226574] blk_mq_init_queue+0x2c/0x78 [ 241.223615] [c0000003f07c37e0] [c0000000002ff710] scsi_mq_alloc_queue+0x28/0x70 [ 241.223631] [c0000003f07c3810] [c0000000003005b8] scsi_alloc_sdev+0x184/0x264 [ 241.223647] [c0000003f07c38a0] [c000000000300ba0] scsi_probe_and_add_lun+0x288/0xa3c [ 241.223663] [c0000003f07c3a00] [c000000000301768] __scsi_scan_target+0xcc/0x478 [ 241.223679] [c0000003f07c3b20] [c000000000301c64] scsi_scan_channel.part.9+0x74/0x7c [ 241.223696] [c0000003f07c3b70] [c000000000301df4] scsi_scan_host_selected+0xe0/0x158 [ 241.223712] [c0000003f07c3bd0] [c000000000303f04] store_scan+0x104/0x114 [ 241.223727] [c0000003f07c3cb0] [c0000000002d5ac4] dev_attr_store+0x30/0x4c [ 241.223741] [c0000003f07c3cd0] [c0000000001dbc34] sysfs_kf_write+0x64/0x78 [ 241.223756] [c0000003f07c3cf0] [c0000000001da858] kernfs_fop_write+0x170/0x1b8 [ 241.223773] [c0000003f07c3d40] [c0000000001621fc] __vfs_write+0x34/0x60 [ 241.223787] [c0000003f07c3d60] [c000000000163c2c] vfs_write+0xa8/0xcc [ 241.223802] [c0000003f07c3db0] [c000000000163df4] ksys_write+0x70/0xbc [ 241.223816] [c0000003f07c3e20] [c00000000000b40c] system_call+0x5c/0x68 As a part of the scan process Linux will allocate and configure a scsi_device for each target to be scanned. If the device is not present, then the scsi_device is torn down. As a part of scsi_device teardown a workqueue item will be scheduled and the lockups we see are because there are 250k workqueue items to be processed. Accoding to the specification of SIS-64 sas controller, max_channel should be decreased on SIS-64 adapters to 4. The patch fixes softlockup issue. Thanks for Oliver Halloran's help with debugging and explanation! Link: https://lore.kernel.org/r/1583510248-23672-1-git-send-email-wenxiong@linux.vnet.ibm.com Signed-off-by: Wen Xiong Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ipr.c | 3 ++- drivers/scsi/ipr.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c index 35d54ee1c5c74..b172f0a020834 100644 --- a/drivers/scsi/ipr.c +++ b/drivers/scsi/ipr.c @@ -9962,6 +9962,7 @@ static void ipr_init_ioa_cfg(struct ipr_ioa_cfg *ioa_cfg, ioa_cfg->max_devs_supported = ipr_max_devs; if (ioa_cfg->sis64) { + host->max_channel = IPR_MAX_SIS64_BUSES; host->max_id = IPR_MAX_SIS64_TARGETS_PER_BUS; host->max_lun = IPR_MAX_SIS64_LUNS_PER_TARGET; if (ipr_max_devs > IPR_MAX_SIS64_DEVS) @@ -9970,6 +9971,7 @@ static void ipr_init_ioa_cfg(struct ipr_ioa_cfg *ioa_cfg, + ((sizeof(struct ipr_config_table_entry64) * ioa_cfg->max_devs_supported))); } else { + host->max_channel = IPR_VSET_BUS; host->max_id = IPR_MAX_NUM_TARGETS_PER_BUS; host->max_lun = IPR_MAX_NUM_LUNS_PER_TARGET; if (ipr_max_devs > IPR_MAX_PHYSICAL_DEVS) @@ -9979,7 +9981,6 @@ static void ipr_init_ioa_cfg(struct ipr_ioa_cfg *ioa_cfg, * ioa_cfg->max_devs_supported))); } - host->max_channel = IPR_VSET_BUS; host->unique_id = host->host_no; host->max_cmd_len = IPR_MAX_CDB_LEN; host->can_queue = ioa_cfg->max_cmds; diff --git a/drivers/scsi/ipr.h b/drivers/scsi/ipr.h index 085e6c90f9e6a..89b36987ff309 100644 --- a/drivers/scsi/ipr.h +++ b/drivers/scsi/ipr.h @@ -1306,6 +1306,7 @@ struct ipr_resource_entry { #define IPR_ARRAY_VIRTUAL_BUS 0x1 #define IPR_VSET_VIRTUAL_BUS 0x2 #define IPR_IOAFP_VIRTUAL_BUS 0x3 +#define IPR_MAX_SIS64_BUSES 0x4 #define IPR_GET_RES_PHYS_LOC(res) \ (((res)->bus << 24) | ((res)->target << 8) | (res)->lun) -- 2.20.1