Received: by 2002:a05:7412:6592:b0:d7:7d3a:4fe2 with SMTP id m18csp2111345rdg; Sun, 13 Aug 2023 09:39:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE9Gcfa6Gql52GPfhXJG+VEN1/WF52RmnYTciOqMKlUXvtpJ6kfv17FebZ8taygEGBXvlU4 X-Received: by 2002:a05:6402:430d:b0:522:b723:11bd with SMTP id m13-20020a056402430d00b00522b72311bdmr12955245edc.4.1691944745910; Sun, 13 Aug 2023 09:39:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691944745; cv=none; d=google.com; s=arc-20160816; b=xrHo2+euxZUA3iWcWtKKfXlPqg7vQSZp5aYajGHbgZynCl2KijqlWwLA9rLYOced7r aJ4XBR4i5D+tX+mbsAyRTA3c3BDQAfMqb33eKW5z4Mna7IP2yvvNqzJ96qafn0jPllew EZ0q9lw5EKm0YiucM72DxBESGMhjH+H8dmO5GwTXKnstNp3YPUfSF+gNi2x/uqfleo5P inotU0ZU7N1t9QqtpsydfbAb7IlWP3V9pQJ9FwS335TACr33I0AGan8pvrd1YfSX2H3c Vl5YsQeCNHuWRPUDAwBKss+V6wr0PB27yQH9t1Xx5E58EK+Lq3bTA8pD1la0tdqRDkW8 yATA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=F/zet4S0/l3pT4SwTCUcLC2lAbccs3afb1JPJJ9Q7z8=; fh=vKGX40lLR3vhFtgU2W4yJ601dZPPBQTcM9c2AlAaUoo=; b=mnKDwVrFnsqBPOcSWDln4Hm/MgReATcPozNewwsJf1q0YfPu29nQ6ZdXrJt+3xy9vk WcxFkQbYUb/ubQpeY2dV6WTqnOgVh8NueBCoA5zz5yxRvoEetAJqkQaE2yif3cyd11ll 8SforRyL5OntP4icFx7v7g/k3Z9Azjxncck1Gyv5kO6N7bfGDDHHy16IXUAdNP2SrwB2 PpEm5/tvayhcTzUTL6Tn9FCJ1Y3JjPwiTtL5LeDkWk/1+T12ebdov5pW13tqZxnx1IwU jfvRX8N7tPFYAPRoUst3fQUC1CjHdr5Pe4HXYnX7YQ3T6Eo6TVHVRdMxmECTnlyImkOf r5VQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=jesiPw6b; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e16-20020a50ec90000000b00522f060f8desi6862988edr.326.2023.08.13.09.38.40; Sun, 13 Aug 2023 09:39:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=jesiPw6b; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232748AbjHMQJJ (ORCPT + 99 others); Sun, 13 Aug 2023 12:09:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232705AbjHMQIn (ORCPT ); Sun, 13 Aug 2023 12:08:43 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1540C19A3; Sun, 13 Aug 2023 09:08:27 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CD15C6389F; Sun, 13 Aug 2023 16:07:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 48DC2C433C7; Sun, 13 Aug 2023 16:07:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1691942849; bh=2LUK9BBS55WPwsYdRDj0TEiA5roMVehiiOcUVJYA/Zg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jesiPw6b60fxmodDE25pqy+AxThD47zw6tAkHdaEa+E/tTmYPgZcez4iSAMIb/nMg /PaMDoh4gE9oYepb1qr2OQsa6Nnc8CnLE/Nny3vvkJrGhNB+6YQIbXP/PDpg5KJXGx 2TYKAtOrDS2VBhdWRYCJWIxm1cOLfXQ4RjWziZI97gujcE3k15shT79bD3C55QOJSC qT9OyLeB5TgWFyAK5ybNTQpZFwD/1zFfvOCpwzsekas0d9YZ3Xc3cE+14pYjVWhqYj 5yj6ecKHRj+CSUYriKkJssAEz+6Z3JnWJJlG+dHy4i/lDvcpMuQGTrSfRXhFHfRSX6 iaQTmt3ruVauA== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Stefan Haberland , Jan Hoeppner , Jens Axboe , Sasha Levin , hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, linux-s390@vger.kernel.org Subject: [PATCH AUTOSEL 5.15 14/31] s390/dasd: fix hanging device after request requeue Date: Sun, 13 Aug 2023 12:05:47 -0400 Message-Id: <20230813160605.1080385-14-sashal@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230813160605.1080385-1-sashal@kernel.org> References: <20230813160605.1080385-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 5.15.126 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Stefan Haberland [ Upstream commit 8a2278ce9c25048d999fe1a3561def75d963f471 ] The DASD device driver has a function to requeue requests to the blocklayer. This function is used in various cases when basic settings for the device have to be changed like High Performance Ficon related parameters or copy pair settings. The functions iterates over the device->ccw_queue and also removes the requests from the block->ccw_queue. In case the device is started on an alias device instead of the base device it might be removed from the block->ccw_queue without having it canceled properly before. This might lead to a hanging device since the request is no longer on a queue and can not be handled properly. Fix by iterating over the block->ccw_queue instead of the device->ccw_queue. This will take care of all blocklayer related requests and handle them on all associated DASD devices. Signed-off-by: Stefan Haberland Reviewed-by: Jan Hoeppner Link: https://lore.kernel.org/r/20230721193647.3889634-4-sth@linux.ibm.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin --- drivers/s390/block/dasd.c | 125 +++++++++++++++----------------------- 1 file changed, 48 insertions(+), 77 deletions(-) diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c index ed897dc499ff6..0c6ab288201e5 100644 --- a/drivers/s390/block/dasd.c +++ b/drivers/s390/block/dasd.c @@ -2948,41 +2948,32 @@ static void _dasd_wake_block_flush_cb(struct dasd_ccw_req *cqr, void *data) * Requeue a request back to the block request queue * only works for block requests */ -static int _dasd_requeue_request(struct dasd_ccw_req *cqr) +static void _dasd_requeue_request(struct dasd_ccw_req *cqr) { - struct dasd_block *block = cqr->block; struct request *req; - if (!block) - return -EINVAL; /* * If the request is an ERP request there is nothing to requeue. * This will be done with the remaining original request. */ if (cqr->refers) - return 0; + return; spin_lock_irq(&cqr->dq->lock); req = (struct request *) cqr->callback_data; blk_mq_requeue_request(req, true); spin_unlock_irq(&cqr->dq->lock); - return 0; + return; } -/* - * Go through all request on the dasd_block request queue, cancel them - * on the respective dasd_device, and return them to the generic - * block layer. - */ -static int dasd_flush_block_queue(struct dasd_block *block) +static int _dasd_requests_to_flushqueue(struct dasd_block *block, + struct list_head *flush_queue) { struct dasd_ccw_req *cqr, *n; - int rc, i; - struct list_head flush_queue; unsigned long flags; + int rc, i; - INIT_LIST_HEAD(&flush_queue); - spin_lock_bh(&block->queue_lock); + spin_lock_irqsave(&block->queue_lock, flags); rc = 0; restart: list_for_each_entry_safe(cqr, n, &block->ccw_queue, blocklist) { @@ -2997,13 +2988,32 @@ static int dasd_flush_block_queue(struct dasd_block *block) * is returned from the dasd_device layer. */ cqr->callback = _dasd_wake_block_flush_cb; - for (i = 0; cqr != NULL; cqr = cqr->refers, i++) - list_move_tail(&cqr->blocklist, &flush_queue); + for (i = 0; cqr; cqr = cqr->refers, i++) + list_move_tail(&cqr->blocklist, flush_queue); if (i > 1) /* moved more than one request - need to restart */ goto restart; } - spin_unlock_bh(&block->queue_lock); + spin_unlock_irqrestore(&block->queue_lock, flags); + + return rc; +} + +/* + * Go through all request on the dasd_block request queue, cancel them + * on the respective dasd_device, and return them to the generic + * block layer. + */ +static int dasd_flush_block_queue(struct dasd_block *block) +{ + struct dasd_ccw_req *cqr, *n; + struct list_head flush_queue; + unsigned long flags; + int rc; + + INIT_LIST_HEAD(&flush_queue); + rc = _dasd_requests_to_flushqueue(block, &flush_queue); + /* Now call the callback function of flushed requests */ restart_cb: list_for_each_entry_safe(cqr, n, &flush_queue, blocklist) { @@ -3926,75 +3936,36 @@ EXPORT_SYMBOL_GPL(dasd_generic_space_avail); */ static int dasd_generic_requeue_all_requests(struct dasd_device *device) { + struct dasd_block *block = device->block; struct list_head requeue_queue; struct dasd_ccw_req *cqr, *n; - struct dasd_ccw_req *refers; int rc; - INIT_LIST_HEAD(&requeue_queue); - spin_lock_irq(get_ccwdev_lock(device->cdev)); - rc = 0; - list_for_each_entry_safe(cqr, n, &device->ccw_queue, devlist) { - /* Check status and move request to flush_queue */ - if (cqr->status == DASD_CQR_IN_IO) { - rc = device->discipline->term_IO(cqr); - if (rc) { - /* unable to terminate requeust */ - dev_err(&device->cdev->dev, - "Unable to terminate request %p " - "on suspend\n", cqr); - spin_unlock_irq(get_ccwdev_lock(device->cdev)); - dasd_put_device(device); - return rc; - } - } - list_move_tail(&cqr->devlist, &requeue_queue); - } - spin_unlock_irq(get_ccwdev_lock(device->cdev)); - - list_for_each_entry_safe(cqr, n, &requeue_queue, devlist) { - wait_event(dasd_flush_wq, - (cqr->status != DASD_CQR_CLEAR_PENDING)); + if (!block) + return 0; - /* - * requeue requests to blocklayer will only work - * for block device requests - */ - if (_dasd_requeue_request(cqr)) - continue; + INIT_LIST_HEAD(&requeue_queue); + rc = _dasd_requests_to_flushqueue(block, &requeue_queue); - /* remove requests from device and block queue */ - list_del_init(&cqr->devlist); - while (cqr->refers != NULL) { - refers = cqr->refers; - /* remove the request from the block queue */ - list_del(&cqr->blocklist); - /* free the finished erp request */ - dasd_free_erp_request(cqr, cqr->memdev); - cqr = refers; + /* Now call the callback function of flushed requests */ +restart_cb: + list_for_each_entry_safe(cqr, n, &requeue_queue, blocklist) { + wait_event(dasd_flush_wq, (cqr->status < DASD_CQR_QUEUED)); + /* Process finished ERP request. */ + if (cqr->refers) { + spin_lock_bh(&block->queue_lock); + __dasd_process_erp(block->base, cqr); + spin_unlock_bh(&block->queue_lock); + /* restart list_for_xx loop since dasd_process_erp + * might remove multiple elements + */ + goto restart_cb; } - - /* - * _dasd_requeue_request already checked for a valid - * blockdevice, no need to check again - * all erp requests (cqr->refers) have a cqr->block - * pointer copy from the original cqr - */ + _dasd_requeue_request(cqr); list_del_init(&cqr->blocklist); cqr->block->base->discipline->free_cp( cqr, (struct request *) cqr->callback_data); } - - /* - * if requests remain then they are internal request - * and go back to the device queue - */ - if (!list_empty(&requeue_queue)) { - /* move freeze_queue to start of the ccw_queue */ - spin_lock_irq(get_ccwdev_lock(device->cdev)); - list_splice_tail(&requeue_queue, &device->ccw_queue); - spin_unlock_irq(get_ccwdev_lock(device->cdev)); - } dasd_schedule_device_bh(device); return rc; } -- 2.40.1