Received: by 2002:a05:6512:2355:0:0:0:0 with SMTP id p21csp197052lfu; Wed, 30 Mar 2022 20:31:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwXkLj8jREFwoB+ZR9EUAyGpI34hOIukmQmTPKX1vKx7w0kVreqkIWPS4pNCdoPSj0cDRyq X-Received: by 2002:a17:903:114:b0:155:dfca:eb43 with SMTP id y20-20020a170903011400b00155dfcaeb43mr28945794plc.125.1648697463969; Wed, 30 Mar 2022 20:31:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648697463; cv=none; d=google.com; s=arc-20160816; b=VEEcgPTFfYl1jyhDhRX86j9pr+wmH64CbXltUvEYbZZftAyyG8snSPEpBiNDQHlBs+ m2Zb/9tqX6j+5sa9NRNK343E3tYFGKy/boeblHhFAO6eLUPPw9dbqsG6/E4HbW2Scq8S q6TKwMwLnbJb0Y8b2cAVedDRGrvdEuNeVIrf4tx7CGANK+uQcxfJZZPGXrbJaz7ZrdKu kyaEShOPIvM2+Am49eULA3nLneCJvoWqHsxxNZxavmY0Auobv/Zw/TTIAVk87xRR65qR CwKdqAgJ1HsnmUlVeeMR8zAxfc7Im24rlFdCnPAmtrjws8uKvxySaqnmdlCIGLVTqGbe 71MA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=gXfpfu0xGu9AU9IVZyEe1xm/gc4e6tROUTKAPDZ7b0M=; b=H0/XejV1ExAYnMQUccRJm9qiyeY+wVaZH05mdVfun3jzTR6MhzpFoALPZh6fsF0gCD Z3fdzuItRiDyiRfz+QmY5EyxnrAQ17gulNOqkIcWcCdYhNoATPkhvHGTUdviuG5AM2/I Iw2+vBPWdjiyatR2xU58/HRpfrRDruweaYkxEyp5BtYvABkY9s4UIX7ViUkRY3IJ7tVd 9HzjQQMBdr84pI3TfsXIZJ0ZzMrv3s7paybUcLp9jBn5+Ky5n6yUDw0KQX/iUNxiiSwD lx5204NOkSznG7gn+C+ehRY/YNW4EY9qlv25AiQyus1mfmmJCN94ZcK7tyGK7k+5HeG4 sWKw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id nu14-20020a17090b1b0e00b001c9818229c8si2118221pjb.163.2022.03.30.20.31.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Mar 2022 20:31:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 68638D76F5; Wed, 30 Mar 2022 19:56:13 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245450AbiC3Kjx (ORCPT + 99 others); Wed, 30 Mar 2022 06:39:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236321AbiC3Kjw (ORCPT ); Wed, 30 Mar 2022 06:39:52 -0400 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1B2B2C108; Wed, 30 Mar 2022 03:38:06 -0700 (PDT) Received: from fraeml738-chm.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4KT2rd4Z36z67gt0; Wed, 30 Mar 2022 18:35:29 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml738-chm.china.huawei.com (10.206.15.219) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 30 Mar 2022 12:38:05 +0200 Received: from [10.47.83.59] (10.47.83.59) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 30 Mar 2022 11:38:04 +0100 Message-ID: Date: Wed, 30 Mar 2022 11:38:02 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: filesystem corruption with "scsi: core: Reallocate device's budget map on queue depth change" To: Andrea Righi , Ming Lei , Martin Wilck CC: Bart Van Assche , "James E.J. Bottomley" , "Martin K. Petersen" , , References: From: John Garry In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.47.83.59] X-ClientProxiedBy: lhreml745-chm.china.huawei.com (10.201.108.195) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 30/03/2022 11:11, Andrea Righi wrote: > Hello, > > after this commit I'm experiencing some filesystem corruptions at boot > on a power9 box with an aacraid controller. > > At the moment I'm running a 5.15.30 kernel; when the filesystem is > mounted at boot I see the following errors in the console: > > Begin: Will now check root file system ... fsck from util-linux 2.36.1 > [/usr/sbin/fsck.ext4 (1) -- /dev/sda2] fsck.ext4 -a -C0 /dev/sda2 > root: clean, 99646/122101760 files, 11187342/488376336 blocks > done. > [ 4.636613] sd 0:2:0:0: [sda] tag#257 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s > [ 4.636655] sd 0:2:0:0: [sda] tag#257 CDB: Read(10) 28 00 00 00 4c 10 00 00 08 00 > [ 4.636689] blk_update_request: I/O error, dev sda, sector 19472 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 4.636734] sd 0:2:0:0: [sda] tag#258 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s > [ 4.636772] sd 0:2:0:0: [sda] tag#258 CDB: Read(10) 28 00 00 00 4c 18 00 00 08 00 > [ 4.636796] blk_update_request: I/O error, dev sda, sector 19480 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 4.636840] sd 0:2:0:0: [sda] tag#260 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s > [ 4.636877] sd 0:2:0:0: [sda] tag#260 CDB: Read(10) 28 00 00 00 4c 28 00 00 08 00 > [ 4.636901] blk_update_request: I/O error, dev sda, sector 19496 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 4.636944] sd 0:2:0:0: [sda] tag#259 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s > [ 4.636971] sd 0:2:0:0: [sda] tag#259 CDB: Read(10) 28 00 00 00 4c 20 00 00 08 00 > [ 4.637005] blk_update_request: I/O error, dev sda, sector 19488 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 4.637049] sd 0:2:0:0: [sda] tag#262 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s > [ 4.637085] sd 0:2:0:0: [sda] tag#262 CDB: Read(10) 28 00 00 00 4c 38 00 00 08 00 > [ 4.637118] blk_update_request: I/O error, dev sda, sector 19512 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 4.637161] sd 0:2:0:0: [sda] tag#264 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s > [ 4.637197] sd 0:2:0:0: [sda] tag#264 CDB: Read(10) 28 00 00 00 4c 48 00 00 08 00 > [ 4.637221] blk_update_request: I/O error, dev sda, sector 19528 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 4.637270] sd 0:2:0:0: [sda] tag#284 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s > [ 4.637306] sd 0:2:0:0: [sda] tag#284 CDB: Read(10) 28 00 00 00 4c e8 00 00 08 00 > [ 4.637332] blk_update_request: I/O error, dev sda, sector 19688 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 4.637375] sd 0:2:0:0: [sda] tag#286 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s > [ 4.637411] sd 0:2:0:0: [sda] tag#286 CDB: Read(10) 28 00 00 00 4c f8 00 00 08 00 > [ 4.637444] blk_update_request: I/O error, dev sda, sector 19704 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 4.637481] blk_update_request: I/O error, dev sda, sector 19664 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 4.637485] sd 0:2:0:0: [sda] tag#282 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s > [ 4.637487] sd 0:2:0:0: [sda] tag#287 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s > [ 4.637491] sd 0:2:0:0: [sda] tag#287 CDB: Read(10) 28 00 00 00 4d 00 00 00 08 00 > [ 4.637491] sd 0:2:0:0: [sda] tag#282 CDB: Read(10) 28 00 00 00 4c d8 00 00 08 00 > [ 4.637494] blk_update_request: I/O error, dev sda, sector 19672 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 4.747771] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none. > > If I reboot multiple times fsck requires a manual fix and I get dropped > to the initramfs shell. Some times the filesystem gets corrupted and I > need to redeploy the box. > > If I use the same kernel with this commit reverted I can reboot as many > times as I want without any failure: > > 813c6871f76b ("scsi: core: Reallocate device's budget map on queue depth change") I would not have thought that this causes possible corruption. > > For now I've just reverted the commit, but I'll try to add some > debugging and collect more info. > > Let me know if there's any specific test that you want me to try. > Please try this: https://lore.kernel.org/linux-scsi/yq1ee2kumrh.fsf@ca-mkp.ca.oracle.com/T/#t It never made 5.17, which I would have hoped for. Thanks, John