Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp7484019rwi; Mon, 24 Oct 2022 15:34:42 -0700 (PDT) X-Google-Smtp-Source: AMsMyM61JsCF0YaNrCLf1imalh5Ef0xnDSUgyU4lHDfXadj1pEvnkvdteeLSbKmB0ZqgtdSsfb7b X-Received: by 2002:a17:907:3e03:b0:722:e694:438 with SMTP id hp3-20020a1709073e0300b00722e6940438mr30422684ejc.755.1666650882045; Mon, 24 Oct 2022 15:34:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666650882; cv=none; d=google.com; s=arc-20160816; b=bgKhLwbl+f3St6Ig4r0jFMJSrVYHk49sQ271knVzHzZZHUcOTm+8fQGDG8BnP+wqf0 S8dd1kP3XAwFqMUIz42YmRFOjzv7ARLpnLczy5ieQINmKCvpudlf9Ms4k+dcsVttIDMK UoDIMZsnreAE/Prz1HK/NKc01OFDZyL4+1G0Faof6Knn9HJhOSFve1lIg7efRfAlF2km H7HSl7e+Dj97xepylqDGJZML7vytzWQlO6rTFY1JW659QOeSBIZ/MfIp/SSV0p2n1E2f JxPTiiRqtTw6yX3veSAlZWBVbO2+MF5QzimQOC4/kD1kskL4UGRC5Lp0hmgGo10PINiS JZXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=FvlTFKdvwJsBoW1V2T7zZFVSKTftA5bxXr2Dey5Ohbs=; b=Vw2yW3l3fgg7jutGCWN9WSzmKW1/RQ3E0/DAyorgYOqU2tX7Zf3K1eAF1WzeUiNDFr tz9/+lHN13py/8ubeJvMi9APsa36AU38oySeFjjER++n1BJ9GFcdPh0bi7MMQvl+FWR8 rgptD4OChXllxN1tngDAq2jTNSLzBrFUSnRNh9r6p6HWZl/FcVhtlE3bUIpef4a+6Lr5 BTM3oj6yHj05np7HyAyi9BQoBE3sSs5afC8zOqf2YzRe1bRhJYDRQnpq1SdRPECOOh/J 90iLzhYErbnhk1SPYuwsA4Me/N2HxK/2VjnVu9/FhecWK0IWdVvczAuiQRfTD1ktvVgn I+6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ss4-20020a170907038400b0077ccad52420si806278ejb.275.2022.10.24.15.34.17; Mon, 24 Oct 2022 15:34:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232076AbiJXWJ1 (ORCPT + 99 others); Mon, 24 Oct 2022 18:09:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232158AbiJXWI7 (ORCPT ); Mon, 24 Oct 2022 18:08:59 -0400 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A13F62FF684; Mon, 24 Oct 2022 13:22:51 -0700 (PDT) Received: from fraeml736-chm.china.huawei.com (unknown [172.18.147.226]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4MwvrX1QKbz6HJRR; Mon, 24 Oct 2022 20:43:40 +0800 (CST) Received: from lhrpeml500003.china.huawei.com (7.191.162.67) by fraeml736-chm.china.huawei.com (10.206.15.217) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 24 Oct 2022 14:44:58 +0200 Received: from [10.48.145.243] (10.48.145.243) by lhrpeml500003.china.huawei.com (7.191.162.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 24 Oct 2022 13:44:58 +0100 Message-ID: Date: Mon, 24 Oct 2022 13:44:56 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling To: Niklas Cassel CC: Damien Le Moal , "jejb@linux.ibm.com" , "martin.petersen@oracle.com" , "jinpu.wang@cloud.ionos.com" , "linux-scsi@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Linuxarm , yangxingui , yanaijie References: <27148ec5-d1ae-d9a2-1b00-a4c34d2da198@huawei.com> <5db6a7bc-dfeb-76e1-6899-7041daa934cf@opensource.wdc.com> <64ab35a7-f1ff-92ee-890e-89a5aee935a4@opensource.wdc.com> From: John Garry In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.48.145.243] X-ClientProxiedBy: lhrpeml500005.china.huawei.com (7.191.163.240) To lhrpeml500003.china.huawei.com (7.191.162.67) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Niklas, > > For the record, I tested the pm80xx driver on a HoneyComb LX2 board > (an arm64 board using ACPI). > > I tried v6.1-rc1 both with and without your series in $subject. > > I couldn't see any issues. ok, thanks for the effort. > > > What I tried: > -Running fio: > fio --name=test --filename=/dev/sdc --ioengine=io_uring --rw=randrw --direct=1 --iodepth=32 --bs=1M > on three different HDDs simultaneously for 15+ minutes, > without any errors in fio or dmesg. > > -Creating and mounting a btrfs volume, doing a huge dd to the filesystem > without issues. > > -sg_sat_read_gplog -d --log=0x10 /dev/sda > which successfully returned the log. > > > It is worth mentioning that this arm64 board has reserved memory regions, > but does not yet have a firmware that supplies a IORT RMR (reserved memory > regions) revision E.d node, which means that in order to get this board to > boot successfully, we need to supply: > "arm-smmu.disable_bypass=0 iommu.passthrough=1" > on the kernel command line. hmmm... that's interesting. I can try again with the IOMMU turned off, but, as I recall, it did not make a difference before. I think that requiring reserved memory regions would totally bust the driver (if not present) with IOMMU enabled. As I recall, sas 3008 card would not work without RMR for us. It's also interesting that this LX2 board has A72 cores. For my system, we have newer custom arm v8 cores with quite weak memory ordering implementation. With that same system, I have detected a couple of other driver memory ordering bugs which we did not see on our A72-based platforms. I always suspected that this issue was a memory ordering issue, but since the hang so reliably occurs I ruled it out. Maybe it is... thanks, John