Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754500AbbG0Qa7 (ORCPT ); Mon, 27 Jul 2015 12:30:59 -0400 Received: from mga03.intel.com ([134.134.136.65]:54570 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753699AbbG0Qa6 (ORCPT ); Mon, 27 Jul 2015 12:30:58 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,555,1432623600"; d="scan'208";a="613810717" Subject: Re: ATA failure regression in kernel 4.2 To: Alex Deucher References: <20150723183548.GS15934@mtj.duckdns.org> <55B59F22.2010102@linux.intel.com> Cc: Tejun Heo , LKML From: Jiang Liu Organization: Intel Message-ID: <55B65CBF.6010007@linux.intel.com> Date: Tue, 28 Jul 2015 00:30:55 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3051 Lines: 71 On 2015/7/27 23:21, Alex Deucher wrote: > On Sun, Jul 26, 2015 at 11:01 PM, Jiang Liu wrote: >> On 2015/7/25 1:38, Alex Deucher wrote: >>> On Thu, Jul 23, 2015 at 2:44 PM, Alex Deucher wrote: >>>> On Thu, Jul 23, 2015 at 2:35 PM, Tejun Heo wrote: >>>>> Hello, >>>>> >>>>> On Thu, Jul 23, 2015 at 01:48:24PM -0400, Alex Deucher wrote: >>>>>> Something new in kernel 4.2 seems to have broken one of my hard drives >>>>>> (ssd) in kernel 4.2. 4.1 and older kernels work fine. Here are the >>>>>> relevant logs. >>>>>> >>>>> ... >>>>>> [ 6.547628] ata2.00: qc timeout (cmd 0xec) >>>>>> [ 6.547721] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4) >>>>>> [ 7.007213] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) >>>>>> [ 16.997819] ata2.00: qc timeout (cmd 0xec) >>>>>> [ 16.997910] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4) >>>>>> [ 16.997995] ata2: limiting SATA link speed to 3.0 Gbps >>>>>> [ 17.457400] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320) >>>>>> [ 47.429257] ata2.00: qc timeout (cmd 0xec) >>>>>> [ 47.429349] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4) >>>>>> [ 47.888822] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320) >>>>> >>>>> Nothing really rings a bell. Timeouts on IDENTIFY. Could be IRQ >>>>> related. Which controller is it (lspci -nn)? Also, can you try to >>>>> bisect the issue? >>>> >>>> 00:11.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH >>>> SATA Controller [AHCI mode] [1022:7801] (rev 40) >>>> 00:14.1 IDE interface [0101]: Advanced Micro Devices, Inc. [AMD] FCH >>>> IDE Controller [1022:780c] >>>> >>>> I can take a look at bisecting later this week. >>> >>> You were right about the interrupts. This is an AMD Kaveri APU system. >> Hi Alex, >> Could you please help to provide more information about the >> system so we could identify the issue? Dmesg and /proc/interrupts >> from good and bad kernels are welcomed. >> Thanks! >> Gerry > > See attached. Thanks! Hi Alex, Thanks for the info. Seems something is wrong with multiple-MSI support. To narrow down the scope, could you please help to: 1) apply the small patch and retest diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 7e62751abfac..35f524cc23b7 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -1345,6 +1345,7 @@ static int ahci_init_msi(struct pci_dev *pdev, unsigned int n_ports, if (nvec < 0) return nvec; + nvec = 1; /* * If number of MSIs is less than number of ports then Sharing Last * Message mode could be enforced. In this case assume that advantage 2) Disable interrupt remapping by kernel parameter "nointremap" and retest. Thanks! Gerry -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/