Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753931AbdF1EKh (ORCPT ); Wed, 28 Jun 2017 00:10:37 -0400 Received: from kvm5.telegraphics.com.au ([98.124.60.144]:54204 "EHLO kvm5.telegraphics.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753831AbdF1EK2 (ORCPT ); Wed, 28 Jun 2017 00:10:28 -0400 Date: Wed, 28 Jun 2017 14:10:29 +1000 (AEST) From: Finn Thain To: Ondrej Zary cc: "James E.J. Bottomley" , "Martin K. Petersen" , linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, Michael Schmitz Subject: Re: [PATCH v3 0/4] g_NCR5380: PDMA fixes and cleanup In-Reply-To: <201706271806.05004.linux@rainbow-software.org> Message-ID: References: <201706270828.40336.linux@rainbow-software.org> <201706271806.05004.linux@rainbow-software.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1419 Lines: 38 On Tue, 27 Jun 2017, Ondrej Zary wrote: > On Tuesday 27 June 2017 14:42:29 Finn Thain wrote: > > > > ... it triggers sometimes: the value is 1 instead of 0. As we use > > > only 16-bit writes, I don't see how the value could ever be odd. > > > Looks like a bug in the chip. The index register corrupts during the > > > transfer, not after IRQ or timeout. The same check at beginning of > > > pwrite() did not trigger. > > > > Are you reading this register at the right moment? Have you tried > > waiting for it to reach zero, as in, > > > > if (NCR5380_poll_politely(hostdata, 13, 0xff, 0, HZ / 64) < 0) > > /* printk, reset etc */; > > I have not but will try (expecting that it will not change by itself). > Now that I know that it is the byte at the beginning of the block that went missing, I agree that there's no point waiting for the byte count to change. I've included a patch with your 512 B limit in v4. Thanks. > > Even if this is a reliable way to detect a short transfer, it would be > > nice to know the root cause. But I'm being unrealistic: the DTC436 > > vendor never responded to my requests for technical documentation. > > According to the data corruption observed, it's not a short transfer. > The corruption is always the same: one byte missing at the beginning of > a 128 B block. It happens only with slow Quantum LPS 240 drive, not with > faster IBM DORS-32160. > --