Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755486AbYJZPPa (ORCPT ); Sun, 26 Oct 2008 11:15:30 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754887AbYJZPPK (ORCPT ); Sun, 26 Oct 2008 11:15:10 -0400 Received: from accolon.hansenpartnership.com ([76.243.235.52]:42835 "EHLO accolon.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754558AbYJZPPJ (ORCPT ); Sun, 26 Oct 2008 11:15:09 -0400 Subject: Re: Timeout regression introduced by 242f9dcb8ba6f68fcd217a119a7648a4f69290e9 From: James Bottomley To: Tejun Heo Cc: Jens Axboe , linux-scsi , IDE/ATA development list , Linux Kernel In-Reply-To: <49043C7C.8050207@kernel.org> References: <49043C7C.8050207@kernel.org> Content-Type: text/plain Date: Sun, 26 Oct 2008 10:15:05 -0500 Message-Id: <1225034105.3958.4.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.22.3.1 (2.22.3.1-1.fc9) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1473 Lines: 33 On Sun, 2008-10-26 at 18:46 +0900, Tejun Heo wrote: > Hello, Jens. > > Commit 242f9dcb8ba6f68fcd217a119a7648a4f69290e9 introduces a strange > regression for libata. The second timeout gives puts different > pointer from the issued command onto eh_cmd_q breaking libata EH > command matching which triggers WARN_ON() in ata_eh_finish() and hangs > command processing or causes oops later depending on circumstances. > > Here are logs with induced timeouts (patch attached). In commit > 242f9dcb8, the XXX messages for the second timeout shows different > scsi_cmd pointers for eh_cmd_q and qc->scmd which is initialized by > ata_scsi_qc_new() during command translation. I can't see a way we could be getting a different command passed in from the actual one, since the only way to lose the command from the request is to go through the command completion routines which free it (and end the request). However, since the WARN_ON is specifically comparing the command with the one found by the active tag, could this actually be a problem caused by block tags? I note that libata still uses its own array of outstanding tags (ap->qcmd[tag]) instead of finding them using blk_queue_find_tag() (or scsi_find_tag()). James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/