Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753684AbbEROgq (ORCPT ); Mon, 18 May 2015 10:36:46 -0400 Received: from 66.63.173.11.static.quadranet.com ([66.63.173.11]:43656 "EHLO q1.ich-9.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1752913AbbEROgm (ORCPT ); Mon, 18 May 2015 10:36:42 -0400 Message-ID: <1431959798.19977.4.camel@memnix.com> Subject: Re: Regression: Disk corruption with dm-crypt and kernels >= 4.0 From: Abelardo Ricart III To: Brandon Smith Cc: Mike Snitzer , dm-devel@redhat.com, mpatocka@redhat.com, linux-kernel@vger.kernel.org Date: Mon, 18 May 2015 10:36:38 -0400 In-Reply-To: <20150515150442.GB35834@hank.reardencode.com> References: <1430455027.7012.32.camel@memnix.com> <20150501211703.GA15030@redhat.com> <1430519090.5537.4.camel@memnix.com> <1430523735.5352.1.camel@memnix.com> <20150515150442.GB35834@hank.reardencode.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.16.2.1 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - q1.ich-9.com X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - memnix.com X-Get-Message-Sender-Via: q1.ich-9.com: authenticated_id: aricart@memnix.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2082 Lines: 48 On Fri, 2015-05-15 at 08:04 -0700, Brandon Smith wrote: > On 2015-05-01 (Fri) at 19:42:15 -0400, Abelardo Ricart III wrote: > > > > The patchset in question was tested quite heavily so this is a > > > > surprising report. I'm noticing you are opting in to dm-crypt discard > > > > support. Have you tested without discards enabled? > > > > > > I've disabled discards universally and rebuilt a vanilla kernel. After > > > running > > > my heavy read-write-sync scripts, everything seems to be working fine now. > > > I > > > suppose this could be something that used to fail silently before, but now > > > produces bad behavior? I seem to remember having something in my message > > > log > > > about "discards not supported on this device" when running with it enabled > > > before. > > > > Forgive me, but I spoke too soon. The corruption and libata errors are still > > there, as was evidenced when I went to reboot and got treated to an eye full > > of > > "read-only filesystem" and ata errors. > > > > So no, disabling discards unfortunately did nothing to help. > > I've been experiencing the same problem. Vanilla 4.0 series kernels, > dm-crypt, with/or without discards, on a ThinkPad X1 Carbon with a > LiteOn LGT-256M6G SSD. > > After some of googling around, I found some chatter relating to changes > in NCQ on SSDs in 4.0. Been running w/o NCQ for a full kernel build so > far without issue. Perhaps there's been some change in the interaction > between dm-crypt and NCQ? > > Abelardo, can you try w/o NCQ and see if that helps your situation? > > Best, > > --Brandon I've been running with NCQ disabled and been stress testing for awhile and the issue is indeed gone. Thanks for the workaround! So it seems the issue is somehow related to the combination of NCQ, dm-crypt, and possibly (some?) SSDs. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/