Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754417AbbKLSuI (ORCPT ); Thu, 12 Nov 2015 13:50:08 -0500 Received: from mx1.redhat.com ([209.132.183.28]:33552 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752908AbbKLSuG (ORCPT ); Thu, 12 Nov 2015 13:50:06 -0500 Date: Thu, 12 Nov 2015 13:50:04 -0500 (EST) From: Mikulas Patocka X-X-Sender: mpatocka@file01.intranet.prod.int.rdu2.redhat.com To: Sami Tolvanen cc: Mike Snitzer , Milan Broz , device-mapper development , Mandeep Baines , Will Drewry , Kees Cook , linux-kernel@vger.kernel.org, Alasdair Kergon , Mark Salyzyn Subject: Re: [PATCH 0/4] dm verity: add support for error correction In-Reply-To: <20151109191925.GA29185@google.com> Message-ID: References: <1446688954-29589-1-git-send-email-samitolvanen@google.com> <563B066C.6050202@redhat.com> <20151105173306.GA22302@google.com> <20151109163735.GA28884@redhat.com> <20151109191925.GA29185@google.com> User-Agent: Alpine 2.02 (LRH 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1628 Lines: 34 On Mon, 9 Nov 2015, Sami Tolvanen wrote: > > Also, the 2 other big questions from Mikulas need answering: > > 1) why aren't you actually adjustng error codes, returning success, if > > dm-verity was able to trap/correct the corruption? > > We don't see actual I/O errors very often. Most corruption we've seen > is caused by flaky hardware that doesn't return errors. However, I can > certainly change to code to attempt recovery in this case too. What flash controller and chips do you use? What is the probability of I/O error and what is the probability of silent data corruption? Is the silent data corruption permanent or transient? What is causing the silent data corruption (decay in flash chips?, errors on the bus?) Why can't you ask the hardware engineers to use a controler with proper error correction? Without these data - it looks like you first wrote the patch and then tried to make some excuses why it should be accepted. I'm also a little bit concerned that the patch will increase prevalence of crapware on the market - when accepted, this kind of reasoning will follow: "now we have error correction in the kernel, so we cut down flash overprovisioning, save a dollar or two per device, and produce a crap that randomly corrupts user's data on the read-write partition (because that partition not protected by the error correction)". Mikulas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/