Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755167AbcJMR22 (ORCPT ); Thu, 13 Oct 2016 13:28:28 -0400 Received: from mail-oi0-f42.google.com ([209.85.218.42]:33930 "EHLO mail-oi0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754126AbcJMR2X (ORCPT ); Thu, 13 Oct 2016 13:28:23 -0400 MIME-Version: 1.0 In-Reply-To: <1476374787.20881.34.camel@hpe.com> References: <1476374061-9080-1-git-send-email-toshi.kani@hpe.com> <1476374787.20881.34.camel@hpe.com> From: Dan Williams Date: Thu, 13 Oct 2016 10:22:56 -0700 Message-ID: Subject: Re: [PATCH] pmem: report error on clear poison failure To: "Kani, Toshimitsu" Cc: "linux-kernel@vger.kernel.org" , "linux-nvdimm@lists.01.org" , "Verma, Vishal L" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1150 Lines: 25 On Thu, Oct 13, 2016 at 9:08 AM, Kani, Toshimitsu wrote: > On Thu, 2016-10-13 at 09:01 -0700, Dan Williams wrote: >> On Thu, Oct 13, 2016 at 8:54 AM, Toshi Kani >> wrote: >> > >> > ACPI Clear Uncorrectable Error DSM function may fail or may be >> > unsupported on a platform. pmem_clear_poison() returns without >> > clearing badblocks in such cases, which leads to a silent data >> > corruption. >> > >> > Change pmem_do_bvec() and pmem_clear_poison() to return -EIO >> > so that filesystem can log an error message. >> >> What's the silent data corruption scenario? If the clear poison >> fails I'm assuming that the poison will still be notified on the next >> read. > > I agree that the data is eventually read, but there is no guranteed > that when it is read soon enough, i.e. user might not access to the > data for a long time. ...but that's the same behavior for errors that we don't yet know about. That said, we indeed know that the write failed. I'd feel better about this patch if the justification / impact was clearer in the changelog, because "silent data corruption" is not the impact.