Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932193AbcDOTb6 (ORCPT ); Fri, 15 Apr 2016 15:31:58 -0400 Received: from g4t3426.houston.hp.com ([15.201.208.54]:44273 "EHLO g4t3426.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752710AbcDOTRD (ORCPT ); Fri, 15 Apr 2016 15:17:03 -0400 Message-ID: <1460747308.4597.9.camel@hpe.com> Subject: Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io From: Toshi Kani To: Dan Williams , Jeff Moyer Cc: "axboe@fb.com" , "jack@suse.cz" , "david@fromorbit.com" , "linux-kernel@vger.kernel.org" , "xfs@oss.sgi.com" , "hch@infradead.org" , "linux-mm@kvack.org" , "linux-block@vger.kernel.org" , "viro@zeniv.linux.org.uk" , "linux-nvdimm@ml01.01.org" , "linux-fsdevel@vger.kernel.org" , "akpm@linux-foundation.org" , "linux-ext4@vger.kernel.org" , "Wilcox, Matthew R" Date: Fri, 15 Apr 2016 13:08:28 -0600 In-Reply-To: <1460746909.4597.7.camel@hpe.com> References: <1459303190-20072-1-git-send-email-vishal.l.verma@intel.com> <1459303190-20072-6-git-send-email-vishal.l.verma@intel.com> <1460739288.3012.3.camel@intel.com> <1460741821.3012.11.camel@intel.com> <1460746909.4597.7.camel@hpe.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.18.5.2 (3.18.5.2-1.fc23) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1406 Lines: 31 On Fri, 2016-04-15 at 13:01 -0600, Toshi Kani wrote: > On Fri, 2016-04-15 at 11:17 -0700, Dan Williams wrote: > > > > On Fri, Apr 15, 2016 at 11:06 AM, Jeff Moyer wrote: > > > > > > Dan Williams writes: > > >   > > > > > > There's a lot of special casing here, so you might consider > > > > > > adding comments. > > > > > Correct - maybe we should reconsider wrapper-izing this? :) > > > > Another option is just to skip dax_do_io() and this special casing > > > > fallback entirely if errors are present.  I.e. only attempt > > > > dax_do_io when: IS_DAX() && gendisk->bb && bb->count == 0. > > > > > > So, if there's an error anywhere on the device, penalize all I/O (not > > > just writes, and not just on sectors that are bad)?  I'm not sure > > > that's a great plan, either. > > > > > If errors are rare how much are we actually losing in practice? > > Moreover, we're going to do the full badblocks lookup anyway when we > > call ->direct_access().  If we had that information earlier we can > > avoid this fallback dance. > > A system running with DAX may have active data set in NVDIMM lager than > RAM size.  In this case, falling back to non-DAX will allocate page cache > for the data, which will saturate the system with memory pressure. Oh, sorry, we are still in DIO path.  Falling back to DIO should not cause this issue. -Toshi