Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753580AbbGWQoM (ORCPT ); Thu, 23 Jul 2015 12:44:12 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46422 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753405AbbGWQn7 (ORCPT ); Thu, 23 Jul 2015 12:43:59 -0400 Date: Thu, 23 Jul 2015 12:43:58 -0400 From: Vivek Goyal To: Dave Chinner Cc: Eric Sandeen , Mike Snitzer , axboe@kernel.dk, linux-kernel@vger.kernel.org, xfs@oss.sgi.com, dm-devel@redhat.com, linux-fsdevel@vger.kernel.org, hch@lst.de Subject: Re: [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space Message-ID: <20150723164358.GA24562@redhat.com> References: <20150720223610.GV7943@dastard> <55AE6670.40903@redhat.com> <20150721174753.GA8563@redhat.com> <20150722000923.GB7943@dastard> <20150722010056.GC7943@dastard> <20150722014029.GA10628@redhat.com> <20150722023711.GD7943@dastard> <20150722133451.GB16842@redhat.com> <55AFC496.4000009@redhat.com> <20150723051043.GB3902@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150723051043.GB3902@dastard> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1840 Lines: 48 On Thu, Jul 23, 2015 at 03:10:43PM +1000, Dave Chinner wrote: [..] > I don't think knowing the bdev timeout is necessary because the > default is most likely to be "fail fast" in this case. i.e. no > retries, just shut down. IOWs, if we describe the configs and > actions in neutral terms, then the default configurations easy for > users to understand. i.e: > > bdev enospc XFS default > ----------- ----------- > Fail slow Fail fast > Fail fast Fail slow > Fail never Fail never, Record in log > EOPNOTSUPP Fail never > > With that in mind, I'm thinking I should drop the > "permanent/transient" error classifications, and change it "failure > behaviour" with the options "fast slow [never]" and only the slow > option has retry/timeout configuration options. I think the "never" > option still needs to "fail at unmount" config variable, but we > enable it by default rather than hanging unmount and requiring a > manual shutdown like we do now.... I am wondering instead of 4 knobs (fast,slow,never,retry-timeout) can we just do with one knob per error type and that is retry-timout. retry-timeout=0 (Fail fast) retry-timeout=X (Fail slow) retry-timeout=-1 (Never Give up). Also do we really need this timeout per error type. Also would be nice if this timeout was configurable using a mount option. Then we can just specify it during mount time and be done with it. Idea of auto tuning based on what block device is doing sounds reasonable but that should not be a requirement for this patch and can go in even later. It is one of those nice to have features. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/