Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp595988pxb; Wed, 24 Feb 2021 09:49:43 -0800 (PST) X-Google-Smtp-Source: ABdhPJwwHqkwF6tOtvPdQnUPQt8tbKRHRMZ+oFMAK4wXwICjPX3bDnIe/dNTDrGuFyjYleMVfADm X-Received: by 2002:a17:906:c55:: with SMTP id t21mr23536632ejf.537.1614188983102; Wed, 24 Feb 2021 09:49:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614188983; cv=none; d=google.com; s=arc-20160816; b=zf8pLjwMJJ5qoEdicavYQ1w/b+0RHKspWG+rYptJkBD6vRWgg4w7zKGir9T1JNQfcr h61guUTVkWuXRDndc4+mNzdJ3ZRX/pEaxQacbIWA3xJjipROjfHlOfJyIQTAV/wnQQgb oPjrsD9Lm1iP+EsordbpUCAyknLfKxdRr9hZwHCs1kJvFj7eqYbkhpFlruS6UcODfJwI 2BPkdYMKGooXWHKH0RUPvTX9uJFIY0CEwh23sPE6e1LiXT3MsffSHVsKzp9tebY4a41f bOb7itriCkHruMzSt8yOBCZ9iCRRuWtnWQw/fiFYX4IEbDpwEAagwEX7Zf4VKHPabIgw PJuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=y0eyKuVckv4Cro4bWYq9wjn2Q8KqcHCl6O1WtvrqBD8=; b=mVeQojSnOSzS30B10g9Uat8DNcV/54CN7EevOlo7rdafkKdyvz8tepCt6PmximUdjN /hWDaz316hbnn6aEWX9/Wy8/CkYCh9jUpOr1l17XHHysyhNVquXPlR5CxH4fkTXK73Zg DJ0kaw0reRHsojwPQ3Jtg2/4IeRVKrSeeL+FAcc5TvPMBC8MP+YM6ThWtcY7+M6WqiKj yge1F0+MztugnEEhegxrrDssk1DkTdveKuF5QKTMT48x90nLwpoTwBOahq38MH+g8eEB FMVnSf6O4JitJ3jzAKp60rEeQ5zb2f9A8P+hS9CCNI+1NQTBUdzG1Bi/W7Uo/VtZWHU4 P/TA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m19si1622402edd.509.2021.02.24.09.49.20; Wed, 24 Feb 2021 09:49:43 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235840AbhBXRpA (ORCPT + 99 others); Wed, 24 Feb 2021 12:45:00 -0500 Received: from mx2.suse.de ([195.135.220.15]:59436 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234716AbhBXRox (ORCPT ); Wed, 24 Feb 2021 12:44:53 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 5142EADDB; Wed, 24 Feb 2021 17:44:11 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 19A281E14EE; Wed, 24 Feb 2021 18:44:11 +0100 (CET) Date: Wed, 24 Feb 2021 18:44:11 +0100 From: Jan Kara To: Matthew Wilcox Cc: Jan Kara , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Christoph Hellwig , Kent Overstreet Subject: Re: [RFC] Better page cache error handling Message-ID: <20210224174411.GH849@quack2.suse.cz> References: <20210205161142.GI308988@casper.infradead.org> <20210224123848.GA27695@quack2.suse.cz> <20210224134115.GP2858050@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210224134115.GP2858050@casper.infradead.org> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 24-02-21 13:41:15, Matthew Wilcox wrote: > On Wed, Feb 24, 2021 at 01:38:48PM +0100, Jan Kara wrote: > > > We allocate a page and try to read it. 29 threads pile up waiting > > > for the page lock in filemap_update_page(). The error returned by the > > > original I/O is shared between all 29 waiters as well as being returned > > > to the requesting thread. The next request for index.html will send > > > another I/O, and more waiters will pile up trying to get the page lock, > > > but at no time will more than 30 threads be waiting for the I/O to fail. > > > > Interesting idea. It certainly improves current behavior. I just wonder > > whether this isn't a partial solution to a problem and a full solution of > > it would have to go in a different direction? I mean it just seems > > wrong that each reader (let's assume they just won't overlap) has to retry > > the failed IO and wait for the HW to figure out it's not going to work. > > Shouldn't we cache the error state with the page? And I understand that we > > then also have to deal with the problem how to invalidate the error state > > when the block might eventually become readable (for stuff like temporary > > IO failures). That would need some signalling from the driver to the page > > cache, maybe in a form of some error recovery sequence counter or something > > like that. For stuff like iSCSI, multipath, or NBD it could be doable I > > believe... > > That felt like a larger change than I wanted to make. I already have > a few big projects on my plate! I can understand that ;) > Also, it's not clear to me that the host can necessarily figure out when > a device has fixed an error -- certainly for the three cases you list > it can be done. I think we'd want a timer to indicate that it's worth > retrying instead of returning the error. > > Anyway, that seems like a lot of data to cram into a struct page. So I > think my proposal is still worth pursuing while waiting for someone to > come up with a perfect solution. Yes, timer could be a fallback. Or we could just schedule work to discard all 'error' pages in the fs in an hour or so. Not perfect but more or less workable I'd say. Also I don't think we need to cram this directly into struct page - I think it is perfectly fine to kmalloc() structure we need for caching if we hit error and just don't cache if the allocation fails. Then we might just reference it from appropriate place... didn't put too much thought to this... Honza -- Jan Kara SUSE Labs, CR