Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935036AbcKVWpl (ORCPT ); Tue, 22 Nov 2016 17:45:41 -0500 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:59533 "EHLO out5-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933819AbcKVWpj (ORCPT ); Tue, 22 Nov 2016 17:45:39 -0500 X-ME-Sender: X-Sasl-enc: dFdhTeDkk0FkoGgZej+K6XY9VYVZ/hNeAwpuTd+mrkbU 1479854737 From: Nikolaus Rath To: Maxim Patlasov Cc: Miklos Szeredi , , linux-fsdevel , LKML Subject: Re: [fuse-devel] fuse: max_background and congestion_threshold settings References: <87oa1g90nx.fsf@thinkpad.rath.org> <64a57faa-d3a6-a209-8728-723ed7f37c2f@virtuozzo.com> <87fumrmdvn.fsf@thinkpad.rath.org> <716677ab-f962-1628-205b-2326219f4487@virtuozzo.com> <877f83mb2v.fsf@thinkpad.rath.org> <7828c809-f699-c16f-a1aa-24ce839547ff@virtuozzo.com> Mail-Copies-To: never Mail-Followup-To: Maxim Patlasov , Miklos Szeredi , , linux-fsdevel , LKML Date: Tue, 22 Nov 2016 14:45:36 -0800 In-Reply-To: <7828c809-f699-c16f-a1aa-24ce839547ff@virtuozzo.com> (Maxim Patlasov's message of "Wed, 16 Nov 2016 12:41:03 -0800") Message-ID: <877f7vcewf.fsf@thinkpad.rath.org> User-Agent: Gnus/5.130014 (Ma Gnus v0.14) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by mail.home.local id uAMMjqqX014120 Content-Length: 3232 Lines: 64 On Nov 16 2016, Maxim Patlasov wrote: > On 11/16/2016 12:19 PM, Nikolaus Rath wrote: > >> On Nov 16 2016, Maxim Patlasov wrote: >>> On 11/16/2016 11:19 AM, Nikolaus Rath wrote: >>> >>>> Hi Maxim, >>>> >>>> On Nov 15 2016, Maxim Patlasov wrote: >>>>> On 11/15/2016 08:18 AM, Nikolaus Rath wrote: >>>>>> Could someone explain to me the meaning of the max_background and >>>>>> congestion_threshold settings of the fuse module? >>>>>> >>>>>> At first I assumed that max_background specifies the maximum number of >>>>>> pending requests (i.e., requests that have been send to userspace but >>>>>> for which no reply was received yet). But looking at fs/fuse/dev.c, it >>>>>> looks as if not every request is included in this number. >>>>> fuse uses max_background for cases where the total number of >>>>> simultaneous requests of given type is not limited by some other >>>>> natural means. AFAIU, these cases are: 1) async processing of direct >>>>> IO; 2) read-ahead. As an example of "natural" limitation: when >>>>> userspace process blocks on a sync direct IO read/write, the number of >>>>> requests fuse consumed is limited by the number of such processes >>>>> (actually their threads). In contrast, if userspace requests 1GB >>>>> direct IO read/write, it would be unreasonable to issue 1GB/128K==8192 >>>>> fuse requests simultaneously. That's where max_background steps in. >>>> Ah, that makes sense. Are these two cases meant as examples, or is that >>>> an exhaustive list? Because I would have thought that other cases should >>>> be writing of cached data (when writeback caching is enabled), and >>>> asynchronous I/O from userspace...? >>> I think that's exhaustive list, but I can miss something. >>> >>> As for writing of cached data, that definitely doesn't go through >>> background requests. Here we rely on flusher: fuse will allocate as >>> many requests as the flusher wants to writeback. >>> >>> Buffered AIO READs actually block in submit_io until fully >>> processed. So it's just another example of "natural" limitation I told >>> above. >> Not sure I understand. What is it that's blocking? It can't be the >> userspace process, because then it wouldn't be asynchronous I/O... > > Surprise! Alas, Linux kernel does NOT process buffered AIO reads in > async manner. You can verify it yourself by strace-ing a simple > program looping over io_submit + io_getevents: for direct IO (as > expected) io_submit returns immediately while io_getevents waits for > actual IO; in contrast, for buffered IO (surprisingly) io_submit waits > for actual IO while io_getevents returns immediately. Presumably, > people are supposed to use mmap-ed read/writes rather than buffered > AIO. What about buffered, asynchronous writes when writeback cache is disabled? It sounds as if io_submit does not block (so userspace could create an unlimited number), nor can the kernel coalesce them (since writeback caching is disabled). Thanks! -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«