From: Andreas Dilger Subject: Re: Aw: Re: Ext4: Slow performance on first write after mount Date: Tue, 21 May 2013 18:27:13 -0600 Message-ID: <5FF1E777-4701-4F9E-A2C2-48092510428C@dilger.ca> References: <1679869241.585607.1368809483337.JavaMail.ngmail@webmail12.arcor-online.net> <20130519140023.GB7183@thunk.org> <20130520114647.GC8404@thunk.org> <46558986.3488364.1369159356954.JavaMail.ngmail@webmail09.arcor-online.net> Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8BIT Cc: linux-ext4@vger.kernel.org To: frankcmoeller@arcor.de Return-path: Received: from mail-pa0-f42.google.com ([209.85.220.42]:33215 "EHLO mail-pa0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752609Ab3EVA1S convert rfc822-to-8bit (ORCPT ); Tue, 21 May 2013 20:27:18 -0400 Received: by mail-pa0-f42.google.com with SMTP id bj3so1236564pad.29 for ; Tue, 21 May 2013 17:27:17 -0700 (PDT) In-Reply-To: <46558986.3488364.1369159356954.JavaMail.ngmail@webmail09.arcor-online.net> Sender: linux-ext4-owner@vger.kernel.org List-ID: On 2013-05-21, at 12:02 PM, frankcmoeller@arcor.de wrote: >> I like the idea of keeping the high bits of the buddy bitmap >> in the group descriptor, instead of just the largest free order. >> It takes the same amount of space, but provides more information. > More informations for what? Sorry, what I meant to write was that it provides more information than just recording e.g. the number of blocks in the largest free extent. > The allocator or better the good_group function > needs bb_largest_free_order and in some cases fragment count. Do you > want to use the bitmap for a not 100% correct fragment count calculation? Or is there another use for it? The bitmap would provide the largest_free_order value directly (assuming it is at least 4MB in size). Cheers, Andreas >>> On Sun, 19 May 2013 21:36:02 +0200 (CEST) Frank C Moeller wrote: >>>> From my point (end user) I would prefer a builtin solution. I'm also a >>>> programmer and I can therefore understand why you don't want to change >>>> anything. >>> >>> It's not that I don't want to change anything, it's that I'm very >>> hesitant to add new mount options or new code paths that now need more >>> testing unless there's no other way of addressing a particular use >>> case. Another consideration is how to do it in such a way that it >>> doesn't degrade other users' performance. >>> >>> Issuing readahead requests for the bitmap blocks might be good >>> compromise; since they are readahead requests, as low priority >>> requests they won't interfere with anything else going on, and in >>> practice, unless you are starting your video recording **immediately** >>> after the reboot, it should address your concern. >> >> Right. Some of our users do something similar in userspace to avoid >> slowdown on first write, which doesn't _usually_ happen immediately >> after mount, but this isn't always helpful. >> >>> (Also note that for >>> most people willing to hack a DVR, adding a line to /etc/rc.local is >>> usually considered easier than building a new kernel from sources and >>> then after making file system format changes, requiring a reformat of >>> their data disk!) >> >> I think storing the buddy bitmap top bits in the GDT could be a COMPAT >> feature. It is just a hint that could be ignored or incorrect, since >> the actual bitmap would be authoritative. >> >> Cheers, Andreas >> >>> So it's not that I'm against solutions that involve kernel changes or >>> file system format changes. It's just that I want to make sure we >>> explore the entire solution space, since there are costs in terms of >>> testing costs, the need to do a backup-reformat-restore pass, etc, >>> etc., to some of the solutions that have been suggested so far. >>> >>> Regards, >>> >>> - Ted >> Cheers, Andreas