Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759432AbcCDRkZ (ORCPT ); Fri, 4 Mar 2016 12:40:25 -0500 Received: from mga03.intel.com ([134.134.136.65]:59254 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758757AbcCDRkT (ORCPT ); Fri, 4 Mar 2016 12:40:19 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,536,1449561600"; d="scan'208";a="901455090" Subject: Re: THP-enabled filesystem vs. FALLOC_FL_PUNCH_HOLE To: "Kirill A. Shutemov" , linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org References: <1457023939-98083-1-git-send-email-kirill.shutemov@linux.intel.com> <20160304112603.GA9790@node.shutemov.name> Cc: "Kirill A. Shutemov" , Hugh Dickins , Andrea Arcangeli , Andrew Morton , Vlastimil Babka , Christoph Lameter , Naoya Horiguchi , Jerome Marchand , Yang Shi , Sasha Levin , linux-kernel@vger.kernel.org, linux-mm@kvack.org From: Dave Hansen Message-ID: <56D9C882.3040808@intel.com> Date: Fri, 4 Mar 2016 09:40:18 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <20160304112603.GA9790@node.shutemov.name> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1082 Lines: 24 On 03/04/2016 03:26 AM, Kirill A. Shutemov wrote: > On Thu, Mar 03, 2016 at 07:51:50PM +0300, Kirill A. Shutemov wrote: >> Truncate and punch hole that only cover part of THP range is implemented >> by zero out this part of THP. >> >> This have visible effect on fallocate(FALLOC_FL_PUNCH_HOLE) behaviour. >> As we don't really create hole in this case, lseek(SEEK_HOLE) may have >> inconsistent results depending what pages happened to be allocated. >> Not sure if it should be considered ABI break or not. > > Looks like this shouldn't be a problem. man 2 fallocate: > > Within the specified range, partial filesystem blocks are zeroed, > and whole filesystem blocks are removed from the file. After a > successful call, subsequent reads from this range will return > zeroes. > > It means we effectively have 2M filesystem block size. The question is still whether this will case problems for apps. Isn't 2MB a quote unusual block size? Wouldn't some files on a tmpfs filesystem act like they have a 2M blocksize and others like they have 4k? Would that confuse apps?