Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp1185795rdb; Wed, 6 Dec 2023 10:49:12 -0800 (PST) X-Google-Smtp-Source: AGHT+IEpCMZlcIgKkh7iJE2BkdeFBnf3TciCmdSyQ4LPO9I6j0CamIqPBrLJLelKL1+KY/+oqWWN X-Received: by 2002:a05:6a00:301c:b0:6ce:2731:d5b6 with SMTP id ay28-20020a056a00301c00b006ce2731d5b6mr1414726pfb.39.1701888552432; Wed, 06 Dec 2023 10:49:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701888552; cv=none; d=google.com; s=arc-20160816; b=q3RO93AWgihxYTepO8aB0EA60xI99t+AvJV2ZX6//Tk0FohhuZjEwRJtYolzrrL2zx nq9Ot5DU6au21kzGaluQmserTXOqVozMPGGXm/AFI86KOftl1QoLmFvyjANTg77wnKCx TxcD/JWP/qitPMSEbbFaSahz/Uw9vG3RwJzls+NX4Mc8OuO4HmcGup+HjBHVKs8i2NbE FgCObWVuajVifhT0n7xvho35Aob8eZlyHAjcNqGhtlw9MkrtgVCb4XfpwtvwiGeppNEK pxedhfRChmkWSJQlyOGZtUXygYOUV0n3C5APYUy0o3VoVz8L/5Edu+fVffyU6RbrTY/x dKGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=rRzkXlCWRuk6cPHGCX4lSNX8QvlP7YqKOTf0hCeXf/w=; fh=Ur8ofQz4jbmvkao0pQi5dEX4WuGY4eg7n+eHZ9S554Q=; b=uWA9cCRhixQcVFJ27wwzYgpVi+oYGLRxvRk0jzpEcKOVTE6NNk3KIiiLVrZdiZjkuh aU4zPzcaSE8VrVHqL7DkIMN9o75IxrERWquGRi12MLFEvj+nmGXjlwqdCTfVpKYipeUB JaE2TnMMBlazej/Z8PtlRTrAkX3rcgC+y8xhiML+Kh7d5dKahfOtRLhxlVTpDnZf3FMw NdYbrRZ8XGoj9kyAV+/jeHvJb/gYrQFi22sd3JoT/d3PzlO0Hw7WABVdlj7Tszq+dX87 RMTIXiqDg4JkXW6wWWAEN5glt7QgBHKTCu4mZGSPuvNqbfOg8fHCBUKv9w/GGw/K/lbw iCvA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VNiL0HtD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id cw22-20020a056a00451600b006cd84368e8dsi345483pfb.190.2023.12.06.10.49.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 10:49:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VNiL0HtD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 212008297C62; Wed, 6 Dec 2023 10:49:09 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379091AbjLFSsr (ORCPT + 99 others); Wed, 6 Dec 2023 13:48:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379234AbjLFSsd (ORCPT ); Wed, 6 Dec 2023 13:48:33 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 84BF41FCB for ; Wed, 6 Dec 2023 10:48:00 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C9C18C433C7; Wed, 6 Dec 2023 18:47:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1701888479; bh=+ti42Uy3LaDObl3C2cNA/it91T+eOi0j2YArak6KYyE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=VNiL0HtDbCMJEgtw6xQk/Iypp0j8e/QsroWuZHB12LyrOYGCAc9pCPh2PJdmYILhu Jc7b/ryZmrcof+TPY9lq9kWqOePGpieR3X4xcLY+O24MVM124XkfUychzNL+OHq9NW uAyGYylocxT802Io2AMScXSvWcxH9V5GdouI4Qj0xY9cvsTRC+/YSQKxadYdBIxyOC xjZrM98Y7KPh8QL5Tm2XDxdKiwwYtA/wVI0VhWLPV1jNABLQJdu6/6Zo1iMjLASyoW gp2V90qAOmtvyUWWX0mSeo7p9Pl9Al8NIcYahHmJ7f7EJhstjwlnV31qT38eXvDiWc hStumjTX/XYHg== Date: Wed, 6 Dec 2023 10:47:59 -0800 From: "Darrick J. Wong" To: David Howells Cc: fstests@vger.kernel.org, samba-technical@lists.samba.org, linux-cifs@vger.kernel.org, Steve French , Paulo Alcantara , Dave Chinner , Filipe Manana , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: Issues with FIEMAP, xfstests, Samba, ksmbd and CIFS Message-ID: <20231206184759.GA3964019@frogsfrogsfrogs> References: <447324.1701860432@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <447324.1701860432@warthog.procyon.org.uk> X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Wed, 06 Dec 2023 10:49:09 -0800 (PST) On Wed, Dec 06, 2023 at 11:00:32AM +0000, David Howells wrote: > Hi, > > I've been debugging apparent cifs failures with xfstests, in particular > generic/009, and I'm finding that the tests are failing because FIEMAP is not > returning exactly the expected extent map. > > The problem is that the FSCTL_QUERY_ALLOCATED_RANGES smb RPC op can only > return a list of ranges that are allocated and does not return any other > information about those allocations or the gaps between them - and thus FIEMAP > cannot express this information to the extent that the test expects. Perhaps that simply makes FSCTL_QUERY_ALLOCATED_RANGES -> FIEMAP translation a poor choice? FIEMAP doesn't have a way to say "written status unknown". > Further, as Steve also observed, the expectation that the individual subtests > should return exactly those ranges is flawed. The filesystem is at liberty to > split extents, round up extents, bridge extents and automatically punch out > blocks of zeros. xfstests/common/punch allows for some of this, but I wonder > if it needs to be more fuzzy. > > I wonder if the best xfstests can be expected to check is that the data we > have written is within the allocated regions. I think the only expectation that generic/shared tests can have is that file ranges they've written must not be reported as SEEK_HOLE. The ranges reported by SEEK_DATA must include the file ranges written by application software, but the data ranges can be encompass more range than that. > Which brings me on to FALLOC_FL_ZERO_RANGE - is this guaranteed to result in > an allocated region (if successful)? Yes, that's the distinction between ZERO and PUNCH. > Samba is translating FSCTL_SET_ZERO_DATA > to FALLOC_FL_PUNCH_HOLE, as is ksmbd, and then there is no allocated range to What does the FSCTL_SET_ZERO_DATA documentation say about the state of the file range after a successful operation? Oh. Heh. According to: https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ni-winioctl-fsctl_set_zero_data "If you use the FSCTL_SET_ZERO_DATA control code to write zeros (0) to a sparse file and the zero (0) region is large enough, the file system may not allocate disk space. "If you use the FSCTL_SET_ZERO_DATA control code to write zeros (0) to a non-sparse file, zeros (0) are written to the file. The system allocates disk storage for all of the zero (0) range, which is equivalent to using the WriteFile function to write zeros (0) to a file. > report back (Samba and ksmbd use SEEK_HOLE/SEEK_DATA rather than FIEMAP - > would a ZERO_RANGE even show up with that?). That depends on the local disk's implementation of lseek and ZERO_RANGE. XFS, for example, implements ZERO_RANGE by unmapping the entire range and then reallocating it with an unwritten extent. There's no reason why it couldn't also issue a WRITE_SAME to storage and change the mapping state to written. The user-visible behavior would be the same (reads return zeroes, space is allocated). However. XFS' SEEK_DATA implementation (aka iomap's) skips over parts of unwritten extents if there isn't a folio in the page cache. If some day the implementation were adjusted to do that WRITE_SAME thing I mentioned, then SEEK_DATA would return the entire range as data regardless of pagecache state. This difference between SEEK_DATA and FIEMAP has led to data corruption problems in the past, because unwritten extents as reported by FIEMAP can have dirty page cache sitting around. SEEK_DATA reports the dirty pages as data; FIEMAP is silent. > Finally, should the Linux cifs filesystem translate gaps in the result of > FSCTL_QUERY_ALLOCATED_RANGES into 'unwritten' extents rather than leaving them > as gaps in the list (to be reported as holes by xfs_io)? This smacks a bit of > adjusting things for the sake of making the testsuite work when the testsuite > isn't quite compatible with the thing being tested. That doesn't make sense to me. > So: > > - Should Samba and ksmbd be using FALLOC_FL_ZERO_RANGE rather than > PUNCH_HOLE? Probably depends on whether or not they present unix files as sparse or non-sparse to Windows? > - Should Samba and ksmbd be using FIEMAP rather than SEEK_DATA/HOLE? No. > - Should xfstests be less exacting in its FIEMAP analysis - or should this be > skipped for cifs? I don't want to skip generic/009 as it checks some > corner cases that need testing, but it may not be possible to make the > exact extent matching work. It's a big lift but I think the generic fstests need to be reworked to FIEMAP-check only the file ranges that it actually wrote. Those can't be SEEK_HOLEs. --D > > Thanks, > David > > >