Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp2706372pxb; Fri, 17 Sep 2021 16:35:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz/tnwjRrFyQTDSaqbs8XabRxu7Nzl1RenmNDxf2icxQPNA4t9nFCsxGmVrQ9adOS2lNluj X-Received: by 2002:a92:c147:: with SMTP id b7mr9585221ilh.277.1631921705734; Fri, 17 Sep 2021 16:35:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631921705; cv=none; d=google.com; s=arc-20160816; b=hWfXJIvCYr2YJfE+d+U7i7JiUjxdN25G9FaZxFXxJu2yPriY8eKvoakajNTR9Jl/YN KOzI3ehUA/zrO1XWplp7DDm8GP6ADC8FXtoexMMW8ZF2Kxj7Z9QDJCfJhWd3WUSCYL/m hcMlViEEBlcNbOVobbasGCmigqHV+Nsi7bdJ3DHUUgvx5Bb83rAUdHVoaCNdAr/NJLax iZosFhcvuheRi4hP/F3WGsxKlJQoX/odZA40RmjYTCmGQpQJW6AEdSX0kydSya1k04ZD uPyt+B9mIXlQmtbdR5WorVRz6Rjo8wzjz6Wav5rnwCGOa99pSm9MPrib9AwshxXgmaQk 71dQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=efJ2COIKqV+weOIAuUNnd8Kzlk4xzuGVRmOY1K/8ryc=; b=nBsF2w5jon+XJ5dfgx9fcvs6kSzmhL8OWOiv/LicF8H88KVZ+qlZfAAsHcgtkjwqaN VlUAh1NZaLDqnMG2WXRR2WRE+0l6tixmHaE/uhcvcJxDUHYUEWU2UGZoWvpvcVWCc7/o xRHpxYi6shzI6fFaPU2erqu8X8CT/qgUl2rJchryXwjKaihLm6XIV5al+kt4XhR5R8a/ 5mlxayQMOJAL416n4VzptsQkqM3vuL0lJrnof9BvxlvCIJArHKy6NpJ5WBYedLSEcsbd W/OVgEgz+TLAtrpLSNC1TbZjIlPV2yTLVgMwdVOUArkGIE5ypUX68srtK3TairF+4Elf 441A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=dnEezXaN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c32si6961362jaa.44.2021.09.17.16.34.54; Fri, 17 Sep 2021 16:35:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=dnEezXaN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238373AbhIQP3I (ORCPT + 99 others); Fri, 17 Sep 2021 11:29:08 -0400 Received: from mail.kernel.org ([198.145.29.99]:41412 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237140AbhIQP3H (ORCPT ); Fri, 17 Sep 2021 11:29:07 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 53687610C7; Fri, 17 Sep 2021 15:27:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631892465; bh=ApeQnqp5tBSyqw2P59T/IpiWpP+px0+CAYiPTEWF45I=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=dnEezXaNJyDu4RHBxsc0XtjHLGQMfbR79lyNvkDMIhMe3JWe5pQaZXAzlzfz5SINt 4WU/7pPo6cBp5ZOSO/B6OrL7qHMecUKXmU5s7sk+FMqBV+2MmhagTTnE1hQT1PUeaQ Xlv7OHmN53vb9NlmxuAfxFcfvLtLG8B/fbVOUgCVIkoxjiPMDShVxOZuiTpy38MOx+ EaBYiCsYULMmVKCk9W6ICNqg7iRbosrGokbBjSKYP8TDTrryr+tPt1AikhKhG+ba0O tPcWPZIxZWBXvfSzmLZo5Y0Z+eFxiSF4DDXD+cxRT3UUI/tYGCWukoVoSG27445ecq QhrmtVOX0k7xQ== Date: Fri, 17 Sep 2021 08:27:44 -0700 From: "Darrick J. Wong" To: Christoph Hellwig Cc: Dan Williams , Jane Chu , Vishal L Verma , Dave Jiang , "Weiny, Ira" , Al Viro , Matthew Wilcox , Jan Kara , Linux NVDIMM , Linux Kernel Mailing List , linux-fsdevel Subject: Re: [PATCH 0/3] dax: clear poison on the fly along pwrite Message-ID: <20210917152744.GA10250@magnolia> References: <20210914233132.3680546-1-jane.chu@oracle.com> <516ecedc-38b9-1ae3-a784-289a30e5f6df@oracle.com> <20210915161510.GA34830@magnolia> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 17, 2021 at 01:53:33PM +0100, Christoph Hellwig wrote: > On Thu, Sep 16, 2021 at 11:40:28AM -0700, Dan Williams wrote: > > > That was my gut feeling. If everyone feels 100% comfortable with > > > zeroingas the mechanism to clear poisoning I'll cave in. The most > > > important bit is that we do that through a dedicated DAX path instead > > > of abusing the block layer even more. > > > > ...or just rename dax_zero_page_range() to dax_reset_page_range()? > > Where reset == "zero + clear-poison"? > > I'd say that naming is more confusing than overloading zero. How about dax_zeroinit_range() ? To go with its fallocate flag (yeah I've been too busy sorting out -rc1 regressions to repost this) FALLOC_FL_ZEROINIT_RANGE that will reset the hardware (whatever that means) and set the contents to the known value zero. Userspace usage model: void handle_media_error(int fd, loff_t pos, size_t len) { /* yell about this for posterior's sake */ ret = fallocate(fd, FALLOC_FL_ZEROINIT_RANGE, pos, len); /* yay our disk drive / pmem / stone table engraver is online */ } > > > I'm really worried about both patartitions on DAX and DM passing through > > > DAX because they deeply bind DAX to the block layer, which is just a bad > > > idea. I think we also need to sort that whole story out before removing > > > the EXPERIMENTAL tags. > > > > I do think it was a mistake to allow for DAX on partitions of a pmemX > > block-device. > > > > DAX-reflink support may be the opportunity to start deprecating that > > support. Only enable DAX-reflink for direct mounting on /dev/pmemX > > without partitions (later add dax-device direct mounting), > > I think we need to fully or almost fully sort this out. > > Here is my bold suggestions: > > 1) drop no drop the EXPERMINTAL on the current block layer overload > at all I don't understand this. > 2) add direct mounting of the nvdimm namespaces ASAP. Because all > the filesystem currently also need the /dev/pmem0 device add a way > to open the block device by the dax_device instead of our current > way of doing the reverse > 3) deprecate DAX support through block layer mounts with a say 2 year > deprecation period > 4) add DAX remapping devices as needed What devices are needed? linear for lvm, and maybe error so we can actually test all this stuff? > I'll volunteer to write the initial code for 2). And I think we should > not allow DAX+reflink on the block device shim at all. /me has other questions about daxreflink, but I'll ask them on shiyang's thread. --D