Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp4451065pxb; Tue, 2 Nov 2021 09:56:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyJhxG69hywgxGZeIhM0TZf8Wx0Tk6nOB42veHSbj1ghtUsnahxt9derUCBChKD8npqUTjh X-Received: by 2002:a5e:8803:: with SMTP id l3mr27473638ioj.217.1635872180054; Tue, 02 Nov 2021 09:56:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635872180; cv=none; d=google.com; s=arc-20160816; b=cVffA6JhU+3Y8GxgPqGGm41xM6Qf3GKxzElTaNMk160/5XHPbsGm7B34rcvjttkOWi U9GJq2lcqY4COTDKsLNhYcy19mL1a0z0Rpr7NQ8AEcBqpyQaP260fq8tUhSKhUTneufV BkohRu1fuZccNHgGW2LNGWYPhBMz/KjXe/t0HbTetSUPS3KC+50WpDU272zUl98EA9TY i1wZ4rviVxn7oWPl2VZ2yW3yklEpS6ZZYHZYfpLEBoRlu13lG4j33C/9oBKU9dnBduw9 At+p51yjTt06z9C/0OQ8X4+e5z9jNt1NnqHk5zqSU3vrsbXWaLLQPEqsRrhb5e/K++sG n1XQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=pqUtMw42srMAJfM2b3lzp/b+ILplHJlRcpIaRS/3TQM=; b=WcJzKr7LOmS9EW8ARqBiab4xaRd4iSVdY9aVYKLBHcMw6COeNih+9uUhd+9dVDH4Wx bsoiEyEiz7R5yiMgaT5w1NPvFv3UzvhBC/N8s/IsjuUl1t77y3Gdr1BQn7o2FdrVrw2Q PleouInhhaPpw0FygrIz2MtyvBlHZPSBsAwLOkkKNlvt57Oi6CLAydJMpR3khF8lNQUF 4Y2Yp4xz18le/iHJxL+Xeq+EfRuan073Wq5ckAIa1cW/jo9ThM0s/ilo85Jr/u5qlzSw 7qKV7z67bZ6sf9+PEV+AT3eu1H75ELpB2QZh3OviqmbB1hpSk7Jvg+eSyUHplpFI5Z8L h3cA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=M1Jy4rAe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 1si21783657iln.66.2021.11.02.09.56.07; Tue, 02 Nov 2021 09:56:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=M1Jy4rAe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234618AbhKBQ5O (ORCPT + 99 others); Tue, 2 Nov 2021 12:57:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230382AbhKBQ46 (ORCPT ); Tue, 2 Nov 2021 12:56:58 -0400 Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B2A33C061205 for ; Tue, 2 Nov 2021 09:04:06 -0700 (PDT) Received: by mail-pf1-x436.google.com with SMTP id g11so8589669pfv.7 for ; Tue, 02 Nov 2021 09:04:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pqUtMw42srMAJfM2b3lzp/b+ILplHJlRcpIaRS/3TQM=; b=M1Jy4rAev2XNNrOaCWRPKKonIM70Nc4Pm5SPQkebmdl++tjn6a6C7m3mnLdNZFp0pr Dw88LtEJTtN1PUO++FTkdtdNtjTNQDnlwv2yLNTeMXrjP9ZyYAKGTMcL8r9RTRPBw3il 4C97/+Sa7C2wkCTq6baD8je9LdaO98oiGYYt7spVrtyDRzBtQpt9dbnzUI85k/5rztRv YJjx1bNAw3Odr4BoA8nGSa8XDcp5bV9j8cNbaFaDoC0vNAdSZH4jYZcFSEHleUzrSlpk yxIbYa611I/q0MBtSK4EowjkBNpDUpLqBOxKzNjkyNInHvPRJou4mqsFW/Ul7RBz7G+C 9qNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pqUtMw42srMAJfM2b3lzp/b+ILplHJlRcpIaRS/3TQM=; b=dL08dnytxTP2d+Z5bkzs+pZPT5XrxzDXOIR2uCSVwQrWWJWt6gmM3E9SxfHsktZzPM xeXxt4yuq1t4002TAxyHTE5SvVa6e4X1vsU9beMTmOHq/TGcG8YYR4ztEaPVIlaOVQH4 8xwpIHlNIMHB/Gbhij71bajhpo9jJVSIn65307Z+S2Do5KuxmhNVk8e50FfEi8zOfsf8 lCpYWQZIoP/g+LKdQ6euuPw3St+FJBn2CMDeDa0UzFvV9k73VFo4g6tIyJ8Mn9eYz33a iQEML4YEX6zAVRgKKinBt1KUGzLcH0hbQs9Z4yuaCp/VIvF69F3AJ8wj0TXJAfLdb6FV WNuQ== X-Gm-Message-State: AOAM531VZsm/l6VGovG9O0xOTtjvR0VaEbrWCHlSnjOBkyRDdFNQsGUJ zL3ZBxzh51LrzCrlimQazGZEtRZwqPsNYsEvtWdZhg== X-Received: by 2002:a63:6302:: with SMTP id x2mr11074410pgb.5.1635869046166; Tue, 02 Nov 2021 09:04:06 -0700 (PDT) MIME-Version: 1.0 References: <20211021001059.438843-1-jane.chu@oracle.com> <2102a2e6-c543-2557-28a2-8b0bdc470855@oracle.com> In-Reply-To: From: Dan Williams Date: Tue, 2 Nov 2021 09:03:55 -0700 Message-ID: Subject: Re: [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag To: Christoph Hellwig Cc: Jane Chu , "david@fromorbit.com" , "djwong@kernel.org" , "vishal.l.verma@intel.com" , "dave.jiang@intel.com" , "agk@redhat.com" , "snitzer@redhat.com" , "dm-devel@redhat.com" , "ira.weiny@intel.com" , "willy@infradead.org" , "vgoyal@redhat.com" , "linux-fsdevel@vger.kernel.org" , "nvdimm@lists.linux.dev" , "linux-kernel@vger.kernel.org" , "linux-xfs@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 26, 2021 at 11:50 PM Christoph Hellwig wrote: > > On Fri, Oct 22, 2021 at 08:52:55PM +0000, Jane Chu wrote: > > Thanks - I try to be honest. As far as I can tell, the argument > > about the flag is a philosophical argument between two views. > > One view assumes design based on perfect hardware, and media error > > belongs to the category of brokenness. Another view sees media > > error as a build-in hardware component and make design to include > > dealing with such errors. > > No, I don't think so. Bit errors do happen in all media, which is > why devices are built to handle them. It is just the Intel-style > pmem interface to handle them which is completely broken. No, any media can report checksum / parity errors. NVME also seems to do a poor job with multi-bit ECC errors consumed from DRAM. There is nothing "pmem" or "Intel" specific here. > > errors in mind from start. I guess I'm trying to articulate why > > it is acceptable to include the RWF_DATA_RECOVERY flag to the > > existing RWF_ flags. - this way, pwritev2 remain fast on fast path, > > and its slow path (w/ error clearing) is faster than other alternative. > > Other alternative being 1 system call to clear the poison, and > > another system call to run the fast pwrite for recovery, what > > happens if something happened in between? > > Well, my point is doing recovery from bit errors is by definition not > the fast path. Which is why I'd rather keep it away from the pmem > read/write fast path, which also happens to be the (much more important) > non-pmem read/write path. I would expect this interface to be useful outside of pmem as a "failfast" or "try harder to recover" flag for reading over media errors.