Received: by 2002:ab2:7853:0:b0:1fa:5c73:8e2d with SMTP id m19csp55533lqp; Thu, 30 May 2024 11:49:21 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCURXsw1YDf6UZe0hhhdkBKhO/MwxFw4Ii/EgazB3yCdexEAo2LCppg8T48T0p5FvPdtOJCnUu5RwtVUvXsIkvSqT/TtgXobZZIA5cQRqg== X-Google-Smtp-Source: AGHT+IHlU15RyS3+x837cccKNJpT3kye659vqDOMQaC3DGiPTqgfJ1Cbk5UNHE95MqG75DZJXVWo X-Received: by 2002:a05:620a:2445:b0:792:bf10:3ebf with SMTP id af79cd13be357-794e9e118eamr396958685a.56.1717094961475; Thu, 30 May 2024 11:49:21 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717094961; cv=pass; d=google.com; s=arc-20160816; b=ce+cRbrTZMpXQ9pRrK+H0UD1uJ+uiGDHaV7LPHpyMioCDvjwfE8V2rXmH//HmwkX4r gy8E9P2jKVqojr8oFpRve0s0Q7TmJnDDVLXbmtsMYemiK3Zzcil9znGDN/N0ns9X6Pbl TBZ1eKBe180+VOU6zTTaVpjIrCgILGOA6TaLV95CbNpRycSmA2fJ754QKSAolYNb+0Gt lzaXe1C7jmAJ8MLMcbQ1Ek4j8XaLbKDpLToF4iMDCIdiIHEbLjl1ybIvix01GFfkYrsM Hjq+rdlEg1qzHmC2aAuQSkguJYWMxH98TKDPcxDCFDvljHUD/UiC7G7aL7yQF1gGwk+D 3JPQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:user-agent:date:message-id:from :references:cc:to:subject; bh=Nf9pK9VieeVzrgtcsm9LjPEvQy2hLbIRIcsgLNhqEdg=; fh=3SDFeYnjQvbaSZvzHJQbaM1YQk9sZj1qi5nspBoLD8Q=; b=A5L9xRA2imHe0qSknC1WiF4yNrhEDecYeYwLPOXEoAtSQjwF+1QQ90jAqSr2EYb1O9 cNagF33R87XFEcd+V+wyOsKN64lL2+swSck1U1Ss2Xaa1samNga2tmhWWhpmvTljMctd ZprvegYn6+PeQOg32HrqT0oPwq6sis8FFFQ/+63549NLBqj6eOK9vr6cKKXzI4Nznkn6 IyDDXw09q+P5PL+/0JHyqAz26zZHhqTDmXVoZXmWk0iM0qo4+a/cO6z+BcmKpLFZTVun GXvgrN5hEid1B1cUE9dR/vsQepOJpoLEQ/shnT+ef0QiBhkv5VrqwgJHTB3809UciFOV cpyw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=easystack.cn dmarc=pass fromdomain=easystack.cn); spf=pass (google.com: domain of linux-kernel+bounces-195800-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-195800-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=easystack.cn Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id af79cd13be357-794f2f05ac7si18201185a.24.2024.05.30.11.49.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 May 2024 11:49:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-195800-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=easystack.cn dmarc=pass fromdomain=easystack.cn); spf=pass (google.com: domain of linux-kernel+bounces-195800-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-195800-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=easystack.cn Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 2FB7F1C2323E for ; Thu, 30 May 2024 18:49:21 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id CE1304EB31; Thu, 30 May 2024 18:49:14 +0000 (UTC) Received: from mail-m2411.xmail.ntesmail.com (mail-m2411.xmail.ntesmail.com [45.195.24.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E36FE4D595; Thu, 30 May 2024 18:49:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.195.24.11 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717094954; cv=none; b=bMeEfJWxs8LYMycjCMK/HP/dDuQhPtAogPa5OgfEdAD69GxpnFMM76XUNoumO/TrVuZHJ2Xv8ud5v5uHdIAVXjD2HJJZaGs6DijCFUoern4DI7CRS657P3x79Nc2AvTejEKReKnTI5trGzIQSwMahFoqxUWxSfCqUwq/VzicMEQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717094954; c=relaxed/simple; bh=P2bO5JWCsGZMtS8zS4ZP9PbUQbMPfPvnLfoGhj5y2VA=; h=Subject:To:Cc:References:From:Message-ID:Date:MIME-Version: In-Reply-To:Content-Type; b=rLKuO4GF0WaxwM621f0rwhq6hMIWLKQpiBNqeVkOlU6+3TWPYjTn0IlXLkOB0PCvNc/EFLEATmz9tIrcqVh8Q/cMLL7QdmyQ+uQy+XXObNhRBil+B9hcSOyvmS5ILFV9cBq8vrNjgdHGbI5ZEDO67xjhlW9yv/DwjdQudMZiKqE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=easystack.cn; spf=pass smtp.mailfrom=easystack.cn; arc=none smtp.client-ip=45.195.24.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=easystack.cn Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=easystack.cn Received: from [192.168.122.189] (unknown [218.94.118.90]) by smtp.qiye.163.com (Hmail) with ESMTPA id 3882D860313; Thu, 30 May 2024 14:59:39 +0800 (CST) Subject: Re: [PATCH RFC 0/7] block: Introduce CBD (CXL Block Device) To: Gregory Price Cc: Dan Williams , Jonathan Cameron , John Groves , axboe@kernel.dk, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev References: <20240503105245.00003676@Huawei.com> <5b7f3700-aeee-15af-59a7-8e271a89c850@easystack.cn> <20240508131125.00003d2b@Huawei.com> <20240508164417.00006c69@Huawei.com> <3d547577-e8f2-8765-0f63-07d1700fcefc@easystack.cn> <20240509132134.00000ae9@Huawei.com> <664cead8eb0b6_add32947d@dwillia2-mobl3.amr.corp.intel.com.notmuch> <8f161b2d-eacd-ad35-8959-0f44c8d132b3@easystack.cn> From: Dongsheng Yang Message-ID: <5db870de-ecb3-f127-f31c-b59443b4fbb4@easystack.cn> Date: Thu, 30 May 2024 14:59:38 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-HM-Spam-Status: e1kfGhgUHx5ZQUpXWQgPGg8OCBgUHx5ZQUlOS1dZFg8aDwILHllBWSg2Ly tZV1koWUFJQjdXWS1ZQUlXWQ8JGhUIEh9ZQVkaThpMVkxDTENJTR5NQkNCTVUZERMWGhIXJBQOD1 lXWRgSC1lBWUlKQ1VCT1VKSkNVQktZV1kWGg8SFR0UWUFZT0tIVUpNT0lMTlVKS0tVSkJLS1kG X-HM-Tid: 0a8fc84c11d1023ckunm3882d860313 X-HM-MType: 1 X-HM-Sender-Digest: e1kMHhlZQR0aFwgeV1kSHx4VD1lBWUc6OjY6DBw6MDcZPi8fExwQNy4S MQhPCx9VSlVKTEpMS05JSENLSEtLVTMWGhIXVR8UFRwIEx4VHFUCGhUcOx4aCAIIDxoYEFUYFUVZ V1kSC1lBWUlKQ1VCT1VKSkNVQktZV1kIAVlBTUlDTDcG 在 2024/5/29 星期三 下午 11:25, Gregory Price 写道: > On Wed, May 22, 2024 at 02:17:38PM +0800, Dongsheng Yang wrote: >> >> >> 在 2024/5/22 星期三 上午 2:41, Dan Williams 写道: >>> Dongsheng Yang wrote: >>> >>> What guarantees this property? How does the reader know that its local >>> cache invalidation is sufficient for reading data that has only reached >>> global visibility on the remote peer? As far as I can see, there is >>> nothing that guarantees that local global visibility translates to >>> remote visibility. In fact, the GPF feature is counter-evidence of the >>> fact that writes can be pending in buffers that are only flushed on a >>> GPF event. >> >> Sounds correct. From what I learned from GPF, ADR, and eADR, there would >> still be data in WPQ even though we perform a CPU cache line flush in the >> OS. >> >> This means we don't have a explicit method to make data puncture all caches >> and land in the media after writing. also it seems there isn't a explicit >> method to invalidate all caches along the entire path. >> >>> >>> I remain skeptical that a software managed inter-host cache-coherency >>> scheme can be made reliable with current CXL defined mechanisms. >> >> >> I got your point now, acorrding current CXL Spec, it seems software managed >> cache-coherency for inter-host shared memory is not working. Will the next >> version of CXL spec consider it? >>> > > Sorry for missing the conversation, have been out of office for a bit. > > It's not just a CXL spec issue, though that is part of it. I think the > CXL spec would have to expose some form of puncturing flush, and this > makes the assumption that such a flush doesn't cause some kind of > race/deadlock issue. Certainly this needs to be discussed. > > However, consider that the upstream processor actually has to generate > this flush. This means adding the flush to existing coherence protocols, > or at the very least a new instruction to generate the flush explicitly. > The latter seems more likely than the former. > > This flush would need to ensure the data is forced out of the local WPQ > AND all WPQs south of the PCIE complex - because what you really want to > know is that the data has actually made it back to a place where remote > viewers are capable of percieving the change. > > So this means: > 1) Spec revision with puncturing flush > 2) Buy-in from CPU vendors to generate such a flush > 3) A new instruction added to the architecture. > > Call me in a decade or so. > > > But really, I think it likely we see hardware-coherence well before this. > For this reason, I have become skeptical of all but a few memory sharing > use cases that depend on software-controlled cache-coherency. Hi Gregory, From my understanding, we actually has the same idea here. What I am saying is that we need SPEC to consider this issue, meaning we need to describe how the entire software-coherency mechanism operates, which includes the necessary hardware support. Additionally, I agree that if software-coherency also requires hardware support, it seems that hardware-coherency is the better path. > > There are some (FAMFS, for example). The coherence state of these > systems tend to be less volatile (e.g. mappings are read-only), or > they have inherent design limitations (cacheline-sized message passing > via write-ahead logging only). Can you explain more about this? I understand that if the reader in the writer-reader model is using a readonly mapping, the interaction will be much simpler. However, after the writer writes data, if we don't have a mechanism to flush and invalidate puncturing all caches, how can the readonly reader access the new data? > > ~Gregory >