Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp5780106ioo; Wed, 1 Jun 2022 12:26:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxMylLZUsCTvvKOq4I2M2mWgb/Le5JHp+tEn+ZGf+GwEPCsc4kyXnClMOL8pdUrGIAaOfP2 X-Received: by 2002:a17:90a:31cf:b0:1c9:f9b8:68c7 with SMTP id j15-20020a17090a31cf00b001c9f9b868c7mr36248236pjf.34.1654111566607; Wed, 01 Jun 2022 12:26:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654111566; cv=none; d=google.com; s=arc-20160816; b=QkzI+u6xU66CFkvwy4LkV5UOzho4ymK80jmw9enJXS9bLSnCGlqM5JQgonWyGhjnZ0 aNmoyfbEF29+DxCFx8nqlweL2scGVn+rxPQ3+Nnh7z6ZWZ3Oec2yG/j+XatQGiq94U8S cujodxG1/HuU3qoMw35Tvz6hME2NOjhKwNjLju2Id1PRc/WdLZzI/l0xHUvjG6nx9tpx +xZ1htUyHSFkDttI6D4msOUs6Lzlif+wZvuQbo0o5zPSp/KPEZsBta2oQkSkRxSY26zY wsPMyuNavTGZgXQOhNU4Rt8aTE0wMOnSgsd9EHVCRZ3d02kGQ5WvBEcsH8fTuQw+4mvi KZIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=vEMZSzPYyPuppCUCpnu87IRYMlWuSEAwqWH8Kx7sIKw=; b=IlQaj3G2xK9DGgiXQ0aBZLlz+aBG6uP3hWOjOiQVI3b/+7bpyW0n/YBeLOKH3wtoOo SHX3WIWrd9HAyYr7RYFHXvegzefp9/rH3M+rtaQZjyL60ld0AxJu05C5TTTsJC7lN3z8 DwLgA462hlDZIzVV4XdcmiAnkgH6QSCW6EiFW69xGfz8BKN/9Rlmjs1o4bc/WMd8xt61 4bSUsYr4GfeTO7TD0aPefPdo3vg3+4NzARUWVRXoF5akBB8F9uBnlhiLJR/XtR9rLOF+ GaY6kb4tiWA9H3bIt9UzL8fSVAzsrDc4qimN7fvxnGt+t7mKMaceYzUJCUbPPXXaaaSI N59g== ARC-Authentication-Results: i=1; mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id t190-20020a6381c7000000b003fc6079bda3si3138533pgd.489.2022.06.01.12.26.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jun 2022 12:26:06 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0288D16ABE2; Wed, 1 Jun 2022 11:58:52 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350233AbiFAHSV (ORCPT + 99 others); Wed, 1 Jun 2022 03:18:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349941AbiFAHST (ORCPT ); Wed, 1 Jun 2022 03:18:19 -0400 Received: from bmailout3.hostsharing.net (bmailout3.hostsharing.net [IPv6:2a01:4f8:150:2161:1:b009:f23e:0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0EEFE30572; Wed, 1 Jun 2022 00:18:15 -0700 (PDT) Received: from h08.hostsharing.net (h08.hostsharing.net [IPv6:2a01:37:1000::53df:5f1c:0]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "*.hostsharing.net", Issuer "RapidSSL TLS DV RSA Mixed SHA256 2020 CA-1" (verified OK)) by bmailout3.hostsharing.net (Postfix) with ESMTPS id 229E71002A013; Wed, 1 Jun 2022 09:18:09 +0200 (CEST) Received: by h08.hostsharing.net (Postfix, from userid 100393) id F2FF5118737; Wed, 1 Jun 2022 09:18:08 +0200 (CEST) Date: Wed, 1 Jun 2022 09:18:08 +0200 From: Lukas Wunner To: Ira Weiny Cc: Jonathan Cameron , Dan Williams , Bjorn Helgaas , Christoph Hellwig , Alison Schofield , Vishal Verma , Ben Widawsky , linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org, linux-pci@vger.kernel.org Subject: Re: [PATCH V8 03/10] PCI: Create PCI library functions in support of DOE mailboxes. Message-ID: <20220601071808.GA19924@wunner.de> References: <20220414203237.2198665-1-ira.weiny@intel.com> <20220414203237.2198665-4-ira.weiny@intel.com> <20220530190657.GA14765@wunner.de> <20220531113350.0000421e@Huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 31, 2022 at 07:59:21PM -0700, Ira Weiny wrote: > On Tue, May 31, 2022 at 11:33:50AM +0100, Jonathan Cameron wrote: > > On Mon, 30 May 2022 21:06:57 +0200 Lukas Wunner wrote: > > > On Thu, Apr 14, 2022 at 01:32:30PM -0700, ira.weiny@intel.com wrote: > > > > + /* First 2 dwords have already been read */ > > > > + length -= 2; > > > > + /* Read the rest of the response payload */ > > > > + for (i = 0; i < min(length, task->response_pl_sz / sizeof(u32)); i++) { > > > > + pci_read_config_dword(pdev, offset + PCI_DOE_READ, > > > > + &task->response_pl[i]); > > > > + pci_write_config_dword(pdev, offset + PCI_DOE_READ, 0); > > > > + } > > > > > > You need to check the Data Object Ready bit. The device may clear the > > > bit prematurely (e.g. as a result of a concurrent FLR or Conventional > > > Reset). You'll continue reading zero dwords from the mailbox and > > > pretend success to the caller even though the response is truncated. > > > > > > If you're concerned about performance when checking the bit on every > > > loop iteration, checking it only on the last but one iteration should > > > be sufficient to detect truncation. > > > > Good catch - I hate corner cases. Thankfully this one is trivial to > > check for. > > Ok looking at the spec: Strictly speaking this needs to happen multiple > times both in doe_statemachine_work() and inside pci_doe_recv_resp(); > not just in this loop. :-( > > This is because, the check in doe_statemachine_work() only covers the > 1st dword read IIUC. The spec says "this bit indicates the DOE instance has a *data object* available to be read by system firmware/software". So, the entire object is available for reading, not just one dword. You've already got checks in place for the first two dwords which cover reading an "all zeroes" response. No need to amend them. You only need to re-check the Data Object Ready bit on the last-but-one dword in case the function was reset concurrently. Per sec. 6.30.2, "An FLR to a Function must result in the aborting of any DOE transfer in progress." > > > > +static irqreturn_t pci_doe_irq_handler(int irq, void *data) > > > > +{ > > > > + struct pci_doe_mb *doe_mb = data; > > > > + struct pci_dev *pdev = doe_mb->pdev; > > > > + int offset = doe_mb->cap_offset; > > > > + u32 val; > > > > + > > > > + pci_read_config_dword(pdev, offset + PCI_DOE_STATUS, &val); > > > > + > > > > + /* Leave the error case to be handled outside IRQ */ > > > > + if (FIELD_GET(PCI_DOE_STATUS_ERROR, val)) { > > > > + mod_delayed_work(system_wq, &doe_mb->statemachine, 0); > > > > + return IRQ_HANDLED; > > > > + } > > > > + > > > > + if (FIELD_GET(PCI_DOE_STATUS_INT_STATUS, val)) { > > > > + pci_write_config_dword(pdev, offset + PCI_DOE_STATUS, > > > > + PCI_DOE_STATUS_INT_STATUS); > > > > + mod_delayed_work(system_wq, &doe_mb->statemachine, 0); > > > > + return IRQ_HANDLED; > > > > + } > > > > + > > > > + return IRQ_NONE; > > > > +} > > > > > > PCIe 6.0, table 7-316 says that an interrupt is also raised when > > > "the DOE Busy bit has been Cleared", yet such an interrupt is > > > not handled here. It is incorrectly treated as a spurious > > > interrupt by returning IRQ_NONE. The right thing to do > > > is probably to wake the state machine in case it's polling > > > for the Busy flag to clear. > > > > Ah. I remember testing this via a lot of hacking on the QEMU code > > to inject the various races that can occur (it was really ugly to do). > > > > Guess we lost the handling at some point. I think your fix > > is the right one. > > Perhaps I am missing something but digging into this more. I disagree > that the handler fails to handle this case. If I read the spec correctly > DOE Interrupt Status must be set when an interrupt is generated. > The handler wakes the state machine in that case. The state machine > then checks for busy if there is work to be done. Right, I was mistaken, sorry for the noise. > Normally we would not even need to check for status error. But that is > special cased because clearing that status is left to the state machine. That however looks wrong because the DOE Interrupt Status bit is never cleared after a DOE Error is signaled. The state machine performs an explicit abort upon an error by setting the DOE Abort bit, but that doesn't seem to clear DOE Interrupt Status: Per section 6.30.2, "At any time, the system firmware/software is permitted to set the DOE Abort bit in the DOE Control Register, and the DOE instance must Clear the Data Object Ready bit, if not already Clear, and Clear the DOE Error bit, if already Set, in the DOE Status Register, within 1 second." No mention of the DOE Interrupt Status bit, so we cannot assume that it's cleared by a DOE Abort and we must clear it explicitly. Thanks, Lukas