Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp378978pxb; Thu, 30 Sep 2021 08:01:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyU3MIRHq57/gVAviw7v5POGstP8u+3xSH7T2HJUC1Y/ZvBy5bIqc67Uriajf75Z3phmX4n X-Received: by 2002:a05:6402:8c5:: with SMTP id d5mr7680364edz.122.1633014088170; Thu, 30 Sep 2021 08:01:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633014088; cv=none; d=google.com; s=arc-20160816; b=VMrOoatbmOD77C2Z66Ydi1bPWswRdEupizU4aanH9p9O/BBupsgrVp/LJXdwXvhrKg fpNd3aaUCKhYXL1CAs0lLTUgJjUpfSEX8yiTpVoGSm4dppX3BjpYV1q6zUZnEW3WXKsn nyELI0XpaplEsXa02G4FchViu7Y/E2k6fo0wmSWuFr/ozX4W367CWw/ZdUpfZpP59RXf 1hUcXkiE669Xm2nCbjs+1cYYcxrTHXNUP5rsd+HocPkHnpUOegtW5J9ec9HTDVsxa7uJ 5dHwDvVuDXykFLRAI/plxK8DxBG5nuvW3PjjBFE50urlty9WSIlf09ZcU/N/qXdr+nIP ZlTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:date:message-id:from:references:cc:to :subject; bh=/QA/JFaVW6c6d1uH59NIH+C/OFkpm+LEIH5mgaT0fZk=; b=I2xoXm1P4ImWCRv/fYwdzH+dtxv5x/182LsxSaWbUoQKuiNnwUhadi2IX8Y/8FUFMk eQ/adSOYHrIwUgCa2gF1Ns/q366wFOAFjAqsRuGg8mi9bWcSbfB0opBrzg+mHHSrIKMW ZeJ4wPn/cOzblQfUnRtZ+6J0mhVhJqyBt3CoWXRwlUWH1p6ZdvKhEC4lP/E69LV9e3eO B4Rr/ZmAMXTVqp/2Tk3po84j8ylQVW7Ue5AdyM+xi2tEAWG5mpP9rWAo/ZUUtl+KgHV2 LSv5Ivkcvlt+UnBS2hNlFwBkauYRqbEpNZO8hQV7bzkc/NhRtvWTN2RUumh9d5ebbr8S D4oQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a21si3614464ejf.430.2021.09.30.08.01.06; Thu, 30 Sep 2021 08:01:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349634AbhI3O3f (ORCPT + 78 others); Thu, 30 Sep 2021 10:29:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348233AbhI3O3e (ORCPT ); Thu, 30 Sep 2021 10:29:34 -0400 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [IPv6:2001:67c:2050::465:102]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C65AC06176A; Thu, 30 Sep 2021 07:27:51 -0700 (PDT) Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:105:465:1:4:0]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4HKwZD2LZLzQjf1; Thu, 30 Sep 2021 16:27:48 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de Subject: Re: [PATCH v2 1/2] mwifiex: Use non-posted PCI write when setting TX ring write pointer To: David Laight , =?UTF-8?B?J1BhbGkgUm9ow6FyJw==?= Cc: Amitkumar Karwar , Ganapathi Bhat , Xinming Hu , Kalle Valo , "David S. Miller" , Jakub Kicinski , Tsuchiya Yuto , "linux-wireless@vger.kernel.org" , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-pci@vger.kernel.org" , Maximilian Luz , Andy Shevchenko , Bjorn Helgaas , Heiner Kallweit , Johannes Berg , Brian Norris , "stable@vger.kernel.org" References: <20210914114813.15404-1-verdre@v0yd.nl> <20210914114813.15404-2-verdre@v0yd.nl> <8f65f41a807c46d496bf1b45816077e4@AcuMS.aculab.com> <20210922142726.guviqler5k7wnm52@pali> From: =?UTF-8?Q?Jonas_Dre=c3=9fler?= Message-ID: Date: Thu, 30 Sep 2021 16:27:40 +0200 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 3267726B Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On 9/22/21 5:54 PM, David Laight wrote: > > From: Pali Rohár >> Sent: 22 September 2021 15:27 >> >> On Wednesday 22 September 2021 14:03:25 David Laight wrote: >>> From: Jonas Dreßler >>>> Sent: 14 September 2021 12:48 >>>> >>>> On the 88W8897 card it's very important the TX ring write pointer is >>>> updated correctly to its new value before setting the TX ready >>>> interrupt, otherwise the firmware appears to crash (probably because >>>> it's trying to DMA-read from the wrong place). The issue is present in >>>> the latest firmware version 15.68.19.p21 of the pcie+usb card. >>>> >>>> Since PCI uses "posted writes" when writing to a register, it's not >>>> guaranteed that a write will happen immediately. That means the pointer >>>> might be outdated when setting the TX ready interrupt, leading to >>>> firmware crashes especially when ASPM L1 and L1 substates are enabled >>>> (because of the higher link latency, the write will probably take >>>> longer). >>>> >>>> So fix those firmware crashes by always using a non-posted write for >>>> this specific register write. We do that by simply reading back the >>>> register after writing it, just as a few other PCI drivers do. >>>> >>>> This fixes a bug where during rx/tx traffic and with ASPM L1 substates >>>> enabled (the enabled substates are platform dependent), the firmware >>>> crashes and eventually a command timeout appears in the logs. >>> >>> I think you need to change your terminology. >>> PCIe does have some non-posted write transactions - but I can't >>> remember when they are used. >> >> In PCIe are all memory write requests as posted. >> >> Non-posted writes in PCIe are used only for IO and config requests. But >> this is not case for proposed patch change as it access only card's >> memory space. >> >> Technically this patch does not use non-posted memory write (as PCIe >> does not support / provide it), just adds something like a barrier and >> I'm not sure if it is really correct (you already wrote more details >> about it, so I will let it be). >> >> I'm not sure what is the correct terminology, I do not know how this >> kind of write-followed-by-read "trick" is correctly called. > > I think it is probably best to say: > "flush the posted write when setting the TX ring write pointer". > > The write can get posted in any/all of the following places: > 1) The cpu store buffer. > 2) The PCIe host bridge. > 3) Any other PCIe bridges. > 4) The PCIe slave logic in the target. > There could be separate buffers for each BAR, > 5) The actual target logic for that address block. > The target (probably) will look a bit like an old fashioned cpu > motherboard with the PCIe slave logic as the main bus master. > > The readback forces all the posted write buffers be flushed. > > In this case I suspect it is either flushing (5) or the extra > delay of the read TLP processing that 'fixes' the problem. > > Note that depending on the exact code and host cpu the second > write may not need to wait for the response to the read TLP. > So the write, readback, write TLP may be back to back on the > actual PCIe link. > > Although I don't have access to an actual PCIe monitor we > do have the ability to trace 'data' TLP into fpga memory > on one of our systems. > This is near real-time but they are slightly munged. > Watching the TLP can be illuminating! > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales) > Thanks for the detailed explanations, it looks like indeed the read-back is not the real fix here, a simple udelay(50) before sending the "TX ready" interrupt also does the trick. } else { + udelay(50); + /* Send the TX ready interrupt */ if (mwifiex_write_reg(adapter, PCIE_CPU_INT_EVENT, CPU_INTR_DNLD_RDY)) { I've tested that for a week now and haven't seen any firmware crashes. Interestingly enough it looks like the delay can also be added after setting the "TX ready" interrupt, just not before updating the TX ring write pointer. I have no idea if 50 usecs is a good duration to wait here, from trying different values I found that 10 to 20 usecs is not enough, but who knows, maybe that's platform dependent?