Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp1976074ybl; Sat, 1 Feb 2020 10:30:42 -0800 (PST) X-Google-Smtp-Source: APXvYqyed247usWke2cqmzIFUZdwEbYrDSmqu4wDbTXZQm7EPKHCNEyf+IlVFRLJEkl2BKv1LuBi X-Received: by 2002:a9d:7cd9:: with SMTP id r25mr11431744otn.326.1580581842576; Sat, 01 Feb 2020 10:30:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580581842; cv=none; d=google.com; s=arc-20160816; b=KXTeIIoq9NawMqpIWms1TRKy+d+PG61eY50eaDdsfv32WwVvy9zzcZMT6uTjkiZMNt w5uXeqa0CComd+pCa6EwYTIort/1qmkU5X2Zhyb4s9wuHvtgMdGbXtpd7Jzne63hzw7x ICAtaGNFkQ6miHqDDyM8JxUhIAf60VltpIfKqOtH8rk7B8YUO3d3fFIocW6B9wA0MG+b X0OfGVs2cDCY8Q0ZNf0djjVFWVIobZwqr091pOVt5WEd2LRm+YPBwKrd9gI9rn0WPDFi JWXigONRICKVy5P45wL2snhH2qrpqDEegY2oy8BHGZ6cRnY5qk2dnlyl2PvHDTLkRU3j nobw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :message-id:subject:cc:to:from:date:dkim-signature; bh=NWuauUTlKSnPnP2Br+fb9ycQb19YM6NmaKAb0DR9k6I=; b=GtPGE9aoGJytUs9w/yOigm5sP3qXGrUhnVoUb7zxIUhkeIZwTmHJ9Y+zeIOxcOmCd2 Sl13lDZBfWe8Bc3cxR4165Wc1hp42FwNFsXiAIiR8it6AjeDu+yiNx7b3LqEFEYpAjQM AFAKXThCqnzkZkFRg50zfhrwt+cBrLWvNYmwyIXgXvfNODTDfeeRAIbPV33y1A3qYg6H xlzaUYYJcEZ9sdJlTPqWwPrL0NqvPQhYEzc+MlqEKpicEgcHzvDoYFSyVfleKd2AQ2aE eK9A2Erwe3FnGungojUKg5EPRC91W2S6uygASxNiZ1xiQII/qg/WYnPlrJb65Uu0QySz RyQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=EnmTA6uq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p16si6488947oto.287.2020.02.01.10.30.30; Sat, 01 Feb 2020 10:30:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=EnmTA6uq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726794AbgBAS3i (ORCPT + 99 others); Sat, 1 Feb 2020 13:29:38 -0500 Received: from mail.kernel.org ([198.145.29.99]:60462 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726354AbgBAS3h (ORCPT ); Sat, 1 Feb 2020 13:29:37 -0500 Received: from localhost (mobile-166-175-186-165.mycingular.net [166.175.186.165]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0F16A20679; Sat, 1 Feb 2020 18:29:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1580581776; bh=P8e94BZKTj9lmdkE/2c/5ldrD9NmWWwyxlyqQk4l1gA=; h=Date:From:To:Cc:Subject:In-Reply-To:From; b=EnmTA6uquITzdmA7ANHkChnd0CFlid07wbYKc/Iq8EYWj1OvLPKPLyyFaXOr1K62J d0+tVMderqvl2+rIcmPbbPcAd/dHI9FKP/mhGdkRYY37XaJeFK5u8LiBItH9fuMaoJ 9crwcStEy1wSDU5ydTcVgoMu/aSZTumI8yAIrHjc= Date: Sat, 1 Feb 2020 12:29:34 -0600 From: Bjorn Helgaas To: Muni Sekhar Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: pcie: xilinx: kernel hang - ISR readl() Message-ID: <20200201182934.GA6311@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Feb 01, 2020 at 08:44:40AM +0530, Muni Sekhar wrote: > On Sat, Feb 1, 2020 at 2:16 AM Bjorn Helgaas wrote: > > On Fri, Jan 31, 2020 at 10:04:05PM +0530, Muni Sekhar wrote: > > > On Fri, Jan 31, 2020 at 12:30 AM Bjorn Helgaas wrote: > > > > On Thu, Jan 30, 2020 at 09:37:48PM +0530, Muni Sekhar wrote: > > > > > On Thu, Jan 9, 2020 at 10:05 AM Bjorn Helgaas wrote: > > > > > > > > > > > > On Thu, Jan 09, 2020 at 08:47:51AM +0530, Muni Sekhar wrote: > > > > > > > On Thu, Jan 9, 2020 at 1:45 AM Bjorn Helgaas wrote: > > > > > > > > On Tue, Jan 07, 2020 at 09:45:13PM +0530, Muni Sekhar wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > I have module with Xilinx FPGA. It implements UART(s), SPI(s), > > > > > > > > > parallel I/O and interfaces them to the Host CPU via PCI Express bus. > > > > > > > > > I see that my system freezes without capturing the crash dump for > > > > > > > > > certain tests. I debugged this issue and it was tracked down to the > > > > > > > > > below mentioned interrupt handler code. > > > > > > > > > > > > > > > > > > > > > > > > > > > In ISR, first reads the Interrupt Status register using ‘readl()’ as > > > > > > > > > given below. > > > > > > > > > status = readl(ctrl->reg + INT_STATUS); > > > > > > > > > > > > > > > > > > > > > > > > > > > And then clears the pending interrupts using ‘writel()’ as given blow. > > > > > > > > > writel(status, ctrl->reg + INT_STATUS); > > > > > > > > > > > > > > > > > > > > > > > > > > > I've noticed a kernel hang if INT_STATUS register read again after > > > > > > > > > clearing the pending interrupts. > > > > > > > > > > > > > > > > > > Can someone clarify me why the kernel hangs without crash dump incase > > > > > > > > > if I read the INT_STATUS register using readl() after clearing the > > > > > > > > > pending bits? > > > > > > > > > > > > > > > > > > Can readl() block? > > > > > > > > > > > > > > > > readl() should not block in software. Obviously at the hardware CPU > > > > > > > > instruction level, the read instruction has to wait for the result of > > > > > > > > the read. Since that data is provided by the device, i.e., your FPGA, > > > > > > > > it's possible there's a problem there. > > > > > > > > > > > > > > Thank you very much for your reply. > > > > > > > Where can I find the details about what is protocol for reading the > > > > > > > ‘memory mapped IO’? Can you point me to any useful links.. > > > > > > > I tried locate the exact point of the kernel code where CPU waits for > > > > > > > read instruction as given below. > > > > > > > readl() -> __raw_readl() -> return *(const volatile u32 __force *)add > > > > > > > Do I need to check for the assembly instructions, here? > > > > > > > > > > > > The C pointer dereference, e.g., "*address", will be some sort of a > > > > > > "load" instruction in assembly. The CPU wait isn't explicit; it's > > > > > > just that when you load a value, the CPU waits for the value. > > > > > > > > > > > > > > Can you tell whether the FPGA has received the Memory Read for > > > > > > > > INT_STATUS and sent the completion? > > > > > > > > > > > > > > Is there a way to know this with the help of software debugging(either > > > > > > > enabling dynamic debugging or adding new debug prints)? Can you please > > > > > > > point some tools\hw needed to find this? > > > > > > > > > > > > You could learn this either via a PCIe analyzer (expensive piece of > > > > > > hardware) or possibly some logic in the FPGA that would log PCIe > > > > > > transactions in a buffer and make them accessible via some other > > > > > > interface (you mentioned it had parallel and other interfaces). > > > > > > > > > > > > > > On the architectures I'm familiar with, if a device doesn't respond, > > > > > > > > something would eventually time out so the CPU doesn't wait forever. > > > > > > > > > > > > > > What is timeout here? I mean how long CPU waits for completion? Since > > > > > > > this code runs from interrupt context, does it causes the system to > > > > > > > freeze if timeout is more? > > > > > > > > > > > > The Root Port should have a Completion Timeout. This is required by > > > > > > the PCIe spec. The *reporting* of the timeout is somewhat > > > > > > implementation-specific since the reporting is outside the PCIe > > > > > > domain. I don't know the duration of the timeout, but it certainly > > > > > > shouldn't be long enough to look like a "system freeze". > > > > > Does kernel writes to PCIe configuration space register ‘Device > > > > > Control 2 Register’ (Offset 0x28)? When I tried to read this register, > > > > > I noticed bit 4 is set (which disables completion timeouts) and rest > > > > > all other bits are zero. So, Completion Timeout detection mechanism is > > > > > disabled, right? If so what could be the reason for this? > > > > > > > > To my knowledge Linux doesn't set PCI_EXP_DEVCTL2_COMP_TMOUT_DIS > > > > except for one powerpc case. You can check yourself by using cscope > > > > or grep to look for PCI_EXP_DEVCTL2_COMP_TMOUT_DIS or PCI_EXP_DEVCTL2. > > > > > > > > If you're seeing bit 4 (PCI_EXP_DEVCTL2_COMP_TMOUT_DIS) set, it's > > > > likely because firmware set it. You can try booting with > > > > "pci=earlydump" to see what's there before Linux starts changing > > > > things. > > Yes Linux doesn't set PCI_EXP_DEVCTL2_COMP_TMOUT_DIS, verified with earlydump. > Firmware means BIOS? If so is there a way to enable the timeout detection? Sure; you can change the kernel to turn off PCI_EXP_DEVCTL2_COMP_TMOUT_DIS (for debugging purposes, at least), or you can do it with setpci, e.g., # setpci -s01:00.0 CAP_EXP+0x28.W=0x0000 > 01:00.0 RAM memory: PLDA Device 5555 > Subsystem: Device 4000:0000 > Flags: bus master, fast devsel, latency 0, IRQ 16 > Memory at d0400000 (32-bit, non-prefetchable) [size=4M] > Capabilities: [40] Power Management version 3 > Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit- > Capabilities: [60] Express Endpoint, MSI 00 > Kernel driver in use: PLDA PCI > Kernel modules: plda_pci > > 00: 56 15 55 55 07 00 10 00 00 00 00 05 10 00 00 00 > 10: 00 00 40 d0 00 00 00 00 00 00 00 00 00 00 00 00 > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 > 30: 00 00 00 00 40 00 00 00 00 00 00 00 0b 01 00 00 > 40: 01 48 03 00 08 00 00 00 05 60 00 00 00 00 00 00 > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 60: 10 00 02 00 c2 8f 00 00 00 28 01 00 21 f4 03 00 > 70: 01 00 21 00 00 00 00 00 00 00 00 00 00 00 00 00 > 80: 00 00 00 00 02 00 00 00 10 00 00 00 00 00 00 00 > 90: 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > So, on my system, the PCI Express capability is at "[60]" and > PCI_EXP_DEVCTL2 is at 0x88 with value 0x0010 > (PCI_EXP_DEVCTL2_COMP_TMOUT_DIS). Also this matches what lspci > decodes: > > $ sudo lspci -vvs00.0 | grep -A1 DevCtl2 > DevCtl2: Completion Timeout: 50us to 50ms, > TimeoutDis+, LTR-, OBFF Disabled > LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-