Received: by 10.223.185.116 with SMTP id b49csp2452537wrg; Mon, 12 Feb 2018 09:45:20 -0800 (PST) X-Google-Smtp-Source: AH8x226mIXD+lLub59y9C2dxfOz0wL8XaNsBuLFA/qIL9qgAgjgZYeLI08xWV5pr4O5m+cYjxsuc X-Received: by 10.98.12.144 with SMTP id 16mr12423528pfm.147.1518457520586; Mon, 12 Feb 2018 09:45:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518457520; cv=none; d=google.com; s=arc-20160816; b=GbW4cWpOCBMGkv5gFx/ZB09Ql4My0Ddu7SI4gesikb2pXaMxjr9XTzoq8nNRsL48j4 1/SG0Cha0/5aDzALt8eJk1hlQxAHUbmjufKpnJ/cI9sWQGmlUIpeeRVx5p/tqUs6sqll 4BN+kkIYP/BjJuNo/moB9UIfSn5WLNZ9FefBn9lMrZvlXwLi1seec6gXFZF/HnVKvFon XUY8r/D6pm/bh4I9jYvdnS5u0CuTarsU7D16ObnEO7mkZdmti+qM/j4extlXZMO0ffKm lCOEZBlKi9HBEECuW0uw9goWW2uEW+31foN0fxwlM3amyhDsvAPns8L4KIyq7I6MJNcx v6WQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dmarc-filter:arc-authentication-results; bh=C8jSPHCUAxlFaLPlvDRRLRwDAPGWyDFYeUoQ4HN0Vso=; b=ntqY7Mp1xTPIWblVYOaXO7bid7IIumcdRqqAZ0pMUMyw6MAkDgWtcGkzxrWIoe3YKN CNXHmxh2Z38BiNrUVNzdyi1/Mevj0EmxJlDthMUg6Fl/X2qPA65a1L6UCzPSU6xbetce sidrUTOZFwNKAgd9agYlbPGplxXqWAZoRPP2ZQkM5rbKeGVToQ7iw7TEhrmZ9rz49ils LzyWvVrLYqTaOL3EY+7K5tZr/pw1AcvDi9XK5jV76lAW+VXkC0Ka66rADmA8uXUhx22N WqGIb5Cbgf75k1QDgHwXF3wFGs2edk7VXurLde0s3oGYoDysTx5Tm1+ipqxeNK/BxQIY JAIA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q10si218584pgp.285.2018.02.12.09.45.05; Mon, 12 Feb 2018 09:45:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753656AbeBLRnq (ORCPT + 99 others); Mon, 12 Feb 2018 12:43:46 -0500 Received: from mail.kernel.org ([198.145.29.99]:35506 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753288AbeBLRnn (ORCPT ); Mon, 12 Feb 2018 12:43:43 -0500 Received: from localhost (unknown [69.71.4.159]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id EBA4F20685; Mon, 12 Feb 2018 17:43:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EBA4F20685 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=helgaas@kernel.org Date: Mon, 12 Feb 2018 11:43:41 -0600 From: Bjorn Helgaas To: Arjun Vynipadath Cc: bhelgaas@google.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, davem@davemloft.net, netdev@vger.kernel.org, leedom@chelsio.com, santosh@chelsio.com, ganeshgr@chelsio.com, nirranjan@chelsio.com, kumaras@chelsio.com, swise@opengridcomputing.com, hare@suse.de Subject: Re: [REGRESSION, bisect] pci: cxgb4 probe fails after commit 104daa71b3961434 ("PCI: Determine actual VPD size on first access") Message-ID: <20180212174341.GC75542@bhelgaas-glaptop.roam.corp.google.com> References: <1516710549-26660-1-git-send-email-arjun@chelsio.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1516710549-26660-1-git-send-email-arjun@chelsio.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 23, 2018 at 05:59:09PM +0530, Arjun Vynipadath wrote: > Sending on behalf of "Casey Leedom " > > Way back on April 11, 2016 we reported a regression in Linux kernel 4.6-rc2 > brought on by kernel.org commit 104daa71b396. This commit calculates the > size of a PCI Device's VPD area by parsing the VPD Structure at offset 0x000, > and restricts accesses to the VPD to that computed size. > > Our devices have a second VPD structure which is located starting at offset > 0x400 which is the "real" VPD[1]. The 104daa71b396 commit (plus a follow on > commit 408641e93aa5) caused efforts to read past the end of that computed > length of the VPD to return silently without error leaving stack junk in the > VPD read buffers. > > We introduced kernel.org commit cb92148b to allow a driver to tell the > kernel how large the VPD area really is, introducing a new API > pci_set_vpd_size() for this purpose. > > Now we've discovered a new subtlety to the problem. > > We have a KVM Hypervisor running a 4.9.70 kernel. So it has all of the > above commits. When we attach our Physical Function 4 to a Virtual Machine > and attempt to run cxgb4 in that VM, we see the problem again. The issue is > that all of the VM Guest OS's efforts to access the PCIe VPD Capability are > trapped into the KVM 4.9.70 kernel and executed there, with the results > routed back to the VM Guest OS. The cxgb4 driver in the VM Guest OS uses > the new pci_set_vpd_size() to notify the OS of the true size of the VPD, but > that information of course is never sent to the KVM 4.9.70 Hypervisor. > (And, truth be told, if the Guest OS were older than 4.6, it wouldn't even > know that it needed to do this.) The result is that again we get silent VPD > read failures with random stack garbage in the VPD read buffers. (sigh) Let me pull out one tiny piece of this problem: If the VPD read returns failure, the caller should not look at the read buffer. But we should *never* copy random stack garbage into the read buffer, no matter what the VPD read returns. I guess it's the 4.9.70 kernel that's putting garbage into the VPD read buffer? Is this something that needs to be fixed in the current upstream kernel?