Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3780505imm; Mon, 20 Aug 2018 04:45:02 -0700 (PDT) X-Google-Smtp-Source: AA+uWPx6OYoHI1R1xB05U45+nkD1BQPcQbsr4OrqffaYGYTJgELxpNjMNtJAUQKWmZPQ7TDtREHW X-Received: by 2002:a62:5613:: with SMTP id k19-v6mr47836701pfb.212.1534765501926; Mon, 20 Aug 2018 04:45:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534765501; cv=none; d=google.com; s=arc-20160816; b=CS4DAnI4CE1Ct4dplNIcsqbFABE98snZkkmxD3G4Wow5uPXt2/lQGC8UsEKoQJqVtu hYRPIazho8CMO6iCOyERIC7B6kNnPy0gjrM9xMBV71koc+OJPnYBPrlZlyGuViXrDHAk +Yz9r+m/4FNx9FYyowe2uogxzc17X5o/JvR0Id0ynOVU33Hrk5dUZ8FnaGArUPmp/VE9 WqWlw6MVrQw7X1zFylEalYc3n2vSDekZ50Egb8m1tJ6QuE+VN2gohDemHvRDQdDs1BoL HOyA5uH6mhAPpk70VNow/fqRaDFspHeFhnOQvkpRGyzZlr/eF/GA3Y/A62iQ85xbellM W4Lg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=1lFxuQ+rJ8f3vGxa5mrmC0/WSn5E1c4FuXgT2koIl+E=; b=a59XvfpF34Xprp42muDqwfWn3MFR4oFgAhCQo3WpfCWsyRjIFaOe4L7+UpceNEHX/q qea/qGeIHnE3tZGchyZQAaLUdqztOl9MRmrumJWdtO9W7dxsTLgDq5OjzWMRc+Og+B8b vGjURu8FRH4C4+FzL/7bwz47d8jfdB+O0gfRJnGnbVfhwcDMfJt0uNkwPflc4GaHAfBQ FKf8Enxo1/i0gDxuYKglH4TxKFIx3fCe/i5qbfD6GWvx94KSWix7XUl76cj6yXDwU7GA zvyiJdGRuN6Gj9TTYlEG8FKn384z58swH1wYJjwqFew46BGuXXzyQHmQM0FqDm3rnmmL xAQA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=eHjiXG6w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k14-v6si9413437pga.149.2018.08.20.04.44.46; Mon, 20 Aug 2018 04:45:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=eHjiXG6w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726620AbeHTO7A (ORCPT + 99 others); Mon, 20 Aug 2018 10:59:00 -0400 Received: from mail-oi0-f67.google.com ([209.85.218.67]:38594 "EHLO mail-oi0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726599AbeHTO67 (ORCPT ); Mon, 20 Aug 2018 10:58:59 -0400 Received: by mail-oi0-f67.google.com with SMTP id v8-v6so25228286oie.5 for ; Mon, 20 Aug 2018 04:43:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=1lFxuQ+rJ8f3vGxa5mrmC0/WSn5E1c4FuXgT2koIl+E=; b=eHjiXG6w1TIKMrWpzY/QBf79EBpVUmMzgqTI2knz/mwDbCbTcW48Ps9BKaC+hNu3dJ /Gbb2BR1lcloA78G8UPSEQ21z06QtA+lySMtgk8vCv+0z6+EDbnRIRxhnn2E2xvCxhgu dm5ixZx0VsX/1ZL3y65hH1X8FyoUmUZdJQC+4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=1lFxuQ+rJ8f3vGxa5mrmC0/WSn5E1c4FuXgT2koIl+E=; b=lT5hbnBUqJrgq4FRN4awaVsscZIp8WKvHdXQQhmaeApKcpWWmRRhYL9tGvvgJsWr5w +5TP7BNT5IbSWC6ts65fpr8CE/m83b7dIJaARfC0l85Yu/u39SVeSRtNIR49cC35bm4j HnZYvdxrgJimX38LmNc+d7XmCl0Y4MQZq8wiRRm2xDWqTXXs3ojqQMXMg+c4JC+CaOVb ZhxqktrX64HaNPUvDdDo+lCjwELW6VMwbhch5pjjG2MOBHLJ3wcOKn6islfVd/QA2Kqz wc/LLg+AXHcwr2S0iHCzKw+Q7PBui3kHGBA8inkn7CCG0JyeYNZ9IPMZ+o/CIQfSCzqv xDWQ== X-Gm-Message-State: AOUpUlElOzDuCOXzaH1p3gNVjVBgwqBgrl9KUPw0Kxc1cXAZ37Mb0LWa EmS8Pi/dMZFTMLmWYmNk9t7onKpU3CJBB63gqicPSA== X-Received: by 2002:aca:5b0b:: with SMTP id p11-v6mr14510248oib.116.1534765420989; Mon, 20 Aug 2018 04:43:40 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a4a:6a83:0:0:0:0:0 with HTTP; Mon, 20 Aug 2018 04:43:40 -0700 (PDT) In-Reply-To: <21451a6ba97eada5d4a8a49b2726edde58266817.camel@kernel.crashing.org> References: <20180817044902.31420-1-benh@kernel.crashing.org> <20180817044902.31420-2-benh@kernel.crashing.org> <20180817154431.GC128050@bhelgaas-glaptop.roam.corp.google.com> <06c1233b71dea08b1fc32334acefc48c32c46557.camel@kernel.crashing.org> <20180819022440.GG128050@bhelgaas-glaptop.roam.corp.google.com> <4d777ed8c17b479e59b16cc0b4e9a3f6537f9884.camel@kernel.crashing.org> <21451a6ba97eada5d4a8a49b2726edde58266817.camel@kernel.crashing.org> From: Hari Vyas Date: Mon, 20 Aug 2018 17:13:40 +0530 Message-ID: Subject: Re: [RFC PATCH 1/6] Revert "PCI: Fix is_added/is_busmaster race condition" To: Benjamin Herrenschmidt Cc: Bjorn Helgaas , linux-pci@vger.kernel.org, Ray Jui , Srinath Mannam , Guenter Roeck , Jens Axboe , Lukas Wunner , Konstantin Khlebnikov , Marta Rybczynska , Pierre-Yves Kerbrat , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 20, 2018 at 4:39 PM, Benjamin Herrenschmidt wrote: > On Mon, 2018-08-20 at 11:55 +0530, Hari Vyas wrote: >> On Mon, Aug 20, 2018 at 7:40 AM, Benjamin Herrenschmidt >> wrote: >> > On Sat, 2018-08-18 at 21:24 -0500, Bjorn Helgaas wrote: >> > > On Sat, Aug 18, 2018 at 01:24:51PM +1000, Benjamin Herrenschmidt wrote: >> > > > On Fri, 2018-08-17 at 10:44 -0500, Bjorn Helgaas wrote: >> > > > > On Fri, Aug 17, 2018 at 02:48:57PM +1000, Benjamin Herrenschmidt wrote: >> > > > > > This reverts commit 44bda4b7d26e9fffed6d7152d98a2e9edaeb2a76. >> > > > > >> > > > > Just to be clear, if I understand correctly, this is a pure revert of >> > > > > 44bda4b7d26e and as such it reintroduces the problem solved by that >> > > > > commit. >> > > > > >> > > > > If your solution turns out to be better, that's great, but it would be >> > > > > nice to avoid the bisection hole of reintroducing the problem, then >> > > > > fixing it again later. >> > > > >> > > > There is no way to do that other than merging the revert and the fix >> > > > into one. That said, the race is sufficiently hard to hit that I >> > > > wouldn't worry too much about it. >> > > >> > > OK, then at least mention that in the changelog. >> > >> > Sure will do. This is just RFC at this stage :-) >> > >> > As for the race with enable, what's your take on my approach ? The >> > basic premise is that we need some kind of mutex to make the updates to >> > enable_cnt and the actual config space changes atomic. While at it we >> > can fold pci_set_master vs. is_busmaster as well as those are racy too. >> > >> > I chose to create a new mutex which we should be able to address other >> > similar races if we find them. The other solutions that I dismissed >> > were: >> > >> > - Using the device_lock. A explained previously, this is tricky, I >> > prefer not using this for anything other than locking against >> > concurrent add/remove. The main issue is that drivers will be sometimes >> > called in context where that's already held, so we can't take it inside >> > pci_enable_device() and I'd rather not add new constraints such as >> > "pci_enable_device() must be only called from probe() unless you also >> > take the device lock". It would be tricky to audit everybody... >> > >> > - Using a global mutex. We could move the bridge lock from AER to core >> > code for example, and use that. But it doesn't buy us much, and >> > slightly redecuces parallelism. It also makes it a little bit more >> > messy to walk up the bridge chain, we'd have to do a >> > pci_enable_device_unlocked or something, messy. >> > >> > So are you ok with the approach ? Do you prefer one of the above >> > regardless ? Something else ? >> > >> > Cheers, >> > Ben. >> > >> > >> >> Some concern was raised about race situation so just to be more clear >> about race condition. > > This is not what the above discussion is about. > > The race with is is_added is due to the fact that the bit is set too > later as discussed previously and in my changelog. > > The race I'm talking about here is the race related to multiple > subtrees trying to simultaneously enable a parent bridge. > >> Situation is described in Bug 200283 - PCI: Data corruption happening >> due to a race condition. >> Issue was discovered by our broadcom QA team. >> Initially commit was to use one separate lock only for avoiding race >> condition but after review, approach was slightly changed to move >> is_added to pci_dev private flags as it was completely >> internal/private variable of pci core driver. >> Powerpc legacy arch code was using is_added flag directly which looks >> bit strange so ../../ type of inclusion of pci.h was required. I know >> it looks ugly but it is being used in some legacy code still. >> Anyway, as stated earlier too, problem should be just solved in better way. > > The is_added race can be fixed with a 3 lines patch moving is_added up > to before device_attach() I believe. I yet have to find a scenario > where doing so would break something but it's possible that I missed a > case. > > At this stage, I'm more intested however in us agreeing how to fix the > other race, the one with enabling. As I wrote above, I proposed an > approach based on a mutex in pci_dev, and this is what I would like > discussed. > > This race is currently causing our large systems to randomly error out > at boot time when probing some PCIe devices. I'm putting a band-aid in > the powerpc code for now to pre-enable bridges at boot, but I'd rather > have the race fixed in the generic code. > > Ben. > > I was initially using spin lock in "PATCH v1] PCI: Data corruption happening due to race condition" and didn't face issue at-least in our environment for our race condition. Anyway, we can leave this minor is_added issue time-being and concentrate on current pci bridge concern. Could you re-share your latest patch in your environment and your first doubt where race situation could happen. May be forum can suggest some-thing good. Regard, hari.