Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3523382imm; Sun, 19 Aug 2018 23:27:26 -0700 (PDT) X-Google-Smtp-Source: AA+uWPz77H39TKpOYW8FlD2/g4r5MMtgUOvm8D9ItuSZLR/UjimlsrxV4/8rUMJ0iKSyt6RcgsV1 X-Received: by 2002:a17:902:b111:: with SMTP id q17-v6mr7810943plr.284.1534746446830; Sun, 19 Aug 2018 23:27:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534746446; cv=none; d=google.com; s=arc-20160816; b=TVF8L+0W/ZOuEleALKPaMNXGBJZctt4q5bk5rdeVWARlsVSnIEtvBPPoCVUi8n5bs4 xiaWgzbSfhv9e/lVJAgZ0sgpFjbCjn48n4W3jPvWrYwr/kJZ+vhg0lGpE4jEy5Hf5ln0 DlkqnN94rKrgf/Soza9IVxrVVmpz3rw7rfDMNOMrCe/W/jLEWlGw9Nzl3kjRiCrMGfah ShusGzViMynJqHeEVsZkr+LvSZlfql3iGKO2/8vmZp/FhVT7otKjuYmaivL2s6PgOWPD iIUfwqfNL+MuudDCoiJMzl3y69PBf73do25azTmk2KpLcxU2cRn+OM+YE3erHOmO6hGO awog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=7jf+0bCo2RUt9ol4S1b7IVaFt/uy69mfCor2sPKaIXM=; b=VBkd9cr2xO0MwAoE7OvmBJFKCNxjcJykFLcdAKSriep/ovYN5qZg4MHeKraINzIHgT sxPqDDK7ImM18TnbT0VaVfub8KqNzG7itQ9ZT0coN00p5VMahJP4bnsu/d3m2eHfY27C uz2vY/pHH9oyBO9t4n7DC4rKEf7RE4z5KVuU8Ses0wTtuNU7fGmXorjDqoM5MAiX/28+ cBAPvAtCJCS1TVAXKxHTYnBXj3c8sJeSdcv+9TBKYLsPYGgHm/CSMSAwiOpLWOldyECf Z5xrEGLfw+BAzqPxmMP3YRJuriBcT4RWkX/XAROILSPi4yJlx2Vra+jVcYejpKR5/UQS mu+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=erzCEuAx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x19-v6si8541257pgf.477.2018.08.19.23.27.11; Sun, 19 Aug 2018 23:27:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=erzCEuAx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726346AbeHTJjb (ORCPT + 99 others); Mon, 20 Aug 2018 05:39:31 -0400 Received: from mail-oi0-f68.google.com ([209.85.218.68]:33298 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726026AbeHTJjb (ORCPT ); Mon, 20 Aug 2018 05:39:31 -0400 Received: by mail-oi0-f68.google.com with SMTP id 8-v6so23962920oip.0 for ; Sun, 19 Aug 2018 23:25:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=7jf+0bCo2RUt9ol4S1b7IVaFt/uy69mfCor2sPKaIXM=; b=erzCEuAxbOUb3xd3BsI+ZfE8uIm8R0T+7ipblTKodYJfZQo/JfZE2fvMkX2L59N8a0 YYWpkaxi+Cnyv2EAhF7iuuNtDF3STwXATNR0O/R/oreWC4O7SyNh1t5bWdE5Gm4VD3VU 09IWwourlTGnJFe/3FwX+LZScbyh/+4X/WO+s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=7jf+0bCo2RUt9ol4S1b7IVaFt/uy69mfCor2sPKaIXM=; b=dkZyiveoDfKM3LNNbILsRFTDBymJEgCc+L6nXGGKYNnFZUAG5sapBFyei/74CWOZ+r i9ceYngDabhKwfPNLKPz0B9AywrmCAn/ss6ANb51URTud2O98+P9bgjNn3yC3zKG8x4b uibXGvlCiDXkmxgkHV3bwmJAQPabMpGwmLsmTkCvW0qXOtGJsH9ICPifLULmhfdDlaU6 +Xs8iQ8GXZ8jCCwUsReOAypiE0EZw0F2cuVsHcPxnYMRMDv4yXtIdU/E7JR7JWrNoVf4 w0L7rxT1sBWxbkGRhyry7G/FECLC9FI91jtIWvQgayLjUNIdm8tEbcqgdPe3rT6r3bwe iHUw== X-Gm-Message-State: AOUpUlGnbyR5F+M/Y6DImJUIT7UGbY959o1/IjnF9iNXMBMNkNOL+yIT RqoY1jqzWiOE6g5E//bScED4D2Xw1fGTyLnEsOaMww== X-Received: by 2002:aca:d4d7:: with SMTP id l206-v6mr11917533oig.52.1534746310530; Sun, 19 Aug 2018 23:25:10 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a4a:6a83:0:0:0:0:0 with HTTP; Sun, 19 Aug 2018 23:25:09 -0700 (PDT) In-Reply-To: <4d777ed8c17b479e59b16cc0b4e9a3f6537f9884.camel@kernel.crashing.org> References: <20180817044902.31420-1-benh@kernel.crashing.org> <20180817044902.31420-2-benh@kernel.crashing.org> <20180817154431.GC128050@bhelgaas-glaptop.roam.corp.google.com> <06c1233b71dea08b1fc32334acefc48c32c46557.camel@kernel.crashing.org> <20180819022440.GG128050@bhelgaas-glaptop.roam.corp.google.com> <4d777ed8c17b479e59b16cc0b4e9a3f6537f9884.camel@kernel.crashing.org> From: Hari Vyas Date: Mon, 20 Aug 2018 11:55:09 +0530 Message-ID: Subject: Re: [RFC PATCH 1/6] Revert "PCI: Fix is_added/is_busmaster race condition" To: Benjamin Herrenschmidt Cc: Bjorn Helgaas , linux-pci@vger.kernel.org, Ray Jui , Srinath Mannam , Guenter Roeck , Jens Axboe , Lukas Wunner , Konstantin Khlebnikov , Marta Rybczynska , Pierre-Yves Kerbrat , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 20, 2018 at 7:40 AM, Benjamin Herrenschmidt wrote: > On Sat, 2018-08-18 at 21:24 -0500, Bjorn Helgaas wrote: >> On Sat, Aug 18, 2018 at 01:24:51PM +1000, Benjamin Herrenschmidt wrote: >> > On Fri, 2018-08-17 at 10:44 -0500, Bjorn Helgaas wrote: >> > > On Fri, Aug 17, 2018 at 02:48:57PM +1000, Benjamin Herrenschmidt wrote: >> > > > This reverts commit 44bda4b7d26e9fffed6d7152d98a2e9edaeb2a76. >> > > >> > > Just to be clear, if I understand correctly, this is a pure revert of >> > > 44bda4b7d26e and as such it reintroduces the problem solved by that >> > > commit. >> > > >> > > If your solution turns out to be better, that's great, but it would be >> > > nice to avoid the bisection hole of reintroducing the problem, then >> > > fixing it again later. >> > >> > There is no way to do that other than merging the revert and the fix >> > into one. That said, the race is sufficiently hard to hit that I >> > wouldn't worry too much about it. >> >> OK, then at least mention that in the changelog. > > Sure will do. This is just RFC at this stage :-) > > As for the race with enable, what's your take on my approach ? The > basic premise is that we need some kind of mutex to make the updates to > enable_cnt and the actual config space changes atomic. While at it we > can fold pci_set_master vs. is_busmaster as well as those are racy too. > > I chose to create a new mutex which we should be able to address other > similar races if we find them. The other solutions that I dismissed > were: > > - Using the device_lock. A explained previously, this is tricky, I > prefer not using this for anything other than locking against > concurrent add/remove. The main issue is that drivers will be sometimes > called in context where that's already held, so we can't take it inside > pci_enable_device() and I'd rather not add new constraints such as > "pci_enable_device() must be only called from probe() unless you also > take the device lock". It would be tricky to audit everybody... > > - Using a global mutex. We could move the bridge lock from AER to core > code for example, and use that. But it doesn't buy us much, and > slightly redecuces parallelism. It also makes it a little bit more > messy to walk up the bridge chain, we'd have to do a > pci_enable_device_unlocked or something, messy. > > So are you ok with the approach ? Do you prefer one of the above > regardless ? Something else ? > > Cheers, > Ben. > > Some concern was raised about race situation so just to be more clear about race condition. Situation is described in Bug 200283 - PCI: Data corruption happening due to a race condition. Issue was discovered by our broadcom QA team. Initially commit was to use one separate lock only for avoiding race condition but after review, approach was slightly changed to move is_added to pci_dev private flags as it was completely internal/private variable of pci core driver. Powerpc legacy arch code was using is_added flag directly which looks bit strange so ../../ type of inclusion of pci.h was required. I know it looks ugly but it is being used in some legacy code still. Anyway, as stated earlier too, problem should be just solved in better way. Regards, hari