Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1468234imu; Sat, 26 Jan 2019 03:38:59 -0800 (PST) X-Google-Smtp-Source: ALg8bN4qBkUMbDL0mwG1S4m+gPuUYlaEI4JG4ywnG1Dj61c8Yv2mrCoKovh6blt2tgVRdqeSOcOU X-Received: by 2002:a62:30c3:: with SMTP id w186mr14696996pfw.39.1548502739508; Sat, 26 Jan 2019 03:38:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548502739; cv=none; d=google.com; s=arc-20160816; b=CRWLsts9ka2apzUewummQNa579ehsrUBu8rA56IzYPWM0jpYiNbEDLSCo1y0HUwaiD VjfB9O5LsF5uYedx3Z4pXN9bCddra59xKZFOkQMN4+Tg6pwP42/hWCID8INANNQpzLfD pwCOlD/I7SpGbR+z5xXG0ZksoEoMA9PhZWT8JYOzC7vcL7JdwRa/Nwlor+V+r8amV431 6TdffF7LWGd0TlsKa9ISgxg84/zos2kTS5NRopMNI7NZknvBYFsJvMkfr5XL/Hxngnha ujelHklEa8kXJ6mbeCWK+ZEEtUoMg772XVLUporJGJCozBOWRLrxN9O4RjXevAVaNa9T 6JIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:organization:user-agent :references:in-reply-to:subject:cc:to:from:message-id:date; bh=/7rwRuQwcWmz3FKyhRcRZidiCqaLNyHEDa+UykoL+ls=; b=PrTGmoNeraumqLnkYvqgXUkkDeMx23ai0DOfu4PVEfwWhImKlVSbsMIBwt6DpnHmC9 CmKmS5n75ZQlCHNN1Py1Bv+8YLmwsYehv9/R6gBKbGkFNdbuIwtmxmBrEO31W8YQ7rrl kcKmCEp5o2wIvTPQKkzZtYPgKgtxg8jzyZPkNHhTJf2Gdaf/hCb2RxFwD4OSs70xA7uI TOLQbOr01gXkbphQXZ8u9D6DEhZ+YCwAqTMrU1g2wexz3yNllvJE4YqnfMmBF/Uy8Y0n RfHzq0/8dR86c3mD8OqpWjBTq+STL7AZh95sn2xtrfNpxEF/kqcGA9ivjKMFQxR2pboL OCOw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z123si22694877pfb.104.2019.01.26.03.38.41; Sat, 26 Jan 2019 03:38:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726361AbfAZLii (ORCPT + 99 others); Sat, 26 Jan 2019 06:38:38 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:57716 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726165AbfAZLii (ORCPT ); Sat, 26 Jan 2019 06:38:38 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2FA23A78; Sat, 26 Jan 2019 03:38:37 -0800 (PST) Received: from big-swifty.misterjones.org (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ADBFA3F5C1; Sat, 26 Jan 2019 03:38:35 -0800 (PST) Date: Sat, 26 Jan 2019 11:38:32 +0000 Message-ID: <86bm438x8n.wl-marc.zyngier@arm.com> From: Marc Zyngier To: Zheng Xiang Cc: , , , Subject: Re: [PATCH] irqchip/gic-v3-its: Lock its device list during find and create its device In-Reply-To: <20190126061624.5260-1-zhengxiang9@huawei.com> References: <20190126061624.5260-1-zhengxiang9@huawei.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/25.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: ARM Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Zheng, On Sat, 26 Jan 2019 06:16:24 +0000, Zheng Xiang wrote: > > Currently each PCI device under a PCI Bridge shares the same device id > and ITS device. Assume there are two PCI devices call its_msi_prepare > concurrently and they are both going to find and create their ITS > device. There is a chance that the later one couldn't find ITS device > before the other one creating the ITS device. It will cause the later > one to create a different ITS device even if they have the same > device_id. Interesting finding. Is this something you've actually seen in practice with two devices being probed in parallel? Or something that you found by inspection? The whole RID aliasing is such a mess, I wish we never supported it. Anyway, comments below. > > Signed-off-by: Zheng Xiang > --- > drivers/irqchip/irq-gic-v3-its.c | 52 +++++++++++++++------------------------- > 1 file changed, 19 insertions(+), 33 deletions(-) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c > index db20e99..397edc8 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -2205,25 +2205,6 @@ static void its_cpu_init_collections(void) > raw_spin_unlock(&its_lock); > } > > -static struct its_device *its_find_device(struct its_node *its, u32 dev_id) > -{ > - struct its_device *its_dev = NULL, *tmp; > - unsigned long flags; > - > - raw_spin_lock_irqsave(&its->lock, flags); > - > - list_for_each_entry(tmp, &its->its_device_list, entry) { > - if (tmp->device_id == dev_id) { > - its_dev = tmp; > - break; > - } > - } > - > - raw_spin_unlock_irqrestore(&its->lock, flags); > - > - return its_dev; > -} > - > static struct its_baser *its_get_baser(struct its_node *its, u32 type) > { > int i; > @@ -2321,7 +2302,7 @@ static bool its_alloc_vpe_table(u32 vpe_id) > static struct its_device *its_create_device(struct its_node *its, u32 dev_id, > int nvecs, bool alloc_lpis) > { > - struct its_device *dev; > + struct its_device *dev = NULL, *tmp; > unsigned long *lpi_map = NULL; > unsigned long flags; > u16 *col_map = NULL; > @@ -2331,6 +2312,24 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id, > int nr_ites; > int sz; > > + raw_spin_lock_irqsave(&its->lock, flags); > + list_for_each_entry(tmp, &its->its_device_list, entry) { > + if (tmp->device_id == dev_id) { > + dev = tmp; > + break; > + } > + } > + if (dev) { > + /* > + * We already have seen this ID, probably through > + * another alias (PCI bridge of some sort). No need to > + * create the device. > + */ > + pr_debug("Reusing ITT for devID %x\n", dev_id); > + raw_spin_unlock_irqrestore(&its->lock, flags); > + return dev; > + } > + > if (!its_alloc_device_table(its, dev_id)) You're now performing all sort of allocations in an atomic context, which is pretty horrible (and the kernel will shout at you for doing so). We could probably keep the current logic and wrap it around a mutex instead, which would give us the appropriate guarantees WRT allocations. Something along those lines (untested): diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index db20e992a40f..99feb62e63ba 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -97,9 +97,14 @@ struct its_device; * The ITS structure - contains most of the infrastructure, with the * top-level MSI domain, the command queue, the collections, and the * list of devices writing to it. + * + * alloc_lock has to be taken for any allocation that can happen at + * run time, while the spinlock must be taken to parse data structures + * such as the device list. */ struct its_node { raw_spinlock_t lock; + struct mutex alloc_lock; struct list_head entry; void __iomem *base; phys_addr_t phys_base; @@ -2421,6 +2426,7 @@ static int its_msi_prepare(struct irq_domain *domain, struct device *dev, struct its_device *its_dev; struct msi_domain_info *msi_info; u32 dev_id; + int err = 0; /* * We ignore "dev" entierely, and rely on the dev_id that has @@ -2443,6 +2449,7 @@ static int its_msi_prepare(struct irq_domain *domain, struct device *dev, return -EINVAL; } + mutex_lock(&its->alloc_lock); its_dev = its_find_device(its, dev_id); if (its_dev) { /* @@ -2455,11 +2462,14 @@ static int its_msi_prepare(struct irq_domain *domain, struct device *dev, } its_dev = its_create_device(its, dev_id, nvec, true); - if (!its_dev) - return -ENOMEM; + if (!its_dev) { + err = -ENOMEM; + goto out; + } pr_debug("ITT %d entries, %d bits\n", nvec, ilog2(nvec)); out: + mutex_unlock(&its->alloc_lock); info->scratchpad[0].ptr = its_dev; return 0; } @@ -3516,6 +3526,7 @@ static int __init its_probe_one(struct resource *res, } raw_spin_lock_init(&its->lock); + mutex_init(&its->alloc_lock); INIT_LIST_HEAD(&its->entry); INIT_LIST_HEAD(&its->its_device_list); typer = gic_read_typer(its_base + GITS_TYPER); I still feel that the issue you're seeing here is much more generic. Overall, there is no guarantee that for a given MSI domain, no two allocation will take place in parallel, and maybe that's what we should enforce instead. Thanks, M. -- Jazz is not dead, it just smell funny.