Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp3935964imm; Mon, 11 Jun 2018 04:25:03 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKGH5++4E9QLFpnJnGR6LyV3V1BaitQTcsvJZ7ezg7uVI7hLWgIAzp8rfvLue4yBBb/XHeJ X-Received: by 2002:a17:902:4424:: with SMTP id k33-v6mr2689357pld.242.1528716303262; Mon, 11 Jun 2018 04:25:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528716303; cv=none; d=google.com; s=arc-20160816; b=zdA0jxZGsMGB87dA5SnXQp6ydP9sJHUDBf/rR+K2q5hwV+9AJ3b9qj7sejXyhVSEbq +zP1A1zdxNmEsDl/Ts4iVJiO1xwToHCtw++l58jSCyoqEAF7Ig7ji5zKLRcVwJ2d50X+ MNuYnl/ohZyi4gJfw2iwXfZQTm9I7RpVlXcOWRUs3l/Q4skKZi2eHd4A2fevTitexYsj sqiV8oTqDkxFoe52CBZt74DCclzEELUdNswG4coA8U/QzEdlG/A9CAwGLuEJ6C140yvD f5GHd88kumQAqgClFD+oomVkhJ/aek9JdYs02BJYIStrM7KN8BcTUdcxI501Xwut3G4z 24kA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:references:to :subject:arc-authentication-results; bh=kKJe3dEhrN//7TRr3VI8jiC2ctpO/RBGFTi+mP/8NgE=; b=S6Xgi100zLwpH5Bkwu6+xciY3pe3aF8KsOimcgvQSrIh7PS5X56vMg9q1fSBBGN4LJ qEznV1WkBBTdTmgGIHgZ9adahdr6cpiJbWTAfMZvfXypAdU2s1EQM5pjMUvYGj1rj+9Z MhlM+GC1ZHo5P8SdljVK9XFLrXkwZQ1j7VZVeZtCHl0ryFyaKIWsFJGxC1HXW6GNKiyw SDhVbHSe3mFOWxI7Ex0s5k1H+3D7mjUboTcuizOXrrCa80FT8Kj4s5o+EyrZOhOqa27O wSCdo3FZDHv5Sjk1eVGzPqSah8gHh6rYAfusOWvleAsI2+4PuabxMGdy23JFjB35gGbg 6jbA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a24-v6si38730294pls.129.2018.06.11.04.24.48; Mon, 11 Jun 2018 04:25:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932866AbeFKLYY (ORCPT + 99 others); Mon, 11 Jun 2018 07:24:24 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:50837 "EHLO huawei.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S932804AbeFKLYX (ORCPT ); Mon, 11 Jun 2018 07:24:23 -0400 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 6535629807F53; Mon, 11 Jun 2018 19:24:19 +0800 (CST) Received: from [127.0.0.1] (10.177.19.219) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.382.0; Mon, 11 Jun 2018 19:24:14 +0800 Subject: Re: [PATCH v2] irqchip/gic-v3-its: fix ITS queue timeout To: Marc Zyngier , References: <1528252824-15144-1-git-send-email-yangyingliang@huawei.com> <0ebd8eef-1a86-3c7e-cd3b-f9580c497b5c@arm.com> CC: , , Hanjun Guo From: Yang Yingliang Message-ID: <5B1E5BBE.9010907@huawei.com> Date: Mon, 11 Jun 2018 19:23:42 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <0ebd8eef-1a86-3c7e-cd3b-f9580c497b5c@arm.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.19.219] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Marc On 2018/6/11 17:31, Marc Zyngier wrote: > On 06/06/18 03:40, Yang Yingliang wrote: >> When the kernel booted with maxcpus=x, 'x' is smaller >> than actual cpu numbers, the TAs of offline cpus won't >> be set to its->collection. >> >> If LPI is bind to offline cpu, sync cmd will use zero TA, >> it leads to ITS queue timeout. Fix this by choosing a >> online cpu, if there is no online cpu in cpu_mask. >> >> Signed-off-by: Yang Yingliang >> --- >> drivers/irqchip/irq-gic-v3-its.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c >> index 5416f2b..d8b9539 100644 >> --- a/drivers/irqchip/irq-gic-v3-its.c >> +++ b/drivers/irqchip/irq-gic-v3-its.c >> @@ -2309,7 +2309,9 @@ static int its_irq_domain_activate(struct irq_domain *domain, >> cpu_mask = cpumask_of_node(its_dev->its->numa_node); >> >> /* Bind the LPI to the first possible CPU */ >> - cpu = cpumask_first(cpu_mask); >> + cpu = cpumask_first_and(cpu_mask, cpu_online_mask); >> + if (cpu >= nr_cpu_ids) >> + cpu = cpumask_first(cpu_online_mask); > I've thought about this one a bit more, and apart from breaking TX1 > in a very bad way, I think it is actually correct. It is just that > the commit message doesn't make much sense. > > The way I understand it is: > - this is a NUMA system, with at least one node not online > - the SRAT table indicates that this ITS is local to an offline node Yes, your comment is more proper and correct. Mine describes how the BUG happens. I will send a v3 later with proper comment. Thanks, Yang > > In that case, we need to pick an online CPU, and any will do (again, > ignoring the silly Cavium erratum). Explained like this, the above > hunk is sensible, and just needs to handle the TX1 quirk. Something like: > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c > index 5416f2b2ac21..21b7b5151177 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -2309,7 +2309,13 @@ static int its_irq_domain_activate(struct irq_domain *domain, > cpu_mask = cpumask_of_node(its_dev->its->numa_node); > > /* Bind the LPI to the first possible CPU */ > - cpu = cpumask_first(cpu_mask); > + cpu = cpumask_first_and(cpu_mask, cpu_online_mask); > + if (cpu >= nr_cpu_idx) { > + if (its_dev->its->flags & ITS_FLAGS_WORKAROUND_CAVIUM_23144) > + return -EINVAL; > + > + cpu = cpumask_first(cpu_online_mask); > + } > its_dev->event_map.col_map[event] = cpu; > irq_data_update_effective_affinity(d, cpumask_of(cpu)); > > >> its_dev->event_map.col_map[event] = cpu; >> irq_data_update_effective_affinity(d, cpumask_of(cpu)); >> >> @@ -2466,7 +2468,10 @@ static int its_vpe_set_affinity(struct irq_data *d, >> bool force) >> { >> struct its_vpe *vpe = irq_data_get_irq_chip_data(d); >> - int cpu = cpumask_first(mask_val); >> + int cpu = cpumask_first_and(mask_val, cpu_online_mask); >> + >> + if (cpu >= nr_cpu_ids) >> + cpu = cpumask_first(cpu_online_mask); >> >> /* >> * Changing affinity is mega expensive, so let's be as lazy as >> > This hunk, on the other hand, is completely useless. Look how this is > called from vgic_v4_flush_hwstate(): > > err = irq_set_affinity(irq, cpumask_of(smp_processor_id())); > > The mask is always that of the CPU we run on, and we're in a non-premptible > section. So no way we can be targeting an offline CPU. > > If you quickly respin this patch with a decent commit log, I'll take it. > > Thanks, > > M.