Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp2333844ybt; Fri, 3 Jul 2020 06:45:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy7pJjZq34GO1GpCSDM6WJgZBugtIaC9ICf24vjzg4jF18zHc3HwO2dXQ7qOLS3Tmc0lfHB X-Received: by 2002:a17:906:ca43:: with SMTP id jx3mr21241178ejb.208.1593783902913; Fri, 03 Jul 2020 06:45:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593783902; cv=none; d=google.com; s=arc-20160816; b=wtXFTTrD8lqW3tB70YAWu/ZkC48XPGwUGdBXYzOTUhiqzroupOsydLOPYiRSSWedl1 DfgT+ttMd0UHBoyTtPAJpypB6kJE/uWSzAarBgBSiO/Ex4Q95de2mJxssL01izru8rWL A/4UKjzFPRR6VIUgLUE2bZdYIwin/xBJBdUpuVZsdg+93fkB+r8S+yFjSatEfaZ9PSib aGQKETX7+++11bxt8C7KdDoyWx3EXHJwAkBq0Ws15rMBVh9WZOJLrSPY6CO1GYrl6wcS QMkqz6Rx8pDnI41T3ABuFoKTTi0wM+L1sgetK8vx6stu/MHb8VbpkqdG4feAVhvMGRTf jZNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=6NqLxgo4imFsbrejKhEbj34P3+4D8XQssCkmXu1TP7w=; b=uDj0H4aXHh6pgDKA8CaM/r7SWKxNhpXX/U3cYOZ7zn5V5X8mTVkO6r4yFCWBm+lshq 2oHtSBGZfI7jkHAuBY15A2bFV94qDOUVDlh0CtxT4KlXzn3SDFCqvugw+1E2ugrqOre/ lAdwOa4lz3x/jUyMrrBzjcHjoMCmeS9InACdDjkrlJu7VWkYsVaSe9LrftlZt6NSsgTO zn7BL3SoapUNEpYItmHaY2vQHtYdrpYnASDnP8ZlUJSGVpRXsVeBQHhlcZ91/7FA9SYw XxaJcK4lRpjtx56ZxvYpxZ4EHXTy12cjKxH40EuSJr8o0SKqWPc+QjgNdtcgwZA2oqPd VT8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=VDtAwYk4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dg17si7975489edb.606.2020.07.03.06.44.39; Fri, 03 Jul 2020 06:45:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=VDtAwYk4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726108AbgGCNmU (ORCPT + 99 others); Fri, 3 Jul 2020 09:42:20 -0400 Received: from mail.kernel.org ([198.145.29.99]:35710 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726048AbgGCNmT (ORCPT ); Fri, 3 Jul 2020 09:42:19 -0400 Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1CB3220772; Fri, 3 Jul 2020 13:42:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1593783738; bh=cToOqdlvOK3QskA5AoImVKSbV3a2bi08lfqB8HxpdMw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=VDtAwYk42WX3U3gEyhndSgHS6/8jDyy3KwJdpIPfKk8iZGKi9Kl9IXfPRU5L1FgIY nmi23K+QQUCyFl/1ufWiIVQ/JGpBULwvbJlxwDGAmelesoZYvQzeiRCefWQD13Gjjp le6UmPU/+gAzllcUi/0kRykshDoqyr2vo4Ei+yJU= Date: Fri, 3 Jul 2020 14:42:14 +0100 From: Will Deacon To: Robin Murphy Cc: mark.rutland@arm.com, tuanphan@os.amperecomputing.com, john.garry@huawei.com, linux-kernel@vger.kernel.org, shameerali.kolothum.thodi@huawei.com, harb@amperecomputing.com, linux-arm-kernel@lists.infradead.org Subject: Re: [RFC PATCH] perf/smmuv3: Fix shared interrupt handling Message-ID: <20200703134213.GE18953@willie-the-truck> References: <20200624125045.GC6270@willie-the-truck> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 24, 2020 at 02:08:30PM +0100, Robin Murphy wrote: > On 2020-06-24 13:50, Will Deacon wrote: > > On Wed, Jun 24, 2020 at 12:48:14PM +0100, Robin Murphy wrote: > > > On 2020-04-08 17:49, Robin Murphy wrote: > > > > IRQF_SHARED is dangerous, since it allows other agents to retarget the > > > > IRQ's affinity without migrating PMU contexts to match, breaking the way > > > > in which perf manages mutual exclusion for accessing events. Although > > > > this means it's not realistically possible to support PMU IRQs being > > > > shared with other drivers, we *can* handle sharing between multiple PMU > > > > instances with some explicit affinity bookkeeping and manual interrupt > > > > multiplexing. > > > > > > > > RCU helps us handle interrupts efficiently without having to worry about > > > > fine-grained locking for relatively-theoretical race conditions with the > > > > probe/remove/CPU hotplug slow paths. The resulting machinery ends up > > > > looking largely generic, so it should be feasible to factor out with a > > > > "system PMU" base class for similar multi-instance drivers. > > > > > > > > Signed-off-by: Robin Murphy > > > > --- > > > > > > > > RFC because I don't have the means to test it, and if the general > > > > approach passes muster then I'd want to tackle the aforementioned > > > > factoring-out before merging anything anyway. > > > > > > Any comments on whether it's worth pursuing this? > > > > Sorry, I don't really get the problem that it's solving. Is there a crash > > log somewhere I can look at? If all the users of the IRQ are managed by > > this driver, why is IRQF_SHARED dangerous? > > Because as-is, multiple PMU instances may make different choices about which > CPU they associate with, change the shared IRQ affinity behind each others' > backs, and break the "IRQ handler runs on event->cpu" assumption that perf > core relies on for correctness. I'm not sure how likely it would be to > actually crash rather than just lead to subtle nastiness, but wither way > it's not good, and since people seem to be tempted to wire up system PMU > instances this way we could do with a general approach for dealing with it. Ok, thanks for the explanation. If we're just talking about multiple instances of the same driver, why is it not sufficient to have a static atomic_t initialised to -1 which tracks the current affinity and then just CAS that during probe()? Hotplug notifiers can just check whether or not it points to an online CPU Will