Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp21611886ybl; Mon, 6 Jan 2020 07:57:52 -0800 (PST) X-Google-Smtp-Source: APXvYqylT8cYBaCVCDvlK07RVZZMBxlfx0Sxj5GddJCVhX9/SmSLapsj2Iqt41QaSFqoMCxfDrWj X-Received: by 2002:a9d:588d:: with SMTP id x13mr110130509otg.6.1578326272755; Mon, 06 Jan 2020 07:57:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578326272; cv=none; d=google.com; s=arc-20160816; b=ZZGAP8RklwjDHlOnuL8944QK5CwZUOHqAEATal99VCFSGHQdMy7RUiV1JqQcRzh+nG KVL3PBdZDUWd+Bp6R8SSDiW5ZEeWODcUJMcWncXqeY622McBMayrI0EDT9kjZZMD+2WA x8Lk+RdwtTQn0ek5l7HVQGg8IUDbucuMtTm2OLepH+d9E2PWE8qLfvl2a4Zwr22B+N6y gFgozoaWq0c9a5+Zt56ROOzvcMnlBsFroJB4LIiZe7Y/7uZ+YlfL075FnhhvctHV7puj ppg16plCIu321paQFiNrJnZb5d80JDCD/wnOOZhsWlnnhgSxMTRtFWNnXuPGQ76/5jrP 7S+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=ojDnsiQizbgA0k/L03WuKe+oQHI9i9zZ1bsvOrTeVBM=; b=PwNR1DgJHYK9Z1mUtYCDDnGEtM4FNuWZkfOSgc4opvOYVlbmF/FooQ7uDSIMfsNeN9 echucQKqrK45OC3cjSe1eD7zgeDN+gGx8yTplVCGEeVrP9ryIh5745arY1haaXUCUQnt xSnQz1VJ/X4ccvGJAJLejJW+pXHQTPRyfgaS5fzMG8Za+7kAzYemOHvAlzOGWbhN+lAk 12vU1gOKXHzGMN0W6Uizf5MqFgmjYvvbycpoiqXDB5p/pPEBZuQDeKdnnHTte3dDz5Nl GNgUiJrc/nmJ7p0X7Kv87l9mlrtIGhnueEsP7jxFZtvvoBTu23CjjxJNV4vw0N7ACSMH dxyg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a29si37569872otd.268.2020.01.06.07.57.39; Mon, 06 Jan 2020 07:57:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726508AbgAFP4z (ORCPT + 99 others); Mon, 6 Jan 2020 10:56:55 -0500 Received: from foss.arm.com ([217.140.110.172]:45428 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726437AbgAFP4z (ORCPT ); Mon, 6 Jan 2020 10:56:55 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C148331B; Mon, 6 Jan 2020 07:56:54 -0800 (PST) Received: from [10.1.196.105] (eglon.cambridge.arm.com [10.1.196.105]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 523CE3F6C4; Mon, 6 Jan 2020 07:56:54 -0800 (PST) Subject: Re: [PATCH 1/3] firmware: arm_sdei: fix possible deadlock To: luanshi Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <1577110975-54782-1-git-send-email-zhangliguang@linux.alibaba.com> From: James Morse Message-ID: <0f122a88-e2e1-2139-1c2d-095f684a5701@arm.com> Date: Mon, 6 Jan 2020 15:56:51 +0000 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <1577110975-54782-1-git-send-email-zhangliguang@linux.alibaba.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Luanshi! On 23/12/2019 14:22, luanshi wrote: > From: Liguang Zhang > > We call sdei_reregister_event() with sdei_list_lock held but > _sdei_event_register() and sdei_event_destroy() also acquires > sdei_list_lock thus creating A-A deadlock. Ooer. This was clearly never tested properly! The hibernate support got plenty of testing, but it must have been with only private events. Hibernate+SDEI with a side-order of cpuhp is a niche sport. > diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c > index a479023..b122927 100644 > --- a/drivers/firmware/arm_sdei.c > +++ b/drivers/firmware/arm_sdei.c > @@ -651,20 +651,19 @@ static int sdei_reregister_event(struct sdei_event *event) > > lockdep_assert_held(&sdei_events_lock); > > - err = _sdei_event_register(event); > + err = sdei_api_event_register(event->event_num, > + sdei_entry_point, > + event->registered, > + SDEI_EVENT_REGISTER_RM_ANY, 0); I don't like pushing these 'api' calls further out creating more of them... The root of the problem is the reregister/reenable values are protected by the same lock as the list, _sdei_event_register() needs to manipulate these, which it can't do from something that is walking the list. The list lock is a spin_lock() because the cpuhp callbacks happen too early for taking mutexes, (fairly sure). Those callbacks don't hit this because they skip shared events. As the simplest fix for stable, could we add another spin_lock inside struct sdei_event to independently protect the reregister/renable values? This would always be taken last, and removes the double-lock. Was this from inspection, or is there some tool I should be running?! (my testing obviously missed it) Thanks, James