Received: by 10.213.65.68 with SMTP id h4csp402032imn; Tue, 27 Mar 2018 01:09:00 -0700 (PDT) X-Google-Smtp-Source: AIpwx48uyLF92rS4nd05v9F2OyWLLFjqVQ6MxljHRHMgd7XSMrBcgKTdZ6ABBtimKalnEOC/N4zl X-Received: by 10.98.150.198 with SMTP id s67mr4978827pfk.191.1522138140288; Tue, 27 Mar 2018 01:09:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522138140; cv=none; d=google.com; s=arc-20160816; b=NPC3eMwl5G5eRr7dvmI4mlan+0IJi16oqR1eZCcKCLw3S94v1i0VtVwHdEk/5ddr2F aN5SMBS4Nd75CUAdWG06wLey+000m1RCRohy9r2B+yxeNODUaDrrcwt8dIONZOjPmCLF QTPKZ0bGGHbtnZpe75HkfR75/XeRLCfuAV7Hf6KhURxOr1l2OENV2DI0/f0x6gWBdbNf qgOD8pgoh2jpzH2QkWbHkq9Xeg2fFgFS6px9V52T+n4C0+f9rgO2hS/f2GPl+oCToXVR 5+dAfbP10vR6LSyugi6FH04qHT6Ef2Iq7eXM5J/a45a1ecdo34oXD17KpNDXRvmJOl0p cjkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=nyGU9Ha9SuTZiUZxK4o/1aMD83A3qg9gFzIz8/bMPns=; b=PJA6PAa5YPC4sf15MMp+hP8/u3Q3n0i6Pnz3OLs39skEDdzkF4rWQMlsWuumujgSZf UOaRIn78ORAz4t02kX27lMpPdrA7d0NwpmJXRl2yPMm2/xLnX6H9CsR8nwsPR8jdQZIq hIZNclJMTXimdloOZEnTcuUO3ZLJT8T2t/8rjkBjoUKOHdknjfL4z5idn6Ab2vdlAicQ /RyaRqoKrgxo52TEJfb0UEsl+6ysGJkTc5hHBcW0ZTmVpoBKk3uR1lSU1AcfEskMHoFx PxgkmyT5EVSBKw+lokoqpzbovywMIaG6vI64bkuIXMT3mL1e/DzU8z15iOt6iIVYThvg 3gGw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mellanox.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z16-v6si832466pll.36.2018.03.27.01.08.46; Tue, 27 Mar 2018 01:09:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mellanox.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752431AbeC0IHW (ORCPT + 99 others); Tue, 27 Mar 2018 04:07:22 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:35123 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752189AbeC0IF5 (ORCPT ); Tue, 27 Mar 2018 04:05:57 -0400 Received: from Internal Mail-Server by MTLPINE1 (envelope-from vadimp@mellanox.com) with ESMTPS (AES256-SHA encrypted); 27 Mar 2018 10:06:45 +0200 Received: from r-mgtswh-226.mtr.labs.mlnx. (r-mgtswh-226.mtr.labs.mlnx [10.209.1.51]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id w2R85mx3029508; Tue, 27 Mar 2018 11:05:52 +0300 From: Vadim Pasternak To: dvhart@infradead.org, andy.shevchenko@gmail.com, gregkh@linuxfoundation.org Cc: linux-kernel@vger.kernel.org, platform-driver-x86@vger.kernel.org, jiri@resnulli.us, michaelsh@mellanox.com, ivecera@redhat.com, Vadim Pasternak Subject: [PATCH v1 3/7] platform/mellanox: mlxreg-hotplug: add extra cycle for hotplug work queue Date: Tue, 27 Mar 2018 10:02:03 +0000 Message-Id: <1522144927-56512-4-git-send-email-vadimp@mellanox.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1522144927-56512-1-git-send-email-vadimp@mellanox.com> References: <1522144927-56512-1-git-send-email-vadimp@mellanox.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org It adds missed logic for signal acknowledge, by adding an extra run for work queue in case a signal is received, but no specific signal assertion is detected. Such case theoretically can happen for example in case several units are removed or inserted at the same time. In this situation acknowledge for some signal can be missed at signal top aggreagation status level. The extra run will allow to handler to acknowledge the missed signal. The interrupt handling flow performs the next steps: (1) Enter mlxreg_hotplug_work_handler due to signal assertion. Aggregation status register is changed for example from 0xff to 0xfd (event signal group related to bit 1). (2) Mask aggregation interrupts, read aggregation status register and save it (0xfd) in aggr_cache, then traverse down to handle signal from groups related to the changed bit. (3) Read and mask group related signal. Acknowledge and unmask group related signal (acknowledge should clear aggregation status register from 0xfd back to 0xff). (4) Re-schedule work queue for the immediate execution. (5) Enter mlxreg_hotplug_work_handler due to re-scheduling. Aggregation status is changed from previous 0xfd to 0xff. Go over steps (2) - (5) and in case no new signal assertion is detected - unmask aggregation interrupts. The possible race could happen in case new signal from the same group is asserted after step (3) and prior step (5). In such case aggregation status will change back from 0xff to 0xfd and the value read from the aggregation status register will be the same as a value saved in aggr_cache. As a result the handler will not traverse down and signal will stay unhandled. The fix will enforce handler to traverse down in case the signal is received, but signal assertion is not detected. Fixes: 1f976f6978bf ("platform/x86: Move Mellanox platform hotplug driver to platform/mellanox") Signed-off-by: Vadim Pasternak --- drivers/platform/mellanox/mlxreg-hotplug.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/drivers/platform/mellanox/mlxreg-hotplug.c b/drivers/platform/mellanox/mlxreg-hotplug.c index b56953a..ced81b7 100644 --- a/drivers/platform/mellanox/mlxreg-hotplug.c +++ b/drivers/platform/mellanox/mlxreg-hotplug.c @@ -55,6 +55,7 @@ #define MLXREG_HOTPLUG_RST_CNTR 3 #define MLXREG_HOTPLUG_ATTRS_MAX 24 +#define MLXREG_HOTPLUG_NOT_ASSERT 3 /** * struct mlxreg_hotplug_priv_data - platform private data: @@ -74,6 +75,7 @@ * @mask: top aggregation interrupt common mask; * @aggr_cache: last value of aggregation register status; * @after_probe: flag indication probing completion; + * @not_asserted: number of entries in workqueue with no signal assertion; */ struct mlxreg_hotplug_priv_data { int irq; @@ -93,6 +95,7 @@ struct mlxreg_hotplug_priv_data { u32 mask; u32 aggr_cache; bool after_probe; + u8 not_asserted; }; static int mlxreg_hotplug_device_create(struct mlxreg_hotplug_priv_data *priv, @@ -410,6 +413,18 @@ static void mlxreg_hotplug_work_handler(struct work_struct *work) aggr_asserted = priv->aggr_cache ^ regval; priv->aggr_cache = regval; + /* + * Handler is invoked, but no assertion is detected at top aggregation + * status level. Set aggr_asserted to mask value to allow handler extra + * run over all relevant signals to recover any missed signal. + */ + if (priv->not_asserted == MLXREG_HOTPLUG_NOT_ASSERT) { + priv->not_asserted = 0; + aggr_asserted = pdata->mask; + } + if (!aggr_asserted) + goto unmask_event; + /* Handle topology and health configuration changes. */ for (i = 0; i < pdata->counter; i++, item++) { if (aggr_asserted & item->aggr_mask) { @@ -441,6 +456,8 @@ static void mlxreg_hotplug_work_handler(struct work_struct *work) return; } +unmask_event: + priv->not_asserted++; /* Unmask aggregation event (no need acknowledge). */ ret = regmap_write(priv->regmap, pdata->cell + MLXREG_HOTPLUG_AGGR_MASK_OFF, pdata->mask); -- 2.1.4