Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753623AbdGSJov (ORCPT ); Wed, 19 Jul 2017 05:44:51 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:36556 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753576AbdGSJos (ORCPT ); Wed, 19 Jul 2017 05:44:48 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Mohamad Haj Yahia , Moshe Shemesh , Saeed Mahameed Subject: [PATCH 4.12 03/84] net/mlx5: Cancel delayed recovery work when unloading the driver Date: Wed, 19 Jul 2017 11:43:09 +0200 Message-Id: <20170719092322.494213595@linuxfoundation.org> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20170719092322.362625377@linuxfoundation.org> References: <20170719092322.362625377@linuxfoundation.org> User-Agent: quilt/0.65 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3577 Lines: 97 4.12-stable review patch. If anyone has any objections, please let me know. ------------------ From: Mohamad Haj Yahia [ Upstream commit 2a0165a034ac024b60cca49c61e46f4afa2e4d98 ] Draining the health workqueue will ignore future health works including the one that report hardware failure and thus we can't enter error state Instead cancel the recovery flow and make sure only recovery flow won't be scheduled. Fixes: 5e44fca50470 ('net/mlx5: Only cancel recovery work when cleaning up device') Signed-off-by: Mohamad Haj Yahia Signed-off-by: Moshe Shemesh Signed-off-by: Saeed Mahameed Signed-off-by: Greg Kroah-Hartman --- drivers/net/ethernet/mellanox/mlx5/core/health.c | 15 ++++++++++++++- drivers/net/ethernet/mellanox/mlx5/core/main.c | 2 +- include/linux/mlx5/driver.h | 1 + 3 files changed, 16 insertions(+), 2 deletions(-) --- a/drivers/net/ethernet/mellanox/mlx5/core/health.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c @@ -67,6 +67,7 @@ enum { enum { MLX5_DROP_NEW_HEALTH_WORK, + MLX5_DROP_NEW_RECOVERY_WORK, }; static u8 get_nic_state(struct mlx5_core_dev *dev) @@ -193,7 +194,7 @@ static void health_care(struct work_stru mlx5_handle_bad_state(dev); spin_lock(&health->wq_lock); - if (!test_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags)) + if (!test_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags)) schedule_delayed_work(&health->recover_work, recover_delay); else dev_err(&dev->pdev->dev, @@ -313,6 +314,7 @@ void mlx5_start_health_poll(struct mlx5_ init_timer(&health->timer); health->sick = 0; clear_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags); + clear_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags); health->health = &dev->iseg->health; health->health_counter = &dev->iseg->health_counter; @@ -335,11 +337,22 @@ void mlx5_drain_health_wq(struct mlx5_co spin_lock(&health->wq_lock); set_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags); + set_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags); spin_unlock(&health->wq_lock); cancel_delayed_work_sync(&health->recover_work); cancel_work_sync(&health->work); } +void mlx5_drain_health_recovery(struct mlx5_core_dev *dev) +{ + struct mlx5_core_health *health = &dev->priv.health; + + spin_lock(&health->wq_lock); + set_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags); + spin_unlock(&health->wq_lock); + cancel_delayed_work_sync(&dev->priv.health.recover_work); +} + void mlx5_health_cleanup(struct mlx5_core_dev *dev) { struct mlx5_core_health *health = &dev->priv.health; --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -1228,7 +1228,7 @@ static int mlx5_unload_one(struct mlx5_c int err = 0; if (cleanup) - mlx5_drain_health_wq(dev); + mlx5_drain_health_recovery(dev); mutex_lock(&dev->intf_state_mutex); if (test_bit(MLX5_INTERFACE_STATE_DOWN, &dev->intf_state)) { --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -925,6 +925,7 @@ int mlx5_health_init(struct mlx5_core_de void mlx5_start_health_poll(struct mlx5_core_dev *dev); void mlx5_stop_health_poll(struct mlx5_core_dev *dev); void mlx5_drain_health_wq(struct mlx5_core_dev *dev); +void mlx5_drain_health_recovery(struct mlx5_core_dev *dev); int mlx5_buf_alloc_node(struct mlx5_core_dev *dev, int size, struct mlx5_buf *buf, int node); int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, struct mlx5_buf *buf);