Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp1176314pxb; Fri, 20 Nov 2020 03:12:21 -0800 (PST) X-Google-Smtp-Source: ABdhPJzKdKV09omxiGo4YqYHHBzpfw8nAf3aVOUANgtOh+9bSfYv7CDX9MaN5O5R5R72yaH9GUJ5 X-Received: by 2002:a17:906:8541:: with SMTP id h1mr14197322ejy.445.1605870741285; Fri, 20 Nov 2020 03:12:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605870741; cv=none; d=google.com; s=arc-20160816; b=QxtphCl9IjzOPLfqYriJnuBO8iLDw8HpGVtiOkWXwH4T4gwsi4yoqiACnLNmVK/9Bi Q7vwpjGhPbobvoJpzu6b+1BvHVt4QzOxxqcRJb0bbOb+Lm7l/kqE5UjuB3tI4xl3rNT4 wEuyIuHxAGbFzPFO18erYQ7rmjGG3pvqDaGLxJmZKwKlBmFtlHBYXb6NOmUJsC6f8YzB vGzgiWfi0hQfGu2H3YOmu6IR5rmZ30ufTuHkYsa5twgZhKIO1ihw+ot746Fvncl8ahzA skLH3RKT/166QzmNSgiMp1J9o8XfzNZ4NMlstEHAqF1kH4flDUzHVTbLQVKQ/+medzwC 7+7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=o8U7Ea6lkYTlQF6gbF6TipzuCumRSTlHjEGObrsM9ZE=; b=bCSsrH7i9xnFsUIIDhWWKcbDPRJOt0ZlZdZJz+JYIBavd8IVGZyedgFUgRwZ8yzp71 3tZW7Vs2b/80J19E/owYaXF1Mwet+kgm71DQKvTzXIkjFwaXLPurv4WOQqemZDd/7FcP IVR99m5z/hJtP8sy6JMVi+ZvGHkQlbcQRH+BcNyOZPcvNNG4yld/27ZQWk0n607peI/j 3qRHQN13Am/Q40btwTkBCh3u5qhYRbqQiU0yubyvxos3tmS8E9SaSDnoAvrrEsEwE8ri I+bdxRnSgXT4bHUYNV+AXbG+pfcPtRmkU0RXKmx9rrrVx8ckCIy7SLIKBchY5A+WyOc5 gcMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=oGrVaP3N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k7si841080edk.251.2020.11.20.03.11.58; Fri, 20 Nov 2020 03:12:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=oGrVaP3N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728423AbgKTLJV (ORCPT + 99 others); Fri, 20 Nov 2020 06:09:21 -0500 Received: from mail.kernel.org ([198.145.29.99]:55032 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728505AbgKTLHK (ORCPT ); Fri, 20 Nov 2020 06:07:10 -0500 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 72B0C206E3; Fri, 20 Nov 2020 11:07:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1605870429; bh=D3m/YqrMYV+36QMzjgyTut1Dk1n+SCwSuxCmEfCpaUk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oGrVaP3N/fyAtTeDINVcKatTee5j2dpwkvdAl3Gdcc7ais+gVtnTiOYR2bcKFrlVk rYEFmFoKAvAiJqLi8vv9d5pHXaed6iJRzOqERvhl8LlDJwlbbjQHopZ282TAw02Zac BupBjEse4z3LgKwkfpQdx6y5TXDd5xVhzE+t8F/M= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Eran Ben Elisha , Moshe Shemesh , Saeed Mahameed , Timo Rothenpieler Subject: [PATCH 5.4 09/17] net/mlx5: Fix a race when moving command interface to events mode Date: Fri, 20 Nov 2020 12:03:36 +0100 Message-Id: <20201120104541.518263886@linuxfoundation.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201120104541.058449969@linuxfoundation.org> References: <20201120104541.058449969@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Eran Ben Elisha commit d43b7007dbd1195a5b6b83213e49b1516aaf6f5e upstream. After driver creates (via FW command) an EQ for commands, the driver will be informed on new commands completion by EQE. However, due to a race in driver's internal command mode metadata update, some new commands will still be miss-handled by driver as if we are in polling mode. Such commands can get two non forced completion, leading to already freed command entry access. CREATE_EQ command, that maps EQ to the command queue must be posted to the command queue while it is empty and no other command should be posted. Add SW mechanism that once the CREATE_EQ command is about to be executed, all other commands will return error without being sent to the FW. Allow sending other commands only after successfully changing the driver's internal command mode metadata. We can safely return error to all other commands while creating the command EQ, as all other commands might be sent from the user/application during driver load. Application can rerun them later after driver's load was finished. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha Signed-off-by: Moshe Shemesh Signed-off-by: Saeed Mahameed Cc: Timo Rothenpieler Signed-off-by: Greg Kroah-Hartman --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 35 +++++++++++++++++++++++--- drivers/net/ethernet/mellanox/mlx5/core/eq.c | 3 ++ include/linux/mlx5/driver.h | 6 ++++ 3 files changed, 40 insertions(+), 4 deletions(-) --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -875,6 +875,14 @@ static void free_msg(struct mlx5_core_de static void mlx5_free_cmd_msg(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *msg); +static bool opcode_allowed(struct mlx5_cmd *cmd, u16 opcode) +{ + if (cmd->allowed_opcode == CMD_ALLOWED_OPCODE_ALL) + return true; + + return cmd->allowed_opcode == opcode; +} + static void cmd_work_handler(struct work_struct *work) { struct mlx5_cmd_work_ent *ent = container_of(work, struct mlx5_cmd_work_ent, work); @@ -941,7 +949,8 @@ static void cmd_work_handler(struct work /* Skip sending command to fw if internal error */ if (pci_channel_offline(dev->pdev) || - dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) { + dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR || + !opcode_allowed(&dev->cmd, ent->op)) { u8 status = 0; u32 drv_synd; @@ -1459,6 +1468,22 @@ static void create_debugfs_files(struct mlx5_cmdif_debugfs_init(dev); } +void mlx5_cmd_allowed_opcode(struct mlx5_core_dev *dev, u16 opcode) +{ + struct mlx5_cmd *cmd = &dev->cmd; + int i; + + for (i = 0; i < cmd->max_reg_cmds; i++) + down(&cmd->sem); + down(&cmd->pages_sem); + + cmd->allowed_opcode = opcode; + + up(&cmd->pages_sem); + for (i = 0; i < cmd->max_reg_cmds; i++) + up(&cmd->sem); +} + static void mlx5_cmd_change_mod(struct mlx5_core_dev *dev, int mode) { struct mlx5_cmd *cmd = &dev->cmd; @@ -1751,12 +1776,13 @@ static int cmd_exec(struct mlx5_core_dev int err; u8 status = 0; u32 drv_synd; + u16 opcode; u8 token; + opcode = MLX5_GET(mbox_in, in, opcode); if (pci_channel_offline(dev->pdev) || - dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) { - u16 opcode = MLX5_GET(mbox_in, in, opcode); - + dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR || + !opcode_allowed(&dev->cmd, opcode)) { err = mlx5_internal_err_ret_value(dev, opcode, &drv_synd, &status); MLX5_SET(mbox_out, out, status, status); MLX5_SET(mbox_out, out, syndrome, drv_synd); @@ -2058,6 +2084,7 @@ int mlx5_cmd_init(struct mlx5_core_dev * mlx5_core_dbg(dev, "descriptor at dma 0x%llx\n", (unsigned long long)(cmd->dma)); cmd->mode = CMD_MODE_POLLING; + cmd->allowed_opcode = CMD_ALLOWED_OPCODE_ALL; create_msg_cache(dev); --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -648,11 +648,13 @@ static int create_async_eqs(struct mlx5_ .nent = MLX5_NUM_CMD_EQE, .mask[0] = 1ull << MLX5_EVENT_TYPE_CMD, }; + mlx5_cmd_allowed_opcode(dev, MLX5_CMD_OP_CREATE_EQ); err = setup_async_eq(dev, &table->cmd_eq, ¶m, "cmd"); if (err) goto err1; mlx5_cmd_use_events(dev); + mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL); param = (struct mlx5_eq_param) { .irq_index = 0, @@ -682,6 +684,7 @@ err2: mlx5_cmd_use_polling(dev); cleanup_async_eq(dev, &table->cmd_eq, "cmd"); err1: + mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL); mlx5_eq_notifier_unregister(dev, &table->cq_err_nb); return err; } --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -299,6 +299,7 @@ struct mlx5_cmd { struct semaphore sem; struct semaphore pages_sem; int mode; + u16 allowed_opcode; struct mlx5_cmd_work_ent *ent_arr[MLX5_MAX_COMMANDS]; struct dma_pool *pool; struct mlx5_cmd_debug dbg; @@ -890,10 +891,15 @@ mlx5_frag_buf_get_idx_last_contig_stride return min_t(u32, last_frag_stride_idx - fbc->strides_offset, fbc->sz_m1); } +enum { + CMD_ALLOWED_OPCODE_ALL, +}; + int mlx5_cmd_init(struct mlx5_core_dev *dev); void mlx5_cmd_cleanup(struct mlx5_core_dev *dev); void mlx5_cmd_use_events(struct mlx5_core_dev *dev); void mlx5_cmd_use_polling(struct mlx5_core_dev *dev); +void mlx5_cmd_allowed_opcode(struct mlx5_core_dev *dev, u16 opcode); struct mlx5_async_ctx { struct mlx5_core_dev *dev;