Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp330567pxb; Mon, 16 Aug 2021 06:28:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwhRbpiuHxP0vp4D/RRQO7uC7VNUsqOA3m53vvJ7P0vHRjqJIDw0dlyEb+WktPs3qAyB0kR X-Received: by 2002:a17:906:9b87:: with SMTP id dd7mr15295373ejc.99.1629120536878; Mon, 16 Aug 2021 06:28:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629120536; cv=none; d=google.com; s=arc-20160816; b=C51PfFXOlA/eW2/sa75NsjXQKIKAHafup1Q7VOaCIdmcUAW3XVawuqGFB8ZQxSdcva QjXZ7FOUoqd85KyrviQE4/nWIi08cgKow9i9IH3srB1yEFxu6slcV9P1V5R7eW30g0Ft Lw4yMbw76pnZWv9ddiwJtymO+KXA6Y+AMiPLcqANqldXq66EGOAxItdFwc8zroCHAlZJ 1OSw8BwqG15cOMY+uwIvG8UQ54ayFGZR/CQEw0Yur+7fhVaveWyDtGyQ7KAHytP2SdUT PQ4D27ljfScJGaQgj34+9H87Acsid/6NSQy20VTQiGMMV7zBEmRRfUdTAY3cBA6J+Qc5 eCFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=UxfM1C47u39e6nJWvnPHf8Xj3uOhByZ8TQZd/FAk5Ic=; b=mZ4UHLgvJhoVIpn6yhGKtxPIwLkHq1yC1QHPhFHkVCz743zQ8JVuuN1OpFwO8b+b4w EN8uVVotNmSOdUumJzaQIiDAUexGS6ES2wUA28zHTwtYCBlE7QgpDRw3yhuRkAmacIrJ BqXCjEl13i2jxIrmRNzTKi+8MRR+R8770XCwXIhFPDle4pPgh4mOo/QrC1LzrdJhHdsd bbaCZmI0fecKFs73O4ou3gbzaIF3z8xBOudDAz8h9KIbMgKPjtIPKQciRa2HtVLHLDNA TQE1Ti0d3y6LNqUACV5IzghtxanZd9NccXoAd8KXi2gvTdPLYlmTfQZhPeQvKUP7m1YK vvkA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=As+fB0sR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p7si9818845edq.323.2021.08.16.06.28.34; Mon, 16 Aug 2021 06:28:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=As+fB0sR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241267AbhHPNZW (ORCPT + 99 others); Mon, 16 Aug 2021 09:25:22 -0400 Received: from mail.kernel.org ([198.145.29.99]:39180 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240301AbhHPNPG (ORCPT ); Mon, 16 Aug 2021 09:15:06 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id DBF0D632E0; Mon, 16 Aug 2021 13:12:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1629119547; bh=Jb+gZGHh3vH9E8gtxG4rCG8qtKkbEd3bZ0UuZsoaFz0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=As+fB0sRBu4+4a8Ss5shUB/TUFp8q6d5MFA7Zoy4hBuWAKH3iKyv+iBASe4HFT481 kWh4Fjm2sGvizNxfHSjYlog0sT5Pa39GJbz9H0fILWuE/rHOMWn6Ew9wpsBmS5D1lt 5acOEMgOzlFVhzJdSesCsvCXMgnmPSV9wARPpzMM= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Karsten Graul , Guvenc Gulce , "David S. Miller" , Sasha Levin Subject: [PATCH 5.13 069/151] net/smc: fix wait on already cleared link Date: Mon, 16 Aug 2021 15:01:39 +0200 Message-Id: <20210816125446.347373994@linuxfoundation.org> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210816125444.082226187@linuxfoundation.org> References: <20210816125444.082226187@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Karsten Graul [ Upstream commit 8f3d65c166797746455553f4eaf74a5f89f996d4 ] There can be a race between the waiters for a tx work request buffer and the link down processing that finally clears the link. Although all waiters are woken up before the link is cleared there might be waiters which did not yet get back control and are still waiting. This results in an access to a cleared wait queue head. Fix this by introducing atomic reference counting around the wait calls, and wait with the link clear processing until all waiters have finished. Move the work request layer related calls into smc_wr.c and set the link state to INACTIVE before calling smcr_link_clear() in smc_llc_srv_add_link(). Fixes: 15e1b99aadfb ("net/smc: no WR buffer wait for terminating link group") Signed-off-by: Karsten Graul Signed-off-by: Guvenc Gulce Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/smc/smc_core.h | 2 ++ net/smc/smc_llc.c | 10 ++++------ net/smc/smc_tx.c | 18 +++++++++++++++++- net/smc/smc_wr.c | 10 ++++++++++ 4 files changed, 33 insertions(+), 7 deletions(-) diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h index 6d6fd1397c87..64d86298e4df 100644 --- a/net/smc/smc_core.h +++ b/net/smc/smc_core.h @@ -97,6 +97,7 @@ struct smc_link { unsigned long *wr_tx_mask; /* bit mask of used indexes */ u32 wr_tx_cnt; /* number of WR send buffers */ wait_queue_head_t wr_tx_wait; /* wait for free WR send buf */ + atomic_t wr_tx_refcnt; /* tx refs to link */ struct smc_wr_buf *wr_rx_bufs; /* WR recv payload buffers */ struct ib_recv_wr *wr_rx_ibs; /* WR recv meta data */ @@ -109,6 +110,7 @@ struct smc_link { struct ib_reg_wr wr_reg; /* WR register memory region */ wait_queue_head_t wr_reg_wait; /* wait for wr_reg result */ + atomic_t wr_reg_refcnt; /* reg refs to link */ enum smc_wr_reg_state wr_reg_state; /* state of wr_reg request */ u8 gid[SMC_GID_SIZE];/* gid matching used vlan id*/ diff --git a/net/smc/smc_llc.c b/net/smc/smc_llc.c index 273eaf1bfe49..2e7560eba981 100644 --- a/net/smc/smc_llc.c +++ b/net/smc/smc_llc.c @@ -888,6 +888,7 @@ int smc_llc_cli_add_link(struct smc_link *link, struct smc_llc_qentry *qentry) if (!rc) goto out; out_clear_lnk: + lnk_new->state = SMC_LNK_INACTIVE; smcr_link_clear(lnk_new, false); out_reject: smc_llc_cli_add_link_reject(qentry); @@ -1184,6 +1185,7 @@ int smc_llc_srv_add_link(struct smc_link *link) goto out_err; return 0; out_err: + link_new->state = SMC_LNK_INACTIVE; smcr_link_clear(link_new, false); return rc; } @@ -1286,10 +1288,8 @@ static void smc_llc_process_cli_delete_link(struct smc_link_group *lgr) del_llc->reason = 0; smc_llc_send_message(lnk, &qentry->msg); /* response */ - if (smc_link_downing(&lnk_del->state)) { - if (smc_switch_conns(lgr, lnk_del, false)) - smc_wr_tx_wait_no_pending_sends(lnk_del); - } + if (smc_link_downing(&lnk_del->state)) + smc_switch_conns(lgr, lnk_del, false); smcr_link_clear(lnk_del, true); active_links = smc_llc_active_link_count(lgr); @@ -1805,8 +1805,6 @@ void smc_llc_link_clear(struct smc_link *link, bool log) link->smcibdev->ibdev->name, link->ibport); complete(&link->llc_testlink_resp); cancel_delayed_work_sync(&link->llc_testlink_wrk); - smc_wr_wakeup_reg_wait(link); - smc_wr_wakeup_tx_wait(link); } /* register a new rtoken at the remote peer (for all links) */ diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c index 4532c16bf85e..ff02952b3d03 100644 --- a/net/smc/smc_tx.c +++ b/net/smc/smc_tx.c @@ -479,7 +479,7 @@ static int smc_tx_rdma_writes(struct smc_connection *conn, /* Wakeup sndbuf consumers from any context (IRQ or process) * since there is more data to transmit; usable snd_wnd as max transmit */ -static int smcr_tx_sndbuf_nonempty(struct smc_connection *conn) +static int _smcr_tx_sndbuf_nonempty(struct smc_connection *conn) { struct smc_cdc_producer_flags *pflags = &conn->local_tx_ctrl.prod_flags; struct smc_link *link = conn->lnk; @@ -533,6 +533,22 @@ out_unlock: return rc; } +static int smcr_tx_sndbuf_nonempty(struct smc_connection *conn) +{ + struct smc_link *link = conn->lnk; + int rc = -ENOLINK; + + if (!link) + return rc; + + atomic_inc(&link->wr_tx_refcnt); + if (smc_link_usable(link)) + rc = _smcr_tx_sndbuf_nonempty(conn); + if (atomic_dec_and_test(&link->wr_tx_refcnt)) + wake_up_all(&link->wr_tx_wait); + return rc; +} + static int smcd_tx_sndbuf_nonempty(struct smc_connection *conn) { struct smc_cdc_producer_flags *pflags = &conn->local_tx_ctrl.prod_flags; diff --git a/net/smc/smc_wr.c b/net/smc/smc_wr.c index cbc73a7e4d59..a419e9af36b9 100644 --- a/net/smc/smc_wr.c +++ b/net/smc/smc_wr.c @@ -322,9 +322,12 @@ int smc_wr_reg_send(struct smc_link *link, struct ib_mr *mr) if (rc) return rc; + atomic_inc(&link->wr_reg_refcnt); rc = wait_event_interruptible_timeout(link->wr_reg_wait, (link->wr_reg_state != POSTED), SMC_WR_REG_MR_WAIT_TIME); + if (atomic_dec_and_test(&link->wr_reg_refcnt)) + wake_up_all(&link->wr_reg_wait); if (!rc) { /* timeout - terminate link */ smcr_link_down_cond_sched(link); @@ -566,10 +569,15 @@ void smc_wr_free_link(struct smc_link *lnk) return; ibdev = lnk->smcibdev->ibdev; + smc_wr_wakeup_reg_wait(lnk); + smc_wr_wakeup_tx_wait(lnk); + if (smc_wr_tx_wait_no_pending_sends(lnk)) memset(lnk->wr_tx_mask, 0, BITS_TO_LONGS(SMC_WR_BUF_CNT) * sizeof(*lnk->wr_tx_mask)); + wait_event(lnk->wr_reg_wait, (!atomic_read(&lnk->wr_reg_refcnt))); + wait_event(lnk->wr_tx_wait, (!atomic_read(&lnk->wr_tx_refcnt))); if (lnk->wr_rx_dma_addr) { ib_dma_unmap_single(ibdev, lnk->wr_rx_dma_addr, @@ -728,7 +736,9 @@ int smc_wr_create_link(struct smc_link *lnk) memset(lnk->wr_tx_mask, 0, BITS_TO_LONGS(SMC_WR_BUF_CNT) * sizeof(*lnk->wr_tx_mask)); init_waitqueue_head(&lnk->wr_tx_wait); + atomic_set(&lnk->wr_tx_refcnt, 0); init_waitqueue_head(&lnk->wr_reg_wait); + atomic_set(&lnk->wr_reg_refcnt, 0); return rc; dma_unmap: -- 2.30.2