Received: by 2002:a05:7412:e794:b0:fa:551:50a7 with SMTP id o20csp1593546rdd; Thu, 11 Jan 2024 04:05:01 -0800 (PST) X-Google-Smtp-Source: AGHT+IGNMZlH7P8Ir6DwN6V59Tufl1EEO9g60N1e0saWEBVBdNVs+DF0huifDJiUD1xku71MKq0B X-Received: by 2002:aa7:c69a:0:b0:553:5874:74b7 with SMTP id n26-20020aa7c69a000000b00553587474b7mr266409edq.99.1704974701590; Thu, 11 Jan 2024 04:05:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704974701; cv=none; d=google.com; s=arc-20160816; b=h42hzkA1vmVzi+rcrT9QhzHsRRMtoz0gXyUqnWtpX039xMi5d8/iXlIIYgultwRDWO kzXGwFFmSyW80zNdmUxDeTDxjOe0pShghXua5bYxRZMrlGrMuwTB/eCS1K21URXHRpg2 g5xrSeslACugb1ObaCMzcZlZwzVpWbXUThe6mc3dg1kve74/GL5lxzi7+w1A0SrjSzmX HCq6729uYlJtZ6si1bvfArumzwIp+6eyUQYq8U43UIZzgWsgLSw3Uxnchfni4+R4FVCU y1OOv+VL62EA/aHZoMFekJBzABn5wKKPaXyGM7oP6Clp00G7PMCyxMJN0laahwdcJCdk BceA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=qDtGUZAXLbUO/3IjbpwtjysiyCY03bdxLmpHWuwleGE=; fh=jvQSXgb9+Gwj+2SIugFJuht8/m1b8ZaGKJU/ZBOVFgM=; b=BaCwHsghj6nhCV11a3kSl6fDVmF36QxkDVQ4BeGI0CydA8S8vpVD3YyKujBHKwsaHF xGCRcMcavlPX6XVGAXLFH3JToiIGQPMHg6xw5ejRxEhyQRRWx0QuPHApNOhkfnOaabRV UjewRWWiPd4YCUNmgYAp9j9trSeg0H89bhhXmylkKPtPIMeLcB9Q/a19UcZ5ND6c5D9q 8M12taycyAAOivyYYr96hI16oBeIXGrWEIzyopIb4sBaMjemmDVchIxfR0tCPjWHnOCS h2UPqjVszCAaOx50jI/IlZVvNfqPQr+dwMFSivQlu5lnwIxWrxYehNmRUyXG44lXHNfs yu8A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23546-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23546-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id dy10-20020a05640231ea00b00552dc1dd9cesi459000edb.654.2024.01.11.04.05.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jan 2024 04:05:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-23546-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23546-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23546-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 3EA2D1F26B6B for ; Thu, 11 Jan 2024 12:05:01 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 335C728374; Thu, 11 Jan 2024 12:01:40 +0000 (UTC) Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A8F5015AC3; Thu, 11 Jan 2024 12:01:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R741e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045192;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-Pewec_1704974493; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-Pewec_1704974493) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:35 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 14/15] net/smc: introduce loopback-ism DMB data copy control Date: Thu, 11 Jan 2024 20:00:35 +0800 Message-Id: <20240111120036.109903-15-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit This provides a way to {get|set} whether loopback-ism device supports merging sndbuf with peer DMB to eliminate data copies between them. echo 0 > /sys/devices/virtual/smc/loopback-ism/dmb_copy # support echo 1 > /sys/devices/virtual/smc/loopback-ism/dmb_copy # not support The settings take effect after re-activating loopback-ism by: echo 0 > /sys/devices/virtual/smc/loopback-ism/active echo 1 > /sys/devices/virtual/smc/loopback-ism/active After this, the link group related to loopback-ism will be flushed and the sndbufs of subsequent connections will be merged or not merged with peer DMB. The motivation of this control is that the bandwidth will be highly improved when sndbuf and DMB are merged, but when virtually contiguous DMB is provided and merged with sndbuf, it will be concurrently accessed on Tx and Rx, then there will be a bottleneck caused by lock contention of find_vmap_area when there are many CPUs and CONFIG_HARDENED_USERCOPY is set (see link below). So an option is provided. Link: https://lore.kernel.org/all/238e63cd-e0e8-4fbf-852f-bc4d5bc35d5a@linux.alibaba.com/ Signed-off-by: Wen Gu --- net/smc/smc_loopback.c | 46 ++++++++++++++++++++++++++++++++++++++++++ net/smc/smc_loopback.h | 8 +++++++- 2 files changed, 53 insertions(+), 1 deletion(-) diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c index 2e734f8e08f5..bfbb346ef01a 100644 --- a/net/smc/smc_loopback.c +++ b/net/smc/smc_loopback.c @@ -26,6 +26,7 @@ static const char smc_lo_dev_name[] = "loopback-ism"; static unsigned int smc_lo_dmb_type = SMC_LO_DMB_PHYS; +static unsigned int smc_lo_dmb_copy = SMC_LO_DMB_NOCOPY; static struct smc_lo_dev *lo_dev; static struct class *smc_class; @@ -167,9 +168,52 @@ static ssize_t dmb_type_store(struct device *dev, return count; } static DEVICE_ATTR_RW(dmb_type); + +static ssize_t dmb_copy_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct smc_lo_dev *ldev = + container_of(dev, struct smc_lo_dev, dev); + const char *copy; + + switch (ldev->dmb_copy) { + case SMC_LO_DMB_NOCOPY: + copy = "sndbuf and DMB merged and no data copied"; + break; + case SMC_LO_DMB_COPY: + copy = "sndbuf and DMB separated and data copied"; + break; + default: + copy = "Unknown setting"; + } + + return sysfs_emit(buf, "%d: %s\n", ldev->dmb_copy, copy); +} + +static ssize_t dmb_copy_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + unsigned int dmb_copy; + int ret; + + ret = kstrtouint(buf, 0, &dmb_copy); + if (ret) + return ret; + + if (dmb_copy != SMC_LO_DMB_NOCOPY && + dmb_copy != SMC_LO_DMB_COPY) + return -EINVAL; + + smc_lo_dmb_copy = dmb_copy; /* re-activate to take effect */ + return count; +} +static DEVICE_ATTR_RW(dmb_copy); + static struct attribute *smc_lo_attrs[] = { &dev_attr_active.attr, &dev_attr_dmb_type.attr, + &dev_attr_dmb_copy.attr, &dev_attr_xfer_bytes.attr, &dev_attr_dmbs_cnt.attr, NULL, @@ -451,6 +495,7 @@ static int smcd_lo_register_dev(struct smc_lo_dev *ldev) smcd->priv = ldev; smc_ism_set_v2_capable(); ldev->dmb_type = smc_lo_dmb_type; + ldev->dmb_copy = smc_lo_dmb_copy; mutex_lock(&smcd_dev_list.mutex); list_add(&smcd->list, &smcd_dev_list.list); mutex_unlock(&smcd_dev_list.mutex); @@ -475,6 +520,7 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev) kfree(smcd->conn); kfree(smcd); ldev->dmb_type = smc_lo_dmb_type; + ldev->dmb_copy = smc_lo_dmb_copy; smc_lo_clear_stats(ldev); } diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h index 8ee5c6805fc4..7ecb4a35eb36 100644 --- a/net/smc/smc_loopback.h +++ b/net/smc/smc_loopback.h @@ -28,6 +28,11 @@ enum { SMC_LO_DMB_VIRT, }; +enum { + SMC_LO_DMB_NOCOPY, + SMC_LO_DMB_COPY, +}; + struct smc_lo_dmb_node { struct hlist_node list; u64 token; @@ -45,7 +50,8 @@ struct smc_lo_dev_stats64 { struct smc_lo_dev { struct smcd_dev *smcd; struct device dev; - u8 active; + u8 active : 1; + u8 dmb_copy : 1; u8 dmb_type; u16 chid; struct smcd_gid local_gid; -- 2.32.0.3.g01195cf9f