Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp4029438rdg; Wed, 18 Oct 2023 12:44:22 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG/giC5/qhoJ2tSKwRWuAUcpT/CGpSiuGWBYUZwhpZMEUoSUblbdKA7oQvQNyAXPOdrPFmL X-Received: by 2002:a05:6a20:a0a5:b0:15e:1486:1e08 with SMTP id r37-20020a056a20a0a500b0015e14861e08mr133648pzj.19.1697658261456; Wed, 18 Oct 2023 12:44:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697658261; cv=none; d=google.com; s=arc-20160816; b=qAbJOB1iKrgo7W0irnZvtnD7eP/O3whdic5Xiit9K0T8aBfizWNCB2PINwUetTaWiH fE6HH05dgdHZDSAIS9HOv02uO+3FYsanZ4mmlYiSVix04HD+KT1pBbqkwBuEOmLVCuFl nKVD2wQKgVKFz/yB0Mfzcwjb6tDVGmxm6/bomoW+2rXOnrdBqBAoroCxNDfTv5ls9Kmq 8DbINB3x/e1u3GPtYhNN0f2NbwRZZxpDV7bXzwX8Xh6FZU7qMk+lgnCnS34gyCxJq6Bm b7Gi+cGa7lZNVdctw9vtv1BEi62sX02T3hewOhMWO2mWp/cYlE5Jv9pcnMGZIg/9AgpY hwtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=1wAlIzhdahUXrjREXhznI7ZxPs1/qwk6S/so0wB3ZWo=; fh=ickTKs4l5rA1Qxn157X0sWsP10xahBbzFizAEx2p30U=; b=nlVs9mZRSz5UGCF0LH4I6S2VH+hSIUY8r+ZDNQ7RzEHDSsIhE8RhhLCGwd2SS/FB4u UJBUJdhdGFxraKHgAGg0QpjDFeBbKusUGyfu/TBaSj70Gj4IymYYiPntTFMjYIHHycQ0 MeiahAK5IN/lHBGyfKB7rejtDpLiN+C5lK8q5l5lhZ9sRLUfXUc3rRGo430UTgeBpeb1 xGDRAnSh/Vu/uNe0am/VusUuUrV0JqFmrw6sDnFiQgcN1LsRVDsW63kz2HIDRcc6Oa/m a+mrrdXSAgjTcpnKGCt3xI+VDLjmKlbnYQ7Wz26DrRi0NAI8PP8GfsvO0sK0y+pFgcv8 DfIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=L9nPzo7a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id l16-20020a170903245000b001caa2904c09si635242pls.351.2023.10.18.12.44.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 12:44:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=L9nPzo7a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 7027C81C0C86; Wed, 18 Oct 2023 12:44:18 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229923AbjJRTn5 (ORCPT + 99 others); Wed, 18 Oct 2023 15:43:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231724AbjJRTny (ORCPT ); Wed, 18 Oct 2023 15:43:54 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4593195; Wed, 18 Oct 2023 12:43:53 -0700 (PDT) Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39IJc4lA013634; Wed, 18 Oct 2023 19:43:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=1wAlIzhdahUXrjREXhznI7ZxPs1/qwk6S/so0wB3ZWo=; b=L9nPzo7aA1BUPfrnm8MlBNhjDLlPZX0tcCaa+a25nWGWMCFbkPSaPBw/LhxHu5Uw+YGd b2PjVVuKhsVihnQwi9Zj9WSPOkeE3tJW1IUAm7P8nqgti4wfe9maUuKmPKOS4YwhGXZU zD7Vnk4jx6LgzmU7x3BConKXUeSm2cS0CzXdMCUBDAXqA0oRkdFOFvZJWljqaZhOJQT8 WRB+fBkDNjEYuxMXDz92rgLxLKdzoPwUVIwLPZPcffuUdl3ev4WOveCS9sme1qWziEOn xnCPtRAIXj2ESPzpzkxLDgf36CPsayDlGgaBAbx26rGoVUfb6WQT3Zmxeit5SO/jlS4u 0g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ttnpdr8rn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 18 Oct 2023 19:43:50 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 39IJcqwe015407; Wed, 18 Oct 2023 19:43:49 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ttnpdr8k4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 18 Oct 2023 19:43:49 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 39IHwwQw019881; Wed, 18 Oct 2023 19:43:46 GMT Received: from smtprelay05.dal12v.mail.ibm.com ([172.16.1.7]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3tr811u047-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 18 Oct 2023 19:43:46 +0000 Received: from smtpav06.wdc07v.mail.ibm.com (smtpav06.wdc07v.mail.ibm.com [10.39.53.233]) by smtprelay05.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 39IJhj7U21234416 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 18 Oct 2023 19:43:45 GMT Received: from smtpav06.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3ABE458054; Wed, 18 Oct 2023 19:43:45 +0000 (GMT) Received: from smtpav06.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1022C5804E; Wed, 18 Oct 2023 19:43:42 +0000 (GMT) Received: from [9.171.53.134] (unknown [9.171.53.134]) by smtpav06.wdc07v.mail.ibm.com (Postfix) with ESMTP; Wed, 18 Oct 2023 19:43:41 +0000 (GMT) Message-ID: <68580479-c66e-41e3-b869-b9f98e348f01@linux.ibm.com> Date: Wed, 18 Oct 2023 21:43:40 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net-next v4 00/18] net/smc: implement virtual ISM extension and loopback-ism Content-Language: en-GB To: Wen Gu , Niklas Schnelle , kgraul@linux.ibm.com, jaka@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: wintera@linux.ibm.com, gbayer@linux.ibm.com, pasic@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, dust.li@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org References: <1695568613-125057-1-git-send-email-guwen@linux.alibaba.com> <49847786-9914-b615-56d6-f39fbc6e03c2@linux.alibaba.com> From: Wenjia Zhang In-Reply-To: <49847786-9914-b615-56d6-f39fbc6e03c2@linux.alibaba.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 5Z0z2u4sSi5qD8aSKeK2Fo8debmTPYqm X-Proofpoint-GUID: V1DHv_58D7XxJyd7noitwiW0f3qWdE1o X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-18_18,2023-10-18_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 phishscore=0 mlxlogscore=999 mlxscore=0 bulkscore=0 clxscore=1015 adultscore=0 priorityscore=1501 suspectscore=0 impostorscore=0 malwarescore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2309180000 definitions=main-2310180162 X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Wed, 18 Oct 2023 12:44:18 -0700 (PDT) On 17.10.23 05:49, Wen Gu wrote: > > > On 2023/10/8 15:19, Wen Gu wrote: >> >> >> On 2023/10/5 16:21, Niklas Schnelle wrote: >> >>> >>> Hi Wen Gu, >>> >>> I've been trying out your series with iperf3, qperf, and uperf on >>> s390x. I'm using network namespaces with a ConnectX VF from the same >>> card in each namespace for the initial TCP/IP connection i.e. initially >>> it goes out to a real NIC even if that can switch internally. All of >>> these look great for streaming workloads both in terms of performance >>> and stability. With a Connect-Request-Response workload and uperf >>> however I've run into issues. The test configuration I use is as >>> follows: >>> >>> Client Command: >>> >>> # host=$ip_server ip netns exec client smc_run uperf -m tcp_crr.xml >>> >>> Server Command: >>> >>> # ip netns exec server smc_run uperf -s &> /dev/null >>> >>> Uperf tcp_crr.xml: >>> >>> >>> >>>          >>>                  >>>                          >> options="remotehost=$host protocol=tcp" /> >>>                          >>>                          >>>                          >>>                  >>>          >>> >>> >>> The workload first runs fine but then after about 4 GB of data >>> transferred fails with "Connection refused" and "Connection reset by >>> peer" errors. The failure is not permanent however and re-running >>> the streaming workloads run fine again (with both uperf server and >>> client restarted). So I suspect something gets stuck in either the >>> client or server sockets. The same workload runs fine with TCP/IP of >>> course. >>> >>> Thanks, >>> Niklas >>> >>> >> >> Hi Niklas, >> >> Thank you very much for the test. With the test example you provided, >> I've >> reproduced the issue in my VM. And moreover, sometimes the test complains >> with 'Error saying goodbye with ' >> >> I'll figure out what's going on here. >> >> Thanks! >> Wen Gu > > I think that there is a common issue for SMC-R and SMC-D. I also reproduce > 'connection reset by peer' and 'Error saying goodbye with ' when using > SMC-R under the same test condition. They occur at the end of the test. > > When the uperf test time ends, some signals are sent. At this point there > are usually some SMC connections doing CLC handshake. I catch some > -EINTR(-4) > in client and -ECONNRESET(-104) in server returned from smc_clc_wait_msg, > (correspondingly handshake error counts also increase) and TCP RST packets > sent to terminate the CLC TCP connection(clcsock). > > I am not sure if this should be considered as a bydesign or a bug of SMC. > From an application perspective, the conn reset behavior only happens when > using SMC. > > @Wenjia, could you please take a look at this? > > Thanks, > Wen Gu Hi Wen, Do you mean the bug in smc_clc_wait_msg()? If yes, I can not see any problem in the smc_clc_wait_msg(). From your description, it looks to me like the server should get the CLC_PROPOSAL message, but nothing in it while the client is waiting for the accept CLC_ACCEPT message from the server until the wait loops is broken out. Thanks, Wenjia