Received: by 2002:a05:6a10:8395:0:0:0:0 with SMTP id n21csp474779pxh; Wed, 10 Nov 2021 04:53:18 -0800 (PST) X-Google-Smtp-Source: ABdhPJybGBU/oXyvOrEwfz7+TszNP3taX+QLhLA4ciHE/tZoK+nqiiwZckE4HQ8rClOWFvHH5wPZ X-Received: by 2002:a17:907:3f19:: with SMTP id hq25mr19657907ejc.225.1636548798405; Wed, 10 Nov 2021 04:53:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1636548798; cv=none; d=google.com; s=arc-20160816; b=nJCvJFkBMyIcMD9OPA6nJOJlAoKD2BEQtiTXHOzBOfQ6Bf7CSl3iHWKTmLJboJ63AF /0t9vAgjnVKMmGUc1UTjWMp4gSdxopFRRnWH3QNilz69O1zTu175lyEm1gjlxgbJEBfY 0p5Pd3Y50jmBhVkQgvFrTGZdTAheEOzp+vAJwawOqBxgIFmn3PrMAbapkQ91RnwLDpZx gf1elAImXoSBtSJte3s758pkbPTT8mx0bDVeWoD6eXarq1XYM5JA20fE2qxEIAeKY1XD sJcADMvbOmlc9D/MLZsjdHdYlsnJZSEDKEd6aHnQeBRKzGnT9x3Ua15v0zXMr7EZrcyb aZug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=SfJBi7RqZZeTD8k6Wta3KANbbB6mNKilP28sDs5kE4I=; b=FKfHqeSSUTn/H5syfwkspUWCjrLZI1TiM3oviafUlMmpdQvUZAF8txc+7HPl4jcBvq LPOJioaoqFnJGZsaOixUL+0+l6aXLh3aH3Tc8ww2HhdCStlrTwAGPzs0EBF5T2ZNmtvn kjqXQP8wFHV8AfGa0Nv9HFR3DIwnjSm0aylmoZYH/sQpejRXHtLSfrN6m48S5RNYTfGa Jkp8OifPgzH3dZepRTy0x5Xq+ZEvSvyIl7Teq4uGnE2YLFitjukvh4frSp3lzZFKSBjU Zsb57Q/3wpUVeFzUNRS2KZbifmRPk2+uGhJf64Wy7RmjHcej6hJkVlwtmYa5xiZ67hp3 enAg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o16si12125918ejy.428.2021.11.10.04.52.52; Wed, 10 Nov 2021 04:53:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231290AbhKJMxv (ORCPT + 99 others); Wed, 10 Nov 2021 07:53:51 -0500 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]:32850 "EHLO out30-130.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229653AbhKJMxs (ORCPT ); Wed, 10 Nov 2021 07:53:48 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R261e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04395;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0UvuzFxJ_1636548651; Received: from e02h04404.eu6sqa(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0UvuzFxJ_1636548651) by smtp.aliyun-inc.com(127.0.0.1); Wed, 10 Nov 2021 20:50:58 +0800 From: Wen Gu To: kgraul@linux.ibm.com, tonylu@linux.alibaba.com Cc: davem@davemloft.net, kuba@kernel.org, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, dust.li@linux.alibaba.com, xuanzhuo@linux.alibaba.com, guwen@linux.alibaba.com Subject: [RFC PATCH 0/2] Two RFC patches for the same SMC socket wait queue mismatch issue Date: Wed, 10 Nov 2021 20:50:49 +0800 Message-Id: <1636548651-44649-1-git-send-email-guwen@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Karsten Thanks for your reply. The previous discussion about the issue of socket wait queue mismatch in SMC fallback can be referred from: https://lore.kernel.org/all/db9acf73-abef-209e-6ec2-8ada92e2cfbc@linux.ibm.com/ This set of patches includes two RFC patches, they are both aimed to fix the same issue, the mismatch of socket wait queue in SMC fallback. In your last reply, I am suggested to add the complete description about the intention of initial patch in order that readers can understand the idea behind it. This has been done in "[RFC PATCH net v2 0/2] net/smc: Fix socket wait queue mismatch issue caused by fallback" of this mail. Unfortunately, I found a defect later in the solution of the initial patch or the v2 patch mentioned above. The defect is about fasync_list and related to 67f562e3e14 ("net/smc: transfer fasync_list in case of fallback"). When user applications use sock_fasync() to insert entries into fasync_list, the wait queue they operate is smc socket->wq. But in initial patch or the v2 patch, I swapped sk->sk_wq of smc socket and clcsocket in smc_create(), thus the sk_data_ready / sk_write_space.. of smc will wake up clcsocket->wq finally. So the entries added into smc socket->wq.fasync_list won't be woken up at all before fallback. So the solution in initial patch or the v2 patch of this mail by swapping sk->sk_wq of smc socket and clcsocket seems a bad way to fix this issue. Therefore, I tried another solution by removing the wait queue entries from smc socket->wq to clcsocket->wq during the fallback, which is described in the "[RFC PATCH net 2/2] net/smc: Transfer remaining wait queue entries" of this mail. In our test environment, this patch can fix the fallback issue well. I am looking forward to hear your opinions. Thank you. Cheers, Wen Gu Wen Gu (2): net/smc: Fix socket wait queue mismatch issue caused by fallback net/smc: Transfer remaining wait queue entries