2022-09-05 05:54:00

by Menglong Dong

[permalink] [raw]
Subject: [PATCH net] net: mptcp: fix unreleased socket in accept queue

From: Menglong Dong <[email protected]>

The mptcp socket and its subflow sockets in accept queue can't be
released after the process exit.

While the release of a mptcp socket in listening state, the
corresponding tcp socket will be released too. Meanwhile, the tcp
socket in the unaccept queue will be released too. However, only init
subflow is in the unaccept queue, and the joined subflow is not in the
unaccept queue, which makes the joined subflow won't be released, and
therefore the corresponding unaccepted mptcp socket will not be released
to.

This can be reproduced easily with following steps:

1. create 2 namespace and veth:
$ ip netns add mptcp-client
$ ip netns add mptcp-server
$ sysctl -w net.ipv4.conf.all.rp_filter=0
$ ip netns exec mptcp-client sysctl -w net.mptcp.enabled=1
$ ip netns exec mptcp-server sysctl -w net.mptcp.enabled=1
$ ip link add red-client netns mptcp-client type veth peer red-server \
netns mptcp-server
$ ip -n mptcp-server address add 10.0.0.1/24 dev red-server
$ ip -n mptcp-server address add 192.168.0.1/24 dev red-server
$ ip -n mptcp-client address add 10.0.0.2/24 dev red-client
$ ip -n mptcp-client address add 192.168.0.2/24 dev red-client
$ ip -n mptcp-server link set red-server up
$ ip -n mptcp-client link set red-client up

2. configure the endpoint and limit for client and server:
$ ip -n mptcp-server mptcp endpoint flush
$ ip -n mptcp-server mptcp limits set subflow 2 add_addr_accepted 2
$ ip -n mptcp-client mptcp endpoint flush
$ ip -n mptcp-client mptcp limits set subflow 2 add_addr_accepted 2
$ ip -n mptcp-client mptcp endpoint add 192.168.0.2 dev red-client id \
1 subflow

3. listen and accept on a port, such as 9999. The nc command we used
here is modified, which makes it uses mptcp protocol by default.
And the default backlog is 1:
ip netns exec mptcp-server nc -l -k -p 9999

4. open another *two* terminal and connect to the server with the
following command:
$ ip netns exec mptcp-client nc 10.0.0.1 9999
input something after connect, to triger the connection of the second
subflow

5. exit all the nc command, and check the tcp socket in server namespace.
And you will find that there is one tcp socket in CLOSE_WAIT state
and can't release forever.

There are some solutions that I thought:

1. release all unaccepted mptcp socket with mptcp_close() while the
listening tcp socket release in mptcp_subflow_queue_clean(). This is
what we do in this commit.
2. release the mptcp socket with mptcp_close() in subflow_ulp_release().
3. etc

Signed-off-by: Menglong Dong <[email protected]>
---
net/mptcp/subflow.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index c7d49fb6e7bd..e39dff5d5d84 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1770,6 +1770,10 @@ void mptcp_subflow_queue_clean(struct sock *listener_ssk)
msk->first = NULL;
msk->dl_next = NULL;
unlock_sock_fast(sk, slow);
+
+ /* */
+ sock_hold(sk);
+ sk->sk_prot->close(sk);
}

/* we are still under the listener msk socket lock */
--
2.37.2


2022-09-05 06:57:37

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH net] net: mptcp: fix unreleased socket in accept queue

Hi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net/master]

url: https://github.com/intel-lab-lkp/linux/commits/menglong8-dong-gmail-com/net-mptcp-fix-unreleased-socket-in-accept-queue/20220905-130457
base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git c90714017cb3f197e71c7ff1317335b96d4d19e8
config: s390-randconfig-r015-20220905
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project c55b41d5199d2394dd6cdb8f52180d8b81d809d4)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install s390 cross compiling tool for clang build
# apt-get install binutils-s390x-linux-gnu
# https://github.com/intel-lab-lkp/linux/commit/496c680afa6c8a180858e88ba2b5a6aa6d262bed
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review menglong8-dong-gmail-com/net-mptcp-fix-unreleased-socket-in-accept-queue/20220905-130457
git checkout 496c680afa6c8a180858e88ba2b5a6aa6d262bed
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash net/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

In file included from net/mptcp/subflow.c:11:
In file included from include/linux/netdevice.h:38:
In file included from include/net/net_namespace.h:43:
In file included from include/linux/skbuff.h:31:
In file included from include/linux/dma-mapping.h:10:
In file included from include/linux/scatterlist.h:9:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __raw_readb(PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
#define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
^
include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
#define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
^
In file included from net/mptcp/subflow.c:11:
In file included from include/linux/netdevice.h:38:
In file included from include/net/net_namespace.h:43:
In file included from include/linux/skbuff.h:31:
In file included from include/linux/dma-mapping.h:10:
In file included from include/linux/scatterlist.h:9:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
#define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
^
include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
#define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
^
In file included from net/mptcp/subflow.c:11:
In file included from include/linux/netdevice.h:38:
In file included from include/net/net_namespace.h:43:
In file included from include/linux/skbuff.h:31:
In file included from include/linux/dma-mapping.h:10:
In file included from include/linux/scatterlist.h:9:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writeb(value, PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsb(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsw(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsl(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesb(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesw(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesl(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
>> net/mptcp/subflow.c:1776:24: error: too few arguments to function call, expected 2, have 1
sk->sk_prot->close(sk);
~~~~~~~~~~~~~~~~~~ ^
12 warnings and 1 error generated.


vim +1776 net/mptcp/subflow.c

1726
1727 void mptcp_subflow_queue_clean(struct sock *listener_ssk)
1728 {
1729 struct request_sock_queue *queue = &inet_csk(listener_ssk)->icsk_accept_queue;
1730 struct mptcp_sock *msk, *next, *head = NULL;
1731 struct request_sock *req;
1732
1733 /* build a list of all unaccepted mptcp sockets */
1734 spin_lock_bh(&queue->rskq_lock);
1735 for (req = queue->rskq_accept_head; req; req = req->dl_next) {
1736 struct mptcp_subflow_context *subflow;
1737 struct sock *ssk = req->sk;
1738 struct mptcp_sock *msk;
1739
1740 if (!sk_is_mptcp(ssk))
1741 continue;
1742
1743 subflow = mptcp_subflow_ctx(ssk);
1744 if (!subflow || !subflow->conn)
1745 continue;
1746
1747 /* skip if already in list */
1748 msk = mptcp_sk(subflow->conn);
1749 if (msk->dl_next || msk == head)
1750 continue;
1751
1752 msk->dl_next = head;
1753 head = msk;
1754 }
1755 spin_unlock_bh(&queue->rskq_lock);
1756 if (!head)
1757 return;
1758
1759 /* can't acquire the msk socket lock under the subflow one,
1760 * or will cause ABBA deadlock
1761 */
1762 release_sock(listener_ssk);
1763
1764 for (msk = head; msk; msk = next) {
1765 struct sock *sk = (struct sock *)msk;
1766 bool slow;
1767
1768 slow = lock_sock_fast_nested(sk);
1769 next = msk->dl_next;
1770 msk->first = NULL;
1771 msk->dl_next = NULL;
1772 unlock_sock_fast(sk, slow);
1773
1774 /* */
1775 sock_hold(sk);
> 1776 sk->sk_prot->close(sk);
1777 }
1778
1779 /* we are still under the listener msk socket lock */
1780 lock_sock_nested(listener_ssk, SINGLE_DEPTH_NESTING);
1781 }
1782

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (8.48 kB)
config (77.57 kB)
Download all attachments

2022-09-05 08:29:16

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net] net: mptcp: fix unreleased socket in accept queue

Hello,

On Mon, 2022-09-05 at 13:04 +0800, [email protected] wrote:
> From: Menglong Dong <[email protected]>
>
> The mptcp socket and its subflow sockets in accept queue can't be
> released after the process exit.
>
> While the release of a mptcp socket in listening state, the
> corresponding tcp socket will be released too. Meanwhile, the tcp
> socket in the unaccept queue will be released too. However, only init
> subflow is in the unaccept queue, and the joined subflow is not in the
> unaccept queue, which makes the joined subflow won't be released, and
> therefore the corresponding unaccepted mptcp socket will not be released
> to.
>
> This can be reproduced easily with following steps:
>
> 1. create 2 namespace and veth:
> $ ip netns add mptcp-client
> $ ip netns add mptcp-server
> $ sysctl -w net.ipv4.conf.all.rp_filter=0
> $ ip netns exec mptcp-client sysctl -w net.mptcp.enabled=1
> $ ip netns exec mptcp-server sysctl -w net.mptcp.enabled=1
> $ ip link add red-client netns mptcp-client type veth peer red-server \
> netns mptcp-server
> $ ip -n mptcp-server address add 10.0.0.1/24 dev red-server
> $ ip -n mptcp-server address add 192.168.0.1/24 dev red-server
> $ ip -n mptcp-client address add 10.0.0.2/24 dev red-client
> $ ip -n mptcp-client address add 192.168.0.2/24 dev red-client
> $ ip -n mptcp-server link set red-server up
> $ ip -n mptcp-client link set red-client up
>
> 2. configure the endpoint and limit for client and server:
> $ ip -n mptcp-server mptcp endpoint flush
> $ ip -n mptcp-server mptcp limits set subflow 2 add_addr_accepted 2
> $ ip -n mptcp-client mptcp endpoint flush
> $ ip -n mptcp-client mptcp limits set subflow 2 add_addr_accepted 2
> $ ip -n mptcp-client mptcp endpoint add 192.168.0.2 dev red-client id \
> 1 subflow
>
> 3. listen and accept on a port, such as 9999. The nc command we used
> here is modified, which makes it uses mptcp protocol by default.
> And the default backlog is 1:
> ip netns exec mptcp-server nc -l -k -p 9999
>
> 4. open another *two* terminal and connect to the server with the
> following command:
> $ ip netns exec mptcp-client nc 10.0.0.1 9999
> input something after connect, to triger the connection of the second
> subflow
>
> 5. exit all the nc command, and check the tcp socket in server namespace.
> And you will find that there is one tcp socket in CLOSE_WAIT state
> and can't release forever.

Thank you for the report!

I have a doubt WRT the above scenario: AFAICS 'nc' will accept the
incoming sockets ASAP, so the unaccepted queue should be empty at
shutdown, but that does not fit with your description?!?

> There are some solutions that I thought:
>
> 1. release all unaccepted mptcp socket with mptcp_close() while the
> listening tcp socket release in mptcp_subflow_queue_clean(). This is
> what we do in this commit.
> 2. release the mptcp socket with mptcp_close() in subflow_ulp_release().
> 3. etc
>

Can you please point to a commit introducing the issue?

> Signed-off-by: Menglong Dong <[email protected]>
> ---
> net/mptcp/subflow.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index c7d49fb6e7bd..e39dff5d5d84 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -1770,6 +1770,10 @@ void mptcp_subflow_queue_clean(struct sock *listener_ssk)
> msk->first = NULL;
> msk->dl_next = NULL;
> unlock_sock_fast(sk, slow);
> +
> + /* */
> + sock_hold(sk);
> + sk->sk_prot->close(sk);

You can call mptcp_close() directly here.

Perhaps we could as well drop the mptcp_sock_destruct() hack?

Perhpas even providing a __mptcp_close() variant not acquiring the
socket lock and move such close call inside the existing sk socket lock
above?

Thanks,

Paolo

2022-09-05 09:46:12

by Menglong Dong

[permalink] [raw]
Subject: Re: [PATCH net] net: mptcp: fix unreleased socket in accept queue

On Mon, Sep 5, 2022 at 4:26 PM Paolo Abeni <[email protected]> wrote:
>
> Hello,
>
> On Mon, 2022-09-05 at 13:04 +0800, [email protected] wrote:
> > From: Menglong Dong <[email protected]>
> >
> > The mptcp socket and its subflow sockets in accept queue can't be
> > released after the process exit.
> >
> > While the release of a mptcp socket in listening state, the
> > corresponding tcp socket will be released too. Meanwhile, the tcp
> > socket in the unaccept queue will be released too. However, only init
> > subflow is in the unaccept queue, and the joined subflow is not in the
> > unaccept queue, which makes the joined subflow won't be released, and
> > therefore the corresponding unaccepted mptcp socket will not be released
> > to.
> >
> > This can be reproduced easily with following steps:
> >
> > 1. create 2 namespace and veth:
> > $ ip netns add mptcp-client
> > $ ip netns add mptcp-server
> > $ sysctl -w net.ipv4.conf.all.rp_filter=0
> > $ ip netns exec mptcp-client sysctl -w net.mptcp.enabled=1
> > $ ip netns exec mptcp-server sysctl -w net.mptcp.enabled=1
> > $ ip link add red-client netns mptcp-client type veth peer red-server \
> > netns mptcp-server
> > $ ip -n mptcp-server address add 10.0.0.1/24 dev red-server
> > $ ip -n mptcp-server address add 192.168.0.1/24 dev red-server
> > $ ip -n mptcp-client address add 10.0.0.2/24 dev red-client
> > $ ip -n mptcp-client address add 192.168.0.2/24 dev red-client
> > $ ip -n mptcp-server link set red-server up
> > $ ip -n mptcp-client link set red-client up
> >
> > 2. configure the endpoint and limit for client and server:
> > $ ip -n mptcp-server mptcp endpoint flush
> > $ ip -n mptcp-server mptcp limits set subflow 2 add_addr_accepted 2
> > $ ip -n mptcp-client mptcp endpoint flush
> > $ ip -n mptcp-client mptcp limits set subflow 2 add_addr_accepted 2
> > $ ip -n mptcp-client mptcp endpoint add 192.168.0.2 dev red-client id \
> > 1 subflow
> >
> > 3. listen and accept on a port, such as 9999. The nc command we used
> > here is modified, which makes it uses mptcp protocol by default.
> > And the default backlog is 1:
> > ip netns exec mptcp-server nc -l -k -p 9999
> >
> > 4. open another *two* terminal and connect to the server with the
> > following command:
> > $ ip netns exec mptcp-client nc 10.0.0.1 9999
> > input something after connect, to triger the connection of the second
> > subflow
> >
> > 5. exit all the nc command, and check the tcp socket in server namespace.
> > And you will find that there is one tcp socket in CLOSE_WAIT state
> > and can't release forever.
>
> Thank you for the report!
>
> I have a doubt WRT the above scenario: AFAICS 'nc' will accept the
> incoming sockets ASAP, so the unaccepted queue should be empty at
> shutdown, but that does not fit with your description?!?
>

By default, as far as in my case, nc won't accept the new connection
until the first connection closes with the '-k' set. Therefor, the second
connection will stay in the unaccepted queue.

> > There are some solutions that I thought:
> >
> > 1. release all unaccepted mptcp socket with mptcp_close() while the
> > listening tcp socket release in mptcp_subflow_queue_clean(). This is
> > what we do in this commit.
> > 2. release the mptcp socket with mptcp_close() in subflow_ulp_release().
> > 3. etc
> >
>
> Can you please point to a commit introducing the issue?
>

In fact, I'm not sure. In my case, I found this issue in kernel 5.10.
And I wanted to find the solution in the upstream, but find that
upstream has this issue too.

Hmm...I am curious if this issue exists in the beginning? I
can't find the opportunity that the joined subflow which are
unaccepted can be released.

> > Signed-off-by: Menglong Dong <[email protected]>
> > ---
> > net/mptcp/subflow.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> > index c7d49fb6e7bd..e39dff5d5d84 100644
> > --- a/net/mptcp/subflow.c
> > +++ b/net/mptcp/subflow.c
> > @@ -1770,6 +1770,10 @@ void mptcp_subflow_queue_clean(struct sock *listener_ssk)
> > msk->first = NULL;
> > msk->dl_next = NULL;
> > unlock_sock_fast(sk, slow);
> > +
> > + /* */
> > + sock_hold(sk);
> > + sk->sk_prot->close(sk);
>
> You can call mptcp_close() directly here.
>
> Perhaps we could as well drop the mptcp_sock_destruct() hack?

Do you mean to call mptcp_sock_destruct() directly here?

>
> Perhpas even providing a __mptcp_close() variant not acquiring the
> socket lock and move such close call inside the existing sk socket lock
> above?
>

Yeah, sounds nice.

Thanks!
Menglong Dong

> Thanks,
>
> Paolo
>

2022-09-06 07:42:04

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net] net: mptcp: fix unreleased socket in accept queue

On Mon, 2022-09-05 at 17:03 +0800, Menglong Dong wrote:
> On Mon, Sep 5, 2022 at 4:26 PM Paolo Abeni <[email protected]> wrote:
> >
> > Hello,
> >
> > On Mon, 2022-09-05 at 13:04 +0800, [email protected] wrote:
> > > From: Menglong Dong <[email protected]>
> > >
> > > The mptcp socket and its subflow sockets in accept queue can't be
> > > released after the process exit.
> > >
> > > While the release of a mptcp socket in listening state, the
> > > corresponding tcp socket will be released too. Meanwhile, the tcp
> > > socket in the unaccept queue will be released too. However, only init
> > > subflow is in the unaccept queue, and the joined subflow is not in the
> > > unaccept queue, which makes the joined subflow won't be released, and
> > > therefore the corresponding unaccepted mptcp socket will not be released
> > > to.
> > >
> > > This can be reproduced easily with following steps:
> > >
> > > 1. create 2 namespace and veth:
> > > $ ip netns add mptcp-client
> > > $ ip netns add mptcp-server
> > > $ sysctl -w net.ipv4.conf.all.rp_filter=0
> > > $ ip netns exec mptcp-client sysctl -w net.mptcp.enabled=1
> > > $ ip netns exec mptcp-server sysctl -w net.mptcp.enabled=1
> > > $ ip link add red-client netns mptcp-client type veth peer red-server \
> > > netns mptcp-server
> > > $ ip -n mptcp-server address add 10.0.0.1/24 dev red-server
> > > $ ip -n mptcp-server address add 192.168.0.1/24 dev red-server
> > > $ ip -n mptcp-client address add 10.0.0.2/24 dev red-client
> > > $ ip -n mptcp-client address add 192.168.0.2/24 dev red-client
> > > $ ip -n mptcp-server link set red-server up
> > > $ ip -n mptcp-client link set red-client up
> > >
> > > 2. configure the endpoint and limit for client and server:
> > > $ ip -n mptcp-server mptcp endpoint flush
> > > $ ip -n mptcp-server mptcp limits set subflow 2 add_addr_accepted 2
> > > $ ip -n mptcp-client mptcp endpoint flush
> > > $ ip -n mptcp-client mptcp limits set subflow 2 add_addr_accepted 2
> > > $ ip -n mptcp-client mptcp endpoint add 192.168.0.2 dev red-client id \
> > > 1 subflow
> > >
> > > 3. listen and accept on a port, such as 9999. The nc command we used
> > > here is modified, which makes it uses mptcp protocol by default.
> > > And the default backlog is 1:
> > > ip netns exec mptcp-server nc -l -k -p 9999
> > >
> > > 4. open another *two* terminal and connect to the server with the
> > > following command:
> > > $ ip netns exec mptcp-client nc 10.0.0.1 9999
> > > input something after connect, to triger the connection of the second
> > > subflow
> > >
> > > 5. exit all the nc command, and check the tcp socket in server namespace.
> > > And you will find that there is one tcp socket in CLOSE_WAIT state
> > > and can't release forever.
> >
> > Thank you for the report!
> >
> > I have a doubt WRT the above scenario: AFAICS 'nc' will accept the
> > incoming sockets ASAP, so the unaccepted queue should be empty at
> > shutdown, but that does not fit with your description?!?
> >
>
> By default, as far as in my case, nc won't accept the new connection
> until the first connection closes with the '-k' set. Therefor, the second
> connection will stay in the unaccepted queue.

I missed the fact you opened 2 connections. I guess that is point 4
above. Please rephrase that sentence with something alike:

---
4. open another *two* terminal and use each of them to connect to the
server with the following command:
...
So that there are two established mptcp connections, with the second
one still unaccepted.
---
>
> > > There are some solutions that I thought:
> > >
> > > 1. release all unaccepted mptcp socket with mptcp_close() while the
> > > listening tcp socket release in mptcp_subflow_queue_clean(). This is
> > > what we do in this commit.
> > > 2. release the mptcp socket with mptcp_close() in subflow_ulp_release().
> > > 3. etc
> > >
> >
> > Can you please point to a commit introducing the issue?
> >
>
> In fact, I'm not sure. In my case, I found this issue in kernel 5.10.
> And I wanted to find the solution in the upstream, but find that
> upstream has this issue too.
>
> Hmm...I am curious if this issue exists in the beginning? I
> can't find the opportunity that the joined subflow which are
> unaccepted can be released.

It looks like the problem is there since MPJ support, commit
f296234c98a8fcec94eec80304a873f635d350ea

>
> > > Signed-off-by: Menglong Dong <[email protected]>
> > > ---
> > > net/mptcp/subflow.c | 4 ++++
> > > 1 file changed, 4 insertions(+)
> > >
> > > diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> > > index c7d49fb6e7bd..e39dff5d5d84 100644
> > > --- a/net/mptcp/subflow.c
> > > +++ b/net/mptcp/subflow.c
> > > @@ -1770,6 +1770,10 @@ void mptcp_subflow_queue_clean(struct sock *listener_ssk)
> > > msk->first = NULL;
> > > msk->dl_next = NULL;
> > > unlock_sock_fast(sk, slow);
> > > +
> > > + /* */
> > > + sock_hold(sk);
> > > + sk->sk_prot->close(sk);
> >
> > You can call mptcp_close() directly here.
> >
> > Perhaps we could as well drop the mptcp_sock_destruct() hack?
>
> Do you mean to call mptcp_sock_destruct() directly here?

I suspect that with this change setting msk->sk_destruct to
mptcp_sock_destruct in subflow_syn_recv_sock() is not needed anymore,
and the relevant intialization (and callback definition) could be
removed.

>
Cheers,

Paolo

2022-09-07 07:34:41

by Menglong Dong

[permalink] [raw]
Subject: Re: [PATCH net] net: mptcp: fix unreleased socket in accept queue

On Tue, Sep 6, 2022 at 3:02 PM Paolo Abeni <[email protected]> wrote:
>
> On Mon, 2022-09-05 at 17:03 +0800, Menglong Dong wrote:
> > On Mon, Sep 5, 2022 at 4:26 PM Paolo Abeni <[email protected]> wrote:
> > >
> > > Hello,
> > >
> > > On Mon, 2022-09-05 at 13:04 +0800, [email protected] wrote:
> > > > From: Menglong Dong <[email protected]>
> > > >
> > > > The mptcp socket and its subflow sockets in accept queue can't be
> > > > released after the process exit.
> > > >
> > > > While the release of a mptcp socket in listening state, the
> > > > corresponding tcp socket will be released too. Meanwhile, the tcp
> > > > socket in the unaccept queue will be released too. However, only init
> > > > subflow is in the unaccept queue, and the joined subflow is not in the
> > > > unaccept queue, which makes the joined subflow won't be released, and
> > > > therefore the corresponding unaccepted mptcp socket will not be released
> > > > to.
> > > >
> > > > This can be reproduced easily with following steps:
> > > >
> > > > 1. create 2 namespace and veth:
> > > > $ ip netns add mptcp-client
> > > > $ ip netns add mptcp-server
> > > > $ sysctl -w net.ipv4.conf.all.rp_filter=0
> > > > $ ip netns exec mptcp-client sysctl -w net.mptcp.enabled=1
> > > > $ ip netns exec mptcp-server sysctl -w net.mptcp.enabled=1
> > > > $ ip link add red-client netns mptcp-client type veth peer red-server \
> > > > netns mptcp-server
> > > > $ ip -n mptcp-server address add 10.0.0.1/24 dev red-server
> > > > $ ip -n mptcp-server address add 192.168.0.1/24 dev red-server
> > > > $ ip -n mptcp-client address add 10.0.0.2/24 dev red-client
> > > > $ ip -n mptcp-client address add 192.168.0.2/24 dev red-client
> > > > $ ip -n mptcp-server link set red-server up
> > > > $ ip -n mptcp-client link set red-client up
> > > >
> > > > 2. configure the endpoint and limit for client and server:
> > > > $ ip -n mptcp-server mptcp endpoint flush
> > > > $ ip -n mptcp-server mptcp limits set subflow 2 add_addr_accepted 2
> > > > $ ip -n mptcp-client mptcp endpoint flush
> > > > $ ip -n mptcp-client mptcp limits set subflow 2 add_addr_accepted 2
> > > > $ ip -n mptcp-client mptcp endpoint add 192.168.0.2 dev red-client id \
> > > > 1 subflow
> > > >
> > > > 3. listen and accept on a port, such as 9999. The nc command we used
> > > > here is modified, which makes it uses mptcp protocol by default.
> > > > And the default backlog is 1:
> > > > ip netns exec mptcp-server nc -l -k -p 9999
> > > >
> > > > 4. open another *two* terminal and connect to the server with the
> > > > following command:
> > > > $ ip netns exec mptcp-client nc 10.0.0.1 9999
> > > > input something after connect, to triger the connection of the second
> > > > subflow
> > > >
> > > > 5. exit all the nc command, and check the tcp socket in server namespace.
> > > > And you will find that there is one tcp socket in CLOSE_WAIT state
> > > > and can't release forever.
> > >
> > > Thank you for the report!
> > >
> > > I have a doubt WRT the above scenario: AFAICS 'nc' will accept the
> > > incoming sockets ASAP, so the unaccepted queue should be empty at
> > > shutdown, but that does not fit with your description?!?
> > >
> >
> > By default, as far as in my case, nc won't accept the new connection
> > until the first connection closes with the '-k' set. Therefor, the second
> > connection will stay in the unaccepted queue.
>
> I missed the fact you opened 2 connections. I guess that is point 4
> above. Please rephrase that sentence with something alike:
>
> ---
> 4. open another *two* terminal and use each of them to connect to the
> server with the following command:
> ...
> So that there are two established mptcp connections, with the second
> one still unaccepted.
> ---

Sounds nice! Thanks~

> >
> > > > There are some solutions that I thought:
> > > >
> > > > 1. release all unaccepted mptcp socket with mptcp_close() while the
> > > > listening tcp socket release in mptcp_subflow_queue_clean(). This is
> > > > what we do in this commit.
> > > > 2. release the mptcp socket with mptcp_close() in subflow_ulp_release().
> > > > 3. etc
> > > >
> > >
> > > Can you please point to a commit introducing the issue?
> > >
> >
> > In fact, I'm not sure. In my case, I found this issue in kernel 5.10.
> > And I wanted to find the solution in the upstream, but find that
> > upstream has this issue too.
> >
> > Hmm...I am curious if this issue exists in the beginning? I
> > can't find the opportunity that the joined subflow which are
> > unaccepted can be released.
>
> It looks like the problem is there since MPJ support, commit
> f296234c98a8fcec94eec80304a873f635d350ea
>

Yeah, I'll add a Fixes tag for this commit.

> >
> > > > Signed-off-by: Menglong Dong <[email protected]>
> > > > ---
> > > > net/mptcp/subflow.c | 4 ++++
> > > > 1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> > > > index c7d49fb6e7bd..e39dff5d5d84 100644
> > > > --- a/net/mptcp/subflow.c
> > > > +++ b/net/mptcp/subflow.c
> > > > @@ -1770,6 +1770,10 @@ void mptcp_subflow_queue_clean(struct sock *listener_ssk)
> > > > msk->first = NULL;
> > > > msk->dl_next = NULL;
> > > > unlock_sock_fast(sk, slow);
> > > > +
> > > > + /* */
> > > > + sock_hold(sk);
> > > > + sk->sk_prot->close(sk);
> > >
> > > You can call mptcp_close() directly here.
> > >
> > > Perhaps we could as well drop the mptcp_sock_destruct() hack?
> >
> > Do you mean to call mptcp_sock_destruct() directly here?
>
> I suspect that with this change setting msk->sk_destruct to
> mptcp_sock_destruct in subflow_syn_recv_sock() is not needed anymore,
> and the relevant intialization (and callback definition) could be
> removed.

Your suspect should be right. The mptcp_subflow_queue_clean()
should always be called before unaccepted tcp socket and the
corresponding mptcp socket release.
Therefore this change can ensure that the mptcp socket will be
CLOSE state when it is released. I'll remove
mptcp_sock_destruct(), BTW.

Thanks!
Menglong Dong

>
> >
> Cheers,
>
> Paolo
>