2003-05-19 13:27:11

by Hirokazu Takahashi

[permalink] [raw]
Subject: [BUG] [PATCH] Flow control problems(1)

Hello,

neilb> Finally I have had reports of poor stability with 2.5 nfsd, but I
neilb> don't have any specific details. If anyone wants to hammer nfsd
neilb> nice and hard and tell me lots of details about any negative
neilb> results, I would appreciate it.

There are some flow control problems when we use NFS over TCP.

svc_sock_enqueue() calls sock_wspace() to checkes whether there
is enough space in a sending buffer. But in case of TCP it won't
work correctly.

We should use tcp_wspace() instead of sock_wspace().


Thank you,
Hirokazu Takahashi.


--- net/sunrpc/svcsock.c.ORG Wed May 14 17:00:35 2031
+++ net/sunrpc/svcsock.c Fri May 16 11:43:46 2031
@@ -35,6 +35,7 @@
#include <net/sock.h>
#include <net/checksum.h>
#include <net/ip.h>
+#include <net/tcp.h>
#include <asm/uaccess.h>
#include <asm/ioctls.h>

@@ -116,6 +117,22 @@ svc_release_skb(struct svc_rqst *rqstp)
}

/*
+ * Any space to write?
+ */
+static inline unsigned long
+svc_sock_wspace(struct svc_sock *svsk)
+{
+ int wspace;
+
+ if (svsk->sk_sock->type == SOCK_STREAM)
+ wspace = tcp_wspace(svsk->sk_sk);
+ else
+ wspace = sock_wspace(svsk->sk_sk);
+
+ return wspace;
+}
+
+/*
* Queue up a socket with data pending. If there are idle nfsd
* processes, wake 'em up.
*
@@ -150,13 +167,13 @@ svc_sock_enqueue(struct svc_sock *svsk)
}

if (((svsk->sk_reserved + serv->sv_bufsz)*2
- > sock_wspace(svsk->sk_sk))
+ > svc_sock_wspace(svsk))
&& !test_bit(SK_CLOSE, &svsk->sk_flags)
&& !test_bit(SK_CONN, &svsk->sk_flags)) {
/* Don't enqueue while not enough space for reply */
dprintk("svc: socket %p no space, %d*2 > %ld, not enqueued\n",
svsk->sk_sk, svsk->sk_reserved+serv->sv_bufsz,
- sock_wspace(svsk->sk_sk));
+ svc_sock_wspace(svsk));
goto out_unlock;
}



-------------------------------------------------------
This SF.net email is sponsored by: If flattening out C++ or Java
code to make your application fit in a relational database is painful,
don't do it! Check out ObjectStore. Now part of Progress Software.
http://www.objectstore.net/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-05-19 13:29:01

by Hirokazu Takahashi

[permalink] [raw]
Subject: [BUG] [PATCH] Flow control problems(2)

Hello,

> There are some flow control problems when we use NFS over TCP.
>
> svc_sock_enqueue() checkes whether there is enough space in a
> sending buffer using sock_wspace(). But in case of TCP it won't
> work correctly.
>
> We should use tcp_wspace() instead of sock_wspace().

If there isn't enough space in the socket, svc_sock_enqueue()
won't do anythning. Then some buffers depending on the socket
will be released after a brief interval. But TCP/IP stack won't
notify about it as the socket doesn't have a SOCK_NOSPACE flag.

We should set a SOCK_NOSPACE flag on the socket while there isn't
enough space in it.


Thank you,
Hirokazu Takahashi.



--- net/sunrpc/svcsock.c.ORG Fri May 16 11:43:46 2031
+++ net/sunrpc/svcsock.c Fri May 16 11:45:27 2031
@@ -166,6 +166,7 @@ svc_sock_enqueue(struct svc_sock *svsk)
goto out_unlock;
}

+ set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
if (((svsk->sk_reserved + serv->sv_bufsz)*2
> svc_sock_wspace(svsk))
&& !test_bit(SK_CLOSE, &svsk->sk_flags)
@@ -176,6 +177,7 @@ svc_sock_enqueue(struct svc_sock *svsk)
svc_sock_wspace(svsk));
goto out_unlock;
}
+ clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);

/* Mark socket as busy. It will remain in this state until the
* server has processed all pending data and put the socket back


-------------------------------------------------------
This SF.net email is sponsored by: If flattening out C++ or Java
code to make your application fit in a relational database is painful,
don't do it! Check out ObjectStore. Now part of Progress Software.
http://www.objectstore.net/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-05-19 13:31:34

by Hirokazu Takahashi

[permalink] [raw]
Subject: [BUG] [PATCH] Flow control problems(3)

Hello,

> There are some flow control problems when we use NFS over TCP.

svc_tcp_sendto() may send a partial packet in case of heavily
loaded situations.

In this case we should shutdown the socket immediately
or many broken RPC packets will be sent and may make some NFS
clients stucked. (e.g. I saw NFS clients on Solaris have stucked.)


Thank you,
Hirokazu Takahashi.




--- include/linux/sunrpc/svcsock.h.ORG Thu May 15 17:52:27 2031
+++ include/linux/sunrpc/svcsock.h Fri May 16 11:47:01 2031
@@ -56,7 +56,7 @@ struct svc_sock {
* Function prototypes.
*/
int svc_makesock(struct svc_serv *, int, unsigned short);
-void svc_delete_socket(struct svc_sock *);
+void svc_delete_socket(struct svc_sock *, int);
int svc_recv(struct svc_serv *, struct svc_rqst *, long);
int svc_send(struct svc_rqst *);
void svc_drop(struct svc_rqst *);
--- net/sunrpc/svcsock.c.NEW2 Fri May 16 11:45:27 2031
+++ net/sunrpc/svcsock.c Fri May 16 11:47:01 2031
@@ -360,9 +360,6 @@ svc_sendto(struct svc_rqst *rqstp, struc

slen = xdr->len;

- /* Grab svsk->sk_sem to serialize outgoing data. */
- down(&svsk->sk_sem);
-
if (rqstp->rq_prot == IPPROTO_UDP) {
/* set the destination */
struct msghdr msg;
@@ -415,8 +412,6 @@ svc_sendto(struct svc_rqst *rqstp, struc
len += result;
}
out:
- up(&svsk->sk_sem);
-
dprintk("svc: socket %p sendto([%p %Zu... ], %d) = %d (addr %x)\n",
rqstp->rq_sock, xdr->head[0].iov_base, xdr->head[0].iov_len, xdr->len, len,
rqstp->rq_addr.sin_addr.s_addr);
@@ -867,7 +862,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
return svc_deferred_recv(rqstp);

if (test_bit(SK_CLOSE, &svsk->sk_flags)) {
- svc_delete_socket(svsk);
+ svc_delete_socket(svsk, 0);
return 0;
}

@@ -981,7 +976,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
return len;

err_delete:
- svc_delete_socket(svsk);
+ svc_delete_socket(svsk, 0);
return -EAGAIN;

error:
@@ -1020,7 +1015,11 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
rqstp->rq_sock->sk_server->sv_name,
(sent<0)?"got error":"sent only",
sent, xbufp->len);
- svc_delete_socket(rqstp->rq_sock);
+ /*
+ * we can't keep the socket alive anymore
+ * if we wrote a partial packet.
+ */
+ svc_delete_socket(rqstp->rq_sock, sent > 0);
sent = -EAGAIN;
}
return sent;
@@ -1251,7 +1250,10 @@ svc_send(struct svc_rqst *rqstp)
xb->page_len +
xb->tail[0].iov_len;

+ /* Grab svsk->sk_sem to serialize outgoing data. */
+ down(&svsk->sk_sem);
len = svsk->sk_sendto(rqstp);
+ up(&svsk->sk_sem);
svc_sock_release(rqstp);

if (len == -ECONNREFUSED || len == -ENOTCONN || len == -EAGAIN)
@@ -1379,15 +1381,17 @@ bummer:
* Remove a dead socket
*/
void
-svc_delete_socket(struct svc_sock *svsk)
+svc_delete_socket(struct svc_sock *svsk, int doshutdown)
{
struct svc_serv *serv;
struct sock *sk;
+ struct socket *sock;

dprintk("svc: svc_delete_socket(%p)\n", svsk);

serv = svsk->sk_server;
sk = svsk->sk_sk;
+ sock = svsk->sk_sock;

sk->state_change = svsk->sk_ostate;
sk->data_ready = svsk->sk_odata;
@@ -1407,6 +1411,8 @@ svc_delete_socket(struct svc_sock *svsk)
kfree(svsk);
} else {
spin_unlock_bh(&serv->sv_lock);
+ if (doshutdown && sock->ops->shutdown)
+ sock->ops->shutdown(sock, 2/*SHUT_RDWR*/);
dprintk(KERN_NOTICE "svc: server socket destroy delayed\n");
/* svsk->sk_server = NULL; */
}
--- net/sunrpc/svc.c.ORG Wed May 14 17:56:33 2031
+++ net/sunrpc/svc.c Thu May 15 19:50:27 2031
@@ -90,13 +90,13 @@ svc_destroy(struct svc_serv *serv)
svsk = list_entry(serv->sv_tempsocks.next,
struct svc_sock,
sk_list);
- svc_delete_socket(svsk);
+ svc_delete_socket(svsk, 0);
}
while (!list_empty(&serv->sv_permsocks)) {
svsk = list_entry(serv->sv_permsocks.next,
struct svc_sock,
sk_list);
- svc_delete_socket(svsk);
+ svc_delete_socket(svsk, 0);
}

/* Unregister service with the portmapper */


-------------------------------------------------------
This SF.net email is sponsored by: If flattening out C++ or Java
code to make your application fit in a relational database is painful,
don't do it! Check out ObjectStore. Now part of Progress Software.
http://www.objectstore.net/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-05-20 04:39:25

by NeilBrown

[permalink] [raw]
Subject: Re: [BUG] [PATCH] Flow control problems(1)

On Monday May 19, [email protected] wrote:
> Hello,
>
> neilb> Finally I have had reports of poor stability with 2.5 nfsd, but I
> neilb> don't have any specific details. If anyone wants to hammer nfsd
> neilb> nice and hard and tell me lots of details about any negative
> neilb> results, I would appreciate it.
>
> There are some flow control problems when we use NFS over TCP.
>
> svc_sock_enqueue() calls sock_wspace() to checkes whether there
> is enough space in a sending buffer. But in case of TCP it won't
> work correctly.
>
> We should use tcp_wspace() instead of sock_wspace().

Thanks... but I wonder *why* do we need two different wspace
functions.

Infact, after hunting through the tcp and udp code looking at how
space is accounted and sndbuf is treated, it looks rather untidy.

I'll probably have to use this patch or something alot like it, but
I'm not happy about it.

Thanks again,

NeilBrown

>
>
> Thank you,
> Hirokazu Takahashi.
>
>
> --- net/sunrpc/svcsock.c.ORG Wed May 14 17:00:35 2031
> +++ net/sunrpc/svcsock.c Fri May 16 11:43:46 2031
> @@ -35,6 +35,7 @@
> #include <net/sock.h>
> #include <net/checksum.h>
> #include <net/ip.h>
> +#include <net/tcp.h>
> #include <asm/uaccess.h>
> #include <asm/ioctls.h>
>
> @@ -116,6 +117,22 @@ svc_release_skb(struct svc_rqst *rqstp)
> }
>
> /*
> + * Any space to write?
> + */
> +static inline unsigned long
> +svc_sock_wspace(struct svc_sock *svsk)
> +{
> + int wspace;
> +
> + if (svsk->sk_sock->type == SOCK_STREAM)
> + wspace = tcp_wspace(svsk->sk_sk);
> + else
> + wspace = sock_wspace(svsk->sk_sk);
> +
> + return wspace;
> +}
> +
> +/*
> * Queue up a socket with data pending. If there are idle nfsd
> * processes, wake 'em up.
> *
> @@ -150,13 +167,13 @@ svc_sock_enqueue(struct svc_sock *svsk)
> }
>
> if (((svsk->sk_reserved + serv->sv_bufsz)*2
> - > sock_wspace(svsk->sk_sk))
> + > svc_sock_wspace(svsk))
> && !test_bit(SK_CLOSE, &svsk->sk_flags)
> && !test_bit(SK_CONN, &svsk->sk_flags)) {
> /* Don't enqueue while not enough space for reply */
> dprintk("svc: socket %p no space, %d*2 > %ld, not enqueued\n",
> svsk->sk_sk, svsk->sk_reserved+serv->sv_bufsz,
> - sock_wspace(svsk->sk_sk));
> + svc_sock_wspace(svsk));
> goto out_unlock;
> }
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: If flattening out C++ or Java
> code to make your application fit in a relational database is painful,
> don't do it! Check out ObjectStore. Now part of Progress Software.
> http://www.objectstore.net/sourceforge
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------
This SF.net email is sponsored by: ObjectStore.
If flattening out C++ or Java code to make your application fit in a
relational database is painful, don't do it! Check out ObjectStore.
Now part of Progress Software. http://www.objectstore.net/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs