2008-07-17 22:24:04

by Leisner, Martin

[permalink] [raw]
Subject: showmount issues

I have a system (which is using the legacy SUN derived rpcbind).

For some reason, showmount wasn't "always" working. It reported
nothing when running from the command line, but worked under strace
(ugh!)

I was running fedora8 (showmount 1.1.0).
I built nfs-utils 1.1.2 -- same problems.

(I just tried a 1.0.6 on a RedHat Enterprise Linux 3 -- it worked fine)

Running the code under gdb, I found some "interesting" problems...

I changed:
bash2 :2 mleisner@mleisner-linux 05:59:55; rcsdiff -u showmount.c
===================================================================
RCS file: showmount.c,v
retrieving revision 1.1
diff -u -r1.1 showmount.c
--- showmount.c 2008/07/17 21:28:59 1.1
+++ showmount.c 2008/07/17 21:45:45
@@ -82,6 +82,8 @@
*
* tout contains the timeout. It will be modified to contain the time
* remaining (i.e. time provided - time elasped).
+ *
+ * Returns 0 if it works
*/
static int connect_nb(int fd, struct sockaddr_in *addr, struct timeval
*tout)
{
@@ -177,7 +179,7 @@
tout.tv_sec = TIMEOUT_TCP;

ret = connect_nb(sock, &saddr, &tout);
- if (ret == -1) {
+ if (ret < 0) {
close(sock);
rpc_createerr.cf_stat = RPC_SYSTEMERROR;
rpc_createerr.cf_error.re_errno = errno;
@@ -350,7 +352,7 @@
MOUNTPROG, MOUNTVERS,
IPPROTO_TCP);
if (server_addr.sin_port) {
ret = connect_nb(msock, &server_addr, 0);
- if (ret != -1)
+ if (ret == 0)
mclient = clnttcp_create(&server_addr,
MOUNTPROG, MOUNTVERS,
&msock,
0, 0);


and now it works....

marty


2008-07-21 20:10:35

by Chuck Lever III

[permalink] [raw]
Subject: Re: showmount issues

Hi Martin-

On Jul 17, 2008, at 6:01 PM, Leisner, Martin wrote:
> I have a system (which is using the legacy SUN derived rpcbind).
>
> For some reason, showmount wasn't "always" working. It reported
> nothing when running from the command line, but worked under strace
> (ugh!)
>
> I was running fedora8 (showmount 1.1.0).
> I built nfs-utils 1.1.2 -- same problems.
>
> (I just tried a 1.0.6 on a RedHat Enterprise Linux 3 -- it worked
> fine)
>
> Running the code under gdb, I found some "interesting" problems...
>
> I changed:
> bash2 :2 mleisner@mleisner-linux 05:59:55; rcsdiff -u showmount.c
> ===================================================================
> RCS file: showmount.c,v
> retrieving revision 1.1
> diff -u -r1.1 showmount.c
> --- showmount.c 2008/07/17 21:28:59 1.1
> +++ showmount.c 2008/07/17 21:45:45
> @@ -82,6 +82,8 @@
> *
> * tout contains the timeout. It will be modified to contain the
> time
> * remaining (i.e. time provided - time elasped).
> + *
> + * Returns 0 if it works
> */
> static int connect_nb(int fd, struct sockaddr_in *addr, struct timeval
> *tout)
> {
> @@ -177,7 +179,7 @@
> tout.tv_sec = TIMEOUT_TCP;
>
> ret = connect_nb(sock, &saddr, &tout);
> - if (ret == -1) {
> + if (ret < 0) {
> close(sock);
> rpc_createerr.cf_stat = RPC_SYSTEMERROR;
> rpc_createerr.cf_error.re_errno = errno;
> @@ -350,7 +352,7 @@
> MOUNTPROG, MOUNTVERS,
> IPPROTO_TCP);
> if (server_addr.sin_port) {
> ret = connect_nb(msock, &server_addr, 0);
> - if (ret != -1)
> + if (ret == 0)
> mclient = clnttcp_create(&server_addr,
> MOUNTPROG, MOUNTVERS,
> &msock,
> 0, 0);
>
>
> and now it works....

I think the underlying problem is that sometimes connect_nb() returns
"-1" to signal an error, and sometimes it returns a negative errno
type code.

It would be a slightly nicer fix if connect_nb() were changed to
always return 0 on success and -1 on error. connect_nb()'s callers do
not appear to care why it failed, so returning an errno is unnecessary.

Documenting connect_nb()'s return codes (as you did in your patch) is
a nice finishing touch.

When posting patches, can you also include a patch description and a
Signed-off-by: line? Some basic instructions for submitting Linux
kernel patches can be found here:

http://lxr.linux.no/linux/Documentation/SubmittingPatches

but most of these also apply to submitting to user space packages like
nfs-utils.

Steve, perhaps this should be included in 1.1.3?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

2008-07-25 18:52:02

by Steve Dickson

[permalink] [raw]
Subject: Re: showmount issues



Leisner, Martin wrote:
> I have a system (which is using the legacy SUN derived rpcbind).
>
> For some reason, showmount wasn't "always" working. It reported
> nothing when running from the command line, but worked under strace
> (ugh!)
>
> I was running fedora8 (showmount 1.1.0).
> I built nfs-utils 1.1.2 -- same problems.
>
> (I just tried a 1.0.6 on a RedHat Enterprise Linux 3 -- it worked fine)
>
> Running the code under gdb, I found some "interesting" problems...
Committed...

steved.