From: "Wim Colgate" Subject: question on the inner workings of NFS soft mount Date: Thu, 26 Jul 2007 15:29:19 -0700 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1281110349==" To: Return-path: Received: from sc8-sf-mx1-b.sourceforge.net ([10.3.1.91] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1IEBrD-0005tB-Ho for nfs@lists.sourceforge.net; Thu, 26 Jul 2007 15:30:55 -0700 Received: from webmailrdm.xensource.com ([66.228.214.202]) by mail.sourceforge.net with esmtp (Exim 4.44) id 1IEBrG-0002Pi-Sy for nfs@lists.sourceforge.net; Thu, 26 Jul 2007 15:30:59 -0700 List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net This is a multi-part message in MIME format. --===============1281110349== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C7CFD4.A2BA91FE" This is a multi-part message in MIME format. ------_=_NextPart_001_01C7CFD4.A2BA91FE Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I was wondering if anyone could help me interpret the results of the following experiment with regards to NFS softmount. =20 First a very brief description of my setup and test program: =20 I have a simple program that does a small, 512 byte, write to a file, and print the success/failure of the IO plus the time the write took to return, sleep one second, then loop back and write another - repeat until ^C. The file it is writing to is on an NFS filer. During its execution, I will disable the NFS service on the filer. The printfs of success/failure are below. =20 The mount is SOFT:=20 =20 172.24.0.90:/vhd1 on /mnt/snp1 type nfs (rw,soft,timeo=3D1,retrans=3D1,addr=3D172.24.0.90) =20 When I run one instance of the program here is the output: =20 # ./netwrite /mnt/snp1/foo2 write block 0 (succeeded): 0.006674 secs write block 1 (succeeded): 0.001639 secs write block 2 (succeeded): 0.001793 secs write block 3 (succeeded): 0.001784 secs write block 4 (succeeded): 0.004258 secs write block 5 (succeeded): 0.002426 secs write block 6 (succeeded): 0.001494 secs write block 7 (succeeded): 0.001477 secs write block 8 (succeeded): 0.001271 secs write block 9 (failed): 3.00014 secs write block 10 (failed): 3.00015 secs write block 11 (failed): 6.00012 secs write block 12 (failed): 12.0001 secs write block 13 (failed): 24 secs write block 14 (failed): 47.9999 secs write block 15 (failed): 59.9996 secs write block 16 (failed): 35 secs write block 17 (failed): 59.9997 secs write block 18 (failed): 59.9997 secs write block 19 (failed): 59.9997 secs write block 20 (failed): 9.00012 secs =20 =20 I'm curious as to why the write failure times are all over the place. Note the timeo and retrans are both set to 1. If I run two simultaneous instances (to different files on the same filer), the times are also all over the place - and go as high as two minutes (curiously twice the max as one...). =20 I'm using the following version of linux: =20 Linux wims 2.6.18-8.1.8.el5.xs4.0.0b2.102.154xen #1 SMP Tue Jul 24 16:57:36 EDT 2007 i686 i686 i386 GNU/Linux =20 The program is pretty simple: =20 #define O_DIRECT 040000 =20 #include #include #include #include #include #include #include #include #include =20 #include =20 #define ALIGN 512 #define SIZE 512 #define SEC_WAIT 1 =20 main(int argc, char *argv[]) { int fd; =20 int data=3D0; char *bufptr =3D 0; int result; struct timeval tim; struct timeval start; struct timeval end; =20 if (argc !=3D 2) { printf("usage: %s \n", argv[0]); goto done; } =20 posix_memalign((void **)&bufptr, ALIGN, SIZE); if (!bufptr) { printf("failed to allocate %d bytes", SIZE); goto done; } =20 fd =3D open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0777); =20 if (!fd) { printf("failed to open %s for write, errno=3D%d\n", argv[1], errno); goto done; } =20 while (1) { ftime(&start); tim.tv_sec =3D SEC_WAIT; tim.tv_usec =3D 0; memset(bufptr, data, SIZE); gettimeofday(&start, 0); result =3D write(fd, bufptr, SIZE); data =3D (data+1) & 255; gettimeofday(&end, 0); printf("write block %d (%s): %g secs\n", data-1, result < 0 ? "failed" : "succeeded", ((end.tv_sec*1000000 + end.tv_usec) - (start.tv_sec*1000000 + start.tv_usec)) /1000000.0); select(0, 0, 0, 0, &tim); } =20 done: return 0; } =20 Thanks, =20 Wim ------_=_NextPart_001_01C7CFD4.A2BA91FE Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

I was wondering if anyone could help me interpret the results of the following experiment with regards to NFS = softmount.

 

First a very brief description of my setup and test = program:

 

I have a simple program that does a small, 512 byte, = write to a file, and print the success/failure of the IO plus the time the = write took to return, sleep one second, then loop back and write another – = repeat until ^C. The file it is writing to is on an NFS filer. During its execution, = I will disable the NFS service on the filer. The printfs of success/failure are = below.

 

The mount is SOFT:

 

172.24.0.90:/vhd1 on /mnt/snp1 type nfs (rw,soft,timeo=3D1,retrans=3D1,addr=3D172.24.0.90)

 

When I run one instance of the program here is the = output:

 

# ./netwrite = /mnt/snp1/foo2

write block 0 (succeeded): 0.006674 = secs

write block 1 (succeeded): 0.001639 = secs

write block 2 (succeeded): 0.001793 = secs

write block 3 (succeeded): 0.001784 = secs

write block 4 (succeeded): 0.004258 = secs

write block 5 (succeeded): 0.002426 = secs

write block 6 (succeeded): 0.001494 = secs

write block 7 (succeeded): 0.001477 = secs

write block 8 (succeeded): 0.001271 = secs

write block 9 (failed): 3.00014 = secs

write block 10 (failed): 3.00015 = secs

write block 11 (failed): 6.00012 = secs

write block 12 (failed): 12.0001 = secs

write block 13 (failed): 24 = secs

write block 14 (failed): 47.9999 = secs

write block 15 (failed): 59.9996 = secs

write block 16 (failed): 35 = secs

write block 17 (failed): 59.9997 = secs

write block 18 (failed): 59.9997 = secs

write block 19 (failed): 59.9997 = secs

write block 20 (failed): 9.00012 = secs

 

 

I’m curious as to why the write failure times = are all over the place. Note the timeo and retrans are both set to 1. If I run = two simultaneous instances (to different files on the same filer), the times = are also all over the place – and go as high as two minutes (curiously = twice the max as one…).

 

I’m using the following version of = linux:

 

Linux wims 2.6.18-8.1.8.el5.xs4.0.0b2.102.154xen #1 = SMP Tue Jul 24 16:57:36 EDT 2007 i686 i686 i386 = GNU/Linux

 

The program is pretty = simple:

 

#define = O_DIRECT        040000

 

#include <stdio.h>

#include = <sys/types.h>

#include = <sys/stat.h>

#include <fcntl.h>

#include = <unistd.h>

#include = <sys/select.h>

#include <errno.h>

#include = <string.h>

#include = <stdlib.h>

 

#include = <sys/time.h>

 

#define ALIGN 512

#define SIZE 512

#define SEC_WAIT 1

 

main(int argc, char = *argv[])

{

    int = fd;

 

    int = data=3D0;

    char *bufptr =3D = 0;

    int = result;

    struct timeval = tim;

    struct timeval = start;

    struct timeval = end;

 

    if (argc !=3D 2) = {

        printf("usage: %s <filename>\n", = argv[0]);

        goto = done;

    }

 

    posix_memalign((void = **)&bufptr, ALIGN, SIZE);

    if (!bufptr) = {

        printf("failed to allocate %d bytes", = SIZE);

        goto = done;

    }

 

    fd =3D open(argv[1], O_RDWR | = O_CREAT | O_DIRECT, 0777);

 

    if (!fd) = {

        printf("failed to open %s for write, errno=3D%d\n", argv[1], = errno);

        goto = done;

    }

 

    while (1) = {

        ftime(&start);

        tim.tv_sec = =3D SEC_WAIT;

        = tim.tv_usec =3D 0;

        = memset(bufptr, data, SIZE);

        gettimeofday(&start, 0);

        result =3D = write(fd, bufptr, SIZE);

        data =3D = (data+1) & 255;

        gettimeofday(&end, 0);

    =     printf("write block %d (%s): %g secs\n",

         =             &= nbsp;           &n= bsp;   data-1,

         =             &= nbsp;           &n= bsp;   result < 0 ? "failed" : = "succeeded",

         =             &= nbsp;           &n= bsp;   ((end.tv_sec*1000000 + end.tv_usec) -

         =                     &= nbsp;        (start.tv_sec*100000= 0 + start.tv_usec))

         =             &= nbsp;           &n= bsp;    /1000000.0);

        select(0, = 0, 0, 0, &tim);

    }

 

done:

    return = 0;

}

 

Thanks,

 

Wim

------_=_NextPart_001_01C7CFD4.A2BA91FE-- --===============1281110349== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ --===============1281110349== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs --===============1281110349==--