From: =?ISO-8859-1?Q?Carlos_Andr=E9?= Subject: Re: AutoFS+NFSv4 server down = LOOOOONG timeout. Date: Mon, 10 Aug 2009 17:05:10 -0300 Message-ID: References: <4A7BCCCA.4020307@panasas.com> <20090807140425.GA18298@fieldses.org> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Cc: NFS list , Linux NFSv4 mailing list To: Chuck Lever Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfsv4-bounces@linux-nfs.org Errors-To: nfsv4-bounces@linux-nfs.org List-ID: Something funny: Using default tcp_syn_retries (5) i got "3,6,12,24,48,96" secs interval... but if i change tcp_syn_retries to 1 i got "3,6,3,6,3,6..." secs interval... [root@KSERVER mnt]# time mount 1.2.3.4:/blabla tmp/ -t nfs4 -o sec=3Dkrb5p,proto=3Dtcp mount: mount to NFS server '1.2.3.4' failed: timed out (giving up). real 3m9.000s user 0m0.000s sys 0m0.002s [root@KSERVER /]# echo 1 > /proc/sys/net/ipv4/tcp_syn_retries [root@KSERVER mnt]# time mount 1.2.3.4:/blabla tmp/ -t nfs4 -o sec=3Dkrb5p,proto=3Dtcp ("retry=3D1" =3D no change) mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (retrying). mount: mount to NFS server '1.2.3.4' failed: timed out (giving up). real 2m6.004s user 0m0.000s sys 0m0.004s (3,6,3,6... secs interval) 2009/8/10 Carlos Andr=E9 : > No, i'm just using packages from CentOS repo... > > And u're right about expo retries... with tcpdump i've monitored > traffic and i got SYN retries in 3, 6, 12, 24, 48, 96 secs on port > 2049... > I tried use "retry=3D1" option on mount without any change... I dont > want change source or tcp timers... just NFSv4 client. > > 2009/8/10 Chuck Lever : >> On Aug 10, 2009, at 2:29 PM, Carlos Andr=E9 wrote: >>> >>> Bruce, no... you're right. =A0I'm describing a situation where my server >>> died... i need mount fail faster (10 or 15 secs max) than 3 minutes >>> and 9 seconds... >> >> The 189 second timeout is likely how long it takes the kernel to give up >> trying to connect a TCP socket to the server (6 SYN attempts with >> exponential retries, or something like that). =A0For stock CentOS 5.3, I= think >> user space does only a DNS lookup for normal NFSv4 mounts -- the kernel = just >> tries to connect a TCP socket to port 2049, with no preceding rpcbind >> request. >> >> Carlos, let us know if you have replaced any NFS-related CentOS componen= ts >> (kernel, nfs-utils) with something you've built yourself. >> >>> 2009/8/7 J. Bruce Fields : >>>> >>>> On Fri, Aug 07, 2009 at 09:42:18AM +0300, Benny Halevy wrote: >>>>> >>>>> On Aug. 07, 2009, 3:18 +0300, Carlos Andr=E9 wro= te: >>>>>> >>>>>> Anyone ? >>>>>> >>>>>> 2009/7/29 Carlos Andr=E9 : >>>>>>> >>>>>>> PPL, I need put a CentOS 5.3 (updated) NFSv4 server to work with >>>>>>> Kerberos >>>>>>> and AutoFS, but i got a problem: If NFS server goes down i get a >>>>>>> LOOOOOOONG >>>>>>> mount timeout on CentOS 5.3 (updated) NFSv4 client... >>>>>>> >>>>>>> Since i need mount some (3 to 6) dirs at user logon process, if mou= nt >>>>>>> hangs, >>>>>>> user logon hangs. Then i want configure it to timeout (if server do= wn) >>>>>>> after >>>>>>> 10-15 secs (MAX) on each mount attempt. >>>>>>> >>>>>>> I already make a lab and tried a LOT of combinations, there my >>>>>>> findings >>>>>>> (server DOWN IP: 172.16.0.10 / client IP: 172.16.1.10) using basic >>>>>>> command >>>>>>> (time mount 172.16.0.10:/remotedir /localdir/ -t nfs4 -o >>>>>>> sec=3Dkrb5,proto=3D) from NFS client: >>>>>>> >>>>>>> - Once i try access mount point using AutoFS (proto=3Dtcp OR proto= =3Dudp) >>>>>>> it >>>>>>> hangs for 189 secs (3m9s: real =A03m9.001s) =A0until show error (mo= unt: >>>>>>> mount to >>>>>>> NFS server '172.16.0.10' failed: timed out (giving up)) >>>>> >>>>> Sounds like you're hitting the server's grace period. >>>> >>>> I thought he was describing a situation where the server the server >>>> is completely gone and isn't coming back, and wondering how to make the >>>> mount fail faster. =A0But I may be misunderstanding. >>>> >>>> --b. >>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html >> >> -- >> Chuck Lever >> chuck[dot]lever[at]oracle[dot]com >> >> >> >> >