From: Jan Kara Subject: Re: Issues with a rather unusual configured NFS server Date: Mon, 12 Aug 2013 16:36:40 +0200 Message-ID: <20130812143640.GF4596@quack.suse.cz> References: <52075E01.7030506@gmx.de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Linux NFS mailing list , "user-mode-linux-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org" , linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linux Kernel , "J. Bruce Fields" To: Toralf =?iso-8859-1?Q?F=F6rster?= Return-path: Content-Disposition: inline In-Reply-To: <52075E01.7030506-Mmb7MZpHnFY@public.gmane.org> Sender: linux-nfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-ext4.vger.kernel.org On Sun 11-08-13 11:48:49, Toralf F=F6rster wrote: > so that the server either crashes (if it is a user mode linux image) = or at least its reboot functionality got broken > - if the NFS server is hammered with scary NFS calls using a fuzzy to= ol running at a remote NFS client under a non-privileged user id. >=20 > It can re reproduced, if > - the NFS share is an EXT3 or EXT4 directory > - and it is created at file located at tempfs and mounted via loop d= evice > - and the NFS server is forced to umount the NFS share > - and the server forced to restart the NSF service afterwards > - and trinity is used >=20 > I could find a scenario for an automated bisect. 2 times it brought t= his commit=20 > commit 68a3396178e6688ad7367202cdf0af8ed03c8727 > Author: J. Bruce Fields > Date: Thu Mar 21 11:21:50 2013 -0400 >=20 > nfsd4: shut down more of delegation earlier Added Bruce to CC. > to be the one after which the user mode linux server crashes with a b= ack trace like this: >=20 >=20 > $ cat /mnt/ramdisk/bt.v3.11-rc4-172-g8ae3f1d > [New LWP 14025] > Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk u= bda=3D/home/tfoerste/virtual/uml/tr'. > Program terminated with signal 6, Aborted. > #0 0xb77ef424 in __kernel_vsyscall () > #0 0xb77ef424 in __kernel_vsyscall () > #1 0x083a33c5 in kill () > #2 0x0807163d in uml_abort () at arch/um/os-Linux/util.c:93 > #3 0x08071925 in os_dump_core () at arch/um/os-Linux/util.c:138 > #4 0x080613a7 in panic_exit (self=3D0x85a1518 ,= unused1=3D0, unused2=3D0x85d6ce0 ) at arch/um/kernel/um_arc= h.c:240 > #5 0x0809a3b8 in notifier_call_chain (nl=3D0x0, val=3D0, v=3D0x85d6c= e0 , nr_to_call=3D-2, nr_calls=3D0x0) at kernel/notifier.c:9= 3 > #6 0x0809a503 in __atomic_notifier_call_chain (nr_calls=3D, nr_to_call=3D, v=3D, val=3D, nh=3D) at kernel/notifier.c:182 > #7 atomic_notifier_call_chain (nh=3D0x85d6cc4 ,= val=3D0, v=3D0x85d6ce0 ) at kernel/notifier.c:191 > #8 0x08400ba8 in panic (fmt=3D0x0) at kernel/panic.c:128 > #9 0x0818edf4 in ext4_put_super (sb=3D0x4a042690) at fs/ext4/super.c= :818 > #10 0x081010d2 in generic_shutdown_super (sb=3D0x4a042690) at fs/supe= r.c:418 > #11 0x0810209a in kill_block_super (sb=3D0x0) at fs/super.c:1028 > #12 0x08100f6a in deactivate_locked_super (s=3D0x4a042690) at fs/supe= r.c:299 > #13 0x08101001 in deactivate_super (s=3D0x4a042690) at fs/super.c:324 > #14 0x08118e0c in mntfree (mnt=3D) at fs/namespace.c:8= 91 > #15 mntput_no_expire (mnt=3D0x0) at fs/namespace.c:929 > #16 0x0811a2f5 in SYSC_umount (flags=3D, name=3D) at fs/namespace.c:1335 > #17 SyS_umount (name=3D134541632, flags=3D0) at fs/namespace.c:1305 > #18 0x0811a369 in SYSC_oldumount (name=3D) at fs/names= pace.c:1347 > #19 SyS_oldumount (name=3D134541632) at fs/namespace.c:1345 > #20 0x080618e2 in handle_syscall (r=3D0x49e919d4) at arch/um/kernel/s= kas/syscall.c:35 > #21 0x08073c0d in handle_trap (local_using_sysemu=3D, = regs=3D, pid=3D) at arch/um/os-Linux/skas= /process.c:198 > #22 userspace (regs=3D0x49e919d4) at arch/um/os-Linux/skas/process.c:= 431 > #23 0x0805e65c in fork_handler () at arch/um/kernel/process.c:160 > #24 0x00000000 in ?? () >=20 >=20 >=20 > A real system however would not crash bug would give a kernel BUG as = reported here: > http://article.gmane.org/gmane.comp.file-systems.ext4/38915 We have deleted inodes (regular files) in the orphan list during ext4_put_super(). My guess is that NFS is still holding some inode references to these inodes and thus inodes don't get deleted. So ext3/4 would be just a victim here. > Furthermore the server won't be able any longer to reboot - it would = hang > infinitely in the reboot phase. Just the magic sysrq keys still work= s > then. Well, this is likely because the filesystem cannot be shut down. Honza --=20 Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html