Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755169Ab2FSVlk (ORCPT ); Tue, 19 Jun 2012 17:41:40 -0400 Received: from ipmail07.adl2.internode.on.net ([150.101.137.131]:54918 "EHLO ipmail07.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755127Ab2FSVlg (ORCPT ); Tue, 19 Jun 2012 17:41:36 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgwKAF/x4E95LJ3s/2dsb2JhbABFDrQtA4EtgQiCGAEBBTocIxAIAxguFCUDIROICrlfFIsbIG6FCwOLHYoHhkGJPYInS4FF Date: Wed, 20 Jun 2012 07:41:30 +1000 From: Dave Chinner To: Alan Stern Cc: Dima Tisnek , Alexander Viro , Jens Axboe , USB list , linux-fsdevel@vger.kernel.org, Kernel development list Subject: Re: mount stuck, khubd blocked Message-ID: <20120619214130.GO25389@dastard> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5363 Lines: 109 On Tue, Jun 19, 2012 at 10:45:10AM -0400, Alan Stern wrote: > On Tue, 19 Jun 2012, Dima Tisnek wrote: > > > I made a microsd flash with 2 partitions, sdb1 is data partition, and > > sdb2 is a sentinel partition, 1 block in size. > > > > I attached the usb-microsd reader with that card in it and by mistake > > tried to mount the sentinel partition, I ran: > > mount /dev/sdb2 /mnt/flash/ > > > > mount got stuck, I was not able to kill or strace it, I pulled the usb > > reader from the port, mount was still stuck, here's the dmesg log: So where is the mount process stuck? It's holding the lock that khubd is stuck on.... > > > > [65464.536212] usb 4-1.2: new high-speed USB device number 3 using ehci_hcd > > [65464.700933] usbcore: registered new interface driver uas > > [65464.703478] Initializing USB Mass Storage driver... > > [65464.703762] scsi8 : usb-storage 4-1.2:1.0 > > [65464.703852] usbcore: registered new interface driver usb-storage > > [65464.703854] USB Mass Storage support registered. > > [65465.706479] scsi 8:0:0:0: Direct-Access Generic- Card Reader > > 1.00 PQ: 0 ANSI: 0 CCS > > [65466.389664] sd 8:0:0:0: [sdb] 3862528 512-byte logical blocks: > > (1.97 GB/1.84 GiB) > > [65466.390493] sd 8:0:0:0: [sdb] Write Protect is off > > [65466.390497] sd 8:0:0:0: [sdb] Mode Sense: 03 00 00 00 > > [65466.391263] sd 8:0:0:0: [sdb] No Caching mode page present > > [65466.391267] sd 8:0:0:0: [sdb] Assuming drive cache: write through > > [65466.394723] sd 8:0:0:0: [sdb] No Caching mode page present > > [65466.394727] sd 8:0:0:0: [sdb] Assuming drive cache: write through > > [65466.397500] sdb: sdb1 sdb2 > > [65466.400468] sd 8:0:0:0: [sdb] No Caching mode page present > > [65466.400471] sd 8:0:0:0: [sdb] Assuming drive cache: write through > > [65466.400474] sd 8:0:0:0: [sdb] Attached SCSI removable disk > > [66159.793752] usb 4-1.2: USB disconnect, device number 3 > > [66291.567080] INFO: task khubd:90 blocked for more than 120 seconds. > > [66291.567083] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > disables this message. > > [66291.567086] khubd D 0000000000000001 0 90 2 0x00000000 > > [66291.567090] ffff880230313948 0000000000000046 ffff88023233e000 > > ffff880230313fd8 > > [66291.567095] ffff880230313fd8 ffff880230313fd8 ffff880194f8b000 > > ffff88023233e000 > > [66291.567099] ffffffff81244c85 ffff880194f8b048 0000000000000046 > > ffff8802303138c0 > > [66291.567104] Call Trace: > > [66291.567112] [] ? number.isra.2+0x315/0x350 > > [66291.567117] [] ? ep_poll_callback+0xeb/0x120 > > [66291.567121] [] ? string.isra.4+0x3e/0xd0 > > [66291.567125] [] schedule+0x3f/0x60 > > [66291.567129] [] rwsem_down_failed_common+0xc5/0x160 > > [66291.567133] [] ? find_inode+0xa9/0xb0 > > [66291.567136] [] rwsem_down_read_failed+0x15/0x17 > > [66291.567139] [] call_rwsem_down_read_failed+0x14/0x30 > > [66291.567143] [] ? down_read+0x17/0x20 > > [66291.567146] [] get_super+0x9f/0xe0 > > [66291.567149] [] fsync_bdev+0x1d/0x60 > > [66291.567152] [] invalidate_partition+0x2d/0x60 > > [66291.567155] [] del_gendisk+0x90/0x250 > > As can be seen from the stack entries above, this problem lies in the > block or filesystem layer and not in USB or SCSI. Don't blame the higher layers as the cause of the problem simply because they are the ones that show the visible symptoms ;) The problem lies in the fact that the error handling callback that is run when the device is removed triggers IO to the block device that was just removed. If all outstanding IOs have been error'd out correctly, and all new IOs return errors, then there is no reason for the fsync to block here. i.e. the mount process should have received an error. However, the mount could have hung because underlying device has not been cleaned up properly before the device disconnect has proceeded. i.e. that it is possible that the cause is a SCSI or USB issue, not a filesystem issue. :) So, what other blocked tasks are there in the system (echo w > /proc/sysrq-trigger)? As it is, I think that invalidate_partition() is doing something somewhat insane for a block device that has been removed - you can't write to it so fsync_bdev() is useless. And cleaning up the dentry and inode caches is something that should be done when unmounting the filesystem, not when the block device goes away as they can trigger more IO and potentially deadlock with other operations that have not handled the IO errors properly. Yes, shut a filesystem down that has had it's block device removed, but filesystem level cleanup should be left to the filesystem, not this error handling path. And another question - why doesn't having an active filesystem on a block device (i.e. an active reference to the gendisk) prevent the block device from being removed from underneath it? Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/