Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757056Ab3DAMG4 (ORCPT ); Mon, 1 Apr 2013 08:06:56 -0400 Received: from mail-ea0-f178.google.com ([209.85.215.178]:41127 "EHLO mail-ea0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756767Ab3DAMGy (ORCPT ); Mon, 1 Apr 2013 08:06:54 -0400 MIME-Version: 1.0 In-Reply-To: <1349254268.16946.35.camel@wall-e> References: <1349254268.16946.35.camel@wall-e> Date: Mon, 1 Apr 2013 05:06:53 -0700 Message-ID: Subject: Re: losetup kernel crash in drivers/block/loop.c kernel 3.4.11 From: Anatol Pomozov To: Stefani Seibold Cc: linux-kernel , Jens Axboe , Andrew Morton , Dmitry Monakhov , Dave Young , JeffMoyer Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5759 Lines: 157 Hi On Wed, Oct 3, 2012 at 1:51 AM, Stefani Seibold wrote: > Hi, > > i am faced with a strange kernel crash while removing a loopback device > with losetup, during a software update of my embedded device, which was > introduced between 3.0 and 3.4. All other used kernels 2.6.39, 2.6.35, > 2.6.33, 2.6.29, 2.6.27 and 2.6.20 works well. > > BUG: unable to handle kernel NULL pointer derference at 00000041 > IP: [] invalidate_bdev+0x4/0x26 > *pde = 00000000 > Ooops: 0000 I#11 PREEMNT SMP > Modules linked in: vfat fat i915 drm_kms_helper drm intel_agp i2c_algo_bit intel_gtt agpgart video backlight e1000e usb_storage > > Pid: 869, comm: losetup Tainted G 8.3.4 > EIP: 0060:[] EFLAGS: 00010282 CPU: 1 > EIP is at invalidate_bdev+0x4/0x26 > EAX: 00000029 EBX: f63c1c00 ECX: 00000000 EDX: f63c1e20 > ESI: f5c6bc80 EDI: f63c1c60 EBP: f596e500 ESP: f5053e54 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > CR0: 8005003b CR2: 00000041 CR3: 324ae000 CR4: 000407d0 > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > DR6: ffff0ff0 DR7: 00000400 > Process losetup (pid: 869, ti=f5052000 task=f616c0c0 task.ti=f5052000) > Stack: > f63c1c00 c0277449 000200da f63c1c00 ffffffe7 00004c01 f5c39900 c02784d0 > f5d750a4 00000000 f5053efc f5d750a4 f5269900 c017dda6 0000001d 00008000 > f63c1cfc c027897b ffffffe7 00004c01 f5053f10 c0202021 00000000 f5c39900 > Call Trace: > [] ? loop_clr_fd+0x11/0x1d6 > [] ? lo_ioctl+0x455/0x62b > [] ? do_last.clone.32+0x55b/0x5d5 > [] ? loop_switch.clone.13+0x67/0x67 > [] ? __blkdev_driver_ioctl+0x1d/0x25 > [] ? blkdev_ioctl+0x6a3/0x6c2 > [] ? handle_pte_fault+0x21d/0x7ad > [] ? do_file_open+0x21/0x5d > [] ? block_ioctl+0x2f/0x34 > [] ? block_ioctl+0x2f/0x34 > [] ? bd_set_size+0x60/0x60 > [] ? do_vfs_ioctl+0x455/0x492 > [] ? do_page_fault+0x30f/0x32c > [] ? fd_install+0x1e/0x3d > [] ? do_sys_open+0x17e/0x188 > [] ? sys_ioctl+0x2d/0x47 > [] ? syscall+0x7/0xb > Code: 00 89 f0 5b 5e 5f c3 53 8b 40 08 8b 58 18 83 7b 3c 00 74 11 e8 3f b9 ff ff 89 d8 31 d2 31 c9 5b e9 ba 8e fc ff 5b c3 53 8b 40 08 (8b) 58 18 83 7b 3c 00 74 17 e8 1f b9 ff ff e8 4e 88 fc ff 89 d8 > EIP: [] invalidate_bdev+0x4/0x26 SS:ESP 0068:f5053e54 > CR2: 0000000000000041 > > This dump was copied by hand from a smart phone screenshot, i hope there > are no typos. > > It is not possible to write a demo program which reproduce this bug due > the complexity, so i will explain what going on. > > First mount a kernel which include a initramfs doing the following: > > /bin/mount -t proc none /proc > /bin/mount -o rw,data=journal,barrier=1,errors=remount-ro /dev/sda3 /mnt > /bin/mount -o loop /mnt/rootfs.squashfs /rootfs > /bin/mount -o loop modules.squashfs /rootfs/lib/modules > /bin/mount -o move /mnt /rootfs/rw > /bin/umount /proc > exec /rootfs/bin/sh -c 'exec /sbin/switch_root -c /dev/console /rootfs /sbin/init' > exec /bin/sh > > The Squashfs-Image will be mounted and will be the new root filesystem, > the file system of /dev/sda3 will be then mounted under /rw. > > The reason to do this is, that is is very easy to exchange the root > filesystem, since it it only a plain image file. And there is no extra > partition necessary which can be to small in the future. > > Also the kernel modules will be a squashfs image as a part of the > initramfs. This make it safe to exchange the kernel, because it will > change togehter with the modules. > > After starting the new init process of the rootfs.squashfs the firmware > image opfs.squashfs will be mounted also via loopback block device > at /opt. > > When the user decide to do an update, a new rootfs.squashf will be > copied into a ramdisk and the following script (snippet) will be > executed: > > cat </tmp/init > #!/bin/sh > exec exec >/dev/console > exec 2>/dev/console > umount /init/opt > umount -l -r /init/rw > umount -l -r /init > umount /etc > rm -rf /tmp/etc > sync > for i in /dev/loop* > do > losetup -d $i 2>/dev/null > done > rm \$0 > exec /tmp/update.sh "$1" "$2" > reboot -f > EOF > chmod a+x /tmp/init > > echo "::restart:/tmp/init" >/tmp/etc/inittab > > mount -o ro /dev/ramdisk /mnt > cd /mnt > /sbin/pivot_root . init > > mount -o move /init/tmp /tmp > mount -o move /init/proc /proc > mount -o move /init/sys /sys > mount -o move /init/dev/pts /dev/pts > mount -o move /init/dev/shm /dev/shm > mount -o bind /tmp/etc /etc > > init -q > sleep 1 > kill -SIGQUIT 1 > exit > > Now the update.sh script has the control over the system, no more > application or daemons will running and all mass storages should be > unmounted. > > Till this everything is working fine, than the update.sh will execute > the following code: > > rm -f /rw/optfs.squashfs > > for i in /dev/loop* > do > losetup -d $i 2>/dev/null > done > > This will remove the old firmware and all possible loopback devices. > Executing the losetup will crash the kernel and will produce the Oops > above. > > This is independent to the underlying file system or the processor > architecture, it will happen on x86 or ppc and ext3fs and yaffs2 as > well. > > Any idea? Here is proposed fix http://marc.info/?l=linux-kernel&m=136481752606623&w=2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/