Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758197AbaFSP7m (ORCPT ); Thu, 19 Jun 2014 11:59:42 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:33809 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1757567AbaFSP7g (ORCPT ); Thu, 19 Jun 2014 11:59:36 -0400 Message-ID: <53A308DE.7080000@fb.com> Date: Thu, 19 Jun 2014 08:59:26 -0700 From: Jens Axboe User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: "Theodore Ts'o" , , Subject: Re: BUG: scheduling while atomic in blk_mq codepath? References: <20140619153550.GA12836@thunk.org> In-Reply-To: <20140619153550.GA12836@thunk.org> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [192.168.57.29] X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.12.52,1.0.14,0.0.0000 definitions=2014-06-19_04:2014-06-19,2014-06-19,1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 kscore.is_bulkscore=3.74302149774852e-06 kscore.compositescore=0 circleOfTrustscore=253.09852526342 compositescore=0.998851358762094 urlsuspect_oldscore=0.998851358762094 suspectscore=0 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=0 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=64355 rbsscore=0.998851358762094 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.9 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1406190190 X-FB-Internal: deliver Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2014-06-19 08:35, Theodore Ts'o wrote: > While trying to bisect some problems which were introduced sometime > between 3.15 and 3.16-rc1 (specifically, (1) reads to a block device > at offset 262144 * 4k are failing with a short read, and (2) block > device reads are sometimes causing the entire kernel to hang), the > following BUG got hit. > > [ 0.000000] Linux version 3.15.0-rc8-06047-gaaeb255 (tytso@closure) (gcc version 4.8.3 (Debian 4.8.3-2) ) #1902 SMP Thu Jun 19 11:16:10 EDT 2014 > > [....] Checking file systems...fsck from util-linux 2.20.1 > /dev/vdg was not cleanly unmounted, check forced. > [ 4.161703] BUG: scheduling while atomic: fsck.ext4/2072/0x0000000266.5% > [ 4.163673] no locks held by fsck.ext4/2072. > [ 4.164318] Modules linked in: > [ 4.164845] CPU: 0 PID: 2072 Comm: fsck.ext4 Not tainted 3.15.0-rc8-06047-gaaeb255 #1902 > [ 4.166047] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > [ 4.166917] 00000000 00000000 f52c5ba0 c0832655 f5158610 f52c5bac c082f88a f6501e40 > [ 4.168188] f52c5c20 c08362ca c0eb3e40 c0eb3e40 374d3933 00000001 0396a8da 00000000 > [ 4.169474] f5158610 f51f1674 f4f46a00 f52c5be4 c015dd4b f4f46a00 f52c5bf0 c015dd5e > [ 4.170781] Call Trace: > [ 4.171159] [] dump_stack+0x48/0x60 > [ 4.171838] [] __schedule_bug+0x5c/0x6d > [ 4.172572] [] __schedule+0x61/0x65a > [ 4.173228] [] ? kvm_clock_read+0x1f/0x29 > [ 4.173977] [] ? kvm_clock_get_cycles+0x9/0xc > [ 4.174771] [] ? timekeeping_get_ns.constprop.14+0x10/0x56 > [ 4.175701] [] schedule+0x5f/0x61 > [ 4.176345] [] io_schedule+0x50/0x67 > [ 4.177060] [] bt_get+0xaf/0xd1 > [ 4.177677] [] ? wake_up_atomic_t+0x1f/0x1f > [ 4.178444] [] blk_mq_get_tag+0x26/0x82 > [ 4.179158] [] __blk_mq_alloc_request+0x2a/0x169 > [ 4.180022] [] blk_mq_map_request+0x137/0x1e3 > [ 4.180825] [] blk_sq_make_request+0x82/0x145 > [ 4.181630] [] generic_make_request+0x82/0xb5 > [ 4.182430] [] submit_bio+0xf0/0x109 > [ 4.183113] [] ? trace_hardirqs_on_caller+0x14e/0x169 > [ 4.184019] [] _submit_bh+0x1ad/0x1ca > [ 4.184661] [] submit_bh+0xf/0x11 > [ 4.185267] [] block_read_full_page+0x1e2/0x1f2 > [ 4.186073] [] ? I_BDEV+0xa/0xa > [ 4.186695] [] ? __lru_cache_add+0x24/0x46 > [ 4.187452] [] ? lru_cache_add+0xd/0xf > [ 4.188130] [] blkdev_readpage+0x14/0x16 > [ 4.188832] [] __do_page_cache_readahead+0x1c0/0x1eb > [ 4.189704] [] ondemand_readahead+0x1af/0x1b9 > [ 4.190508] [] page_cache_async_readahead+0x5f/0x6a > [ 4.191424] [] generic_file_aio_read+0x226/0x4f4 > [ 4.192272] [] blkdev_aio_read+0x90/0x9e > [ 4.193017] [] do_sync_read+0x52/0x79 > [ 4.193731] [] ? fdput_pos+0x25/0x25 > [ 4.194412] [] vfs_read+0x72/0xd1 > [ 4.195064] [] SyS_read+0x49/0x7c > [ 4.195700] [] syscall_call+0x7/0xb > [ 4.196385] [] ? print_usage_bug+0xcd/0x18e > > Is any of these known problems? This is blocking me from doing any > kind of testing at the moment... (these problems are showing up while > running KVM using virtio devices). I believe you already reported this issue a while back, and it should be fixed by commit cb96a42c in the kernel. The other issue, not sure, not a lot of detail. It may be fixed by the pull request I sent out yesterday. You can try pulling in: git://git.kernel.dk/linux-block.git for-linus and see if that fixes it for you. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/