Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp3501963imm; Mon, 4 Jun 2018 04:48:53 -0700 (PDT) X-Google-Smtp-Source: ADUXVKL3Hj7zABwgIBgLB8wBOoseh/mIAjjgdQyKPU0WBAxprU2zuijRiRSvE/pmVW8wc0g0JygM X-Received: by 2002:a63:b91b:: with SMTP id z27-v6mr16888578pge.240.1528112933784; Mon, 04 Jun 2018 04:48:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528112933; cv=none; d=google.com; s=arc-20160816; b=ykOiaHzsy/IW6IWGa7zgs4k+ojf1iuToMzfZMY/dpHX84IiOhRaoXqpfNLj/OGMO9a zqUvat2BaHCtTkYhoiibiDVhgNLviPKdP1mZU1jiCx6rJ+0ZzFiSjq9bQXPrKGRbID1u 66RhdoX6M6mqkLJ5fzaVbBXUMTc8bQ3pcu9/EfZLOvmqF9ln2KBpEU7Tw7wQKcZCi/gi EF/whrjnJLav61jLNphGgGVuYUqpMq01ASkv1Y5BVMGS+6hCKPLWNd3WoY7pq6gOMku+ qv/AGdNNFVaC0I+UxT1+uV5l0VW+DYwtTd2o/GmZoDJYcRqltips9wQ7KxUPteJbgUWa 2nTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=0igMYhQZ5gaRT8eZtGdalAmM1Ul5XLkEsfAQO3c2mRU=; b=RgZCyukYbpZjWMTWiXkWJxSIyxooej/CO56Lef6qWj05KhoPtjo0zqJiEoHz3aHHvt QQtYMR7Y8ujIaxOWvb0n9IOclxDiZxSJgWtUBBM8KYlSwj0UfvF21LyeK9d0ctE6JA5Y Ar/P+CN+xgJOO17B+PKvcYxE7YDKVQNlUvKre5VtyYvEzmVxrBBmbKZ4cXCNsiX+J9lX 0PwOkymqkQ0vLGKpGSD+GzIfzhMe29km7qM7Eecqk62hLNptCwcLWhqgl8IldyV9A/R9 exU6yMaB3i6kG5AKZRmKiycLDTUdTEseA3uvsMVPCgjxjD143d/z1VtAAkTVwXbnpMkw l1fA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=BXZPuMe2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r1-v6si19196654pgu.52.2018.06.04.04.48.39; Mon, 04 Jun 2018 04:48:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=BXZPuMe2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752692AbeFDLqp (ORCPT + 99 others); Mon, 4 Jun 2018 07:46:45 -0400 Received: from mail-pg0-f68.google.com ([74.125.83.68]:44823 "EHLO mail-pg0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751814AbeFDLqn (ORCPT ); Mon, 4 Jun 2018 07:46:43 -0400 Received: by mail-pg0-f68.google.com with SMTP id p21-v6so14383754pgd.11 for ; Mon, 04 Jun 2018 04:46:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=0igMYhQZ5gaRT8eZtGdalAmM1Ul5XLkEsfAQO3c2mRU=; b=BXZPuMe297RH4DBnr9Si29UGncf+TVyt+qweqGLLEIhASpotAfKL2iN4Yj7TyKyZk7 M0su8gMFO64wWUp1Jq+ulA3fnfMUM7o2e1wxWHC6sfyUnHIBPMD1dRi/lVuCTVCl5DNV TRPDWQgF9hVDaMT6+21S7FWU8LOGCXdwTgFQvMaEGJ/Kndp4cGASS/xbEP0u1olWWFSF 6Ri4Fki/4i/4fPnE2g6+yau6B5ivEGDG6Zq5akXaoYJdbadDYCZI9a0c8WZ85tsV2Un1 uhDGowXQhXEyht+NyzyIxyVAvokWe5rNHJeYD2dtG4qJSifAfMXR2U3VRhjw+bfpGW6B Bf5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=0igMYhQZ5gaRT8eZtGdalAmM1Ul5XLkEsfAQO3c2mRU=; b=ORgTCATuvPzGLo1TzUv8SNPbRfmlQq51uyxjiKbWUiHAm7PvqPEq0ISbKDLwYDdkPg 1IRkYqg4y4D0ZSEFtyZV82XVkXPpTxjYJAcqoqS5i/Z2SMm0sRKx7PcboPTW+HrZ/3Kg CAg3WER+4htLZab/L7Zm0QMnrK/lVa/2xOvNaLRShyYqUWh0qNlt3MmB1HTr6ucbne7p hxq5+490aYTbDDN+06yJW2Htpymtq1Yax1JDbE7UFqsABCRo1KM6dEWdBd8ZcwAPhAIY PLSokk6AMtYdPte0KT3WPYXsQly10VYCDH3s+G5PpcOc3Y/wwGZSDDOccVE59X09qTVJ Ezuw== X-Gm-Message-State: ALKqPwcjEby76bpqm7gZ/JNALftKTZk/pIw/2/PJM0Mlxo1v8ZZnw9Yn 3eseiQhu2XPthHi1zPpgSae1wORrO0N8o8FbKeZlkw== X-Received: by 2002:a63:b407:: with SMTP id s7-v6mr14137746pgf.334.1528112802135; Mon, 04 Jun 2018 04:46:42 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a17:90a:d01:0:0:0:0 with HTTP; Mon, 4 Jun 2018 04:46:21 -0700 (PDT) In-Reply-To: <25708e84-6f35-04c3-a2e4-6854f0ed9e78@I-love.SAKURA.ne.jp> References: <43327033306c3dd2f7c3717d64ce22415b6f3451.camel@wdc.com> <6db16aa3a7c56b6dcca2d10b4e100a780c740081.camel@wdc.com> <201805220652.BFH82351.SMQFFOJOtFOVLH@I-love.SAKURA.ne.jp> <201805222020.FEJ82897.OFtJMFHOVLQOSF@I-love.SAKURA.ne.jp> <25708e84-6f35-04c3-a2e4-6854f0ed9e78@I-love.SAKURA.ne.jp> From: Dmitry Vyukov Date: Mon, 4 Jun 2018 13:46:21 +0200 Message-ID: Subject: Re: INFO: task hung in blk_queue_enter To: Tetsuo Handa Cc: Bart Van Assche , LKML , linux-block@vger.kernel.org, Johannes Thumshirn , Alan Jenkins , syzbot , "Martin K. Petersen" , Jens Axboe , Dan Williams , Christoph Hellwig , oleksandr@natalenko.name, ming.lei@redhat.com, martin@lichtvoll.de, Hannes Reinecke , syzkaller-bugs , Ross Zwisler , keith.busch@intel.com, linux-ext4@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 1, 2018 at 12:10 PM, Tetsuo Handa wrote: > Tetsuo Handa wrote: >> Since sum of percpu_count did not change after percpu_ref_kill(), this is >> not a race condition while folding percpu counter values into atomic counter >> value. That is, for some reason, someone who is responsible for calling >> percpu_ref_put(&q->q_usage_counter) (presumably via blk_queue_exit()) is >> unable to call percpu_ref_put(). >> But I don't know how to find someone who is failing to call percpu_ref_put()... > > I found the someone. It was already there in the backtrace... Nice! Do I understand it correctly that this bug is probably the root cause of a whole lot of syzbot "task hung" reports? E.g. this one too? https://syzkaller.appspot.com/bug?id=cdc4add60bb95a4da3fec27c5fe6d75196b7f976 I guess we will need to sweep close everything related to filesystems/block devices when this is committed? > ---------------------------------------- > [ 62.065852] a.out D 0 4414 4337 0x00000000 > [ 62.067677] Call Trace: > [ 62.068545] __schedule+0x40b/0x860 > [ 62.069726] schedule+0x31/0x80 > [ 62.070796] schedule_timeout+0x1c1/0x3c0 > [ 62.072159] ? __next_timer_interrupt+0xd0/0xd0 > [ 62.073670] blk_queue_enter+0x218/0x520 > [ 62.074985] ? remove_wait_queue+0x70/0x70 > [ 62.076361] generic_make_request+0x3d/0x540 > [ 62.077785] ? __bio_clone_fast+0x6b/0x80 > [ 62.079147] ? bio_clone_fast+0x2c/0x70 > [ 62.080456] blk_queue_split+0x29b/0x560 > [ 62.081772] ? blk_queue_split+0x29b/0x560 > [ 62.083162] blk_mq_make_request+0x7c/0x430 > [ 62.084562] generic_make_request+0x276/0x540 > [ 62.086034] submit_bio+0x6e/0x140 > [ 62.087185] ? submit_bio+0x6e/0x140 > [ 62.088384] ? guard_bio_eod+0x9d/0x1d0 > [ 62.089681] do_mpage_readpage+0x328/0x730 > [ 62.091045] ? __add_to_page_cache_locked+0x12e/0x1a0 > [ 62.092726] mpage_readpages+0x120/0x190 > [ 62.094034] ? check_disk_change+0x70/0x70 > [ 62.095454] ? check_disk_change+0x70/0x70 > [ 62.096849] ? alloc_pages_current+0x65/0xd0 > [ 62.098277] blkdev_readpages+0x18/0x20 > [ 62.099568] __do_page_cache_readahead+0x298/0x360 > [ 62.101157] ondemand_readahead+0x1f6/0x490 > [ 62.102546] ? ondemand_readahead+0x1f6/0x490 > [ 62.103995] page_cache_sync_readahead+0x29/0x40 > [ 62.105539] generic_file_read_iter+0x7d0/0x9d0 > [ 62.107067] ? futex_wait+0x221/0x240 > [ 62.108303] ? trace_hardirqs_on+0xd/0x10 > [ 62.109654] blkdev_read_iter+0x30/0x40 > [ 62.110954] generic_file_splice_read+0xc5/0x140 > [ 62.112538] do_splice_to+0x74/0x90 > [ 62.113726] splice_direct_to_actor+0xa4/0x1f0 > [ 62.115209] ? generic_pipe_buf_nosteal+0x10/0x10 > [ 62.116773] do_splice_direct+0x8a/0xb0 > [ 62.118056] do_sendfile+0x1aa/0x390 > [ 62.119255] __x64_sys_sendfile64+0x4e/0xc0 > [ 62.120666] do_syscall_64+0x6e/0x210 > [ 62.121909] entry_SYSCALL_64_after_hwframe+0x49/0xbe > ---------------------------------------- > > The someone is blk_queue_split() from blk_mq_make_request() who depends on an > assumption that blk_queue_enter() from recursively called generic_make_request() > does not get blocked due to percpu_ref_tryget_live(&q->q_usage_counter) failure. > > ---------------------------------------- > generic_make_request(struct bio *bio) { > if (blk_queue_enter(q, flags) < 0) { /* <= percpu_ref_tryget_live() succeeds. */ > if (!blk_queue_dying(q) && (bio->bi_opf & REQ_NOWAIT)) > bio_wouldblock_error(bio); > else > bio_io_error(bio); > return ret; > } > (...snipped...) > ret = q->make_request_fn(q, bio); > (...snipped...) > if (q) > blk_queue_exit(q); > } > ---------------------------------------- > > where q->make_request_fn == blk_mq_make_request which does > > ---------------------------------------- > blk_mq_make_request(struct request_queue *q, struct bio *bio) { > blk_queue_split(q, &bio); > } > > blk_queue_split(struct request_queue *q, struct bio **bio) { > generic_make_request(*bio); /* <= percpu_ref_tryget_live() fails and waits until atomic_read(&q->mq_freeze_depth) becomes 0. */ > } > ---------------------------------------- > > and meanwhile atomic_inc_return(&q->mq_freeze_depth) and > percpu_ref_kill() are called by blk_freeze_queue_start()... > > Now, it is up to you about how to fix this race problem. >