Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp2114828ybl; Thu, 30 Jan 2020 11:37:00 -0800 (PST) X-Google-Smtp-Source: APXvYqx237IuxU4P4J+cdp2Qyzecs+/JhUHCuJ/Shro3DKyQtz72RmryWy+NGknh7XAbmVsSbITC X-Received: by 2002:a05:6830:1597:: with SMTP id i23mr4600456otr.109.1580413020360; Thu, 30 Jan 2020 11:37:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580413020; cv=none; d=google.com; s=arc-20160816; b=Xu55pO1eurnvKa42t6IqylCE1kygNl+mh4dYHR+OLCEjMQLBiC0t1KaBZLZfbp6mOK jWNpcpzCBLfaUP5j8RsaEEzLNWakLJMDYFglbmF6ENO/JpxG3uby9EaFoucPs/p1eGei HT39N0PSNtmEBD2QJff6RA6FukOTr2HK7qMsFQQMZL+ZzpTbIUVNoo9WR4t/A37rDGqi JrL/n6j52sJHYySb2Sup5EGdOsNfCTxXR22b43HsHCRlvc/y47+YKQHPAL5FsOYj3hDE vm3o4O3lXRjAy4YUrrC+caY66rLqijP7fLBBcHt7ZKI1uijVxUMxXY62gJaQ03h7iwNi rOfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :mime-version:dkim-signature; bh=Uqfd4nHPzlTao5D1BCx3Lm6HKEPbqy0Pm2K9v2dKhEc=; b=LbuX9Zz5lR+q3DYUEDqd19X3459yBFfVW5QDP4mJZXhbcsO+WvMlGSL9sFGGzPj3n/ i1OZWYWe5qkRY0zB/eyy4QPgOIQQeIIwNjmqnoldOCBCcbXgnN/yI4u+fvHM/YgxdEpD 2bvz/1qdJh+jR1VDV1WWCePhvZ34zgoFMu39e8Fju4af9/9fyARilNBoIaHhIOAt59w5 qoPGsvJP7ILTicbkTY+d/a//e/fz/KcfsEg2gti8fEYY1l1lz3SLfG+uxtR6M6ljCKDD KWr1N+aUGap6p+9areT2D+z4EfiGXcj3Aa0iXkat/lvA+wlaMA6gTUj+nLo4RE3Xpafw n0Zg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="l/XUfyOZ"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t9si3830248otp.57.2020.01.30.11.36.47; Thu, 30 Jan 2020 11:37:00 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="l/XUfyOZ"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727515AbgA3Ten (ORCPT + 99 others); Thu, 30 Jan 2020 14:34:43 -0500 Received: from mail-io1-f41.google.com ([209.85.166.41]:46514 "EHLO mail-io1-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727125AbgA3Ten (ORCPT ); Thu, 30 Jan 2020 14:34:43 -0500 Received: by mail-io1-f41.google.com with SMTP id t26so5319945ioi.13 for ; Thu, 30 Jan 2020 11:34:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=Uqfd4nHPzlTao5D1BCx3Lm6HKEPbqy0Pm2K9v2dKhEc=; b=l/XUfyOZQcEys/3CmHKqR38z2/ZBH3253kzyOszVOzfvOxa1yHnN654HGX68pkzXLR wnuzMh358yIMK3o+4lSIT0VygAqyRmatDqENZSVDd6UnH3arZPm54UeegmrT1ukw2x5Z VCOhUHYkXHkMWnmfa/ZwZZFjpJK3BlhG7YdjmhtOR67YNDd4KkviRccgCKbtHmduZ5jx GoeBDB1XxsUsdTEOACS+CIJ3YLXtKnOY29zGaE3ucLey4WiaxfdMujfUoEfSjSGKxTTs FBsMSIto2UhIGf+yhBSBdEvut6figPgUtB3nL+JH8tp6rF625Y9G/jlpObmqDOkd342B x8yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=Uqfd4nHPzlTao5D1BCx3Lm6HKEPbqy0Pm2K9v2dKhEc=; b=bQ2UUo+zEUMWHjd8iW7BlH1xaw/JpTdcJ/6O364x/11AoCRTQysmhHTST8wLZtzX2x z30xa9Xs0aQg7O7CUig6YfRyRwMvAonoD029LUYNfulGHxukBXtJodwlvf3govKkG54j YMxl5J3FQ3tR0vRav7R3J3qjyORD0s81gtcZXAr451QuWK+MWt0QKEaaUXIqtRwxU+/8 LWfgtfxClDnp3B9eWA5G1yKv/pxeS2yzcIaV3qT3zcRXF+X2HQln2AF766QChUYfwM3X Pl3ndMzEwbYZknctyGInoFKIZwjiIMuUOzbmyaV/5iwmGw8WH+SyxBj99+UAm7H7sMpQ SBUQ== X-Gm-Message-State: APjAAAVUKEuqJAMXpdrVsDfyYJqg3iySXeLVF5HW98iixzrajMbfZufN f0g+YnTAV8H+eAYubcMVQN5k3F+dVANfApyxqXjiwjDR56U5xg== X-Received: by 2002:a5d:9285:: with SMTP id s5mr5475456iom.85.1580412880940; Thu, 30 Jan 2020 11:34:40 -0800 (PST) MIME-Version: 1.0 From: Salman Qazi Date: Thu, 30 Jan 2020 11:34:29 -0800 Message-ID: Subject: Hung tasks with multiple partitions To: Jens Axboe , Linux Kernel Mailing List , linux-block@vger.kernel.org Cc: Jesse Barnes , Gwendal Grignou Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, I am writing on behalf of the Chromium OS team at Google. We found the root cause for some hung tasks we were experiencing and we would like to get your opinion on potential solutions. The bugs were encountered on 4.19 kernel. However my reading of the code suggests that the relevant portions of the code have not changed since then. We have an eMMC flash drive that has been carved into partitions on an 8 CPU system. The repro case that we came up with, is to use 8 threaded fio write-mostly workload against one partition, let the system use the other partition as the read-write filesystem (i.e. just background activity) and then run the following loop: while true; do sync; sleep 1 ; done The hung task stack traces look like the following: [ 128.994891] jbd2/dm-1-8 D 0 367 2 0x00000028 last_sleep: 96340206998. last_runnable: 96340140151 [ 128.994898] Call trace: [ 128.994903] __switch_to+0x120/0x13c [ 128.994909] __schedule+0x60c/0x7dc [ 128.994914] schedule+0x74/0x94 [ 128.994919] io_schedule+0x1c/0x40 [ 128.994925] bit_wait_io+0x18/0x58 [ 128.994930] __wait_on_bit+0x78/0xdc [ 128.994935] out_of_line_wait_on_bit+0xa0/0xcc [ 128.994943] __wait_on_buffer+0x48/0x54 [ 128.994948] jbd2_journal_commit_transaction+0x1198/0x1a4c [ 128.994956] kjournald2+0x19c/0x268 [ 128.994961] kthread+0x120/0x130 [ 128.994967] ret_from_fork+0x10/0x18 I added some more information to trace points to understand what was going on. It turns out that blk_mq_sched_dispatch_requests had checked hctx->dispatch, found it empty, and then began consuming requests from the io scheduler (in blk_mq_do_dispatch_sched). Unfortunately, the deluge from the I/O scheduler (BFQ in our case) doesn't stop for 30 seconds and there is no mechanism present in blk_mq_do_dispatch_sched to terminate early or reconsider hctx->dispatch contents. In the meantime, a flush command arrives in hctx->dispatch (via insertion in blk_mq_sched_bypass_insert) and languishes there. Eventually the thread waiting on the flush triggers the hung task watchdog. The solution that comes to mind is to periodically check hctx->dispatch in blk_mq_do_dispatch_sched and exit early if it is non-empty. However, not being an expert in this subsystem, I am not sure if there would be other consequences. Any help is appreciated, Salman