Received: by 10.223.185.116 with SMTP id b49csp2214343wrg; Thu, 22 Feb 2018 09:55:09 -0800 (PST) X-Google-Smtp-Source: AH8x225CJDYwjoFbG9ou7N4e88HpTxH66v9bTYZDg9UIyqeP2WZunxh+erHQqkxt8W2aDfmAFZMV X-Received: by 10.98.31.79 with SMTP id f76mr7756818pff.60.1519322109754; Thu, 22 Feb 2018 09:55:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519322109; cv=none; d=google.com; s=arc-20160816; b=H+u3yMqosDmg2dtu2h6yu6d1zrizdHatF1sHPM30nrPv1MmeY/TjPSAoXQs0oQtjKq 6eWAqQ1y2tbP69Cdx/U5CiLRxNFMsJeMXAz9rctHM1qa7idkVj9o7wOSOm5xWOF4ZzOS Kw1/0UpyimoxR1guVdXZPVckgzvm04rTTBib2tGjis8H0aEo0GMUqCJZDiJmjGdcGwxH S5b/iePjhhpePB50/D6G2tL9PC+DgE9Q3SOfxnovCW2spcyEFe7jg62xxXNNJEFt7Ej2 gm+0YL1E3my5YBxdrQk39YOupDq3NybDOEF6E4PK+NkSLsK5gmdekRnV/WxOnY45cOKA xejA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=ajvE5+c2m56gk/SUkoZC1jaNI660l47APgS7YbbXIxI=; b=d7Ar5Sm7D1JDArP14nzId+ob/qr1d3DQSFA1fEkVgK/Yq0l+rmjvSct0Q+LQkVzuC3 srqiL1LZK5Z1LHB/0c4ki65LN2rfHGieGSlULuc1zVDvmkNzQWnD6oBQUM1Q6msEmA/q 3L+X/AVMnEN1fArxTInp+kJI3ewMG3Gg73Sm3ADdMRY4u+JtqHcet1FsfPX+yuihr9lL Fc8HcMnf65LHP3TmhIuCUw+ygNK6mwqNTie7oFbQ31B0dKjgAEzAtIzMWFF+XIcHQ/RL kAxa29EOlp6ltNb6oom9lh5w1laN9kCryPGkv97XbGdVluFs1lYdfBnLvCR/SOWkPwmr F+KQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=bkWK26xZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t12-v6si370004plm.392.2018.02.22.09.54.55; Thu, 22 Feb 2018 09:55:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=bkWK26xZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933585AbeBVRyS (ORCPT + 99 others); Thu, 22 Feb 2018 12:54:18 -0500 Received: from mail-lf0-f52.google.com ([209.85.215.52]:34220 "EHLO mail-lf0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933447AbeBVRyN (ORCPT ); Thu, 22 Feb 2018 12:54:13 -0500 Received: by mail-lf0-f52.google.com with SMTP id l191so8620141lfe.1; Thu, 22 Feb 2018 09:54:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=ajvE5+c2m56gk/SUkoZC1jaNI660l47APgS7YbbXIxI=; b=bkWK26xZhen4g5wY44alMwbQ2ipLrUALuw1TsDq+8q6tKQ8bH17smqK7mjim4vJMgy LwSk/s0+ba7naF48olEszRd9dTfAMlXTQM/sLOm5NXr/jNZ7kwsiGYL4+tWspIE5v+/1 kSFzmzlIFE3kqzD0CGZ/pvmbP3oknAbZG5gGASWJ1tO/uiXwlAhY70leoq/UnFsIHg7i 5RbH5JLx/lwF2rbFwj47sd79HOMjLt/nNWbwhD+Nqi6ARqkpP94VjJ+ZXdRCfQx7mi44 HK7nN9w+Yyt42DYFzAvdTKOaRVth1yhtCzLHHWmQNz4s8ePv2aJa2T6S4qfi45Xhmkvy bF7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=ajvE5+c2m56gk/SUkoZC1jaNI660l47APgS7YbbXIxI=; b=mAyWkOupUe0xerr9XW354s/C9l714v4gCNb46PUhyCpk54MRZc3Le8rSWiy/qxU1MW PzNMsxu2HYxOA3EDII+8nrvhgq1oxTa9eLCOSY+cvVjm1HvJPpTu7Mc3ZyFtSzW0E0l/ YMIL0NTtSzqeCLB03MEPtu3Cx+Aw0ZdvOHwrVycFL1odf+vHQvJtyuVMhlMqnhPBFH7f jnl8kFkwMA0Ssyw16mKVM3/QUbKF2aPNLjneHOe1NcGtZZW9z25LIgBa7andNhu7T8zl sn+R8smPWELwVV5FwZfe0tk0jFsL5866dfxfia14sb3qOVBFoVQqZegAFkoXtfmKMjsX L+Sg== X-Gm-Message-State: APf1xPCoh/A4zF/q49zrQQo0wNSGXrye15+Uv5ioc6M1nR/yPU86jdHo /MkR1UqLQHmr/WWJtijArGU= X-Received: by 10.25.162.72 with SMTP id l69mr5927807lfe.38.1519322051138; Thu, 22 Feb 2018 09:54:11 -0800 (PST) Received: from [192.168.1.145] (ppp109-252-55-234.pppoe.spdop.ru. [109.252.55.234]) by smtp.googlemail.com with ESMTPSA id u13sm108171lfc.91.2018.02.22.09.54.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Feb 2018 09:54:10 -0800 (PST) Subject: Re: [PATCH V15 06/22] mmc: block: Add blk-mq support To: Adrian Hunter , Ulf Hansson Cc: linux-mmc , linux-block , linux-kernel , Bough Chen , Alex Lemberg , Mateusz Nowak , Yuliy Izrailov , Jaehoon Chung , Dong Aisheng , Das Asutosh , Zhangfei Gao , Sahitya Tummala , Harjani Ritesh , Venu Byravarasu , Linus Walleij , Shawn Lin , Bartlomiej Zolnierkiewicz , Christoph Hellwig , Thierry Reding , Krishna Reddy References: <1511962879-24262-1-git-send-email-adrian.hunter@intel.com> <1511962879-24262-7-git-send-email-adrian.hunter@intel.com> <829308a3-3bf6-c173-65fa-e2a0f45f7f61@intel.com> From: Dmitry Osipenko Message-ID: <68886f99-97f5-897a-f754-6f414741bd5a@gmail.com> Date: Thu, 22 Feb 2018 20:54:08 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <829308a3-3bf6-c173-65fa-e2a0f45f7f61@intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 22.02.2018 10:42, Adrian Hunter wrote: > On 21/02/18 22:50, Dmitry Osipenko wrote: >> On 29.11.2017 16:41, Adrian Hunter wrote: >>> Define and use a blk-mq queue. Discards and flushes are processed >>> synchronously, but reads and writes asynchronously. In order to support >>> slow DMA unmapping, DMA unmapping is not done until after the next request >>> is started. That means the request is not completed until then. If there is >>> no next request then the completion is done by queued work. >> >> Hello, >> >> I'm using (running linux-next and doing some upstream development for) some old >> NVIDIA Tegra tablet that has built-in (internal) and external MMC's and with the >> blk-mq being enabled I'm observing a soft lockup. The lockup seems is >> reproducible quite reliably by running fsck on any MMC partition, sometimes >> kernels lockups on boot during probing partitions table (weirdly only when both >> SDHCI's are present, i.e. internal storage enabled in DT and external SD is >> inserted/enabled) and it also lockups pretty quickly in a case of just a general >> use. Reverting mmc/ commits up to 1bec43a3b18 ("Remove option not to use >> blk-mq") and disabling CONFIG_MMC_MQ_DEFAULT makes everything working fine >> again. There is also a third SDHCI populated with built-in WiFi/Bluetooth SDIO >> and I'm observing odd MMC timeouts with the blk-mq enabled, disabling >> CONFIG_MMC_MQ_DEFAULT fixes these timeouts as well. >> >> Any thoughts? > > SDIO (unless it is a combo card) should be unaffected by changes to the > block driver. > > I don't have any ideas. Adding more NVIDIA people. > >> >> WiFi issue >> ======================== >> >> [ 38.247006] mmc2: Timeout waiting for hardware interrupt. >> [ 38.247027] brcmfmac: brcmf_escan_timeout: timer expired >> [ 38.247036] mmc2: sdhci: ============ SDHCI REGISTER DUMP =========== >> [ 38.247047] mmc2: sdhci: Sys addr: 0x00000000 | Version: 0x00000001 >> [ 38.247055] mmc2: sdhci: Blk size: 0x00007008 | Blk cnt: 0x00000000 >> [ 38.247062] mmc2: sdhci: Argument: 0x21000008 | Trn mode: 0x00000013 >> [ 38.247070] mmc2: sdhci: Present: 0x01d70000 | Host ctl: 0x00000013 >> [ 38.247077] mmc2: sdhci: Power: 0x00000001 | Blk gap: 0x00000000 >> [ 38.247084] mmc2: sdhci: Wake-up: 0x00000000 | Clock: 0x00000007 >> [ 38.247091] mmc2: sdhci: Timeout: 0x0000000e | Int stat: 0x00000000 >> [ 38.247098] mmc2: sdhci: Int enab: 0x02ff000b | Sig enab: 0x02fc000b >> [ 38.247105] mmc2: sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000 >> [ 38.247112] mmc2: sdhci: Caps: 0x61ff30b0 | Caps_1: 0x00000000 >> [ 38.247119] mmc2: sdhci: Cmd: 0x0000353a | Max curr: 0x00000001 >> [ 38.247126] mmc2: sdhci: Resp[0]: 0x00001800 | Resp[1]: 0x08002db5 >> [ 38.247133] mmc2: sdhci: Resp[2]: 0x16da8000 | Resp[3]: 0x00000400 >> [ 38.247139] mmc2: sdhci: Host ctl2: 0x00000000 >> [ 38.247146] mmc2: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x17c47200 >> [ 38.247152] mmc2: sdhci: ============================================ >> [ 38.247250] brcmfmac: brcmf_sdio_readframes: read 520 bytes from channel 1 >> failed: -84 >> [ 38.247274] brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK >> [ 40.807019] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout >> [ 40.807042] brcmfmac: brcmf_notify_escan_complete: Scan abort failed >> [ 48.487007] mmc2: Timeout waiting for hardware interrupt. >> [ 48.487057] mmc2: sdhci: ============ SDHCI REGISTER DUMP =========== >> [ 48.487096] mmc2: sdhci: Sys addr: 0x00000000 | Version: 0x00000001 >> [ 48.487128] mmc2: sdhci: Blk size: 0x00007040 | Blk cnt: 0x00000001 >> [ 48.487160] mmc2: sdhci: Argument: 0x21000040 | Trn mode: 0x00000013 >> [ 48.487191] mmc2: sdhci: Present: 0x01d70000 | Host ctl: 0x00000013 >> [ 48.487221] mmc2: sdhci: Power: 0x00000001 | Blk gap: 0x00000000 >> [ 48.487251] mmc2: sdhci: Wake-up: 0x00000000 | Clock: 0x00000007 >> [ 48.487281] mmc2: sdhci: Timeout: 0x0000000e | Int stat: 0x00000000 >> [ 48.487313] mmc2: sdhci: Int enab: 0x02ff000b | Sig enab: 0x02fc000b >> [ 48.487343] mmc2: sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000 >> [ 48.487374] mmc2: sdhci: Caps: 0x61ff30b0 | Caps_1: 0x00000000 >> [ 48.487404] mmc2: sdhci: Cmd: 0x0000353a | Max curr: 0x00000001 >> [ 48.487435] mmc2: sdhci: Resp[0]: 0x00001000 | Resp[1]: 0x08002db5 >> [ 48.487466] mmc2: sdhci: Resp[2]: 0x16da8000 | Resp[3]: 0x00000400 >> [ 48.487493] mmc2: sdhci: Host ctl2: 0x00000000 >> [ 48.487525] mmc2: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x17c47200 >> [ 48.487552] mmc2: sdhci: ============================================ >> [ 48.487749] brcmfmac: brcmf_sdio_readframes: read 480 bytes from channel 1 >> failed: -84 >> [ 48.487822] brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK >> >> >> Soft lockup issue >> ======================== >> >> # fsck -f /dev/disk/by-uuid/6768309f-3545-49d5-9ac7-d5be24d35ef2 >> fsck из util-linux 2.30.2 >> e2fsck 1.43.9 (8-Feb-2018) >> Проход 1: Проверка inodes, блокs, а также размеров >> Pass 2: Checking каталог structure >> Pass 3: Checking каталог connectivity >> Pass 4: Checking reference counts >> Pass 5: Checking группа summary information >> .... >> >> [ 125.925436] INFO: task kworker/0:3H:263 blocked for more than 60 seconds. >> [ 125.925496] Not tainted >> 4.16.0-rc2-next-20180220-00101-gaefde91c3955-dirty #2041 >> [ 125.925530] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> [ 125.925562] kworker/0:3H D 0 263 2 0x00000000 >> [ 125.925653] Workqueue: kblockd mmc_blk_mq_complete_work >> [ 125.925747] [] (__schedule) from [] (schedule+0x60/0xcc) >> [ 125.925805] [] (schedule) from [] >> (__mmc_claim_host+0xdc/0x24c) >> [ 125.925849] [] (__mmc_claim_host) from [] > > That claim host should not be there. Here is a fix for that: > > diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c > index 908e4db03535..62049f95116b 100644 > --- a/drivers/mmc/core/mmc_ops.c > +++ b/drivers/mmc/core/mmc_ops.c > @@ -932,9 +932,7 @@ static int mmc_read_bkops_status(struct mmc_card *card) > int err; > u8 *ext_csd; > > - mmc_claim_host(card->host); > err = mmc_get_ext_csd(card, &ext_csd); > - mmc_release_host(card->host); > if (err) > return err; Looks like this patch fixes all the problems. I'll keep testing it for a couple of days and then report back the final result. Thank you very much. >> (mmc_start_bkops+0x4c/0x190) >> [ 125.925895] [] (mmc_start_bkops) from [] >> (mmc_blk_urgent_bkops+0x48/0x5c) >> [ 125.925945] [] (mmc_blk_urgent_bkops) from [] >> (mmc_blk_mq_complete_prev_req.part.5+0x74/0x210) >> [ 125.925995] [] (mmc_blk_mq_complete_prev_req.part.5) from >> [] (mmc_blk_mq_complete_work+0x30/0x34) >> [ 125.926049] [] (mmc_blk_mq_complete_work) from [] >> (process_one_work+0x1f8/0x584) >> [ 125.926093] [] (process_one_work) from [] >> (worker_thread+0x68/0x5d4) >> [ 125.926144] [] (worker_thread) from [] (kthread+0x178/0x184) >> [ 125.926188] [] (kthread) from [] (ret_from_fork+0x14/0x2c) >> [ 125.926209] Exception stack(0xd579ffb0 to 0xd579fff8) >> [ 125.926239] ffa0: 00000000 00000000 >> 00000000 00000000 >> [ 125.926272] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 >> 00000000 00000000 >> [ 125.926301] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 >> [ 125.926361] INFO: task fsck.ext4:471 blocked for more than 60 seconds. >> [ 125.926399] Not tainted >> 4.16.0-rc2-next-20180220-00101-gaefde91c3955-dirty #2041 >> [ 125.926427] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> [ 125.926455] fsck.ext4 D 0 471 470 0x00000000 >> [ 125.926536] [] (__schedule) from [] (schedule+0x60/0xcc) >> [ 125.926593] [] (schedule) from [] (io_schedule+0x20/0x40) >> [ 125.926654] [] (io_schedule) from [] >> (wait_on_page_bit+0x120/0x144) >> [ 125.926705] [] (wait_on_page_bit) from [] >> (__filemap_fdatawait_range+0xd4/0x114) >> [ 125.926757] [] (__filemap_fdatawait_range) from [] >> (file_write_and_wait_range+0x98/0xb4) >> [ 125.926810] [] (file_write_and_wait_range) from [] >> (blkdev_fsync+0x2c/0x5c) >> [ 125.926867] [] (blkdev_fsync) from [] >> (vfs_fsync_range+0x4c/0xb0) >> [ 125.926912] [] (vfs_fsync_range) from [] (do_fsync+0x4c/0x74) >> [ 125.926954] [] (do_fsync) from [] (SyS_fsync+0x1c/0x20) >> [ 125.926996] [] (SyS_fsync) from [] >> (ret_fast_syscall+0x0/0x54) >> [ 125.927016] Exception stack(0xd5631fa8 to 0xd5631ff0) >> [ 125.927050] 1fa0: 00480fc8 00481108 00000004 00481108 >> 00000000 00000000 >> [ 125.927086] 1fc0: 00480fc8 00481108 00000000 00000076 00000000 7f2bb750 >> 00483268 bec602e8 >> [ 125.927113] 1fe0: 00000076 bec60288 b6de5e8b b6d67cf6 >> [ 177.015618] random: crng init done >> [ 187.365434] INFO: task kworker/0:3H:263 blocked for more than 60 seconds. >> [ 187.365491] Not tainted >> 4.16.0-rc2-next-20180220-00101-gaefde91c3955-dirty #2041 >> [ 187.365524] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> [ 187.365556] kworker/0:3H D 0 263 2 0x00000000 >> [ 187.365647] Workqueue: kblockd mmc_blk_mq_complete_work >> [ 187.365741] [] (__schedule) from [] (schedule+0x60/0xcc) >> [ 187.365798] [] (schedule) from [] >> (__mmc_claim_host+0xdc/0x24c) >> [ 187.365842] [] (__mmc_claim_host) from [] >> (mmc_start_bkops+0x4c/0x190) >> [ 187.365887] [] (mmc_start_bkops) from [] >> (mmc_blk_urgent_bkops+0x48/0x5c) >> [ 187.365936] [] (mmc_blk_urgent_bkops) from [] >> (mmc_blk_mq_complete_prev_req.part.5+0x74/0x210) >> [ 187.365986] [] (mmc_blk_mq_complete_prev_req.part.5) from >> [] (mmc_blk_mq_complete_work+0x30/0x34) >> [ 187.366039] [] (mmc_blk_mq_complete_work) from [] >> (process_one_work+0x1f8/0x584) >> [ 187.366083] [] (process_one_work) from [] >> (worker_thread+0x68/0x5d4) >> [ 187.366134] [] (worker_thread) from [] (kthread+0x178/0x184) >> [ 187.366178] [] (kthread) from [] (ret_from_fork+0x14/0x2c) >> [ 187.366200] Exception stack(0xd579ffb0 to 0xd579fff8) >> [ 187.366229] ffa0: 00000000 00000000 >> 00000000 00000000 >> [ 187.366262] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 >> 00000000 00000000 >> [ 187.366291] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 >> [ 187.366350] INFO: task fsck.ext4:471 blocked for more than 60 seconds. >> [ 187.366388] Not tainted >> 4.16.0-rc2-next-20180220-00101-gaefde91c3955-dirty #2041 >> [ 187.366416] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> [ 187.366445] fsck.ext4 D 0 471 470 0x00000000 >> [ 187.366526] [] (__schedule) from [] (schedule+0x60/0xcc) >> [ 187.366582] [] (schedule) from [] (io_schedule+0x20/0x40) >> [ 187.366644] [] (io_schedule) from [] >> (wait_on_page_bit+0x120/0x144) >> [ 187.366693] [] (wait_on_page_bit) from [] >> (__filemap_fdatawait_range+0xd4/0x114) >> [ 187.366745] [] (__filemap_fdatawait_range) from [] >> (file_write_and_wait_range+0x98/0xb4) >> [ 187.366799] [] (file_write_and_wait_range) from [] >> (blkdev_fsync+0x2c/0x5c) >> [ 187.366857] [] (blkdev_fsync) from [] >> (vfs_fsync_range+0x4c/0xb0) >> [ 187.366902] [] (vfs_fsync_range) from [] (do_fsync+0x4c/0x74) >> [ 187.366944] [] (do_fsync) from [] (SyS_fsync+0x1c/0x20) >> [ 187.366987] [] (SyS_fsync) from [] >> (ret_fast_syscall+0x0/0x54) >> [ 187.367008] Exception stack(0xd5631fa8 to 0xd5631ff0) >> [ 187.367041] 1fa0: 00480fc8 00481108 00000004 00481108 >> 00000000 00000000 >> [ 187.367078] 1fc0: 00480fc8 00481108 00000000 00000076 00000000 7f2bb750 >> 00483268 bec602e8 >> [ 187.367103] 1fe0: 00000076 bec60288 b6de5e8b b6d67cf6 >> >