Received: by 10.213.65.68 with SMTP id h4csp1310637imn; Thu, 29 Mar 2018 02:14:09 -0700 (PDT) X-Google-Smtp-Source: AIpwx48XmZg9wjR8z5zSWjtRhFostUr9IWWihS7hNyAyWEJdBsYgYUpnKpIjUNrFkMgcgCQ4LFf0 X-Received: by 10.99.66.65 with SMTP id p62mr4883985pga.378.1522314849899; Thu, 29 Mar 2018 02:14:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522314849; cv=none; d=google.com; s=arc-20160816; b=IWPBfu6Ezd9e0fPtWBP5bd5FQAMegvN6dV4iBMLlrZskoWzo4OMytdgQjaYilT4tho fPtwLDzvvPNJ0x9AW46eiDhIofZTKSBUfbpom+yxK/Kx2Z6lVa6iDCcPvit6eRoSZUo0 GwwzKtvnlpJ1USArfdpeWm7chwo9H0ieUum+zf2/bYskt4IkTQV0RYa7HRX5/G6ahDrQ NyTMQQMD8d03oz/MwegC0Un3j1BRIo+BRwF4KFnCmb9Dh0Q8Aq6Fz14VmVUQW3W4iCqJ BpdnZixhl367OvzNtZaDRXmAGAGIdLpr+0FCh8llMzJTorh7Lw02aImz5ZJaZAq0bhdP 5Wyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=eWaDUvgNVy5cGFT+ZKXffN4z9cL+GlbedxunawqlJjg=; b=hJHwsjlqTJPusx4G1e/SOGFhiDaHDVRE7Yc8PCBA4g2h9hpQ/GhANey2pFANKsqWEH SbHb4MNJA5HV1IvL77PagETj4pZK/yBMcyqQrgxvMUwh79W68DfgPIKG4Y+p2CjK+Dje QDcfn66/Jdld6fpkUVIchcYWPBoeGtbL1H+L0I2iCUmwdDzwvv7fbjqwKx7sSeDYE0IK XGhAkVSO/wRXzAL4zeAZ/XQWoh3kP9gDN9cIWgndliotJzmXRd2wF2/j+xW9BY9Muxf2 zZtBGHalH9wFAGI2mDcHid/szQ+ixmfTRdThpxsPmHRle9WWWgk/vRSs1jcMdcAjv/d6 hKUQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v2si3646814pgo.639.2018.03.29.02.13.55; Thu, 29 Mar 2018 02:14:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752314AbeC2JM4 (ORCPT + 99 others); Thu, 29 Mar 2018 05:12:56 -0400 Received: from eahe.aehallh.com ([50.116.20.20]:33460 "EHLO eahe.aehallh.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750708AbeC2JMy (ORCPT ); Thu, 29 Mar 2018 05:12:54 -0400 Received: from localhost (localhost [127.0.0.1]) by eahe.aehallh.com (Postfix) with ESMTP id 1C3032FEEB; Thu, 29 Mar 2018 04:12:54 -0500 (CDT) X-Virus-Scanned: Debian amavisd-new at eahe.aehallh.com Received: from eahe.aehallh.com ([127.0.0.1]) by localhost (eahe.aehallh.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UFB9pNgqinPX; Thu, 29 Mar 2018 04:12:52 -0500 (CDT) Received: from [10.0.5.143] (24-113-71-69.wavecable.com [24.113.71.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: warp) by eahe.aehallh.com (Postfix) with ESMTPSA id F3C342FEEA; Thu, 29 Mar 2018 04:12:51 -0500 (CDT) Subject: Re: General protection fault with use_blk_mq=1. To: Paolo Valente , Jens Axboe Cc: "Zephaniah E. Loss-Cutler-Hull" , Linux Kernel Mailing List , linux-block , linux-scsi@vger.kernel.org References: <7d8a9c62-7d3e-879c-5b5b-30707f04553e@aehallh.com> <735c5d75-eacf-8ed2-ba9b-9ff4b0b5290d@kernel.dk> <882A26D2-BEB8-4CE3-B132-0DE31BFD5D28@linaro.org> From: "Zephaniah E. Loss-Cutler-Hull" Message-ID: <0a96375b-5a2e-e828-fa1f-a14af192be4c@aehallh.com> Date: Thu, 29 Mar 2018 02:12:39 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <882A26D2-BEB8-4CE3-B132-0DE31BFD5D28@linaro.org> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="lB5lpsYNudRQbnvCa5GSZAeI4OY55eDRs" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --lB5lpsYNudRQbnvCa5GSZAeI4OY55eDRs Content-Type: multipart/mixed; boundary="HA3ciq3VPsgDAdrkx1KE2ahbDdy3Q9ZhF"; protected-headers="v1" From: "Zephaniah E. Loss-Cutler-Hull" To: Paolo Valente , Jens Axboe Cc: "Zephaniah E. Loss-Cutler-Hull" , Linux Kernel Mailing List , linux-block , linux-scsi@vger.kernel.org Message-ID: <0a96375b-5a2e-e828-fa1f-a14af192be4c@aehallh.com> Subject: Re: General protection fault with use_blk_mq=1. References: <7d8a9c62-7d3e-879c-5b5b-30707f04553e@aehallh.com> <735c5d75-eacf-8ed2-ba9b-9ff4b0b5290d@kernel.dk> <882A26D2-BEB8-4CE3-B132-0DE31BFD5D28@linaro.org> In-Reply-To: <882A26D2-BEB8-4CE3-B132-0DE31BFD5D28@linaro.org> --HA3ciq3VPsgDAdrkx1KE2ahbDdy3Q9ZhF Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 03/28/2018 10:13 PM, Paolo Valente wrote: >=20 >=20 >> Il giorno 29 mar 2018, alle ore 05:22, Jens Axboe ha= scritto: >> >> On 3/28/18 9:13 PM, Zephaniah E. Loss-Cutler-Hull wrote: >>> On 03/28/2018 06:02 PM, Jens Axboe wrote: >>>> On 3/28/18 5:03 PM, Zephaniah E. Loss-Cutler-Hull wrote: >>>>> I am not subscribed to any of the lists on the To list here, please= CC >>>>> me on any replies. >>>>> >>>>> I am encountering a fairly consistent crash anywhere from 15 minute= s to >>>>> 12 hours after boot with scsi_mod.use_blk_mq=3D1 dm_mod.use_blk_mq=3D= 1>=20 >>>>> The crash looks like: >>>>> >>> >>>>> >>>>> Looking through the code, I'd guess that this is dying inside >>>>> blkg_rwstat_add, which calls percpu_counter_add_batch, which is wha= t RIP >>>>> is pointing at. >>>> >>>> Leaving the whole thing here for Paolo - it's crashing off insertion= of >>>> a request coming out of SG_IO. Don't think we've seen this BFQ failu= re >>>> case before. >>>> >>>> You can mitigate this by switching the scsi-mq devices to mq-deadlin= e >>>> instead. >>>> >>> >>> I'm thinking that I should also be able to mitigate it by disabling >>> CONFIG_DEBUG_BLK_CGROUP. >>> >>> That should remove that entire chunk of code. >>> >>> Of course, that won't help if this is actually a symptom of a bigger >>> problem. >> >> Yes, it's not a given that it will fully mask the issue at hand. But >> turning off BFQ has a much higher chance of working for you. >> >> This time actually CC'ing Paolo. >> >=20 > Hi Zephaniah, > if you are actually interested in the benefits of BFQ (low latency, > high responsiveness, fairness, ...) then it may be worth to try what > you yourself suggest: disabling CONFIG_DEBUG_BLK_CGROUP. Also because > this option activates the heavy computation of debug cgroup statistics,= > which probably you don't use. I definitely am. >=20 > In addition, the outcome of your attempt without > CONFIG_DEBUG_BLK_CGROUP would give us useful bisection information: > - if no failure occurs, then the issue is likely to be confined in > that debugging code (which, on the bright side, is likely to be of > occasional interest, for only a handful of developers) > - if the issue still shows up, then we may have new hints on this odd > failure >=20 > Finally, consider that this issue has been reported to disappear from > 4.16 [1], and, as a plus, that the service quality of BFQ had a > further boost exactly from 4.16. I look forward to that either way then. >=20 > Looking forward to your feedback, in case you try BFQ without > CONFIG_DEBUG_BLK_CGROUP, I'm running that now, judging from the past if it survives until tomorrow evening then we're good, so I should hopefully know in the next day. Thank you, Zephaniah E. Loss-Cutler-Hull. > Paolo >=20 > [1] https://www.spinics.net/lists/linux-block/msg21422.html >=20 >> >> --=20 >> Jens Axboe >=20 --HA3ciq3VPsgDAdrkx1KE2ahbDdy3Q9ZhF-- --lB5lpsYNudRQbnvCa5GSZAeI4OY55eDRs Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJavK4SAAoJEJMemkxvB+Y8XqoP/2CkP3w/B6pSYhDSDd8lrwUy Ago4fVmi5VGwfIsT+1nuHUJL/vYc8L9jbSfr2g2v36nd94mPLQac8Xuawbp26XOh eIZtWlBrUcCXuZqRqsA1OfVBFa9zBL6AjvKMdAqDhuF6ONfmRKFzS5OUQKHkMeYS Ujyt7rUquQzpVw3+64Ki5cc80Jpw337ahFMQ/H1red36NVjw/d5Fe97waDPD+oM9 CF6/cyLHudcMcLhIbyHnomPaMGtSIpBAfV1IjkiDmnV9HgNtaSo1Mkd0RfFdKn4+ 9hjT8hRxKz2Q3BF02ncKPOPctKTN12QObcF38vg+3ggMfFSU+GYA+oU8LzjsdXuk sUi1BZMVVJhZ+8RF56+MlMEoZ61nUiV/XHiN88qD0kbdH0i1zrH4DskrpzWIioup V/lIaGpOb2a+AJogc5RboWqhPekQRZYcWBHlNrh1b11doM8t+BcNN20r343YFMWg GVSKABvom69f+nqbTpqYncI7pz+ocdu5qDVysgx3NUPRAWXW0+kz8vn5YAakr5TB z7qJRqv1EUVg/e6z0tng3VcEQD/Uf5YFyB4PKhKlXJxENZpJrvrVWkZGEePQ+v2+ P9pVD54vBW7wSUAnYaQ/6cuXrWbxUsOzPVGafC3a7Nka1OD8zjBMwnsVtStJ5poE 6I0JqAtBnHP0EbOPj8pk =hGyE -----END PGP SIGNATURE----- --lB5lpsYNudRQbnvCa5GSZAeI4OY55eDRs--