Received: by 10.213.65.68 with SMTP id h4csp3585184imn; Tue, 3 Apr 2018 07:30:05 -0700 (PDT) X-Google-Smtp-Source: AIpwx49QUfLZtMgyiFKzyew9mNZkELqfRrNhtkvch53NlZmw+YaR71vnUY8jLjhI5U1PFsnguB17 X-Received: by 2002:a17:902:d03:: with SMTP id 3-v6mr14481238plu.245.1522765805872; Tue, 03 Apr 2018 07:30:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522765805; cv=none; d=google.com; s=arc-20160816; b=cCujV/K9U6r9msgQeslLXxwqKKvbrVJUVodODyf6kGvYB8X5bg4MZvMQj/RDQFSzSd 0L3siCzClb00e5uTUquBGbw6DSoEMgWoBzzT5WUEFgqultXOTsSTaxum01QdMY7tQu8k BN1ThgT085ZJufFDKIXdqW6iHsgs849NdeoMfFGJORxSWAKLWsqI1f2lOVUN7neuwVmy /5asHTLk/KBdOUlwRD6/1Fu1vNuZoSodsejJTvc6PX7uUtOcXD1vZ1Lup4SzvXy6WRqg teFWHBceqYWi5S5eUG5q66jUr6N8R0hGVHCSJkV2kgmGQba+UTASIsacvg4+0N/KkF83 T4Rg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:to:cc:in-reply-to:date:subject :mime-version:message-id:from:dkim-signature :arc-authentication-results; bh=gi7SFjRaYB7jEJN/z6bwVoongVKDYcvfDdDSgDDNMPc=; b=KyCWQ8mWYbLSgXfig69Y9y7LuBJkdJFxu090GX4BYl17mGWHCb5XOJMj4t0NtFtOHd TnwHH990k2O3yBVBgDIZd5z8AHBDK3iT4BnlgKwj5CT8tiVb9j92j+7j+CcqAiZPdsR1 yITf9/Jh6fIC/MFSu0mUC1hGqHAA3Uuy2fbDgGZ7dCiYHQ2RiurVb9PrGFJehuivoUhJ tBA8BHsBAi6UMJ5qvN5v9jgaaTeotBXXWTpO6qoNewqyt7Z9Y1FSK2IptrM0KMSzzGMY rIZckq3t++nps81XSwgEjuoRkVDn1/Sf9wUcIwCO8qiFIRliH5QwpHd47UQIfQduHnSF rJ3g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@javigon-com.20150623.gappssmtp.com header.s=20150623 header.b=ZoVoxJHr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o12-v6si640643plg.715.2018.04.03.07.29.51; Tue, 03 Apr 2018 07:30:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@javigon-com.20150623.gappssmtp.com header.s=20150623 header.b=ZoVoxJHr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751433AbeDCO2l (ORCPT + 99 others); Tue, 3 Apr 2018 10:28:41 -0400 Received: from mail-wm0-f48.google.com ([74.125.82.48]:32971 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751247AbeDCO2a (ORCPT ); Tue, 3 Apr 2018 10:28:30 -0400 Received: by mail-wm0-f48.google.com with SMTP id o23so16726675wmf.0 for ; Tue, 03 Apr 2018 07:28:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=javigon-com.20150623.gappssmtp.com; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=gi7SFjRaYB7jEJN/z6bwVoongVKDYcvfDdDSgDDNMPc=; b=ZoVoxJHrl9q4hLa9ZHYCdsoxoGoNm8dsbZORB8ePpkXgqZZtglzcxeURtKsT16zlTV oXHSGCGEdbqmDwQGvpvVtSY/gCewsYkOTJaewce0ym/i7D5K6/1HGlqL43/CmpIBjyZi 7y015RJGFW5acGPkfGascNY8eEcW0FSNyWFt26udwAQS2C7WXSaazjj4sURNg4Q8ajZL 5evzLPR/iJ3xKx32cbTT+lwM6K1LnkBuzKxHAFaL13w5Y0K9L1GkYuQccj9zjviqy5ml WUUt5sY+x0iMT+il42nXtI4Ywj7URTQSWMItAlhJvPhqDtHkSmymY8ruaYZmevUvby44 z39g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=gi7SFjRaYB7jEJN/z6bwVoongVKDYcvfDdDSgDDNMPc=; b=d/bGK5l1ydi+NEVs35mrF10BZuxg/gNGwyucQDnm8xQEu2Fy8AZPK+4crZRHjFO72V M/fzZkLAFg3BItcKvybCm7J/xUkDvKHq3oW3D/NvTbScHwNvjivdpJ49TN/FUJg/lQrO R25KVM+6bRCMhL/QlScCiCzCXG+TwaJzfdYlOCnrpuR86p7iS/Zim5JpkCHN0abE+ErU XDAM2lnp4Oox4R1k/3Rebv8wl8vIrbI64Xe2L0yOTb3K7QX2TX8jl3zyWkmxlS2HXnjP MFEB60Iuh4qEk6gMOt3vWeFLuGSNe9ViwPRG3NitxiUfuCir2HZR89+oju4dWFV1pewx KSAQ== X-Gm-Message-State: AElRT7EEJ/TpBxqahiipM4cBjV7h0FS2pt2VFfE/ljg7YcmKPFqw9wit lsdb858/E0aBsIKsoyg7jVpLlg== X-Received: by 10.80.166.211 with SMTP id f19mr17005694edc.266.1522765708618; Tue, 03 Apr 2018 07:28:28 -0700 (PDT) Received: from mac-halley13.cnexlabs.com (6164211-cl69.boa.fiberby.dk. [193.106.164.211]) by smtp.gmail.com with ESMTPSA id j42sm2016677eda.67.2018.04.03.07.28.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 03 Apr 2018 07:28:27 -0700 (PDT) From: =?utf-8?Q?Javier_Gonz=C3=A1lez?= Message-Id: <07972A79-4E2E-4CC3-B856-318EFAC553E3@javigon.com> Content-Type: multipart/signed; boundary="Apple-Mail=_4F16BE9B-32EA-4771-A04C-2A907A5DC976"; protocol="application/pgp-signature"; micalg=pgp-sha512 Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\)) Subject: Re: problem with bio handling on raid5 and pblk Date: Tue, 3 Apr 2018 16:28:24 +0200 In-Reply-To: <04D5C038-969C-49EB-A9AA-EBE0548521E6@javigon.com> Cc: =?utf-8?Q?Matias_Bj=C3=B8rling?= , Jens Axboe , shli@kernel.org, linux-raid@vger.kernel.org, linux-block@vger.kernel.org, LKML , Huaicheng Li To: =?utf-8?Q?Javier_Gonz=C3=A1lez?= References: <66350920-EC5E-447F-B5DF-0F3C2CDEAA65@javigon.com> <97678ddd-140f-5a8b-75ee-cbb584308260@lightnvm.io> <04D5C038-969C-49EB-A9AA-EBE0548521E6@javigon.com> X-Mailer: Apple Mail (2.3445.5.20) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --Apple-Mail=_4F16BE9B-32EA-4771-A04C-2A907A5DC976 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On 23 Mar 2018, at 13.52, Javier Gonz=C3=A1lez = wrote: >=20 >> On 22 Mar 2018, at 18.00, Matias Bj=C3=B8rling = wrote: >>=20 >> On 03/22/2018 03:34 PM, Javier Gonz=C3=A1lez wrote: >>> Hi, >>> I have been looking into a bug report when using pblk and raid5 on = top >>> and I am having problems understanding if the problem is in pblk's = bio >>> handling or on raid5's bio assumptions on the completion path. >>> The problem occurs on the read path. In pblk, we take a reference to >>> every read bio as it enters, and release it after completing the = bio. >>> generic_make_request() >>> pblk_submit_read() >>> bio_get() >>> ... >>> bio_endio() >>> bio_put() >>> The problem seems to be that on raid5's bi_end_io completion path, >>> raid5_end_read_request(), bio_reset() is called. When put together >>> with pblk's bio handling: >>> generic_make_request() >>> pblk_submit_read() >>> bio_get() >>> ... >>> bio_endio() >>> raid5_end_read_request() >>> bio_reset() >>> bio_put() >>> it results in the newly reset bio being put immediately, thus freed. >>> When the bio is reused then, we have an invalid pointer. In the = report >>> we received things crash at BUG_ON(bio->bi_next) at >>> generic_make_request(). >>> As far as I understand, it is part of the bio normal operation for >>> drivers under generic_make_request() to be able to take references = and >>> release them after bio completion. Thus, in this case, the = assumption >>> made by raid5, that it can issue a bio_reset() is incorrect. But I = might >>> be missing an implicit cross layer rule that we are violating in = pblk. >>> Any ideas? >>> This said, after analyzing the problem from pblk's perspective, I = see >>> not reason to use bio_get()/bio_put() in the read path as it is at = the >>> pblk level that we are submitting bio_endio(), thus we cannot risk = the >>> bio being freed underneath us. Is this reasoning correct? I remember = I >>> introduced these at the time there was a bug on the aio path, which = was >>> not cleaning up correctly and could trigger an early bio free, but >>> revisiting it now, it seems unnecessary. >>> Thanks for the help! >>> Javier >>=20 >> I think I sent a longer e-mail to you and Huaicheng about this a = while back. >=20 > I don't think I was in that email. >=20 > There are two parts to the question. One is raid5's bio completion > assumptions and the other is whether we can avoid bio_get()/put() in > pblk's read path. The first part is pblk independent and I would like = to > leave it open as I would like to understand how bio_reset() in this > context is correct. Right now, I cannot see how this is correct > behaviour. >=20 > For the pblk specific part, see below. >=20 >> The problem is that the pblk encapsulates the bio in its own request. >> So the bio's are freed before the struct request completion is done >> (as you identify). If you can make the completion path (as bio's are >> completed before the struct request completion fn is called) to not >> use the bio, then the bio_get/put code can be removed. >>=20 >> If it needs the bio on the completion path (e.g., for partial reads, >> and if needed in the struct request completion path), one should = clone >> the bio, submit, and complete the original bio afterwards. >=20 > I don't follow how the relationship with struct request completion is > different with bio_get/put and without. >=20 > The flow in terms of bio and struct request management is today: >=20 > generic_make_request() > pblk_submit_read() > bio_get() > ... > blk_init_request_from_bio() > blk_execute_rq_nowait() / blk_execute_rq() // denepnding on = sync/async > ... > bio_endio() > bio_put() > ... > blk_mq_free_request() >=20 > bios risk to always freed in any case, as bio_put() will the last pblk > reference. The only case in which this will not happen is that = somebody > else took a bio_get() on the way down. But we cannot assume anything. >=20 > I guess the problem I am having understanding this is how we can risk > the bio disappearing underneath when we are the ones completing the = bio. > As I understand it, in this case we are always guaranteed that the > bio is alive due to the allocation reference. Therefore, = bio_get()/put() > is not needed. Am I missing anything? >=20 > Thanks, > Javier After an offline discussion with Matias, we have agreed that bio_get/put() should not be necessary on pblk's read path (Matias, please correct me if my understanding is not correct). As mentioned before too, we have been running internally without them for some time without problems. I'll send a patch for the next window removing them. This said, I believe that a generic_make_request driver should be able to do bio_get/put without side consequences. Thus, I still doubt that raid5's assumption that it can reset the bio on bio_end_io is correct (and if yes, I would like to understand why). Thoughts? Javier --Apple-Mail=_4F16BE9B-32EA-4771-A04C-2A907A5DC976 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEU1dMZpvMIkj0jATvPEYBfS0leOAFAlrDj4gACgkQPEYBfS0l eOAAqBAAoIXO7qbN/66nLtwSVkgyT+1nsh9EeaK+pE/Ksfg1TtLkpFiB73Ahl6nx X3B+/PUyLZYZfW2+8xvxbMhWhav6c/LeG3PXL2XTYDCUX5XwH90Jx5n8HBZS44Cy jz5fTKk6KFF5LDUb6J91JsM3h0HXwWLn8LdemzYaBd05J5gcVEZkfEtkGZGIVrQv PkgKOeW5Gdk+pLIkfiXhtNi7GDJIF1pgyhlbinJVqf4NleHpL3OAy2apU6CWl7YS Yae46Su1BfTxn+mG5Px14xd2IXPJa/vQ/A0dXy2qolA+bf3lycv7//SD6vI6epjm qyROfCzpRo3lfLGlTMgW9/OJfZ/b8QD30/oCtPVgjP3EfrfcWJbzQQ+o3DkJKNxi dfyYds0Yus8eI4tJHH1Tcn8eAxvg+6q52Rhi8eZ3HidTDZdQ890czF6g18mstIZX h/oSJncM0nA5D6LnZQDsyDmjFUn6PbWE4EsBppVqTqVfCBcMmyiWwxgnRgfY693W WWElEMp6MT+IIhTgheAf6JAILGmF9HSqn4Yz5Z4NkjVd6TJw6QbqVC/eTfQ6gy+D p5UuL23LFHKMJjUDvEzYPa1zaBhF+u59OJW3fZIl8dXhkhAQ4kkxWWePVVDFMqzU mOgptaYuWNquCfvbUHIyJvNtd/JI7Xg5JjacVFSbu9N76YNC0uY= =kF43 -----END PGP SIGNATURE----- --Apple-Mail=_4F16BE9B-32EA-4771-A04C-2A907A5DC976--