Received: by 10.213.65.68 with SMTP id h4csp856308imn; Thu, 22 Mar 2018 10:02:45 -0700 (PDT) X-Google-Smtp-Source: AG47ELsXcHqBWTG2H/AARPLeb0GBzjUxQ7Hwm7l9HQ0nnshpNCCNAKyXtSHcTmv0t/37jINzHYae X-Received: by 2002:a17:902:7586:: with SMTP id j6-v6mr11427307pll.352.1521738165316; Thu, 22 Mar 2018 10:02:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521738165; cv=none; d=google.com; s=arc-20160816; b=iMuJ1BQG3euwjerWi3uRyMXORMxHERGI9j6mwLWYChTV6e2ZMrkDwuwpkoiFncIRNL BA6HoYI8YOA09AqsPxhk3GHFh9Vx54R7mvKMugj2otdo9Wsr9fxts0aWMYXtabeEFvOm oFK8i23UuLb2q0FFhPJVx5iI5Fl0dPW1rFoDJ39sYMdJgpBtmSVfsuoG5yiOCiL2IklN gbwsFx8rW4iR9V66PENvkoa+L2syiXFyB5kb+gZft32Rd+gqxR44s8p58Vw/8AVJXMKy L5rgn3d4RP/ruJNyqHf/6sF194a8O+R0sAgpkH4Mvn9GRFmol/g1b+ejvgmLr0/Gb+55 KlrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=e1RT3M5dldIDd2l+Laj8ZSVRQcoKCUeszXVM04iWlbo=; b=H2NSRY6K27ReABP+K+jnXLiPyI+eYvruqk9BONSAYbAwhyL3oqfgx90CtouSE9bkbA r+ccdUlErBh4AsBlLECb9bRLK1xCzGyWHWl8VTZMzU/g1I92IStWw1Y0xS3S4DGt5Oc5 zyQVjMx0VBr74nfzTXUp9kXA9OGNuOEyVFOG0lBlAtxFPFbqLSuHRLWfAhYuePCd2D7N JuzVXdMvyXZAW5VrEqmX9CMvNyAVDu3jqScuZZo7LW3kl+emjXMaIKm9WijAsoeSqMeP BKXWoyaz7tw+328jJq11DWBchM66LQRwJH0O4GI0PRuBwJvflH8OvxYfewJ3K1l3JFEv EZ8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@lightnvm-io.20150623.gappssmtp.com header.s=20150623 header.b=T/ASNqoK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o3si5080817pfg.5.2018.03.22.10.02.20; Thu, 22 Mar 2018 10:02:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@lightnvm-io.20150623.gappssmtp.com header.s=20150623 header.b=T/ASNqoK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751875AbeCVRBA (ORCPT + 99 others); Thu, 22 Mar 2018 13:01:00 -0400 Received: from mail-lf0-f43.google.com ([209.85.215.43]:44958 "EHLO mail-lf0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751375AbeCVRA6 (ORCPT ); Thu, 22 Mar 2018 13:00:58 -0400 Received: by mail-lf0-f43.google.com with SMTP id g203-v6so14212940lfg.11 for ; Thu, 22 Mar 2018 10:00:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lightnvm-io.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=e1RT3M5dldIDd2l+Laj8ZSVRQcoKCUeszXVM04iWlbo=; b=T/ASNqoKTkUUtiajKJxB7CHoR8SKR0B/tP9/z9DEKPgZZBBvzmOTwH7gFnQ+X1+zqF X4txMLZ+bu1rHUhk38ofzcS5qKoEVP53/JUJHA0m1Nfb6CT3IRQncvU6tcHTD40+p31h +iObwNJjr536dUCkDODQ2IswTFDEI+d3tfOxHLitJ4roJ+10h9aFY78kGLAsi0gW8LTC Im92aDf1e1Ekh0QWwH7sEdlLf59InateYcd6bFm39O2v9R5zHHXugxtvUmE8E5M9H/Jc RDzK0eoVihH6AJRyJ3fc4MC25Ri+nTRamJzq8JBVJszrwtwkCu/kF7GxW4GG3mE2bk3O JLiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=e1RT3M5dldIDd2l+Laj8ZSVRQcoKCUeszXVM04iWlbo=; b=kWG5niChxDBF18b5ZQwQ/+OX/wPjyJOEiB42XWqFZzFFr5soAtR2YNGvn2y5QrmyRj QQqeZAf1VlTMb0x7lpFQY2ju/DNadT5pbNselB3UY2OvEzEYV3ufWswNiZXT7KqbtPzM 8Ox+VKNkWRIcK9BPb0oldGgKhLKbFgnHm5qLtoHF9+DSkW9sxMIWHTYo0y/Iq4Ksiz95 9UYWMHyu8F951dYmkEK/rt9N09AjUcYBx+cpXkxLlOn3x3fIspc6CMiYzDRPHx7CZlZM gLAwZlRpgDvhLMRAbI/krn3TB+P56v3/ctgchzDY1LaH28UnnksVCtEYYcpjY4j1xOBu c29g== X-Gm-Message-State: AElRT7EDN5b+GjZW/V9hP6WxgOx0PjjtMersG3A4+qSMw+ONJDzJg2I/ eNgnl4lXLNH0Xl3aMWTc/N+H7A== X-Received: by 10.46.150.200 with SMTP id d8mr17992480ljj.136.1521738056447; Thu, 22 Mar 2018 10:00:56 -0700 (PDT) Received: from [192.168.0.10] (x1-6-a4-08-f5-18-3c-3a.cpe.webspeed.dk. [188.176.29.198]) by smtp.googlemail.com with ESMTPSA id h80sm1456687ljf.62.2018.03.22.10.00.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Mar 2018 10:00:55 -0700 (PDT) Subject: Re: problem with bio handling on raid5 and pblk To: =?UTF-8?Q?Javier_Gonz=c3=a1lez?= , Jens Axboe , shli@kernel.org Cc: linux-raid@vger.kernel.org, linux-block@vger.kernel.org, LKML , Huaicheng Li References: <66350920-EC5E-447F-B5DF-0F3C2CDEAA65@javigon.com> From: =?UTF-8?Q?Matias_Bj=c3=b8rling?= Message-ID: <97678ddd-140f-5a8b-75ee-cbb584308260@lightnvm.io> Date: Thu, 22 Mar 2018 18:00:54 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <66350920-EC5E-447F-B5DF-0F3C2CDEAA65@javigon.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/22/2018 03:34 PM, Javier Gonz?lez wrote: > Hi, > > I have been looking into a bug report when using pblk and raid5 on top > and I am having problems understanding if the problem is in pblk's bio > handling or on raid5's bio assumptions on the completion path. > > The problem occurs on the read path. In pblk, we take a reference to > every read bio as it enters, and release it after completing the bio. > > generic_make_request() > pblk_submit_read() > bio_get() > ... > bio_endio() > bio_put() > > The problem seems to be that on raid5's bi_end_io completion path, > raid5_end_read_request(), bio_reset() is called. When put together > with pblk's bio handling: > > generic_make_request() > pblk_submit_read() > bio_get() > ... > bio_endio() > raid5_end_read_request() > bio_reset() > bio_put() > > it results in the newly reset bio being put immediately, thus freed. > When the bio is reused then, we have an invalid pointer. In the report > we received things crash at BUG_ON(bio->bi_next) at > generic_make_request(). > > As far as I understand, it is part of the bio normal operation for > drivers under generic_make_request() to be able to take references and > release them after bio completion. Thus, in this case, the assumption > made by raid5, that it can issue a bio_reset() is incorrect. But I might > be missing an implicit cross layer rule that we are violating in pblk. > Any ideas? > > This said, after analyzing the problem from pblk's perspective, I see > not reason to use bio_get()/bio_put() in the read path as it is at the > pblk level that we are submitting bio_endio(), thus we cannot risk the > bio being freed underneath us. Is this reasoning correct? I remember I > introduced these at the time there was a bug on the aio path, which was > not cleaning up correctly and could trigger an early bio free, but > revisiting it now, it seems unnecessary. > > Thanks for the help! > > Javier > I think I sent a longer e-mail to you and Huaicheng about this a while back. The problem is that the pblk encapsulates the bio in its own request. So the bio's are freed before the struct request completion is done (as you identify). If you can make the completion path (as bio's are completed before the struct request completion fn is called) to not use the bio, then the bio_get/put code can be removed. If it needs the bio on the completion path (e.g., for partial reads, and if needed in the struct request completion path), one should clone the bio, submit, and complete the original bio afterwards.