Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp163725img; Wed, 27 Mar 2019 19:38:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqxpPm+g5C0AJXAvln2bmDsosTza0RDaJh7h9HLhGSdsTi1R0YiwwHbpgs8VMk9ISae9Ev2/ X-Received: by 2002:aa7:8083:: with SMTP id v3mr21777111pff.135.1553740699854; Wed, 27 Mar 2019 19:38:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553740699; cv=none; d=google.com; s=arc-20160816; b=pPyBc4EhLlfodXBZgrrh1ZZNROXbJS5oFHBoDGvTUYPfkf6QP3YA0bpS/fkxTbvrvS 7r22XxiCcqp5Qr35J2zywldvUbCT6VemIEgnsw+iDY6YKxmFQgE2I2kzCjBc7utlp6eo zsxk9X0Seswo6nuF0DnHZjbsFY1a+G5XT1CUoi7KhZJnF5uZ/22gm5Yu0C20Jy/P/11z i5BWxB6NB7I1zw3f5J6NrjSwg4d0Si4SQbPPCqaX+Jep/uAtJd/qNxJeXGHXSU+BmJUO aXHuCCrnxgSSOOaDtid+FjfF2dSv4hULdp04PcPP3jEkUJ6D9b/Ao/p09HFDZMrPAnkF gXGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=yKhjVri0ZZDPbm2D83Tj+Hov4TJnS83gbokPM6HF+zc=; b=0XWMur2VZ0YweeDBp886FkdmIVqkQVMH7KU72VgSc2p9DMeb0WEkeU1dvaWSmaEPgM uSsPtMrhq4yun1+roD7oTZDahzD1levflsf7V97SSKp2yRNLsUsVAWnrV4zhLOURwlGk cyVNhnzqIfBuznGVXI2mlyuSkQ7tFoCMD63fDwG4qC1Wc8b1oKZR8DVR1JIF53t6v3Ll Og3ZHTyDo8VLSSVGxQuC9nuttWYGMBbt7GBDl8DUAE4loFet8Yglqth+7CQOha0mlRJ2 EdQwNaV/nAjuG2u1yRBQ1aXixd+1pbNS0jhjto2K+X1ZFJLPxOv55ZhMMdTkqeW7Ae3j cQ6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s9si11592192pgr.443.2019.03.27.19.38.04; Wed, 27 Mar 2019 19:38:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729015AbfC1ChD (ORCPT + 99 others); Wed, 27 Mar 2019 22:37:03 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39728 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727928AbfC1ChD (ORCPT ); Wed, 27 Mar 2019 22:37:03 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B1E553688B; Thu, 28 Mar 2019 02:37:02 +0000 (UTC) Received: from ming.t460p (ovpn-8-18.pek2.redhat.com [10.72.8.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 60FCF608A4; Thu, 28 Mar 2019 02:36:55 +0000 (UTC) Date: Thu, 28 Mar 2019 10:36:51 +0800 From: Ming Lei To: Evan Green Cc: Jens Axboe , Martin K Petersen , Bart Van Assche , Gwendal Grignou , Alexis Savery , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 2/2] loop: Better discard support for block devices Message-ID: <20190328023650.GD19708@ming.t460p> References: <20190327222841.38650-1-evgreen@chromium.org> <20190327222841.38650-3-evgreen@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190327222841.38650-3-evgreen@chromium.org> User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 28 Mar 2019 02:37:02 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 27, 2019 at 03:28:41PM -0700, Evan Green wrote: > If the backing device for a loop device is a block device, > then mirror the discard properties of the underlying block > device into the loop device. This new change only applies to > loop devices backed directly by a block device, not loop > devices backed by regular files. > > While in there, differentiate between REQ_OP_DISCARD and > REQ_OP_WRITE_ZEROES, which are different for block devices, > but which the loop device had just been lumping together, since > they're largely the same for files. > > This change fixes blktest block/003, and removes an extraneous > error print in block/013 when testing on a loop device backed > by a block device that does not support discard. I saw such issue many times, I believe it needs the fix. > > Signed-off-by: Evan Green > --- > > Changes in v3: > - Updated commit description > > Changes in v2: None > > drivers/block/loop.c | 61 +++++++++++++++++++++++++++++--------------- > 1 file changed, 41 insertions(+), 20 deletions(-) > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index bbf21ebeccd3..e1edd004298a 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -417,19 +417,14 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq, > return ret; > } > > -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos) > +static int lo_discard(struct loop_device *lo, struct request *rq, > + int mode, loff_t pos) > { > - /* > - * We use punch hole to reclaim the free space used by the > - * image a.k.a. discard. However we do not support discard if > - * encryption is enabled, because it may give an attacker > - * useful information. > - */ > struct file *file = lo->lo_backing_file; > - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE; > + struct request_queue *q = lo->lo_queue; > int ret; > > - if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > + if (!blk_queue_discard(q)) { > ret = -EOPNOTSUPP; > goto out; > } > @@ -599,8 +594,13 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) > case REQ_OP_FLUSH: > return lo_req_flush(lo, rq); > case REQ_OP_DISCARD: > + return lo_discard(lo, rq, > + FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, pos); > + > case REQ_OP_WRITE_ZEROES: > - return lo_discard(lo, rq, pos); > + return lo_discard(lo, rq, > + FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE, pos); > + > case REQ_OP_WRITE: > if (lo->transfer) > return lo_write_transfer(lo, rq, pos); > @@ -854,6 +854,25 @@ static void loop_config_discard(struct loop_device *lo) > struct file *file = lo->lo_backing_file; > struct inode *inode = file->f_mapping->host; > struct request_queue *q = lo->lo_queue; > + struct request_queue *backingq; > + > + /* > + * If the backing device is a block device, mirror its discard > + * capabilities. > + */ > + if (S_ISBLK(inode->i_mode)) { > + backingq = bdev_get_queue(inode->i_bdev); > + blk_queue_max_discard_sectors(q, > + backingq->limits.max_discard_sectors); > + > + blk_queue_max_write_zeroes_sectors(q, > + backingq->limits.max_write_zeroes_sectors); > + > + q->limits.discard_granularity = > + backingq->limits.discard_granularity; > + > + q->limits.discard_alignment = > + backingq->limits.discard_alignment; Loop usually doesn't mirror backing queue's limits, and I believe it isn't necessary for this case too, just wondering why the following simple setting can't work? if (S_ISBLK(inode->i_mode)) { backingq = bdev_get_queue(inode->i_bdev); q->limits.discard_alignment = 0; if (!blk_queue_discard(backingq)) { q->limits.discard_granularity = 0; blk_queue_max_discard_sectors(q, 0); blk_queue_max_write_zeroes_sectors(q, 0); blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); } else { q->limits.discard_granularity = inode->i_sb->s_blocksize; blk_queue_max_discard_sectors(q, UINT_MAX >> 9); blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); } } else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { ... } I remembered you mentioned the above code doesn't work in some of your tests, but never explain the reason. However, it is supposed to work given bio splitting does handle/respect the discard limits. Or is there bug in bio splitting on discard IO? Thanks, Ming