Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753355AbcD0Uo1 (ORCPT ); Wed, 27 Apr 2016 16:44:27 -0400 Received: from mail-oi0-f48.google.com ([209.85.218.48]:34334 "EHLO mail-oi0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751710AbcD0Uo0 (ORCPT ); Wed, 27 Apr 2016 16:44:26 -0400 MIME-Version: 1.0 In-Reply-To: References: From: Rafael David Tinoco Date: Wed, 27 Apr 2016 17:43:55 -0300 Message-ID: Subject: Re: [PATCH 3.16 106/217] sd: disable discard_zeroes_data for UNMAP To: Ben Hutchings Cc: linux-kernel@vger.kernel.org, stable@vger.kernel.org, akpm@linux-foundation.org, Paolo Bonzini , "Martin K. Petersen" , Christoph Hellwig Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4700 Lines: 114 It seems that changing discard method from UNMAP to WRITE SAME(16) without using NDOB bit (as first described in sbc3r35b.pdf) can cause performance problems on big discards (since data-out buffer will be checked for every WRITE SAME command). I think this is happening after this commit, since NDOB bit wasn't implemented with this change (afaik, iirc). >From the spec: """ To ensure that subsequent read operations return all zeros in a logical block, use the WRITE SAME (16) command with the NDOB bit set to one. If the UNMAP bit is set to one, then the device server may unmap the logical blocks specified by the WRITE SAME (16) """ And there were some problems with this change (specifically QEMU SCSI WRITE SAME implementation). So the change (commit e461338b6cd4) was made to guarantee that if LBPRZ=0, after VPD 0xB2, UNMAP is still picked. WRITESAME(16) is picked only if LBPRZ=1. This last commit violated spec in favor of a WRITE SAME "optout" approach for QEMU. I wonder if this should be taken to previous versions ... -Rafael Tinoco On Tue, Apr 26, 2016 at 8:02 PM, Ben Hutchings wrote: > 3.16.35-rc1 review patch. If anyone has any objections, please let me know. > > ------------------ > > From: "Martin K. Petersen" > > commit 7985090aa0201fa7760583f9f8e6ba41a8d4c392 upstream. > > The T10 SBC UNMAP command does not provide any hard guarantees that > blocks will return zeroes on a subsequent READ. This is due to the fact > that the device server is free to silently ignore all or parts of the > request. > > The only way to ensure that a block consistently returns zeroes after > being unmapped is to use WRITE SAME with the UNMAP bit set. Should the > device be unable to unmap one or more blocks described by the command it > is required to manually write zeroes to them. > > Until now we have preferred UNMAP over the WRITE SAME variants to > accommodate thinly provisioned devices that predated the final SBC-3 > spec. This patch changes the heuristic so that we favor WRITE SAME(16) > or (10) over UNMAP if these commands are marked as supported in the > Logical Block Provisioning VPD page. > > The patch also disables discard_zeroes_data for devices operating in > UNMAP mode. > > Signed-off-by: Martin K. Petersen > Reviewed-by: Paolo Bonzini > Signed-off-by: Christoph Hellwig > Signed-off-by: Ben Hutchings > --- > drivers/scsi/sd.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -627,7 +627,7 @@ static void sd_config_discard(struct scs > unsigned int logical_block_size = sdkp->device->sector_size; > unsigned int max_blocks = 0; > > - q->limits.discard_zeroes_data = sdkp->lbprz; > + q->limits.discard_zeroes_data = 0; > q->limits.discard_alignment = sdkp->unmap_alignment * > logical_block_size; > q->limits.discard_granularity = > @@ -651,11 +651,13 @@ static void sd_config_discard(struct scs > case SD_LBP_WS16: > max_blocks = min_not_zero(sdkp->max_ws_blocks, > (u32)SD_MAX_WS16_BLOCKS); > + q->limits.discard_zeroes_data = sdkp->lbprz; > break; > > case SD_LBP_WS10: > max_blocks = min_not_zero(sdkp->max_ws_blocks, > (u32)SD_MAX_WS10_BLOCKS); > + q->limits.discard_zeroes_data = sdkp->lbprz; > break; > > case SD_LBP_ZERO: > @@ -2572,12 +2574,12 @@ static void sd_read_block_limits(struct > > } else { /* LBP VPD page tells us what to use */ > > - if (sdkp->lbpu && sdkp->max_unmap_blocks) > - sd_config_discard(sdkp, SD_LBP_UNMAP); > - else if (sdkp->lbpws) > + if (sdkp->lbpws) > sd_config_discard(sdkp, SD_LBP_WS16); > else if (sdkp->lbpws10) > sd_config_discard(sdkp, SD_LBP_WS10); > + else if (sdkp->lbpu && sdkp->max_unmap_blocks) > + sd_config_discard(sdkp, SD_LBP_UNMAP); > else > sd_config_discard(sdkp, SD_LBP_DISABLE); > } > -- Rafael David Tinoco Canonical - Kernel & Userland Sustaining Engineer Server Tech Lead for SEG - Manager: Brooks Warner - # Email: rafael.tinoco@canonical.com (GPG: 2B15B499) # LP: ~inaddy | IRC: tinoco