Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754081AbdDMOM7 (ORCPT ); Thu, 13 Apr 2017 10:12:59 -0400 Received: from mail-wr0-f196.google.com ([209.85.128.196]:36650 "EHLO mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750745AbdDMOM5 (ORCPT ); Thu, 13 Apr 2017 10:12:57 -0400 MIME-Version: 1.0 In-Reply-To: <20170406124944.11074-1-jthumshirn@suse.de> References: <20170406124944.11074-1-jthumshirn@suse.de> From: Moni Shoua Date: Thu, 13 Apr 2017 17:12:54 +0300 X-Google-Sender-Auth: 4OpadwMV0SSWUzujF7Zfje70PJc Message-ID: Subject: Re: [PATCH] IB/rxe: Don't clamp residual length to mtu To: Johannes Thumshirn Cc: Doug Ledford , Sean Hefty , Hal Rosenstock , Linux Kernel Mailinglist , linux-rdma , Hannes Reinecke , Sagi Grimberg , Max Gurtovoy Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3136 Lines: 68 On Thu, Apr 6, 2017 at 3:49 PM, Johannes Thumshirn wrote: > When reading a RDMA WRITE FIRST packet we copy the DMA length from the RDMA > header into the qp->resp.resid variable for later use. Later in check_rkey() > we clamp it to the MTU if the packet is an RDMA WRITE packet and has a > residual length bigger than the MTU. Later in write_data_in() we subtract the > payload of the packet from the residual length. If the packet happens to have a > payload of exactly the MTU size we end up with a residual length of 0 despite > the packet not being the last in the conversation. When the next packet in the > conversation arrives, we don't have any residual length left and thus set the QP > into an error state. > > This broke NVMe over Fabrics functionality over rdma_rxe.ko > > The patch was verified using the following test. > > # echo eth0 > /sys/module/rdma_rxe/parameters/add > # nvme connect -t rdma -a 192.168.155.101 -s 1023 -n nvmf-test > # mkfs.xfs -fK /dev/nvme0n1 > meta-data=/dev/nvme0n1 isize=256 agcount=4, agsize=65536 blks > = sectsz=4096 attr=2, projid32bit=1 > = crc=0 finobt=0, sparse=0 > data = bsize=4096 blocks=262144, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > log =internal log bsize=4096 blocks=2560, version=2 > = sectsz=4096 sunit=1 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > # mount /dev/nvme0n1 /tmp/ > [ 148.923263] XFS (nvme0n1): Mounting V4 Filesystem > [ 148.961196] XFS (nvme0n1): Ending clean mount > # dd if=/dev/urandom of=test.bin bs=1M count=128 > 128+0 records in > 128+0 records out > 134217728 bytes (134 MB, 128 MiB) copied, 0.437991 s, 306 MB/s > # sha256sum test.bin > cde42941f045efa8c4f0f157ab6f29741753cdd8d1cff93a6b03649d83c4129a test.bin > # cp test.bin /tmp/ > sha256sum /tmp/test.bin > cde42941f045efa8c4f0f157ab6f29741753cdd8d1cff93a6b03649d83c4129a /tmp/test.bin > > Signed-off-by: Johannes Thumshirn > Cc: Hannes Reinecke > Cc: Sagi Grimberg > Cc: Max Gurtovoy > --- > drivers/infiniband/sw/rxe/rxe_resp.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c > index c9dd385..58764df 100644 > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > @@ -478,8 +478,6 @@ static enum resp_states check_rkey(struct rxe_qp *qp, > state = RESPST_ERR_LENGTH; > goto err; > } > - > - qp->resp.resid = mtu; > } else { > if (pktlen != resid) { > state = RESPST_ERR_LENGTH; > -- > 2.10.2 > > -- Thanks Johannes Acked-by: Moni Shoua