Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp301699pxu; Fri, 23 Oct 2020 00:40:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxxAj3XGOG5TzLp+ik+5swetW2CIwPWVNxLqoIO5K8jAJiu8fMTId4RZWkH3gAtkBlM2z87 X-Received: by 2002:a50:8a02:: with SMTP id i2mr1034080edi.40.1603438839980; Fri, 23 Oct 2020 00:40:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603438839; cv=none; d=google.com; s=arc-20160816; b=QlX03UBnlY+ogEe5dGnj+0swQHD6uToYynZImAC78pyDcJOF6sWD7jy+m2SZKTD74G M0mKp2LWPJBngrAWDfy6eIAksbRpH7lpSIwA86Mo3lvLU9syzM3AthfR+Ercw1V6YVvc RL3A6Q8XVYvoC/npzQO8KjrT9jbm/lBAPitC1krT3L/i059TtlQ8FKiSwXscMlH2vq49 7CqcgBnke8YDlRmGS4Vlam0+Rro7PGMydhX4u6jM/I+rmxB3NzFnjhffwv2nemqC8U/C yv6gUgjW+SmBFUarB86a4kJzWGo5FhEjURvb51yj8Ibd/aoRIDtrElxVS6NBYBx8+VbW fdrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=Z9HITgH9jeEl/9G4Cz3PblL9SFu4AKsLrV+cZoRHuWE=; b=SMKfn6QhK4FLx/45LF/NNe+ImiM0dDMVgnROBdv016C+HwrDFZE+hxL8Q2++MJnBLr vlv0nvFic/W4V0/6ILNsH1kj3dsubZ7PinYMWaD45sN+0ITtksRrptZ+1NFTKd0PoCLf LO+NtG68fCAz/uo2Z/zsD0r9bZb5n3ve5/2BO5jKpT9SPThU1QtyeuI+4ex8kUZ2eS1u SLjZxupx7lBDxMxfxGFi7JYEyH81a1GC25TewIK3OE+NE4ZglXSBPJW1ehDMHuB120zd 7DClMWHPL62eK/2div9TEzJy8BxQleFPFd1hDsQGEqF7FpVwa9xeKmcb9Em8VRXsteMc FKBg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id zg24si341786ejb.155.2020.10.23.00.40.18; Fri, 23 Oct 2020 00:40:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S374497AbgJWCHO (ORCPT + 99 others); Thu, 22 Oct 2020 22:07:14 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:3650 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S374489AbgJWCHO (ORCPT ); Thu, 22 Oct 2020 22:07:14 -0400 Received: from DGGEMM401-HUB.china.huawei.com (unknown [172.30.72.56]) by Forcepoint Email with ESMTP id 173DF4436115AF78727F; Fri, 23 Oct 2020 10:07:12 +0800 (CST) Received: from dggema772-chm.china.huawei.com (10.1.198.214) by DGGEMM401-HUB.china.huawei.com (10.3.20.209) with Microsoft SMTP Server (TLS) id 14.3.487.0; Fri, 23 Oct 2020 10:07:11 +0800 Received: from [10.169.42.93] (10.169.42.93) by dggema772-chm.china.huawei.com (10.1.198.214) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1913.5; Fri, 23 Oct 2020 10:07:11 +0800 Subject: Re: [External] Re: [PATCH] nvme-rdma: handle nvme completion data length To: zhenwei pi CC: , , , , , , Yibo Zhu References: <20201022083850.1334880-1-pizhenwei@bytedance.com> <04a97f73-ba13-a4b5-3ea4-fc438391507e@huawei.com> <1c78dbe5-47a4-1590-e064-681cba5fb01d@bytedance.com> From: Chao Leng Message-ID: Date: Fri, 23 Oct 2020 10:07:10 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: <1c78dbe5-47a4-1590-e064-681cba5fb01d@bytedance.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.169.42.93] X-ClientProxiedBy: dggeme705-chm.china.huawei.com (10.1.199.101) To dggema772-chm.china.huawei.com (10.1.198.214) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/10/22 18:05, zhenwei pi wrote: > On 10/22/20 5:55 PM, Chao Leng wrote: >> >> >> On 2020/10/22 16:38, zhenwei pi wrote: >>> Hit a kernel warning: >>> refcount_t: underflow; use-after-free. >>> WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 >>> >>> RIP: 0010:refcount_warn_saturate+0xd9/0xe0 >>> Call Trace: >>>   >>>   nvme_rdma_recv_done+0xf3/0x280 [nvme_rdma] >>>   __ib_process_cq+0x76/0x150 [ib_core] >>>   ... >>> >>> The reason is that a zero bytes message received from target, and the >>> host side continues to process without length checking, then the >>> previous CQE is processed twice. >>> >>> Handle data length, ignore zero bytes message, and try to recovery for >>> corrupted CQE case. >>> >>> Signed-off-by: zhenwei pi >>> --- >>>   drivers/nvme/host/rdma.c | 11 +++++++++++ >>>   1 file changed, 11 insertions(+) >>> >>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c >>> index 9e378d0a0c01..9f5112040d43 100644 >>> --- a/drivers/nvme/host/rdma.c >>> +++ b/drivers/nvme/host/rdma.c >>> @@ -1767,6 +1767,17 @@ static void nvme_rdma_recv_done(struct ib_cq *cq, struct ib_wc *wc) >>>           return; >>>       } >>> +    if (unlikely(!wc->byte_len)) { >>> +        /* zero bytes message could be ignored */ >>> +        return; Resource leak, need nvme_rdma_post_recv. >>> +    } else if (unlikely(wc->byte_len < len)) { >>> +        /* Corrupted completion, try to recovry */ >>> +        dev_err(queue->ctrl->ctrl.device, >>> +            "Unexpected nvme completion length(%d)\n", wc->byte_len); >>> +        nvme_rdma_error_recovery(queue->ctrl); >>> +        return; >>> +    } >> !wc->byte_len and wc->byte_len < len may be the same type of anomaly. >> Why do different error handling? >> In which scenario zero bytes message received from target? fault inject test or normal test/run? > > Zero bytes message could be used as transport layer keep alive mechanism (I's also developing target side transport layer keep alive now. To reclaim resource, target side needs to close dead connections even kato is set as 0). nvme over fabric protocol do not define this. May be async event is a option for target keep alive(if kato set as 0). > >>> + >>>       ib_dma_sync_single_for_cpu(ibdev, qe->dma, len, DMA_FROM_DEVICE); >>>       /* >>>        * AEN requests are special as they don't time out and can >>> >