Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp2888441pxy; Sun, 25 Apr 2021 07:19:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxHa/M4o3Ml3a/MB28bMV74YVEoRtvKrnux2bSh/WfCa77m/7JGMZlRvKeLzF4yv9VULwkc X-Received: by 2002:a17:90b:344e:: with SMTP id lj14mr15728544pjb.89.1619360385304; Sun, 25 Apr 2021 07:19:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619360385; cv=none; d=google.com; s=arc-20160816; b=j0wrqhnu0SjGh0lDxsKjR2mqf8lx0hZECyke8fa3NIJ0IjeD4b0J3xcwEqYIHK5Udx reNjhQWysbbPDWGYaYawv5c+o+k8kp+dt5fGxRI7koM5YbYZnt87hOV4Gk6WoCQvSQMr wWy2zNbLZp5SxMMQH3zRNF3fxygRMiliTIQNoYlR60BRV9udH+RXBXsgW6lcapXVD92L NTYc/GKbPDhXNFedMozqRWUqZJGtXy459p5zZp9xQbbIYQzR5W8qMv8Hfs1qlwgOa8Vg wq/MkNU28qdxCSm7u8utMfilYLnxPRRKW8g8U3i7K0n3tAzQfBjt7NZSGtoxjWQl0XRL joDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=DImb7OsrRdC9PGbz0r03XzcJAu+MGbGVsXBucc53vH8=; b=OI8srgImFFIzPVtdHg60EBQAMwE8f01Zvzhwp0oBBwgasoD3xFSsawC/Vudy+9CEUI vCZJJ23AtczyNPBwOGBR/bEnVu2kwCXRU15cJ9S/jAj3bkdFYox8YTfGQ2z8OdFcZRzZ FtNbmhhyLZGRlMV2kRCV1JzaKOLJqOKjgcLUJG2cdlCfq4veStESK1QtyuVzQA7Vhslk UyluVcmgSIJX2Wb6bZyTh3xmRSYZNkmNhjsDxs2G21ZnqucGAcDssEDnnUdaLxf/rKY6 oaD3bugflleuHCrJ0t+IHyDAIJfbT6Semcky3ifif/CBJkyziEMfGDNdL8m6ROVbtyEh j+2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernelim-com.20150623.gappssmtp.com header.s=20150623 header.b=pcBrJbPx; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r15si16378793pgv.80.2021.04.25.07.19.25; Sun, 25 Apr 2021 07:19:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernelim-com.20150623.gappssmtp.com header.s=20150623 header.b=pcBrJbPx; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230194AbhDYOUA (ORCPT + 99 others); Sun, 25 Apr 2021 10:20:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229906AbhDYOT7 (ORCPT ); Sun, 25 Apr 2021 10:19:59 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 552ACC061756 for ; Sun, 25 Apr 2021 07:19:19 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id g9so37101607wrx.0 for ; Sun, 25 Apr 2021 07:19:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernelim-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=DImb7OsrRdC9PGbz0r03XzcJAu+MGbGVsXBucc53vH8=; b=pcBrJbPxuhhXDB/mncZPMeAxIL1D/Kgg4hfMLgnExFnvafEqmfy6ivAsLaRonJcpq6 GJwzxG5rMNZtS+19oMx0/5kp278eCeTCTL73GGdDIz3e7PeFjWLtzkJQpx/ZQOrCOkzW qpDyxDcUc2tSJfg5ej/a5zuyG/LfHVjUGwDFnGS6fs9qkz2+RFTShrQVK6KCssk8txcv HmOnEZQ1iE2v5aeIQoip6h1ocU2nQofR7Nmw542MNjsh+mq97ABduZKyeTzIrxRiPsYS p6oI2yqyI9VZYwh1tbV5yv6aPY9+yMnwFT5b3H2375neK6xic1/+nrej0VyxbMyj4S8Y VRiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=DImb7OsrRdC9PGbz0r03XzcJAu+MGbGVsXBucc53vH8=; b=lE9tol9imSZyLPwj3bxVWs5CFvFxEXVBDA/r1WFGAZl5f0AdvkPawqrMRtJ/9px2Pj sH5XKaNxaXlOTjZNo5ysk01Hk0+ngpkKslWblnKIq+VOa5NCGgA5Co8v7O4mTV76CdS8 nZI/kr4H3/Pce4peX7kQfvnJprccT888CfamPGmJQb2VYghOJHadj2T78ZZMoInXby9C zI6DdQUe+SYyg0YBBHHKwrzzicXlbxPk/iuFoz1RhVZ75+X4KZQ5TbQjp78q7mXfL4IE 2Os2yLmd+9u7zjHroXxXf0VfMLvLU50EEhRyd5owwmsT/ZqKQnQZC1/4u3RVGApjGDUK hOdQ== X-Gm-Message-State: AOAM533V+lP8wy1+pGjpj73p2dGneqia3VFVYIk3A4bHkClL9Z258IgC P05/4KkjQQCwYm3j4CAWTR5Byw== X-Received: by 2002:adf:fe91:: with SMTP id l17mr8105020wrr.149.1619360357868; Sun, 25 Apr 2021 07:19:17 -0700 (PDT) Received: from gmail.com ([77.126.186.5]) by smtp.gmail.com with ESMTPSA id n124sm672363wmn.40.2021.04.25.07.19.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Apr 2021 07:19:17 -0700 (PDT) Date: Sun, 25 Apr 2021 17:19:14 +0300 From: Dan Aloni To: Chuck Lever Cc: trondmy@hammerspace.com, linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org Subject: Re: [PATCH v3 15/26] xprtrdma: Do not recycle MR after FastReg/LocalInv flushes Message-ID: <20210425141914.6govk2lm2hfosdie@gmail.com> References: <161885481568.38598.16682844600209775665.stgit@manet.1015granger.net> <161885539285.38598.13978652738422395833.stgit@manet.1015granger.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <161885539285.38598.13978652738422395833.stgit@manet.1015granger.net> Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Mon, Apr 19, 2021 at 02:03:12PM -0400, Chuck Lever wrote: > Better not to touch MRs involved in a flush or post error until the > Send and Receive Queues are drained and the transport is fully > quiescent. Simply don't insert such MRs back onto the free list. > They remain on mr_all and will be released when the connection is > torn down. > > I had thought that recycling would prevent hardware resources from > being tied up for a long time. However, since v5.7, a transport > disconnect destroys the QP and other hardware-owned resources. The > MRs get cleaned up nicely at that point. > > Signed-off-by: Chuck Lever Is this a fix for the crash below? I just wonder if it appeared for others in the wild, and the fix is not just theoretical. WARNING: CPU: 5 PID: 20312 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0 list_del corruption, ffff9df150b06768->next is LIST_POISON1 (dead000000000100) Call Trace: [] dump_stack+0x19/0x1b [] __warn+0xd8/0x100 [] warn_slowpath_fmt+0x5f/0x80 [] ? kfree+0x106/0x140 [] __list_del_entry+0x63/0xd0 [] list_del+0xd/0x30 [] frwr_mr_recycle+0xaf/0x150 [rpcrdma] [] frwr_wc_localinv+0x94/0xa0 [rpcrdma] [] __ib_process_cq+0x8e/0x100 [ib_core] [] ib_cq_poll_work+0x29/0x70 [ib_core] [] process_one_work+0x17f/0x440 [] worker_thread+0x126/0x3c0 [] ? manage_workers.isra.25+0x2a0/0x2a0 [] kthread+0xd1/0xe0 [] ? insert_kthread_work+0x40/0x40 [] ret_from_fork_nospec_begin+0x21/0x21 [] ? insert_kthread_work+0x40/0x40 -- Dan Aloni