Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp7748645rdb; Thu, 4 Jan 2024 06:37:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IF/SwuQ0JwWOPfUniwa8t69eEoCvXOIhkbIjlkYhqEviGTdWJnUHeuBLwZEakDa8xxNdNk/ X-Received: by 2002:a05:6a20:d413:b0:18f:97c:8272 with SMTP id il19-20020a056a20d41300b0018f097c8272mr530006pzb.124.1704379054177; Thu, 04 Jan 2024 06:37:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704379054; cv=none; d=google.com; s=arc-20160816; b=P8jzxpbF2O7bsxEXwL+1nFrRyVBNCrvHltMrimLSnN4ZHB5OqpCwa/OuuF/L2ScdFu +M4Yfw8G+9wLeN9EXIi4w2V7xm03lWtoze8IZ7PRZQvFhaQ+1v+vpLWaMgKePZ5f2DqE wCIetnEszovGqSai/VTFYcInuzNuKm+FjhhADNy6YIjBua28kGDe4dcWXmYnXmkItHw9 BSdlPW7wWvQ70WIQs+GIMFeNHpSdO9S55uUqNCBuKCxyuEt0Y8LC+wsVGnIFcFEu+i8a Kf/7Zstt1EFV+QBK15eFKId7ExxSiZQw2VWqJRm5gTcunuQCIkI7DicOm4R/8wooEcUh MF4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=YrAJiMEjZmBOWseOGv48NIkqtNeAWXDk7qk1Q/c5Ig8=; fh=1qQzJNbWNcITLvniMtmUiyUyeSxeE1rVU6L4/3rxi4Y=; b=mWMe2t09V4PydOh7HAnc/xw4PEh1RW3Kf739nhB+qmItd6DS71kXoWtkRKTeY+7ZeL P4MWw/MXvPiye3a6hYK9GpXKP05boqQKs5Ye9MFh/jd14sXhmeMh/s8QowBdXO9RosNT fwEIX0TMCzWTlXc2LsMFnRkENOcm++xO4oTEYBz4ClZmLFLGVtHMfSPd+98M9x3PctCa E45QeQ4Jb5sMlprqhbUCh7YrFtQzB7tuvpjsopCWOPThm5IqulwGCLx7uxC5x3ZSzVbB 0LillmzuudFtXMFa36oyz30dh66rrw7XPkTfc/Yy8ucl795l7RvfQ/xOq2icWYO3KB0Y ujbw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FfJIaaE5; spf=pass (google.com: domain of linux-nfs+bounces-931-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-nfs+bounces-931-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id a18-20020aa78e92000000b006d9b2c6cce5si18617180pfr.385.2024.01.04.06.37.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jan 2024 06:37:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs+bounces-931-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FfJIaaE5; spf=pass (google.com: domain of linux-nfs+bounces-931-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-nfs+bounces-931-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id D18912869C8 for ; Thu, 4 Jan 2024 14:37:33 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B5E0F23746; Thu, 4 Jan 2024 14:37:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="FfJIaaE5" X-Original-To: linux-nfs@vger.kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD5BE23748 for ; Thu, 4 Jan 2024 14:37:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1704379048; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YrAJiMEjZmBOWseOGv48NIkqtNeAWXDk7qk1Q/c5Ig8=; b=FfJIaaE5/afUudpv6iK/rxlZf7fvUz7ObjTZp5OB+dNO2sUXYaOCKSFqc1GlemzEn11ysa AiCOdH/sLiGgwXcklakR6V4ZhvuZ15/SyXypTaZNu/cFIZyv12YxPgHhHQFnu4qoLEh6ru Eo1tE7DVbo/IglZSZEsHLqEXSrq6hhY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-340--T31Nz4_PwK149VAcxm1aA-1; Thu, 04 Jan 2024 09:37:25 -0500 X-MC-Unique: -T31Nz4_PwK149VAcxm1aA-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1A4B2101A5B3; Thu, 4 Jan 2024 14:37:25 +0000 (UTC) Received: from [100.85.132.103] (ovpn-0-5.rdu2.redhat.com [10.22.0.5]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7BED25190; Thu, 4 Jan 2024 14:37:24 +0000 (UTC) From: Benjamin Coddington To: Chuck Lever Cc: Jeff Layton , Linux NFS Mailing List Subject: Re: hangs during fstests testing with TLS Date: Thu, 04 Jan 2024 09:37:23 -0500 Message-ID: <1317F466-EA4C-41F0-BF3E-E0034396C016@redhat.com> In-Reply-To: References: <117352d5dc94d8f31bc6770e4bbb93a357982a93.camel@kernel.org> <8C3DFB5D-B967-4D59-BFC5-7B25315DB9AB@redhat.com> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.5 On 4 Jan 2024, at 9:16, Chuck Lever wrote: > On Thu, Jan 04, 2024 at 07:22:18AM -0500, Benjamin Coddington wrote: >> On 3 Jan 2024, at 16:46, Chuck Lever III wrote: >> >>>> On Jan 3, 2024, at 3:16=E2=80=AFPM, Benjamin Coddington wrote: >>>> >>>> On 3 Jan 2024, at 14:12, Chuck Lever III wrote: >>>> >>>>>> On Jan 3, 2024, at 1:47=E2=80=AFPM, Benjamin Coddington wrote: >>>>>> >>>>>> This looks like it started out as the problem I've been sending pa= tches to >>>>>> fix on 6.7, latest here: >>>>>> https://lore.kernel.org/linux-nfs/e28038fba1243f00b0dd66b7c5296a1e= 181645ea.1702496910.git.bcodding@redhat.com/ >>>>>> >>>>>> .. however whenever I encounter the issue, the client reconnects t= he >>>>>> transport again - so I think there might be an additional problem = here. >>>>> >>>>> I'm looking at the same problem as you, Ben. It doesn't seem to be >>>>> similar to what Jeff reports. >>>>> >>>>> But I'm wondering if gerry-rigging the timeouts is the right answer= >>>>> for backchannel replies. The problem, fundamentally, is that when a= >>>>> forechannel RPC task holds the transport lock, the backchannel's re= ply >>>>> transmit path thinks that means the transport connection is down an= d >>>>> triggers a transport disconnect. >>>> >>>> Why shouldn't backchannel replies have normal timeout values? >>> >>> RPC Replies are "send and forget". The server forechannel sends >>> its Replies without a timeout. There is no such thing as a >>> retransmitted RPC Reply (though a reliable transport might >>> retransmit portions of it, the RPC server itself is not aware of >>> that). >>> >>> And I don't see anything in the client's backchannel path that >>> makes me think there's a different protocol-level requirement >>> in the backchannel. >> >> Its not strictly a protocol thing, the timeouts are used to decide wha= t to >> do with a req or flag the transport state even if the request doesn't = make >> it to the wire. That's why the zero timeout values for this req impro= perly >> resets the transport. > > I guess I'm harping on this a bit because forechannel v. backchannel > is already confusing enough. The use of timeouts for RPC Replies is > just heaping on to that confusion. Yeah, its super confusing for me too. Add to this the fact that we do it= completely different for 4.0 and 4.1, but all the variable names are similar and/or reversed. > If we're going to keep an explicit timeout when sending the Reply, > it should have a little documentation. I suggest adding this to > xprt_init_bc_request() before the new call to > xprt_init_majortimeo(): > > /* > * Backchannel Replies are sent with !RPC_TASK_SOFT and > * RPC_TASK_NO_RETRANS_TIMEOUT. The major timeout setting > * affects only how long each Reply waits to be sent when > * a transport connection cannot be established. > */ > xprt_init_majortimeo(task, req, ... Ok, I will add that on 2/2 v4 which is getting some basic testing ATM. Ben