Received: by 2002:ac0:b08d:0:0:0:0:0 with SMTP id l13csp1854816imc; Fri, 22 Feb 2019 12:28:57 -0800 (PST) X-Google-Smtp-Source: AHgI3Ib0i5tJFnROdC4r0ijgpbaMwScwH5qk8Qi3F10spaB9QfDym3fa5ucqczHyrjwdFVdcKeZS X-Received: by 2002:a62:b415:: with SMTP id h21mr6028286pfn.26.1550867337458; Fri, 22 Feb 2019 12:28:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550867337; cv=none; d=google.com; s=arc-20160816; b=d72gjltDVlXwrdv3aBGRXxcsRPkEyZa5244q0ImERTh5ePrPjkuqB7FvHaQV4+q4Pl Y/NqeJ2H983x9E8m5Qpe7H6+TiaywkWGrUh/IUaBKC6F4ithlg1srZb2OuicShjjaczC gXBqhIxg8DeUKYeRblBVyxMd70L/9rTyWQiqDRBH03mhBpvCmykOti5K5NWNdoEvaOX1 CSArK3EY0jqmaoI/FHYNu9BMw3mNuv7+/H80c9oMeMPpZgmx750q4aLsEGE6iMa/6DQ9 Jo/4BiujYtmy5wZ1JJmSsOs99P1oEzKNKLTf1eyknAxGap3nG9GKuwwx2JRGtGxWzrDJ SB9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=RCqN1N0FBSQRGtaeFZLT4htm7lDwpchLyqK9QVXv6zM=; b=aHuHq/wRjPkyDepeTSkw/rkXpC8MTde15y17ilLWT2BA6OcxKLdGeODn1D33S6Nrvs 1P2j3b5OegGK+x7kVUnwo4/s2zJNCuG030c6BHhnRJHqoy6VXkmBEnAWJQXyoQL3oUHY 1jO29oJ5zWAo2hobU3QFgwt3+D2tXrKK1MOJ7T4weiaDqSaJbdFK+vKx5h7pwSTjPUBs FTTads+ACJ/Ao/UtDX3v0yDVzmdn7VFOfG8TWuhEnA1oqJoKiziJyiQno5qbBykZDYTa UDS/Pt0SlknNkRILVSPvo8LcnrvN2aXVW3ndd1zwskDFQiV+QUFzwRGqqC1Ff46cRLef //PA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x9si2104396pgx.386.2019.02.22.12.28.40; Fri, 22 Feb 2019 12:28:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727065AbfBVU2P (ORCPT + 99 others); Fri, 22 Feb 2019 15:28:15 -0500 Received: from nautica.notk.org ([91.121.71.147]:39960 "EHLO nautica.notk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726287AbfBVU2O (ORCPT ); Fri, 22 Feb 2019 15:28:14 -0500 Received: by nautica.notk.org (Postfix, from userid 1001) id A9889C009; Fri, 22 Feb 2019 21:28:09 +0100 (CET) Date: Fri, 22 Feb 2019 21:27:54 +0100 From: Dominique Martinet To: Tom Herbert Cc: Tom Herbert , David Miller , Doron Roberts-Kedes , Dave Watson , Linux Kernel Network Developers , LKML Subject: Re: [PATCH v2] kcm: remove any offset before parsing messages Message-ID: <20190222202754.GA20806@nautica> References: <20190215015705.GA17974@nautica> <20190215033102.GA3099@nautica> <20190215045214.GA13123@nautica> <20190220041151.GA13520@nautica> <20190221082209.GA32719@nautica> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Tom Herbert wrote on Fri, Feb 22, 2019: > > > So basically it sounds like you're interested in supporting TCP > > > connections that are half closed. I believe that the error in half > > > closed is EPIPE, so if the TCP socket returns that it can be ignored > > > and the socket can continue being attached and used to send data. > > > > Did you mean 'can continue being attached and used to receive data'? > > No, I meant shutdown on receive side when FIN is receved. TX is still > allowed to drain an queued bytes. To support shutdown on the TX side > would require additional logic since we need to effectively detach the > transmit path but retain the receive path. I'm not sure this is a > compelling use case to support. Hm, it must be a matter of how we see thing but from what I understand it's exactly the other way around. The remote closed the connection, so trying to send anything would just yield a RST, so TX doesn't make sense. On the other hand, anything that had been sent by the remote before the FIN and is on the local side's memory should still be receivable. When you think about it as a TCP stream it's really weird: data coming, data coming, data coming, FIN received. But in the networking stack that received FIN short-circuits all the data that was left around and immediately raises an EPIPE error. I don't see what makes this FIN packet so great that it should be processed before the data; we should only see that EPIPE when we're done reading the data before it or trying to send something. I'll check tomorrow/next week but I'm pretty sure the packets before that have been ack'd at a tcp level as well, so losing them in the application level is really unexpected. > > I can confirm getsockopt with SO_ERROR gets me EPIPE, but I don't see > > how to efficiently ignore EPIPE until POLLIN gets unset -- polling on > > both the csock and kcm socket will do many needless wakeups on only the > > csock from what I can see, so I'd need some holdoff timer or something. > > I guess it's possible though. > > We might need to clear the error somehow. May a read of zero bytes? Can try. > > After a bit more debugging, this part works (__strp_recv() is called > > again); but the next packet that is treated properly is rejected because > > by the time __strp_recv() was called again a new skb was read and the > > length isn't large enough to go all the way into the new packet, so this > > test fails: > > } else if (len <= (ssize_t)head->len - > > skb->len - stm->strp.offset) { > > /* Length must be into new skb (and also > > * greater than zero) > > */ > > STRP_STATS_INCR(strp->stats.bad_hdr_len); > > strp_parser_err(strp, -EPROTO, desc); > > > > So I need to figure a way to say "call this function again without > > reading more data" somehow, or make this check more lax e.g. accept any > > len > 0 after a retry maybe... > > Removing that branch altogether seems to work at least but I'm not sure > > we'd want to? > > I like the check since it's conservative and covers the normal case. > Maybe just need some more logic? I can add a "retrying" state and not fail here if we ewre retrying for whatever reason perhaps... But I'm starting to wonder how this would work if my client didn't keep on sending data, I'll try to fail on the last client's packet and see how __strp_recv is called again through the timer, with the same skb perhaps? -- Dominique