Received: by 2002:ac0:b08d:0:0:0:0:0 with SMTP id l13csp1880034imc; Fri, 22 Feb 2019 13:02:38 -0800 (PST) X-Google-Smtp-Source: AHgI3Ib34Zks4wcPlvIvf4jwyi/4qZ4tt5nQbGDM6Ejoq7WhF/fM2YFmS292lUIPmpYLxEXTVw9k X-Received: by 2002:a17:902:2a47:: with SMTP id i65mr6176342plb.237.1550869358694; Fri, 22 Feb 2019 13:02:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550869358; cv=none; d=google.com; s=arc-20160816; b=HMeyXtq7ZRRlqRPPfyZfLwm9jIpaCtPxLd2JtharNIQtOXiJEQ+zUJzRMbDEo2vkG5 PS9c9YHZjvAh1+Sol/n6q1NUqOtm165YSDcXGelNES28UchWzHiw4AZKSdgISkSqrtj9 EAa6neS4Wsv7e12GBRxJofy+mWVpYlmwQ0fk6rTUJJ+q9XtbbPsUPpyo3wm+l5MyBW3+ CUhZgBca54TsbZ4vq9KAfj+RZIz9KhjQEX5HIuWLfxBI9gXyuouooE3It7Lsm3S9NhKw DjHQ5jl8MtJ7E8GIovX7uGh9aXb5jG/vSZIWhcM4PYdaWDsmEZMtLeFQ+ho1nxgitFNJ PzrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=Lb/fVcrySkv3EtiT8/Enj2b41YW5wcL7v3RLqGTBTo4=; b=IUqzHcuT1u8UShGVIcPJ4yplGqOX6c5AtYIJ8x5I7jCOMXTKtg7EJY9LE6KuZYKpTM LcjGod0gJZ1H3nCIRZVLmBB6BnoCkg6skYWleHe95JvboQC8Lg4AIbjMv91KYqGbx8QN pobssjYrce8/vp9P7fPT4ME3SPlJHx+GDCONOz2nOBPpvO9gdqGwQj1Nqn8x4rjI9vKL kqplJZpIghjHBFMBaydRts4Pr2oeiZ6ymRO6HBaEywxGceWkQO+MM5x9xqF8YEuMscOi fMqmjdbEzosI7N1aQHdcs3HiYWW6yJxTZ/3W00hovfFh1aRvNKd4oBII66G5eoLOdtVI G5pw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@quantonium-net.20150623.gappssmtp.com header.s=20150623 header.b=Zb8HDbeA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q3si1743588pfh.235.2019.02.22.13.02.22; Fri, 22 Feb 2019 13:02:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@quantonium-net.20150623.gappssmtp.com header.s=20150623 header.b=Zb8HDbeA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726443AbfBVVCB (ORCPT + 99 others); Fri, 22 Feb 2019 16:02:01 -0500 Received: from mail-wr1-f46.google.com ([209.85.221.46]:33234 "EHLO mail-wr1-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725832AbfBVVCA (ORCPT ); Fri, 22 Feb 2019 16:02:00 -0500 Received: by mail-wr1-f46.google.com with SMTP id i12so3821501wrw.0 for ; Fri, 22 Feb 2019 13:01:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quantonium-net.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Lb/fVcrySkv3EtiT8/Enj2b41YW5wcL7v3RLqGTBTo4=; b=Zb8HDbeASuK6qrlfBdrGmuXii5UZgSGw5hE3qCf8TTSj5R+LaEJ0p2hBvT/AqUEgZp Mhby0sWMKgeugrLbQjW6jESOJnR1/JlMJzACOQfnnrp/uxiJ/7Jl8gpXZIeCZstP19zL PudwUU4XJQ/PpCz3465JF8rqQgX7n+rkgNlcXKUpR88JebTR9eZA/bbWsLgyqZVnDzLO OuORL7xc+N2wETtYBeWVFUjtskyY+T7TTT13NAZ4ZO2imJWL0bjX8bc8YL15z9LbZrzn wfq6vhG+ZSYA7JbOKAgFfgMDuXvZxk00lcUotZuNkjzyFJlVupYl4L4CCvWXcYR34y55 AemA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Lb/fVcrySkv3EtiT8/Enj2b41YW5wcL7v3RLqGTBTo4=; b=Jxb8+DMMz1UgUFKAfvC4V4HwCwlNBTffqNOIQaW6ajXvz2MS7kqMXMOIV7M95afxRn vDJ8DE9FlUz1jezzCaY3jZtVX2XKYQqjOk32P13B+aJdwIX8tW3u948KeHOGFhCZblK+ ea07yd7tbaWwQUugl4SqC5TQ0WKEp5D5sYh4yrPgTk8bsJZqM73PD7afU0aEW1vOb1xI Q2dE1tHUTjruDPdu+jqjtJRNtV6ME4MZsrgiuUzcd0drxng0D97drMMjRYn4xdCaSWPc GUL80pHXYFTgOBos5OzKze7svfYNHJ7xA7xKEFR5tMdAPXjCCHviOTRTgNEuOOKT4Wks QCug== X-Gm-Message-State: AHQUAua7SGsnbk7E7teSP5z09Gh292WmgWIvnw87Tf03cQqvy5u+uO9k LBMgjVU7RPjhAZz0vRG/B0O+yWrZMhUqLQsTq4FiHw== X-Received: by 2002:a5d:4702:: with SMTP id y2mr4675791wrq.149.1550869318331; Fri, 22 Feb 2019 13:01:58 -0800 (PST) MIME-Version: 1.0 References: <20190215015705.GA17974@nautica> <20190215033102.GA3099@nautica> <20190215045214.GA13123@nautica> <20190220041151.GA13520@nautica> <20190221082209.GA32719@nautica> <20190222202754.GA20806@nautica> In-Reply-To: <20190222202754.GA20806@nautica> From: Tom Herbert Date: Fri, 22 Feb 2019 13:01:47 -0800 Message-ID: Subject: Re: [PATCH v2] kcm: remove any offset before parsing messages To: Dominique Martinet Cc: Tom Herbert , David Miller , Doron Roberts-Kedes , Dave Watson , Linux Kernel Network Developers , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 22, 2019 at 12:28 PM Dominique Martinet wrote: > > Tom Herbert wrote on Fri, Feb 22, 2019: > > > > So basically it sounds like you're interested in supporting TCP > > > > connections that are half closed. I believe that the error in half > > > > closed is EPIPE, so if the TCP socket returns that it can be ignored > > > > and the socket can continue being attached and used to send data. > > > > > > Did you mean 'can continue being attached and used to receive data'? > > > > No, I meant shutdown on receive side when FIN is receved. TX is still > > allowed to drain an queued bytes. To support shutdown on the TX side > > would require additional logic since we need to effectively detach the > > transmit path but retain the receive path. I'm not sure this is a > > compelling use case to support. > Dominque, > Hm, it must be a matter of how we see thing but from what I understand > it's exactly the other way around. The remote closed the connection, so > trying to send anything would just yield a RST, so TX doesn't make > sense. > On the other hand, anything that had been sent by the remote before the > FIN and is on the local side's memory should still be receivable. That's right, any data sent before the FIN can be received. After the FIN, there is not more data to received. But, the FIN doesn't say anything about the reverse path. For instance, if the peer did shutdown(SHUT_WR), then it sent the FIN so it no longer transmits on the connection, but still can receive data. > > When you think about it as a TCP stream it's really weird: data coming, > data coming, data coming, FIN received. > But in the networking stack that received FIN short-circuits all the > data that was left around and immediately raises an EPIPE error. > Right, it doesn't help that most people automatically assume a received FIN means the connection is closed! (although in practice that assumption works pretty well for most applications). > I don't see what makes this FIN packet so great that it should be > processed before the data; we should only see that EPIPE when we're > done reading the data before it or trying to send something. It's not supposed to be. If your seeing a problem, it might because state_change is processed before the received data is drained. If plain recvmsg were being used, zero bytes is on returned after all outstanding received data on the socket has been read. We might need some additional logic in KCM to handle this. > > I'll check tomorrow/next week but I'm pretty sure the packets before > that have been ack'd at a tcp level as well, so losing them in the > application level is really unexpected. > > > > I can confirm getsockopt with SO_ERROR gets me EPIPE, but I don't see > > > how to efficiently ignore EPIPE until POLLIN gets unset -- polling on > > > both the csock and kcm socket will do many needless wakeups on only the > > > csock from what I can see, so I'd need some holdoff timer or something. > > > I guess it's possible though. > > > > We might need to clear the error somehow. May a read of zero bytes? > > Can try. > > > > After a bit more debugging, this part works (__strp_recv() is called > > > again); but the next packet that is treated properly is rejected because > > > by the time __strp_recv() was called again a new skb was read and the > > > length isn't large enough to go all the way into the new packet, so this > > > test fails: > > > } else if (len <= (ssize_t)head->len - > > > skb->len - stm->strp.offset) { > > > /* Length must be into new skb (and also > > > * greater than zero) > > > */ > > > STRP_STATS_INCR(strp->stats.bad_hdr_len); > > > strp_parser_err(strp, -EPROTO, desc); > > > > > > So I need to figure a way to say "call this function again without > > > reading more data" somehow, or make this check more lax e.g. accept any > > > len > 0 after a retry maybe... > > > Removing that branch altogether seems to work at least but I'm not sure > > > we'd want to? > > > > I like the check since it's conservative and covers the normal case. > > Maybe just need some more logic? > > I can add a "retrying" state and not fail here if we ewre retrying for > whatever reason perhaps... > But I'm starting to wonder how this would work if my client didn't keep > on sending data, I'll try to fail on the last client's packet and see > how __strp_recv is called again through the timer, with the same skb > perhaps? Right, a timer or some sort of aynchronous callbask is needed. Like I said, I don't think dealing with memory failure like this is unique to KCM. Thanks again for looking into this, I think this is good progress! Tom > > -- > Dominique