Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752286AbcD3WBx (ORCPT ); Sat, 30 Apr 2016 18:01:53 -0400 Received: from mail-vk0-f49.google.com ([209.85.213.49]:33529 "EHLO mail-vk0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751472AbcD3WBv (ORCPT ); Sat, 30 Apr 2016 18:01:51 -0400 MIME-Version: 1.0 In-Reply-To: <57252918.7070302@candelatech.com> References: <5720E1F0.9010203@candelatech.com> <1461780469.5102.0.camel@decadent.org.uk> <1461801603.3971874.591751457.2DB91B98@webmail.messagingengine.com> <572155F4.10405@candelatech.com> <20160428102953.GA7656@bistromath.localdomain> <1462041181.17662.3.camel@decadent.org.uk> <57250A17.5090804@candelatech.com> <57251CB3.1040504@candelatech.com> <572523C4.4080307@candelatech.com> <57252918.7070302@candelatech.com> From: Vijay Pandurangan Date: Sat, 30 Apr 2016 18:01:29 -0400 X-Google-Sender-Auth: 9czym8Di5v__ucdInU7FiPFRSiw Message-ID: Subject: =?UTF-8?Q?Re=3A_=5BPATCH_3=2E2_085=2F115=5D_veth=3A_don=E2=80=99t_modify_ip=5Fsumm?= =?UTF-8?Q?ed=3B_doing_so_treats_packets_with_bad_checksums_as_good=2E?= To: Ben Greear Cc: Tom Herbert , Ben Hutchings , Sabrina Dubroca , Hannes Frederic Sowa , LKML , stable@vger.kernel.org, akpm@linux-foundation.org, "David S. Miller" , Cong Wang , Linux Kernel Network Developers , Evan Jones , Nicolas Dichtel , Phil Sutter , Toshiaki Makita , Cong Wang Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2465 Lines: 70 On Sat, Apr 30, 2016 at 5:52 PM, Ben Greear wrote: >> >> Good point, so if you had: >> >> eth0 <-> raw <-> user space-bridge <-> raw <-> vethA <-> veth B <-> >> userspace-stub <->eth1 >> >> and user-space hub enabled this elide flag, things would work, right? >> Then, it seems like what we need is a way to tell the kernel >> router/bridge logic to follow elide signals in packets coming from >> veth. I'm not sure what the best way to do this is because I'm less >> familiar with conventions in that part of the kernel, but assuming >> there's a way to do this, would it be acceptable? > > > You cannot receive on one veth without transmitting on the other, so > I think the elide csum logic can go on the raw-socket, and apply to packets > in the transmit-from-user-space direction. Just allowing the socket to make > the veth behave like it used to before this patch in question should be good > enough, since that worked for us for years. So, just an option to modify > the > ip_summed for pkts sent on a socket is probably sufficient. I don't think this is right. Consider: - App A sends out corrupt packets 50% of the time and discards inbound data. - App B doesn't care about corrupt packets and is happy to receive them and has some way of dealing with them (special case) - App C is a regular app, say nc or something. In your world, where A decides what happens to data it transmits, then A<--veth-->B and A<---wire-->B will have the same behaviour but A<-- veth --> C and A<-- wire --> C will have _different_ behaviour: C will behave incorrectly if it's connected over veth but correctly if connected with a wire. That is a bug. Since A cannot know what the app it's talking to will desire, I argue that both sides of a message must be opted in to this optimization. > >>> There may be no sockets on the vethB port. And reader/writer is not >>> a good way to look at it since I am implementing a bi-directional bridge >>> in >>> user-space and each packet-socket is for both rx and tx. >> >> >> Sure, but we could model a bidrectional connection as two >> unidirectional sockets for our discussions here, right? > > > Best not to I think, you want to make sure that one socket can > correctly handle tx and rx. As long as that works, then using > uni-directional sockets should work too. > > > Thanks, > Ben > > -- > Ben Greear > Candela Technologies Inc http://www.candelatech.com