Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933259AbbLSVlU (ORCPT ); Sat, 19 Dec 2015 16:41:20 -0500 Received: from mail-yk0-f179.google.com ([209.85.160.179]:33901 "EHLO mail-yk0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753733AbbLSVlS convert rfc822-to-8bit (ORCPT ); Sat, 19 Dec 2015 16:41:18 -0500 MIME-Version: 1.0 In-Reply-To: <1450467299-7188-1-git-send-email-vijayp@vijayp.ca> References: <1450467299-7188-1-git-send-email-vijayp@vijayp.ca> Date: Sat, 19 Dec 2015 13:41:18 -0800 Message-ID: Subject: =?UTF-8?Q?Re=3A_=5BPATCH=5D_veth=3A_don=E2=80=99t_modify_ip=5Fsummed=3B_doing_so?= =?UTF-8?Q?_treats_packets_with_bad_checksums_as_good=2E?= From: Cong Wang To: Vijay Pandurangan Cc: Evan Jones , Nicolas Dichtel , Phil Sutter , Toshiaki Makita , Linux Kernel Network Developers , LKML Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2567 Lines: 47 On Fri, Dec 18, 2015 at 11:34 AM, Vijay Pandurangan wrote: > Packets that arrive from real hardware devices have ip_summed == > CHECKSUM_UNNECESSARY if the hardware verified the checksums, or > CHECKSUM_NONE if the packet is bad or it was unable to verify it. The > current version of veth will replace CHECKSUM_NONE with > CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to > a veth device to be delivered to the application. This caused applications > at Twitter to receive corrupt data when network hardware was corrupting > packets. > > We believe this was added as an optimization to skip computing and > verifying checksums for communication between containers. However, locally > generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as > written does nothing for them. As far as we can tell, after removing this > code, these packets are transmitted from one stack to another unmodified > (tcpdump shows invalid checksums on both sides, as expected), and they are > delivered correctly to applications. We didn’t test every possible network > configuration, but we tried a few common ones such as bridging containers, > using NAT between the host and a container, and routing from hardware > devices to containers. We have effectively deployed this in production at > Twitter (by disabling RX checksum offloading on veth devices). > > This code dates back to the first version of the driver, commit > ("[NET]: Virtual ethernet device driver"), so I > suspect this bug occurred mostly because the driver API has evolved > significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix > packet checksumming") (in December 2010) fixed this for packets that get > created locally and sent to hardware devices, by not changing > CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming > in from hardware devices. > > Co-authored-by: Evan Jones > Signed-off-by: Evan Jones > Cc: Nicolas Dichtel > Cc: Phil Sutter > Cc: Toshiaki Makita > Cc: netdev@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Vijay Pandurangan Acked-by: Cong Wang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/