Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757887AbcJaCIN (ORCPT ); Sun, 30 Oct 2016 22:08:13 -0400 Received: from pb-sasl2.pobox.com ([64.147.108.67]:63542 "EHLO sasl.smtp.pobox.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757412AbcJaCH3 (ORCPT ); Sun, 30 Oct 2016 22:07:29 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=subject:to :references:cc:from:message-id:date:mime-version:in-reply-to :content-type:content-transfer-encoding; q=dns; s=sasl; b=O4+yFE ED6NUvIBi9mTdhuFlwpgTlX7Xa5enfwBkp4DVmGJnEpPggyeJtpzggGOfiNOeLtT YlTyxtT/l0VxSUHSe9O4r61/lkUFf6OGSBMSdb3VD+FRqqZMarOGGXEqMlEg1LQ0 37KWf2Qm0PBWZWEP3oM9cb27YWSZSwKQRPvCk= Subject: Re: [PATCH net] r8152: Fix broken RX checksums. To: David Miller References: <987cfbab-2b48-e28c-1706-967cb2051d63@pobox.com> <9fb6be7b-95f3-6e59-c0f4-1d6c3357416d@pobox.com> <20161030.205755.1198665157526465556.davem@davemloft.net> Cc: nic_swsd@realtek.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org From: Mark Lord Message-ID: <1f847ae0-4928-01e7-f1e7-3cbc37529961@pobox.com> Date: Sun, 30 Oct 2016 22:07:25 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <20161030.205755.1198665157526465556.davem@davemloft.net> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: C56CDCF6-9F0E-11E6-9653-E896F1301B6D-82205200!pb-sasl2.pobox.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1774 Lines: 46 On 16-10-30 08:57 PM, David Miller wrote: > From: Mark Lord > Date: Sun, 30 Oct 2016 19:28:27 -0400 > >> The r8152 driver has been broken since (approx) 3.16.xx >> when support was added for hardware RX checksums >> on newer chip versions. Symptoms include random >> segfaults and silent data corruption over NFS. >> >> The hardware checksum logig does not work on the VER_02 >> dongles I have here when used with a slow embedded system CPU. >> Google reveals others reporting similar issues on Raspberry Pi. >> >> So, disable hardware RX checksum support for VER_02, and fix >> an obvious coding error for IPV6 checksums in the same function. >> >> Because this bug results in silent data corruption, >> it is a good candidate for back-porting to -stable >= 3.16.xx. >> >> Signed-off-by: Mark Lord > > Applied and queued up for -stable, thanks. Thanks. Now that this is taken care of, I do wonder if perhaps RX checksums ought to be enabled at all for ANY versions of this chip? My theory is that the checksums probably work okay most of the time, except when the hardware RX buffer overflows. In my case, and in the case of the Raspberry Pi, the receiving CPU is quite a bit slower than mainstream x86, so it can quite easily fall behind in emptying the RX buffer on the chip. The only indication this has happened may be an incorrect RX checksum. This is only a theory, but I otherwise have trouble explaining why we are seeing invalid RX checksums -- direct cable connections to a switch, shared only with the NFS server. No reason for it to have bad RX checksums in the first place. Should we just blanket disable RX checksums for all versions here unless proven otherwise/safe? Anyone out there know better? Cheers Mark