Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935021AbcKMUeN (ORCPT ); Sun, 13 Nov 2016 15:34:13 -0500 Received: from pb-sasl2.pobox.com ([64.147.108.67]:57915 "EHLO sasl.smtp.pobox.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932549AbcKMUeJ (ORCPT ); Sun, 13 Nov 2016 15:34:09 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=subject:to :references:cc:from:message-id:date:mime-version:in-reply-to :content-type:content-transfer-encoding; q=dns; s=sasl; b=E/IAwI kYfH/lEMIS+5AaIOLla1WsWEZ5fVhMcV3llbLDTSISF6niMq4AihZhzAtlMYNYHs dTuI7LsV0sqNsr1U8yFbrq2LDrGztGorCt/6ZZjzDz9uLSucjV7sqJDotc9MJ5fk pS+7p/eJv2dLHSWrpQ8LAaa0+Evt0rObCctH0= Subject: Re: [PATCH net 2/2] r8152: rx descriptor check To: David Miller , hayeswang@realtek.com References: <1394712342-15778-226-Taiwan-albertk@realtek.com> <1394712342-15778-228-Taiwan-albertk@realtek.com> <20161113.123954.2134945576362221851.davem@davemloft.net> Cc: netdev@vger.kernel.org, nic_swsd@realtek.com, linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org From: Mark Lord Message-ID: Date: Sun, 13 Nov 2016 15:34:05 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161113.123954.2134945576362221851.davem@davemloft.net> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 85F37404-A9E0-11E6-A6C3-E896F1301B6D-82205200!pb-sasl2.pobox.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2267 Lines: 57 On 16-11-13 12:39 PM, David Miller wrote: > From: Hayes Wang > Date: Fri, 11 Nov 2016 15:15:41 +0800 > >> For some platforms, the data in memory is not the same with the one >> from the device. That is, the data of memory is unbelievable. The >> check is used to find out this situation. >> >> Signed-off-by: Hayes Wang > > I'm all for adding consistency checks, but I disagree with proceeding > in this manner for this. > > If you add this patch now, there is a much smaller likelyhood that you > will work with a high priority to figure out _why_ this is happening. > > For all we know this could be a platform bug in the DMA API for the > systems in question. > > It could also be a bug elsewhere in the driver, either in setting up > the descriptor DMA mappings or how the chip is programmed. > > Either way the true cause must be found before we start throwing > changes like this into the driver. I agree. The system I use it with is a 32-bit ppc476, with non-coherent RAM, and using 16KB page sizes. The dongle instantly becomes a lot more reliable when r8152.c is updated to use usb_alloc_coherent() for URB buffers, rather than kmalloc(). Not sure why that would be though, as the USB stack normally would handle kmalloc'd buffers just fine. It is calling the appropriate routines, which boil down to invalidating the dcache lines (for inbound bulk xfers) as part of usb_submit_urb(), and yet the problem there persists. It could be caused by cache-line sharing with other allocations, but that seems unlikely as the kmalloc() size is 16384 bytes per buffer. Perhaps the driver is somehow accessing the buffer space again after doing usb_submit_urb()? That would certainly produce this kind of behaviour. Or maybe there's just a memory barrier missing somewhere in path. The really weird thing is that ASIX-based dongles (which use a different driver) don't have this problem, and yet they also use kmalloc'd buffers. I have access to the test system only for a day or two a week, and it takes a few hours to do a good test as to whether something helps or not. I'll continue to poke at it as time and New Ideas permit. New Ideas welcome! -- Mark Lord Real-Time Remedies Inc. mlord@pobox.com