Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp1592781ybf; Thu, 27 Feb 2020 14:08:09 -0800 (PST) X-Google-Smtp-Source: APXvYqwo3vfTdULTNSRBpomLpH/LtTpy9HIqHbAtmJWx/RnG4WvVAJkX/m7Tnxoq968GPbxWEkYP X-Received: by 2002:a9d:4c09:: with SMTP id l9mr893305otf.180.1582841289277; Thu, 27 Feb 2020 14:08:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582841289; cv=none; d=google.com; s=arc-20160816; b=yPqMZjp/RUSfHDv83fOaPd6GREBBhJj53Zj7Ox/aWDDQL/POkc6C4GGvLXOgQ10RJf 47tBfqfm0m4qMZZxEzzkvGsXX+1eJ/facEAfzolvc6OdcIzkREwoFOvRmDy/3WH2J/JO p7CR/BJgglRDciwTEbPsXmZdvNggQXYkXSGJeLi7T+QgBhxJrd7WyQvmNz5lvNE9P/xX UgJPnzIT/kwr23sUqvFZ0x6S9MDprtgdTi5E66lJOgP/QRu4hkiM1mi+aW1Ou9yFVXLa Zsmida/IrFHRDIDvniGezPU+LCeTx0aTTBDzTdRG08fL0bLBzFzN6Hkrek7BswOWQ5Hk Jc5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=h1JBtpGGQqknlPjXxRS4o0KE0TNrNFTSYULOaY5i3mE=; b=OPRj/vnlVYU4XABYJf/mrzBsRCZ/F42Jlx6sYauk3wFQL3qF09dViAuIlnG9cQqFxG Qh4/68Att3HOnXF6VZTku+bLxNIpQe1ARO5eb8SKcWiRPg4h7Qnnb0Ohm1W3n97ARNa1 pCYzFqjkZ9npe5ptKFd9iMEqGR26aYD4Ix/A0U+vuhsfXCtsDN0dMv3oiPLbTWQLmqut M2kNrB2vFaVggwRgH72DzEmHT//E3uykZ7nS7Zb5Ht9dH+GBA2ghRqEH259TNAxIuzWK Arbd6pTa6v0a5F6WGylw68TwsRMMmbUVYUNBGVK2+uxcfqTdEvVbcMT+6gfVaaE6mAw6 62BQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=FT9JV2RA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d6si303531ote.72.2020.02.27.14.07.56; Thu, 27 Feb 2020 14:08:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=FT9JV2RA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730043AbgB0WGl (ORCPT + 99 others); Thu, 27 Feb 2020 17:06:41 -0500 Received: from mail-vs1-f67.google.com ([209.85.217.67]:46227 "EHLO mail-vs1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729932AbgB0WGk (ORCPT ); Thu, 27 Feb 2020 17:06:40 -0500 Received: by mail-vs1-f67.google.com with SMTP id t12so676585vso.13 for ; Thu, 27 Feb 2020 14:06:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=h1JBtpGGQqknlPjXxRS4o0KE0TNrNFTSYULOaY5i3mE=; b=FT9JV2RABj0iYNv516V9CYCqmtUq6csZgKudwouFIKmAzTKQj0fmK1o3zI4sZU1QUO I7UCVqt6pHmTITu710pWwrqzoIUlkHmDA/vyDYIrhuU5xIpPqfzxC72FPusuImyd1G86 UUXlRESkMxaMIn4eEUSs5e6YgY/yJ90UZIbdg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=h1JBtpGGQqknlPjXxRS4o0KE0TNrNFTSYULOaY5i3mE=; b=D02QQ9DYC64ncUPQdduey3B52D+7TcxpaAn8RwFfHBI4IxjU34iUqFsKCZoo+mtl0B nxTkFh6Yww4GdwLNHn5tOaStkSiFDH3faVMxyQK4kpNeSMpPKCbRbqKlvdM/kfM1NpND lMrc42WYaRXRKAmwg+soOPV+CjXg9aAsIgoXEXltprEbVicorxQ9SMjsOgzf3/i/HPgH 7x4Vn8U+xwtHSP9bua+0ZpLMcGemz5XgUZ7V2TIf4RICvfb3RNaBtmw5eeKagRVo9AAy 0vzBh6t4fZ+ubUVjTdhYSewlZfF5ZdVdJNurTIVly8tTNA5vKQvZGWe0cA+EbxY2wxnH h3AQ== X-Gm-Message-State: ANhLgQ2jHXh7e0J6gzm51S8Dp4ic4KS8ZA6zkZwnGA6jOlYsjzg5jNfi AvhSeyv0JBL1VNK4T5ZHHxwr7A3fxNM= X-Received: by 2002:a67:f755:: with SMTP id w21mr933726vso.107.1582841199028; Thu, 27 Feb 2020 14:06:39 -0800 (PST) Received: from mail-vs1-f42.google.com (mail-vs1-f42.google.com. [209.85.217.42]) by smtp.gmail.com with ESMTPSA id c10sm2416477vkm.29.2020.02.27.14.06.37 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 27 Feb 2020 14:06:37 -0800 (PST) Received: by mail-vs1-f42.google.com with SMTP id p6so679291vsj.11 for ; Thu, 27 Feb 2020 14:06:37 -0800 (PST) X-Received: by 2002:a67:fa1a:: with SMTP id i26mr1029918vsq.169.1582841196490; Thu, 27 Feb 2020 14:06:36 -0800 (PST) MIME-Version: 1.0 References: <20200226210414.28133-1-linux@roeck-us.net> <20200226210414.28133-2-linux@roeck-us.net> In-Reply-To: <20200226210414.28133-2-linux@roeck-us.net> From: Doug Anderson Date: Thu, 27 Feb 2020 14:06:25 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFT PATCH 1/4] usb: dwc2: Simplify and fix DMA alignment code To: Guenter Roeck Cc: Minas Harutyunyan , Greg Kroah-Hartman , =?UTF-8?B?QW50dGkgU2VwcMOkbMOk?= , Boris ARZUR , linux-usb@vger.kernel.org, LKML , Felipe Balbi , Stefan Wahren , Martin Schiller Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Wed, Feb 26, 2020 at 1:04 PM Guenter Roeck wrote: > > The code to align buffers for DMA was first introduced with commit > 3bc04e28a030 ("usb: dwc2: host: Get aligned DMA in a more supported way"). > It was updated with commit 56406e017a88 ("usb: dwc2: Fix DMA alignment > to start at allocated boundary") because it did not really align buffers to > DMA boundaries but to word offsets. This was then optimized in commit > 1e111e885238 ("usb: dwc2: Fix inefficient copy of unaligned buffers") > to only copy actual data rather than the whole buffer. Commit 4a4863bf2e79 > ("usb: dwc2: Fix DMA cache alignment issues") changed this further to add > a padding at the end of the buffer to ensure that the old data pointer is > not in the same cache line as the buffer. > > This last commit states "Otherwise, the stored_xfer_buffer gets corrupted > for IN URBs on non-cache-coherent systems". However, such corruptions are > still observed. This suggests that the commit may have been hiding a > problem rather than fixing it. Further analysis shows that this is indeed > the case: The code in dwc2_hc_start_transfer() assumes that the transfer > size is a multiple of wMaxPacketSize, and rounds up the transfer size > communicated to the chip accordingly. Added debug code confirms that > the chip does under some circumstances indeed send more data than requested > in the urb receive buffer size. > > On top of that, it turns out that buffers are still not guaranteed to be > aligned to dma_get_cache_alignment(), but to DWC2_USB_DMA_ALIGN (4). > Further debugging shows that packets aligned to DWC2_USB_DMA_ALIGN > but not to dma_get_cache_alignment() are indeed common and work just fine. > This suggests that commit 56406e017a88 was not really necessary because > even without it packets were already aligned to DWC2_USB_DMA_ALIGN. > > To simplify the code, move the old data pointer back to the beginning of > the new buffer, restoring most of the original commit. Stop aligning the > buffer to dma_get_cache_alignment() since it isn't needed and only makes > the code more complex. Instead, ensure that the allocated buffer is a > multiple of wMaxPacketSize to ensure that the chip does not write beyond > the end of the buffer. I do like the cleanliness of being able to easily identify the old buffer (AKA by putting it first) and I agree that the existing code was only really guaranteeing 4-byte alignment and if we truly needed more alignment then we'd be allocating a lot more bounce buffers (which is pretty expensive). ...but the argument in commit 56406e017a88 ("usb: dwc2: Fix DMA alignment to start at allocated boundary") is still a compelling one. Maybe at least put a comment in the code next to the "#define DWC2_USB_DMA_ALIGN" saying that we think that this is enough alignment for anyone using dwc2's built-in DMA logic? > Cc: Douglas Anderson > Cc: Boris Arzur > Fixes: 56406e017a88 ("usb: dwc2: Fix DMA alignment to start at allocated boundary") > Signed-off-by: Guenter Roeck > --- > drivers/usb/dwc2/hcd.c | 67 ++++++++++++++++++++++-------------------- > 1 file changed, 35 insertions(+), 32 deletions(-) Sorry for such a mess and thank you for all the work tracking down and documenting all the problems. Clearly deep understanding of DMA is not something I can claim. :( A few points of order first: * Although get_maintainer doesn't identify him, it has seemed like Felipe Balbi lands most of the dwc2 things. Probably a good idea to CC him. * I have historically found Stefan Wahren interested in dwc2 fixes and willing to test them on Raspberry Pi w/ various peripherals. * Since one of the commits you refer to was written by Martin Schiller it probably wouldn't hurt to CC him. > diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c > index b90f858af960..f6d8cc9cee34 100644 > --- a/drivers/usb/dwc2/hcd.c > +++ b/drivers/usb/dwc2/hcd.c > @@ -2471,19 +2471,21 @@ static int dwc2_alloc_split_dma_aligned_buf(struct dwc2_hsotg *hsotg, > > #define DWC2_USB_DMA_ALIGN 4 > > +struct dma_aligned_buffer { > + void *old_xfer_buffer; > + u8 data[0]; > +}; > + > static void dwc2_free_dma_aligned_buffer(struct urb *urb) > { > - void *stored_xfer_buffer; > + struct dma_aligned_buffer *dma; > size_t length; > > if (!(urb->transfer_flags & URB_ALIGNED_TEMP_BUFFER)) > return; > > - /* Restore urb->transfer_buffer from the end of the allocated area */ > - memcpy(&stored_xfer_buffer, > - PTR_ALIGN(urb->transfer_buffer + urb->transfer_buffer_length, > - dma_get_cache_alignment()), > - sizeof(urb->transfer_buffer)); > + dma = container_of(urb->transfer_buffer, > + struct dma_aligned_buffer, data); > > if (usb_urb_dir_in(urb)) { > if (usb_pipeisoc(urb->pipe)) > @@ -2491,49 +2493,50 @@ static void dwc2_free_dma_aligned_buffer(struct urb *urb) > else > length = urb->actual_length; > > - memcpy(stored_xfer_buffer, urb->transfer_buffer, length); > + memcpy(dma->old_xfer_buffer, dma->data, length); > } > - kfree(urb->transfer_buffer); > - urb->transfer_buffer = stored_xfer_buffer; > + urb->transfer_buffer = dma->old_xfer_buffer; > + kfree(dma); > > urb->transfer_flags &= ~URB_ALIGNED_TEMP_BUFFER; > } > > static int dwc2_alloc_dma_aligned_buffer(struct urb *urb, gfp_t mem_flags) > { > - void *kmalloc_ptr; > + struct dma_aligned_buffer *dma; > size_t kmalloc_size; > > - if (urb->num_sgs || urb->sg || > - urb->transfer_buffer_length == 0 || > + if (urb->num_sgs || urb->sg || urb->transfer_buffer_length == 0 || > + (urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP) || > !((uintptr_t)urb->transfer_buffer & (DWC2_USB_DMA_ALIGN - 1))) Maybe I'm misunderstanding things, but it feels like we need something more here. Specifically I'm worried about the fact when the transfer buffer is already aligned but the length is not a multiple of the endpoint's maximum transfer size. You need to handle that, right? AKA something like this (untested): /* Simple case of not having to allocate a bounce buffer */ if (urb->num_sgs || urb->sg || urb->transfer_buffer_length == 0 || (urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)) return 0; /* Can also avoid bounce buffer if alignment and size are good */ maxp = usb_endpoint_maxp(&ep->desc); if (maxp == urb->transfer_buffer_length && !((uintptr_t)urb->transfer_buffer & (DWC2_USB_DMA_ALIGN - 1))) return 0; Other than that this looks good to me. -Doug