Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756640AbdDFTYq (ORCPT ); Thu, 6 Apr 2017 15:24:46 -0400 Received: from mail-io0-f196.google.com ([209.85.223.196]:35479 "EHLO mail-io0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752227AbdDFTYf (ORCPT ); Thu, 6 Apr 2017 15:24:35 -0400 Subject: Re: [PATCH V10 06/12] of: device: Fix overflow of coherent_dma_mask To: Robin Murphy , Sricharan R , will.deacon@arm.com, joro@8bytes.org, lorenzo.pieralisi@arm.com, iommu@lists.linux-foundation.org, linux-arm-kernel@lists.infradead.org, linux-arm-msm@vger.kernel.org, m.szyprowski@samsung.com, bhelgaas@google.com, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, tn@semihalf.com, hanjun.guo@linaro.org, okaya@codeaurora.org, robh+dt@kernel.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, catalin.marinas@arm.com, arnd@arndb.de, linux-arch@vger.kernel.org, gregkh@linuxfoundation.org References: <1489086061-9356-1-git-send-email-sricharan@codeaurora.org> <1491301105-5274-1-git-send-email-sricharan@codeaurora.org> <1491301105-5274-7-git-send-email-sricharan@codeaurora.org> <58E5E7B7.1050400@gmail.com> From: Frank Rowand Message-ID: <58E695DC.7010808@gmail.com> Date: Thu, 6 Apr 2017 12:24:12 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4397 Lines: 99 On 04/06/17 03:24, Robin Murphy wrote: > On 06/04/17 08:01, Frank Rowand wrote: >> On 04/04/17 03:18, Sricharan R wrote: >>> Size of the dma-range is calculated as coherent_dma_mask + 1 >>> and passed to arch_setup_dma_ops further. It overflows when >>> the coherent_dma_mask is set for full 64 bits 0xFFFFFFFFFFFFFFFF, >>> resulting in size getting passed as 0 wrongly. Fix this by >>> passsing in max(mask, mask + 1). Note that in this case >>> when the mask is set to full 64bits, we will be passing the mask >>> itself to arch_setup_dma_ops instead of the size. The real fix >>> for this should be to make arch_setup_dma_ops receive the >>> mask and handle it, to be done in the future. >>> >>> Signed-off-by: Sricharan R >>> --- >>> drivers/of/device.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/of/device.c b/drivers/of/device.c >>> index c17c19d..c2ae6bb 100644 >>> --- a/drivers/of/device.c >>> +++ b/drivers/of/device.c >>> @@ -107,7 +107,7 @@ void of_dma_configure(struct device *dev, struct device_node *np) >>> ret = of_dma_get_range(np, &dma_addr, &paddr, &size); >>> if (ret < 0) { >>> dma_addr = offset = 0; >>> - size = dev->coherent_dma_mask + 1; >>> + size = max(dev->coherent_dma_mask, dev->coherent_dma_mask + 1); >>> } else { >>> offset = PFN_DOWN(paddr - dma_addr); >>> dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset); >>> >> >> NACK. >> >> Passing an invalid size to arch_setup_dma_ops() is only part of the problem. >> size is also used in of_dma_configure() before calling arch_setup_dma_ops(): >> >> dev->coherent_dma_mask = min(dev->coherent_dma_mask, >> DMA_BIT_MASK(ilog2(dma_addr + size))); >> *dev->dma_mask = min((*dev->dma_mask), >> DMA_BIT_MASK(ilog2(dma_addr + size))); >> >> which would be incorrect for size == 0xffffffffffffffffULL when >> dma_addr != 0. So the proposed fix really is not papering over >> the base problem very well. > > I'm not sure I agree there. Granted, there exist many more problematic > aspects than are dealt with here (I've got more patches cooking to sort > out some of the other issues we have with dma-ranges), but considering > size specifically: > > - It is not possible to explicitly specify a range with a size of 2^64 > in DT. If someone does specify a size of 0, they've done a silly thing > and should not be surprised that it ends badly. > > - It *is* perfectly legitimate for bus code (or a previous device > driver, once we start coming here at probe time) to have set a device's > DMA mask to 0xffffffffffffffffULL. If this code then blindly overflows > and infers an invalid size of 0 from that, breaking things in the > process, that is this code's fault alone. It just so happens that > nothing managed to trigger the latent problem until patch #7 here shakes > up the callsites. The existing code that uses size does not appear capable of dealing with the case of DMA mask of 0xffffffffffffffffULL since 2^64 does not fit into size. The code affected by the DMA mask is not within my area of knowledge, so take the following with a grain of salt. If a DMA mask of 0xffffffffffffffffULL is provided, would the code still work without error (though with reduced capability) if the mask was changed to 0xefffffffffffffffULL? I would guess that the location to do so would be where dev->coherent_dma_mask is set, or some other location that is not of_dma_configure(). This would just be a temporary workaround. > Yes, wacky impossible base + size combinations in DT were a theoretical > problem before, and remain a theoretical problem, but also fall into the > "how did you ever expect this to work?" category. There's certainly > plenty more we can do to improve the DT parsing/validation, but that > still doesn't apply to this path where the information is *not* coming > from the DT at all. > >> I agree that the proper solution involves passing a mask instead >> of a size to arch_setup_dma_ops(). > > Having started writing that patch too, I can tell you it's a big bugger > touching multiple architectures and fixing up various drivers doing > stupid things, hence why I'm happy with this point fix being the lesser > of two evils in terms of not holding up this mostly-orthogonal series. > > Robin. > >> >> -Frank >> > >