Received: by 10.223.185.116 with SMTP id b49csp1644206wrg; Thu, 22 Feb 2018 00:15:15 -0800 (PST) X-Google-Smtp-Source: AH8x227rYUnR5AWZZeGEqOb2+cao0kZKcYfZDA4VapFfmjvJreOdqzPwXUfsFr1LGwPDptVT9q3x X-Received: by 2002:a17:902:1486:: with SMTP id k6-v6mr5957863pla.376.1519287315461; Thu, 22 Feb 2018 00:15:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519287315; cv=none; d=google.com; s=arc-20160816; b=e1ygFXGR6Y2P6dcRu22BfRQH+NCqj4qJD4R3WE7Ypt6mpU5DarbdVFgBXnr3Dv/wWL s2HPG4N3OcS8zh8IgwHQmuamlhELMvU7Z/AvSNFPe+xOoairhkB2EZIIBx5mutWURleu rOo8OwWL7a2DKFPmUCnKaq7C2Yp7B89+ghi3/0tP09NJF6/7Avqj5F9gMBCr4+6qT670 u5TcBp+AnAC9ollsFgSVrleF3mjy31NJEwg2EHyAxBKKdJ4/ZNqkl16M5chF16wQ4WEg dYAJ9MLy8Jt0MsP60tge0L+EzqRWlzGp+AIGmjNImHZq9b9azYMALmV+3bC3mL29FqbJ EZLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=UPe2xENpwVEV7++/tpPycq87ZBVkgIZnCE2KZfW7dpo=; b=p6xAajFHDVHkdYBSLNZvUmqb7ZcSTCNY3Yn+J21Ym9Ww0B22m9ViupPQGk2vaQcmMp sR00ggTqEzqaVm1uhfUNzjSUiVgHPvOEa7oC0DoZQiHtZMCok9dcLDnv4aPjC2DL1P3O KHGnWMQtjzos31r6U/EGcB2wg//sJn1Rwhhueaqk+Q8DMNX1lkmj+AXTzd7Vl/pGuDyP GSFvcnQeU1rIsfqzPh8S/H0rSkJ5tCdSfDDTEqE0KtFmVHEsekUTkMR9xk/+qiydie+0 TOypicngUmiDRh/cuzcisstuN1c3f3BiT0+Gzo+OPyPZHSI4n17aSz6C60IxvlBpKn7T 7FbA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=OhBFNeM+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b81si548597pfj.331.2018.02.22.00.15.01; Thu, 22 Feb 2018 00:15:15 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=OhBFNeM+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752681AbeBVIOX (ORCPT + 99 others); Thu, 22 Feb 2018 03:14:23 -0500 Received: from mail-ua0-f170.google.com ([209.85.217.170]:42580 "EHLO mail-ua0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752535AbeBVIOV (ORCPT ); Thu, 22 Feb 2018 03:14:21 -0500 Received: by mail-ua0-f170.google.com with SMTP id b23so306570uak.9 for ; Thu, 22 Feb 2018 00:14:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=UPe2xENpwVEV7++/tpPycq87ZBVkgIZnCE2KZfW7dpo=; b=OhBFNeM+Xax3jeUuPdsuopNzIZl+eYcTDE3K5oZYwtdxcT/gfE5dt+qdGAPhxx0/Lq fjjdTkv7fnY/t7897TAtk/XQZqprqLlcsqwlpAxg92l22uJjdHClGhEVv+1eU7T9h5bo HtR82GfbqWCy/9/EoTYb4dM+XY+XqWpoFe3Ns= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=UPe2xENpwVEV7++/tpPycq87ZBVkgIZnCE2KZfW7dpo=; b=mqgvACl1/GaT9cCyZybHy+z8nctDo2oaQ6o05BMc5FRftvd+24zXXtjHZGhRF7idO0 7ZI2lm4c6HEXO5NH5jZbsJgJSQMpcJ/dZ/f1mf1j8xW340EnJWkYD2LwsyeYemXSD8Ve 9875MFSbpdwG1hsR9N5moyJThkucxHuzTEePCyIpYMT3Q48Iqo4Lg6eS2IqZlx7bn+M4 F8hLzwqZKZZ0u//HIZOAgjOSC1N5WxX3sjlMNdlqaVtEomY6ETvXcunOLrX/7CohlvwY kWB4ypRqj6FG5tC+PLMJp+JTLWu39R4Ja1hpLYecWQrUiHsznWDlIjwNgH5oE16oKUjH 1spw== X-Gm-Message-State: APf1xPA3t3ey2OuYQD42RDkSSJvd+wCnxgprJFFLO4SDyNZ84Pznqth0 yIGh61oJpr17pMVDhi21nMEHeAOifJ4= X-Received: by 10.176.48.140 with SMTP id h12mr4809645ual.203.1519287260742; Thu, 22 Feb 2018 00:14:20 -0800 (PST) Received: from mail-ua0-f172.google.com (mail-ua0-f172.google.com. [209.85.217.172]) by smtp.gmail.com with ESMTPSA id r129sm1239078vkf.36.2018.02.22.00.14.17 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Feb 2018 00:14:18 -0800 (PST) Received: by mail-ua0-f172.google.com with SMTP id e25so2772703uan.5 for ; Thu, 22 Feb 2018 00:14:17 -0800 (PST) X-Received: by 10.176.8.90 with SMTP id b26mr393923uaf.51.1519287256711; Thu, 22 Feb 2018 00:14:16 -0800 (PST) MIME-Version: 1.0 Received: by 10.176.0.99 with HTTP; Thu, 22 Feb 2018 00:13:56 -0800 (PST) In-Reply-To: References: <1517999482-17317-1-git-send-email-vivek.gautam@codeaurora.org> <7406f1ce-c2c9-a6bd-2886-5a34de45add6@arm.com> From: Tomasz Figa Date: Thu, 22 Feb 2018 17:13:56 +0900 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers To: Robin Murphy Cc: Will Deacon , Rob Clark , "list@263.net:IOMMU DRIVERS" , Joerg Roedel , Rob Herring , Mark Rutland , "Rafael J. Wysocki" , devicetree@vger.kernel.org, Linux Kernel Mailing List , Linux PM , dri-devel , freedreno , David Airlie , Greg KH , Stephen Boyd , linux-arm-msm , jcrouse@codeaurora.org, Vivek Gautam Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 16, 2018 at 9:13 AM, Tomasz Figa wrote: > On Fri, Feb 16, 2018 at 2:14 AM, Robin Murphy wrote: >> On 15/02/18 04:17, Tomasz Figa wrote: >> [...] >>>> >>>> Could you elaborate on what kind of locking you are concerned about? >>>> As I explained before, the normally happening fast path would lock >>>> dev->power_lock only for the brief moment of incrementing the runtime >>>> PM usage counter. >>> >>> >>> My bad, that's not even it. >>> >>> The atomic usage counter is incremented beforehands, without any >>> locking [1] and the spinlock is acquired only for the sake of >>> validating that device's runtime PM state remained valid indeed [2], >>> which would be the case in the fast path of the same driver doing two >>> mappings in parallel, with the master powered on (and so the SMMU, >>> through device links; if master was not powered on already, powering >>> on the SMMU is unavoidable anyway and it would add much more latency >>> than the spinlock itself). >> >> >> We now have no locking at all in the map path, and only a per-domain lock >> around TLB sync in unmap which is unfortunately necessary for correctness; >> the latter isn't too terrible, since in "serious" hardware it should only be >> serialising a few cpus serving the same device against each other (e.g. for >> multiple queues on a single NIC). >> >> Putting in a global lock which serialises *all* concurrent map and unmap >> calls for *all* unrelated devices makes things worse. Period. Even if the >> lock itself were held for the minimum possible time, i.e. trivially >> "spin_lock(&lock); spin_unlock(&lock)", the cost of repeatedly bouncing that >> one cache line around between 96 CPUs across two sockets is not negligible. > > Fair enough. Note that we're in a quite interesting situation now: > a) We need to have runtime PM enabled on Qualcomm SoC to have power > properly managed, > b) We need to have lock-free map/unmap on such distributed systems, > c) If runtime PM is enabled, we need to call into runtime PM from any > code that does hardware accesses, otherwise the IOMMU API (and so DMA > API and then any V4L2 driver) becomes unusable. > > I can see one more way that could potentially let us have all the > three. How about enabling runtime PM only on selected implementations > (e.g. qcom,smmu) and then having all the runtime PM calls surrounded > with if (pm_runtime_enabled()), which is lockless? > Sorry for pinging, but any opinion on this kind of approach? Best regards, Tomasz >> >>> [1] >>> http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L1028 >>> [2] >>> http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L613 >>> >>> In any case, I can't imagine this working with V4L2 or anything else >>> relying on any memory management more generic than calling IOMMU API >>> directly from the driver, with the IOMMU device having runtime PM >>> enabled, but without managing the runtime PM from the IOMMU driver's >>> callbacks that need access to the hardware. As I mentioned before, >>> only the IOMMU driver knows when exactly the real hardware access >>> needs to be done (e.g. Rockchip/Exynos don't need to do that for >>> map/unmap if the power is down, but some implementations of SMMU with >>> TLB powered separately might need to do so). >> >> >> It's worth noting that Exynos and Rockchip are relatively small >> self-contained IP blocks integrated closely with the interfaces of their >> relevant master devices; SMMU is an architecture, implementations of which >> may be large, distributed, and have complex and wildly differing internal >> topologies. As such, it's a lot harder to make hardware-specific assumptions >> and/or be correct for all possible cases. >> >> Don't get me wrong, I do ultimately agree that the IOMMU driver is the only >> agent who ultimately knows what calls are going to be necessary for whatever >> operation it's performing on its own hardware*; it's just that for SMMU it >> needs to be implemented in a way that has zero impact on the cases where it >> doesn't matter, because it's not viable to specialise that driver for any >> particular IP implementation/use-case. > > Still, exactly the same holds for the low power embedded use cases, > where we strive for the lowest possible power consumption, while > maintaining performance levels high as well. And so the SMMU code is > expected to also work with our use cases, such as V4L2 or DRM drivers. > Since these points don't hold for current SMMU code, I could say that > the it has been already specialized for large, distributed > implementations. > > Best regards, > Tomasz