2018-08-18 15:59:59

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 00/19] IOMMU: Tegra GART driver clean up and optimization

Hello,

In the previous iteration Thierry Reding suggested that it is better to
break/change GART's device-tree ABI in order to integrate it with Memory
Controller without much churning. So this series now includes the device
tree changes.

After making GART to disallow more than one active IOMMU domain at a time,
I realized that the domain clients managing code has a few significant bugs
and now they are fixed. During the bugs squashing, I found that drivers code
require a major cleanup and hence there are now couple more patches that make
the code less tangled and easier to maintain as a result.

Changelog:

v3: Memory Controller integration part has been reworked and now GART's
device-tree binding is changed. Adding Rob Herring for the device-tree
changes reviewing.

GART now disallows more than one active domain at a time.

Fixed "spinlock recursion", "NULL pointer dereference" and "detaching
of all devices from inactive domains".

New code-refactoring patches.

The previously standalone patch "memory: tegra: Don't invoke Tegra30+
specific memory timing setup on Tegra20" is now included into this
series because there is a dependency on that patch and it wasn't applied
yet.

v2: Addressed review comments from Robin Murphy to v1 by moving devices
iommu_fwspec check to gart_iommu_add_device().

Dropped the "Provide single domain and group for all devices" patch from
the series for now because after some more considering it became not
exactly apparent whether that is what we need, that was also suggested
by Robin Murphy in the review comment. Maybe something like a runtime
IOMMU usage for devices would be a better solution, allowing to implement
transparent context switching of virtual IOMMU domains.

Some very minor code cleanups, reworded commit messages.

Dmitry Osipenko (19):
iommu/tegra: gart: Remove pr_fmt and clean up includes
iommu/tegra: gart: Clean up driver probe errors handling
iommu/tegra: gart: Ignore devices without IOMMU phandle in DT
iommu: Introduce iotlb_sync_map callback
iommu/tegra: gart: Optimize mapping / unmapping performance
dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc
ARM: dts: tegra20: Update Memory Controller node to the new binding
memory: tegra: Don't invoke Tegra30+ specific memory timing setup on
Tegra20
memory: tegra: Adapt to Tegra20 device-tree binding changes
memory: tegra: Read client ID on GART page fault
iommu/tegra: gart: Integrate with Memory Controller driver
iommu/tegra: gart: Fix spinlock recursion
iommu/tegra: gart: Fix NULL pointer dereference
iommu/tegra: gart: Allow only one active domain at a time
iommu/tegra: gart: Don't use managed resources
iommu/tegra: gart: Prepend error/debug messages with "GART:"
iommu/tegra: gart: Don't detach devices from inactive domains
iommu/tegra: gart: Simplify clients-tracking code
iommu/tegra: gart: Perform code refactoring

.../bindings/iommu/nvidia,tegra20-gart.txt | 14 -
.../memory-controllers/nvidia,tegra20-mc.txt | 23 +-
arch/arm/boot/dts/tegra20.dtsi | 13 +-
drivers/iommu/Kconfig | 1 +
drivers/iommu/iommu.c | 8 +-
drivers/iommu/tegra-gart.c | 466 +++++++-----------
drivers/memory/tegra/mc.c | 83 +++-
drivers/memory/tegra/mc.h | 6 -
include/linux/iommu.h | 1 +
include/soc/tegra/mc.h | 29 +-
10 files changed, 287 insertions(+), 357 deletions(-)
delete mode 100644 Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt

--
2.18.0



2018-08-18 15:57:31

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 15/19] iommu/tegra: gart: Don't use managed resources

GART is a part of the Memory Controller driver that is always built-in,
hence there is no benefit from the use of managed resources.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 9f7d3afb686f..d019ae8ecfc9 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -171,7 +171,7 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
struct gart_client *client, *c;
int err = 0;

- client = devm_kzalloc(gart->dev, sizeof(*c), GFP_KERNEL);
+ client = kzalloc(sizeof(*c), GFP_KERNEL);
if (!client)
return -ENOMEM;
client->dev = dev;
@@ -197,7 +197,7 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
return 0;

fail:
- devm_kfree(gart->dev, client);
+ kfree(client);
spin_unlock(&gart->client_lock);
return err;
}
@@ -212,7 +212,7 @@ static void __gart_iommu_detach_dev(struct iommu_domain *domain,
list_for_each_entry(c, &gart->client, list) {
if (c->dev == dev) {
list_del(&c->list);
- devm_kfree(gart->dev, c);
+ kfree(c);
if (list_empty(&gart->client))
gart->active_domain = NULL;
dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
@@ -459,7 +459,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
return ERR_PTR(-ENXIO);
}

- gart = devm_kzalloc(dev, sizeof(*gart), GFP_KERNEL);
+ gart = kzalloc(sizeof(*gart), GFP_KERNEL);
if (!gart) {
dev_err(dev, "failed to allocate gart_device\n");
return ERR_PTR(-ENOMEM);
@@ -468,7 +468,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
ret = iommu_device_sysfs_add(&gart->iommu, dev, NULL, "gart");
if (ret) {
dev_err(dev, "Failed to register IOMMU in sysfs\n");
- return ERR_PTR(ret);
+ goto free_gart;
}

iommu_device_set_ops(&gart->iommu, &gart_iommu_ops);
@@ -506,6 +506,8 @@ struct gart_device *tegra_gart_probe(struct device *dev,
iommu_device_unregister(&gart->iommu);
remove_sysfs:
iommu_device_sysfs_remove(&gart->iommu);
+free_gart:
+ kfree(gart);

return ERR_PTR(ret);
}
--
2.18.0


2018-08-18 15:57:31

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 18/19] iommu/tegra: gart: Simplify clients-tracking code

GART is a simple IOMMU provider that has single address space. There is
no need to setup global clients list and manage it for tracking of the
active domain, hence lot's of code could be safely removed and replaced
with a simpler alternative.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 157 +++++++++----------------------------
1 file changed, 39 insertions(+), 118 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 306e9644a676..7182445c3b76 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -19,7 +19,6 @@

#include <linux/io.h>
#include <linux/iommu.h>
-#include <linux/list.h>
#include <linux/module.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
@@ -42,30 +41,20 @@
#define GART_PAGE_MASK \
(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)

-struct gart_client {
- struct device *dev;
- struct list_head list;
-};
-
struct gart_device {
void __iomem *regs;
u32 *savedata;
u32 page_count; /* total remappable size */
dma_addr_t iovmm_base; /* offset to vmm_area */
spinlock_t pte_lock; /* for pagetable */
- struct list_head client;
- spinlock_t client_lock; /* for client list */
+ spinlock_t dom_lock; /* for active domain */
+ unsigned int active_devices; /* number of active devices */
struct iommu_domain *active_domain; /* current active domain */
struct device *dev;

struct iommu_device iommu; /* IOMMU Core handle */
};

-struct gart_domain {
- struct iommu_domain domain; /* generic domain handle */
- struct gart_device *gart; /* link to gart device */
-};
-
static struct gart_device *gart_handle; /* unique for a system */

static bool gart_debug;
@@ -73,11 +62,6 @@ static bool gart_debug;
#define GART_PTE(_pfn) \
(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))

-static struct gart_domain *to_gart_domain(struct iommu_domain *dom)
-{
- return container_of(dom, struct gart_domain, domain);
-}
-
/*
* Any interaction between any block on PPSB and a block on APB or AHB
* must have these read-back to ensure the APB/AHB bus transaction is
@@ -166,128 +150,69 @@ static inline bool gart_iova_range_valid(struct gart_device *gart,
static int gart_iommu_attach_dev(struct iommu_domain *domain,
struct device *dev)
{
- struct gart_domain *gart_domain = to_gart_domain(domain);
struct gart_device *gart = gart_handle;
- struct gart_client *client, *c;
- int err = 0;
-
- client = kzalloc(sizeof(*c), GFP_KERNEL);
- if (!client)
- return -ENOMEM;
- client->dev = dev;
-
- spin_lock(&gart->client_lock);
- list_for_each_entry(c, &gart->client, list) {
- if (c->dev == dev) {
- dev_err(gart->dev, "GART: %s is already attached\n",
- dev_name(dev));
- err = -EINVAL;
- goto fail;
- }
- }
- if (gart->active_domain && gart->active_domain != domain) {
- dev_err(gart->dev,
- "GART: Only one domain can be active at a time\n");
- err = -EINVAL;
- goto fail;
- }
- gart->active_domain = domain;
- gart_domain->gart = gart;
- list_add(&client->list, &gart->client);
- spin_unlock(&gart->client_lock);
- dev_dbg(gart->dev, "GART: Attached %s\n", dev_name(dev));
- return 0;
+ int ret = 0;

-fail:
- kfree(client);
- spin_unlock(&gart->client_lock);
- return err;
-}
+ spin_lock(&gart->dom_lock);

-static void __gart_iommu_detach_dev(struct iommu_domain *domain,
- struct device *dev)
-{
- struct gart_domain *gart_domain = to_gart_domain(domain);
- struct gart_device *gart = gart_domain->gart;
- struct gart_client *c;
-
- list_for_each_entry(c, &gart->client, list) {
- if (c->dev == dev) {
- list_del(&c->list);
- kfree(c);
- if (list_empty(&gart->client)) {
- gart->active_domain = NULL;
- gart_domain->gart = NULL;
- }
- dev_dbg(gart->dev, "GART: Detached %s\n",
- dev_name(dev));
- return;
- }
+ if (gart->active_domain && gart->active_domain != domain) {
+ ret = -EBUSY;
+ } else if (dev->archdata.iommu != domain) {
+ dev->archdata.iommu = domain;
+ gart->active_domain = domain;
+ gart->active_devices++;
}

- dev_err(gart->dev, "GART: Couldn't find %s to detach\n",
- dev_name(dev));
+ spin_unlock(&gart->dom_lock);
+
+ return ret;
}

static void gart_iommu_detach_dev(struct iommu_domain *domain,
struct device *dev)
{
- struct gart_domain *gart_domain = to_gart_domain(domain);
- struct gart_device *gart = gart_domain->gart;
+ struct gart_device *gart = gart_handle;
+
+ spin_lock(&gart->dom_lock);

- spin_lock(&gart->client_lock);
- __gart_iommu_detach_dev(domain, dev);
- spin_unlock(&gart->client_lock);
+ if (dev->archdata.iommu == domain) {
+ dev->archdata.iommu = NULL;
+
+ if (--gart->active_devices == 0)
+ gart->active_domain = NULL;
+ }
+
+ spin_unlock(&gart->dom_lock);
}

static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
{
- struct gart_domain *gart_domain;
- struct gart_device *gart;
+ struct gart_device *gart = gart_handle;
+ struct iommu_domain *domain;

if (type != IOMMU_DOMAIN_UNMANAGED)
return NULL;

- gart = gart_handle;
- if (!gart)
- return NULL;
-
- gart_domain = kzalloc(sizeof(*gart_domain), GFP_KERNEL);
- if (!gart_domain)
- return NULL;
-
- gart_domain->domain.geometry.aperture_start = gart->iovmm_base;
- gart_domain->domain.geometry.aperture_end = gart->iovmm_base +
+ domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+ if (domain) {
+ domain->geometry.aperture_start = gart->iovmm_base;
+ domain->geometry.aperture_end = gart->iovmm_base +
gart->page_count * GART_PAGE_SIZE - 1;
- gart_domain->domain.geometry.force_aperture = true;
+ domain->geometry.force_aperture = true;
+ }

- return &gart_domain->domain;
+ return domain;
}

static void gart_iommu_domain_free(struct iommu_domain *domain)
{
- struct gart_domain *gart_domain = to_gart_domain(domain);
- struct gart_device *gart = gart_domain->gart;
-
- if (gart) {
- spin_lock(&gart->client_lock);
- if (!list_empty(&gart->client)) {
- struct gart_client *c, *tmp;
-
- list_for_each_entry_safe(c, tmp, &gart->client, list)
- __gart_iommu_detach_dev(domain, c->dev);
- }
- spin_unlock(&gart->client_lock);
- }
-
- kfree(gart_domain);
+ kfree(domain);
}

static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
phys_addr_t pa, size_t bytes, int prot)
{
- struct gart_domain *gart_domain = to_gart_domain(domain);
- struct gart_device *gart = gart_domain->gart;
+ struct gart_device *gart = gart_handle;
unsigned long flags;
unsigned long pfn;
unsigned long pte;
@@ -318,8 +243,7 @@ static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
size_t bytes)
{
- struct gart_domain *gart_domain = to_gart_domain(domain);
- struct gart_device *gart = gart_domain->gart;
+ struct gart_device *gart = gart_handle;
unsigned long flags;

if (!gart_iova_range_valid(gart, iova, bytes))
@@ -334,8 +258,7 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
dma_addr_t iova)
{
- struct gart_domain *gart_domain = to_gart_domain(domain);
- struct gart_device *gart = gart_domain->gart;
+ struct gart_device *gart = gart_handle;
unsigned long pte;
phys_addr_t pa;
unsigned long flags;
@@ -394,8 +317,7 @@ static int gart_iommu_of_xlate(struct device *dev,

static void gart_iommu_sync(struct iommu_domain *domain)
{
- struct gart_domain *gart_domain = to_gart_domain(domain);
- struct gart_device *gart = gart_domain->gart;
+ struct gart_device *gart = gart_handle;

FLUSH_GART_REGS(gart);
}
@@ -486,8 +408,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
gart->dev = dev;
gart_regs = mc->regs + GART_REG_BASE;
spin_lock_init(&gart->pte_lock);
- spin_lock_init(&gart->client_lock);
- INIT_LIST_HEAD(&gart->client);
+ spin_lock_init(&gart->dom_lock);
gart->regs = gart_regs;
gart->iovmm_base = (dma_addr_t)res_remap->start;
gart->page_count = (resource_size(res_remap) >> GART_PAGE_SHIFT);
--
2.18.0


2018-08-18 15:57:33

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 19/19] iommu/tegra: gart: Perform code refactoring

Perform a major code cleanup to make it more readable and as a result
easier to maintain. I've removed some redundant safety-checks in the code
and some debug code that isn't actually very useful for debugging, like
enormous pagetable dump on each fault, the majority of the changes are
code reshuffling and variables/whitespaces clean up.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 203 +++++++++++++++----------------------
1 file changed, 79 insertions(+), 124 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 7182445c3b76..aebbb2ccc536 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -34,63 +34,59 @@
#define GART_CONFIG (0x24 - GART_REG_BASE)
#define GART_ENTRY_ADDR (0x28 - GART_REG_BASE)
#define GART_ENTRY_DATA (0x2c - GART_REG_BASE)
-#define GART_ENTRY_PHYS_ADDR_VALID (1 << 31)
+
+#define GART_ENTRY_PHYS_ADDR_VALID BIT(31)

#define GART_PAGE_SHIFT 12
#define GART_PAGE_SIZE (1 << GART_PAGE_SHIFT)
-#define GART_PAGE_MASK \
- (~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
+#define GART_PAGE_MASK GENMASK(30, GART_PAGE_SHIFT)

struct gart_device {
void __iomem *regs;
u32 *savedata;
- u32 page_count; /* total remappable size */
- dma_addr_t iovmm_base; /* offset to vmm_area */
+ unsigned long iovmm_base; /* offset to vmm_area start */
+ unsigned long iovmm_end; /* offset to vmm_area end */
spinlock_t pte_lock; /* for pagetable */
spinlock_t dom_lock; /* for active domain */
unsigned int active_devices; /* number of active devices */
struct iommu_domain *active_domain; /* current active domain */
- struct device *dev;
-
struct iommu_device iommu; /* IOMMU Core handle */
+ struct device *dev;
};

static struct gart_device *gart_handle; /* unique for a system */

static bool gart_debug;

-#define GART_PTE(_pfn) \
- (GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
-
/*
* Any interaction between any block on PPSB and a block on APB or AHB
* must have these read-back to ensure the APB/AHB bus transaction is
* complete before initiating activity on the PPSB block.
*/
-#define FLUSH_GART_REGS(gart) ((void)readl((gart)->regs + GART_CONFIG))
+#define FLUSH_GART_REGS(gart) readl_relaxed((gart)->regs + GART_CONFIG)

#define for_each_gart_pte(gart, iova) \
for (iova = gart->iovmm_base; \
- iova < gart->iovmm_base + GART_PAGE_SIZE * gart->page_count; \
+ iova < gart->iovmm_end; \
iova += GART_PAGE_SIZE)

static inline void gart_set_pte(struct gart_device *gart,
- unsigned long offs, u32 pte)
+ unsigned long iova, u32 pte)
{
- writel(offs, gart->regs + GART_ENTRY_ADDR);
- writel(pte, gart->regs + GART_ENTRY_DATA);
+ writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
+ writel_relaxed(pte, gart->regs + GART_ENTRY_DATA);

- dev_dbg(gart->dev, "GART: %s %08lx:%08x\n",
- pte ? "map" : "unmap", offs, pte & GART_PAGE_MASK);
+ dev_dbg(gart->dev, "GART: %s %08lx:%08lx\n",
+ pte ? "map" : "unmap", iova, pte & GART_PAGE_MASK);
}

static inline unsigned long gart_read_pte(struct gart_device *gart,
- unsigned long offs)
+ unsigned long iova)
{
unsigned long pte;

- writel(offs, gart->regs + GART_ENTRY_ADDR);
- pte = readl(gart->regs + GART_ENTRY_DATA);
+ writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
+ pte = readl_relaxed(gart->regs + GART_ENTRY_DATA);

return pte;
}
@@ -102,49 +98,20 @@ static void do_gart_setup(struct gart_device *gart, const u32 *data)
for_each_gart_pte(gart, iova)
gart_set_pte(gart, iova, data ? *(data++) : 0);

- writel(1, gart->regs + GART_CONFIG);
+ writel_relaxed(1, gart->regs + GART_CONFIG);
FLUSH_GART_REGS(gart);
}

-#ifdef DEBUG
-static void gart_dump_table(struct gart_device *gart)
-{
- unsigned long iova;
- unsigned long flags;
-
- spin_lock_irqsave(&gart->pte_lock, flags);
- for_each_gart_pte(gart, iova) {
- unsigned long pte;
-
- pte = gart_read_pte(gart, iova);
-
- dev_dbg(gart->dev, "GART: %s %08lx:%08lx\n",
- (GART_ENTRY_PHYS_ADDR_VALID & pte) ? "v" : " ",
- iova, pte & GART_PAGE_MASK);
- }
- spin_unlock_irqrestore(&gart->pte_lock, flags);
-}
-#else
-static inline void gart_dump_table(struct gart_device *gart)
+static inline bool gart_iova_range_invalid(struct gart_device *gart,
+ unsigned long iova, size_t bytes)
{
+ return unlikely(iova < gart->iovmm_base ||
+ iova + bytes > gart->iovmm_end);
}
-#endif

-static inline bool gart_iova_range_valid(struct gart_device *gart,
- unsigned long iova, size_t bytes)
+static inline bool gart_pte_valid(struct gart_device *gart, unsigned long iova)
{
- unsigned long iova_start, iova_end, gart_start, gart_end;
-
- iova_start = iova;
- iova_end = iova_start + bytes - 1;
- gart_start = gart->iovmm_base;
- gart_end = gart_start + gart->page_count * GART_PAGE_SIZE - 1;
-
- if (iova_start < gart_start)
- return false;
- if (iova_end > gart_end)
- return false;
- return true;
+ return !!(gart_read_pte(gart, iova) & GART_ENTRY_PHYS_ADDR_VALID);
}

static int gart_iommu_attach_dev(struct iommu_domain *domain,
@@ -187,7 +154,6 @@ static void gart_iommu_detach_dev(struct iommu_domain *domain,

static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
{
- struct gart_device *gart = gart_handle;
struct iommu_domain *domain;

if (type != IOMMU_DOMAIN_UNMANAGED)
@@ -195,9 +161,8 @@ static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)

domain = kzalloc(sizeof(*domain), GFP_KERNEL);
if (domain) {
- domain->geometry.aperture_start = gart->iovmm_base;
- domain->geometry.aperture_end = gart->iovmm_base +
- gart->page_count * GART_PAGE_SIZE - 1;
+ domain->geometry.aperture_start = gart_handle->iovmm_base;
+ domain->geometry.aperture_end = gart_handle->iovmm_end - 1;
domain->geometry.force_aperture = true;
}

@@ -209,35 +174,44 @@ static void gart_iommu_domain_free(struct iommu_domain *domain)
kfree(domain);
}

+static int __gart_iommu_map(struct gart_device *gart, unsigned long iova,
+ phys_addr_t pa)
+{
+ if (unlikely(gart_debug) && gart_pte_valid(gart, iova)) {
+ dev_WARN(gart->dev, "GART: Page entry is in-use\n");
+ return -EBUSY;
+ }
+
+ gart_set_pte(gart, iova, GART_ENTRY_PHYS_ADDR_VALID | pa);
+
+ return 0;
+}
+
static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
phys_addr_t pa, size_t bytes, int prot)
{
struct gart_device *gart = gart_handle;
unsigned long flags;
- unsigned long pfn;
- unsigned long pte;
+ int ret;

- if (!gart_iova_range_valid(gart, iova, bytes))
+ if (gart_iova_range_invalid(gart, iova, bytes))
return -EINVAL;

spin_lock_irqsave(&gart->pte_lock, flags);
- pfn = __phys_to_pfn(pa);
- if (!pfn_valid(pfn)) {
- dev_err(gart->dev, "GART: Invalid page: %pa\n", &pa);
- spin_unlock_irqrestore(&gart->pte_lock, flags);
- return -EINVAL;
- }
- if (gart_debug) {
- pte = gart_read_pte(gart, iova);
- if (pte & GART_ENTRY_PHYS_ADDR_VALID) {
- spin_unlock_irqrestore(&gart->pte_lock, flags);
- dev_err(gart->dev, "GART: Page entry is in-use\n");
- return -EBUSY;
- }
- }
- gart_set_pte(gart, iova, GART_PTE(pfn));
+ ret = __gart_iommu_map(gart, iova, pa);
spin_unlock_irqrestore(&gart->pte_lock, flags);
- return 0;
+
+ return ret;
+}
+
+static void __gart_iommu_unmap(struct gart_device *gart, unsigned long iova)
+{
+ if (unlikely(gart_debug) && !gart_pte_valid(gart, iova)) {
+ dev_WARN(gart->dev, "GART: Page entry is invalid\n");
+ return;
+ }
+
+ gart_set_pte(gart, iova, 0);
}

static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
@@ -246,12 +220,13 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
struct gart_device *gart = gart_handle;
unsigned long flags;

- if (!gart_iova_range_valid(gart, iova, bytes))
+ if (gart_iova_range_invalid(gart, iova, bytes))
return 0;

spin_lock_irqsave(&gart->pte_lock, flags);
- gart_set_pte(gart, iova, 0);
+ __gart_iommu_unmap(gart, iova);
spin_unlock_irqrestore(&gart->pte_lock, flags);
+
return bytes;
}

@@ -259,25 +234,17 @@ static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
dma_addr_t iova)
{
struct gart_device *gart = gart_handle;
- unsigned long pte;
- phys_addr_t pa;
unsigned long flags;
+ unsigned long pte;

- if (!gart_iova_range_valid(gart, iova, 0))
+ if (gart_iova_range_invalid(gart, iova, SZ_4K))
return -EINVAL;

spin_lock_irqsave(&gart->pte_lock, flags);
pte = gart_read_pte(gart, iova);
spin_unlock_irqrestore(&gart->pte_lock, flags);

- pa = (pte & GART_PAGE_MASK);
- if (!pfn_valid(__phys_to_pfn(pa))) {
- dev_err(gart->dev, "GART: No entry for %08llx:%pa\n",
- (unsigned long long)iova, &pa);
- gart_dump_table(gart);
- return -EINVAL;
- }
- return pa;
+ return pte & GART_PAGE_MASK;
}

static bool gart_iommu_capable(enum iommu_cap cap)
@@ -342,24 +309,19 @@ static const struct iommu_ops gart_iommu_ops = {

int tegra_gart_suspend(struct gart_device *gart)
{
- unsigned long iova;
u32 *data = gart->savedata;
- unsigned long flags;
+ unsigned long iova;

- spin_lock_irqsave(&gart->pte_lock, flags);
for_each_gart_pte(gart, iova)
*(data++) = gart_read_pte(gart, iova);
- spin_unlock_irqrestore(&gart->pte_lock, flags);
+
return 0;
}

int tegra_gart_resume(struct gart_device *gart)
{
- unsigned long flags;
-
- spin_lock_irqsave(&gart->pte_lock, flags);
do_gart_setup(gart, gart->savedata);
- spin_unlock_irqrestore(&gart->pte_lock, flags);
+
return 0;
}

@@ -368,8 +330,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,
struct tegra_mc *mc)
{
struct gart_device *gart;
- struct resource *res_remap;
- void __iomem *gart_regs;
+ struct resource *res;
int ret;

BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);
@@ -379,9 +340,8 @@ struct gart_device *tegra_gart_probe(struct device *dev,
return NULL;

/* the GART memory aperture is required */
- res_remap = platform_get_resource(to_platform_device(dev),
- IORESOURCE_MEM, 1);
- if (!res_remap) {
+ res = platform_get_resource(to_platform_device(dev), IORESOURCE_MEM, 1);
+ if (!res) {
dev_err(dev, "GART: Memory aperture resource unavailable\n");
return ERR_PTR(-ENXIO);
}
@@ -390,39 +350,34 @@ struct gart_device *tegra_gart_probe(struct device *dev,
if (!gart)
return ERR_PTR(-ENOMEM);

+ gart_handle = gart;
+
+ gart->dev = dev;
+ gart->regs = mc->regs + GART_REG_BASE;
+ gart->iovmm_base = res->start;
+ gart->iovmm_end = res->start + resource_size(res);
+ spin_lock_init(&gart->pte_lock);
+ spin_lock_init(&gart->dom_lock);
+
+ do_gart_setup(gart, NULL);
+
ret = iommu_device_sysfs_add(&gart->iommu, dev, NULL, "gart");
- if (ret) {
- dev_err(dev, "GART: Failed to register IOMMU sysfs\n");
+ if (ret)
goto free_gart;
- }

iommu_device_set_ops(&gart->iommu, &gart_iommu_ops);
iommu_device_set_fwnode(&gart->iommu, dev->fwnode);

ret = iommu_device_register(&gart->iommu);
- if (ret) {
- dev_err(dev, "GART: Failed to register IOMMU\n");
+ if (ret)
goto remove_sysfs;
- }
-
- gart->dev = dev;
- gart_regs = mc->regs + GART_REG_BASE;
- spin_lock_init(&gart->pte_lock);
- spin_lock_init(&gart->dom_lock);
- gart->regs = gart_regs;
- gart->iovmm_base = (dma_addr_t)res_remap->start;
- gart->page_count = (resource_size(res_remap) >> GART_PAGE_SHIFT);

- gart->savedata = vmalloc(array_size(sizeof(u32), gart->page_count));
+ gart->savedata = vmalloc(resource_size(res) >> GART_PAGE_SHIFT);
if (!gart->savedata) {
ret = -ENOMEM;
goto unregister_iommu;
}

- do_gart_setup(gart, NULL);
-
- gart_handle = gart;
-
return gart;

unregister_iommu:
--
2.18.0


2018-08-18 15:57:42

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 05/19] iommu/tegra: gart: Optimize mapping / unmapping performance

Currently GART writes one page entry at a time. More optimal would be to
aggregate the writes and flush BUS buffer in the end, this gives map/unmap
10-40% performance boost (depending on size of mapping) in comparison to
flushing after each page entry update.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index f6cf5cd5aaca..86a855c0d031 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -287,7 +287,6 @@ static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
}
}
gart_set_pte(gart, iova, GART_PTE(pfn));
- FLUSH_GART_REGS(gart);
spin_unlock_irqrestore(&gart->pte_lock, flags);
return 0;
}
@@ -304,7 +303,6 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova,

spin_lock_irqsave(&gart->pte_lock, flags);
gart_set_pte(gart, iova, 0);
- FLUSH_GART_REGS(gart);
spin_unlock_irqrestore(&gart->pte_lock, flags);
return bytes;
}
@@ -370,6 +368,14 @@ static int gart_iommu_of_xlate(struct device *dev,
return 0;
}

+static void gart_iommu_sync(struct iommu_domain *domain)
+{
+ struct gart_domain *gart_domain = to_gart_domain(domain);
+ struct gart_device *gart = gart_domain->gart;
+
+ FLUSH_GART_REGS(gart);
+}
+
static const struct iommu_ops gart_iommu_ops = {
.capable = gart_iommu_capable,
.domain_alloc = gart_iommu_domain_alloc,
@@ -384,6 +390,8 @@ static const struct iommu_ops gart_iommu_ops = {
.iova_to_phys = gart_iommu_iova_to_phys,
.pgsize_bitmap = GART_IOMMU_PGSIZES,
.of_xlate = gart_iommu_of_xlate,
+ .iotlb_sync_map = gart_iommu_sync,
+ .iotlb_sync = gart_iommu_sync,
};

static int tegra_gart_suspend(struct device *dev)
--
2.18.0


2018-08-18 15:57:53

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 16/19] iommu/tegra: gart: Prepend error/debug messages with "GART:"

GART became a part of Memory Controller, hence now the drivers device
is Memory Controller and not GART. As a result all printed messages are
prepended with the "tegra-mc 7000f000.memory-controller:", so let's
prepend GART's messages with "GART:" in order to differentiate them
from the MC.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 36 ++++++++++++++++++------------------
1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index d019ae8ecfc9..284cddf90888 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -96,7 +96,7 @@ static inline void gart_set_pte(struct gart_device *gart,
writel(offs, gart->regs + GART_ENTRY_ADDR);
writel(pte, gart->regs + GART_ENTRY_DATA);

- dev_dbg(gart->dev, "%s %08lx:%08x\n",
+ dev_dbg(gart->dev, "GART: %s %08lx:%08x\n",
pte ? "map" : "unmap", offs, pte & GART_PAGE_MASK);
}

@@ -134,7 +134,7 @@ static void gart_dump_table(struct gart_device *gart)

pte = gart_read_pte(gart, iova);

- dev_dbg(gart->dev, "%s %08lx:%08lx\n",
+ dev_dbg(gart->dev, "GART: %s %08lx:%08lx\n",
(GART_ENTRY_PHYS_ADDR_VALID & pte) ? "v" : " ",
iova, pte & GART_PAGE_MASK);
}
@@ -179,21 +179,22 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
spin_lock(&gart->client_lock);
list_for_each_entry(c, &gart->client, list) {
if (c->dev == dev) {
- dev_err(gart->dev,
- "%s is already attached\n", dev_name(dev));
+ dev_err(gart->dev, "GART: %s is already attached\n",
+ dev_name(dev));
err = -EINVAL;
goto fail;
}
}
if (gart->active_domain && gart->active_domain != domain) {
- dev_err(gart->dev, "Only one domain can be active at a time\n");
+ dev_err(gart->dev,
+ "GART: Only one domain can be active at a time\n");
err = -EINVAL;
goto fail;
}
gart->active_domain = domain;
list_add(&client->list, &gart->client);
spin_unlock(&gart->client_lock);
- dev_dbg(gart->dev, "Attached %s\n", dev_name(dev));
+ dev_dbg(gart->dev, "GART: Attached %s\n", dev_name(dev));
return 0;

fail:
@@ -215,12 +216,14 @@ static void __gart_iommu_detach_dev(struct iommu_domain *domain,
kfree(c);
if (list_empty(&gart->client))
gart->active_domain = NULL;
- dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
+ dev_dbg(gart->dev, "GART: Detached %s\n",
+ dev_name(dev));
return;
}
}

- dev_err(gart->dev, "Couldn't find %s to detach\n", dev_name(dev));
+ dev_err(gart->dev, "GART: Couldn't find %s to detach\n",
+ dev_name(dev));
}

static void gart_iommu_detach_dev(struct iommu_domain *domain,
@@ -293,7 +296,7 @@ static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
spin_lock_irqsave(&gart->pte_lock, flags);
pfn = __phys_to_pfn(pa);
if (!pfn_valid(pfn)) {
- dev_err(gart->dev, "Invalid page: %pa\n", &pa);
+ dev_err(gart->dev, "GART: Invalid page: %pa\n", &pa);
spin_unlock_irqrestore(&gart->pte_lock, flags);
return -EINVAL;
}
@@ -301,7 +304,7 @@ static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
pte = gart_read_pte(gart, iova);
if (pte & GART_ENTRY_PHYS_ADDR_VALID) {
spin_unlock_irqrestore(&gart->pte_lock, flags);
- dev_err(gart->dev, "Page entry is in-use\n");
+ dev_err(gart->dev, "GART: Page entry is in-use\n");
return -EBUSY;
}
}
@@ -344,7 +347,7 @@ static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,

pa = (pte & GART_PAGE_MASK);
if (!pfn_valid(__phys_to_pfn(pa))) {
- dev_err(gart->dev, "No entry for %08llx:%pa\n",
+ dev_err(gart->dev, "GART: No entry for %08llx:%pa\n",
(unsigned long long)iova, &pa);
gart_dump_table(gart);
return -EINVAL;
@@ -455,19 +458,17 @@ struct gart_device *tegra_gart_probe(struct device *dev,
res_remap = platform_get_resource(to_platform_device(dev),
IORESOURCE_MEM, 1);
if (!res_remap) {
- dev_err(dev, "GART memory aperture expected\n");
+ dev_err(dev, "GART: Memory aperture resource unavailable\n");
return ERR_PTR(-ENXIO);
}

gart = kzalloc(sizeof(*gart), GFP_KERNEL);
- if (!gart) {
- dev_err(dev, "failed to allocate gart_device\n");
+ if (!gart)
return ERR_PTR(-ENOMEM);
- }

ret = iommu_device_sysfs_add(&gart->iommu, dev, NULL, "gart");
if (ret) {
- dev_err(dev, "Failed to register IOMMU in sysfs\n");
+ dev_err(dev, "GART: Failed to register IOMMU sysfs\n");
goto free_gart;
}

@@ -476,7 +477,7 @@ struct gart_device *tegra_gart_probe(struct device *dev,

ret = iommu_device_register(&gart->iommu);
if (ret) {
- dev_err(dev, "Failed to register IOMMU\n");
+ dev_err(dev, "GART: Failed to register IOMMU\n");
goto remove_sysfs;
}

@@ -491,7 +492,6 @@ struct gart_device *tegra_gart_probe(struct device *dev,

gart->savedata = vmalloc(array_size(sizeof(u32), gart->page_count));
if (!gart->savedata) {
- dev_err(dev, "failed to allocate context save area\n");
ret = -ENOMEM;
goto unregister_iommu;
}
--
2.18.0


2018-08-18 15:58:01

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 17/19] iommu/tegra: gart: Don't detach devices from inactive domains

There could be unlimited number of allocated domains, but only one domain
can be active at a time. Hence devices must be detached only from the
active domain.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 284cddf90888..306e9644a676 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -167,7 +167,7 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
struct device *dev)
{
struct gart_domain *gart_domain = to_gart_domain(domain);
- struct gart_device *gart = gart_domain->gart;
+ struct gart_device *gart = gart_handle;
struct gart_client *client, *c;
int err = 0;

@@ -192,6 +192,7 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
goto fail;
}
gart->active_domain = domain;
+ gart_domain->gart = gart;
list_add(&client->list, &gart->client);
spin_unlock(&gart->client_lock);
dev_dbg(gart->dev, "GART: Attached %s\n", dev_name(dev));
@@ -214,8 +215,10 @@ static void __gart_iommu_detach_dev(struct iommu_domain *domain,
if (c->dev == dev) {
list_del(&c->list);
kfree(c);
- if (list_empty(&gart->client))
+ if (list_empty(&gart->client)) {
gart->active_domain = NULL;
+ gart_domain->gart = NULL;
+ }
dev_dbg(gart->dev, "GART: Detached %s\n",
dev_name(dev));
return;
@@ -253,7 +256,6 @@ static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
if (!gart_domain)
return NULL;

- gart_domain->gart = gart;
gart_domain->domain.geometry.aperture_start = gart->iovmm_base;
gart_domain->domain.geometry.aperture_end = gart->iovmm_base +
gart->page_count * GART_PAGE_SIZE - 1;
--
2.18.0


2018-08-18 15:58:22

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 14/19] iommu/tegra: gart: Allow only one active domain at a time

GART has a single address space that is shared by all devices, hence only
one domain could be active at a time.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 1d45b023adea..9f7d3afb686f 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -55,6 +55,7 @@ struct gart_device {
spinlock_t pte_lock; /* for pagetable */
struct list_head client;
spinlock_t client_lock; /* for client list */
+ struct iommu_domain *active_domain; /* current active domain */
struct device *dev;

struct iommu_device iommu; /* IOMMU Core handle */
@@ -184,6 +185,12 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
goto fail;
}
}
+ if (gart->active_domain && gart->active_domain != domain) {
+ dev_err(gart->dev, "Only one domain can be active at a time\n");
+ err = -EINVAL;
+ goto fail;
+ }
+ gart->active_domain = domain;
list_add(&client->list, &gart->client);
spin_unlock(&gart->client_lock);
dev_dbg(gart->dev, "Attached %s\n", dev_name(dev));
@@ -206,6 +213,8 @@ static void __gart_iommu_detach_dev(struct iommu_domain *domain,
if (c->dev == dev) {
list_del(&c->list);
devm_kfree(gart->dev, c);
+ if (list_empty(&gart->client))
+ gart->active_domain = NULL;
dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
return;
}
--
2.18.0


2018-08-18 15:58:30

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 12/19] iommu/tegra: gart: Fix spinlock recursion

Fix spinlock recursion bug that happens on IOMMU domain destruction if
any of the allocated domains have devices attached to them.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 1c89b20ba4bb..e6fe139576c3 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -195,25 +195,33 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
return err;
}

-static void gart_iommu_detach_dev(struct iommu_domain *domain,
- struct device *dev)
+static void __gart_iommu_detach_dev(struct iommu_domain *domain,
+ struct device *dev)
{
struct gart_domain *gart_domain = to_gart_domain(domain);
struct gart_device *gart = gart_domain->gart;
struct gart_client *c;

- spin_lock(&gart->client_lock);
-
list_for_each_entry(c, &gart->client, list) {
if (c->dev == dev) {
list_del(&c->list);
devm_kfree(gart->dev, c);
dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
- goto out;
+ return;
}
}
- dev_err(gart->dev, "Couldn't find\n");
-out:
+
+ dev_err(gart->dev, "Couldn't find %s to detach\n", dev_name(dev));
+}
+
+static void gart_iommu_detach_dev(struct iommu_domain *domain,
+ struct device *dev)
+{
+ struct gart_domain *gart_domain = to_gart_domain(domain);
+ struct gart_device *gart = gart_domain->gart;
+
+ spin_lock(&gart->client_lock);
+ __gart_iommu_detach_dev(domain, dev);
spin_unlock(&gart->client_lock);
}

@@ -253,7 +261,7 @@ static void gart_iommu_domain_free(struct iommu_domain *domain)
struct gart_client *c;

list_for_each_entry(c, &gart->client, list)
- gart_iommu_detach_dev(domain, c->dev);
+ __gart_iommu_detach_dev(domain, c->dev);
}
spin_unlock(&gart->client_lock);
}
--
2.18.0


2018-08-18 15:58:31

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 13/19] iommu/tegra: gart: Fix NULL pointer dereference

Fix NULL pointer dereference on IOMMU domain destruction that happens
because clients list is being iterated unsafely and its elements are
getting deleted during the iteration.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index e6fe139576c3..1d45b023adea 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -258,9 +258,9 @@ static void gart_iommu_domain_free(struct iommu_domain *domain)
if (gart) {
spin_lock(&gart->client_lock);
if (!list_empty(&gart->client)) {
- struct gart_client *c;
+ struct gart_client *c, *tmp;

- list_for_each_entry(c, &gart->client, list)
+ list_for_each_entry_safe(c, tmp, &gart->client, list)
__gart_iommu_detach_dev(domain, c->dev);
}
spin_unlock(&gart->client_lock);
--
2.18.0


2018-08-18 15:59:01

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 08/19] memory: tegra: Don't invoke Tegra30+ specific memory timing setup on Tegra20

This fixes irrelevant "tegra-mc 7000f000.memory-controller: no memory
timings for RAM code 0 registered" warning message during of kernels
boot-up on Tegra20.

Fixes: a8d502fd3348 ("memory: tegra: Squash tegra20-mc into common tegra-mc driver")
Signed-off-by: Dmitry Osipenko <[email protected]>
Acked-by: Jon Hunter <[email protected]>
---
drivers/memory/tegra/mc.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index bd25faf6d13d..e56862495f36 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -664,12 +664,13 @@ static int tegra_mc_probe(struct platform_device *pdev)
}

isr = tegra_mc_irq;
- }

- err = tegra_mc_setup_timings(mc);
- if (err < 0) {
- dev_err(&pdev->dev, "failed to setup timings: %d\n", err);
- return err;
+ err = tegra_mc_setup_timings(mc);
+ if (err < 0) {
+ dev_err(&pdev->dev, "failed to setup timings: %d\n",
+ err);
+ return err;
+ }
}

mc->irq = platform_get_irq(pdev, 0);
--
2.18.0


2018-08-18 15:59:09

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 11/19] iommu/tegra: gart: Integrate with Memory Controller driver

The device-tree binding has been changed. There is no separate GART device
anymore, it is squashed into the Memory Controller. Integrate GART module
with the MC in a way it is done for the SMMU of Tegra30+.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/Kconfig | 1 +
drivers/iommu/tegra-gart.c | 98 ++++++++++----------------------------
drivers/memory/tegra/mc.c | 41 ++++++++++++++++
include/soc/tegra/mc.h | 27 +++++++++++
4 files changed, 93 insertions(+), 74 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index c60395b7470f..33f97e5f07ca 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -269,6 +269,7 @@ config ROCKCHIP_IOMMU
config TEGRA_IOMMU_GART
bool "Tegra GART IOMMU Support"
depends on ARCH_TEGRA_2x_SOC
+ depends on TEGRA_MC
select IOMMU_API
help
Enables support for remapping discontiguous physical memory
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 86a855c0d031..1c89b20ba4bb 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -21,11 +21,13 @@
#include <linux/iommu.h>
#include <linux/list.h>
#include <linux/module.h>
-#include <linux/of_device.h>
+#include <linux/platform_device.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/vmalloc.h>

+#include <soc/tegra/mc.h>
+
/* bitmap of the page sizes currently supported */
#define GART_IOMMU_PGSIZES (SZ_4K)

@@ -394,9 +396,8 @@ static const struct iommu_ops gart_iommu_ops = {
.iotlb_sync = gart_iommu_sync,
};

-static int tegra_gart_suspend(struct device *dev)
+int tegra_gart_suspend(struct gart_device *gart)
{
- struct gart_device *gart = dev_get_drvdata(dev);
unsigned long iova;
u32 *data = gart->savedata;
unsigned long flags;
@@ -408,9 +409,8 @@ static int tegra_gart_suspend(struct device *dev)
return 0;
}

-static int tegra_gart_resume(struct device *dev)
+int tegra_gart_resume(struct gart_device *gart)
{
- struct gart_device *gart = dev_get_drvdata(dev);
unsigned long flags;

spin_lock_irqsave(&gart->pte_lock, flags);
@@ -419,41 +419,39 @@ static int tegra_gart_resume(struct device *dev)
return 0;
}

-static int tegra_gart_probe(struct platform_device *pdev)
+struct gart_device *tegra_gart_probe(struct device *dev,
+ const struct tegra_smmu_soc *soc,
+ struct tegra_mc *mc)
{
struct gart_device *gart;
- struct resource *res, *res_remap;
+ struct resource *res_remap;
void __iomem *gart_regs;
- struct device *dev = &pdev->dev;
int ret;

BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);

+ /* Tegra30+ has an SMMU and no GART */
+ if (soc)
+ return NULL;
+
/* the GART memory aperture is required */
- res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
- res_remap = platform_get_resource(pdev, IORESOURCE_MEM, 1);
- if (!res || !res_remap) {
+ res_remap = platform_get_resource(to_platform_device(dev),
+ IORESOURCE_MEM, 1);
+ if (!res_remap) {
dev_err(dev, "GART memory aperture expected\n");
- return -ENXIO;
+ return ERR_PTR(-ENXIO);
}

gart = devm_kzalloc(dev, sizeof(*gart), GFP_KERNEL);
if (!gart) {
dev_err(dev, "failed to allocate gart_device\n");
- return -ENOMEM;
+ return ERR_PTR(-ENOMEM);
}

- gart_regs = devm_ioremap(dev, res->start, resource_size(res));
- if (!gart_regs) {
- dev_err(dev, "failed to remap GART registers\n");
- return -ENXIO;
- }
-
- ret = iommu_device_sysfs_add(&gart->iommu, &pdev->dev, NULL,
- dev_name(&pdev->dev));
+ ret = iommu_device_sysfs_add(&gart->iommu, dev, NULL, "gart");
if (ret) {
dev_err(dev, "Failed to register IOMMU in sysfs\n");
- return ret;
+ return ERR_PTR(ret);
}

iommu_device_set_ops(&gart->iommu, &gart_iommu_ops);
@@ -465,7 +463,8 @@ static int tegra_gart_probe(struct platform_device *pdev)
goto remove_sysfs;
}

- gart->dev = &pdev->dev;
+ gart->dev = dev;
+ gart_regs = mc->regs + GART_REG_BASE;
spin_lock_init(&gart->pte_lock);
spin_lock_init(&gart->client_lock);
INIT_LIST_HEAD(&gart->client);
@@ -480,72 +479,23 @@ static int tegra_gart_probe(struct platform_device *pdev)
goto unregister_iommu;
}

- platform_set_drvdata(pdev, gart);
do_gart_setup(gart, NULL);

gart_handle = gart;

- return 0;
+ return gart;

unregister_iommu:
iommu_device_unregister(&gart->iommu);
remove_sysfs:
iommu_device_sysfs_remove(&gart->iommu);

- return ret;
-}
-
-static int tegra_gart_remove(struct platform_device *pdev)
-{
- struct gart_device *gart = platform_get_drvdata(pdev);
-
- iommu_device_unregister(&gart->iommu);
- iommu_device_sysfs_remove(&gart->iommu);
-
- writel(0, gart->regs + GART_CONFIG);
- if (gart->savedata)
- vfree(gart->savedata);
- gart_handle = NULL;
- return 0;
-}
-
-static const struct dev_pm_ops tegra_gart_pm_ops = {
- .suspend = tegra_gart_suspend,
- .resume = tegra_gart_resume,
-};
-
-static const struct of_device_id tegra_gart_of_match[] = {
- { .compatible = "nvidia,tegra20-gart", },
- { },
-};
-MODULE_DEVICE_TABLE(of, tegra_gart_of_match);
-
-static struct platform_driver tegra_gart_driver = {
- .probe = tegra_gart_probe,
- .remove = tegra_gart_remove,
- .driver = {
- .name = "tegra-gart",
- .pm = &tegra_gart_pm_ops,
- .of_match_table = tegra_gart_of_match,
- },
-};
-
-static int tegra_gart_init(void)
-{
- return platform_driver_register(&tegra_gart_driver);
-}
-
-static void __exit tegra_gart_exit(void)
-{
- platform_driver_unregister(&tegra_gart_driver);
+ return ERR_PTR(ret);
}

-subsys_initcall(tegra_gart_init);
-module_exit(tegra_gart_exit);
module_param(gart_debug, bool, 0644);

MODULE_PARM_DESC(gart_debug, "Enable GART debugging");
MODULE_DESCRIPTION("IOMMU API for GART in Tegra20");
MODULE_AUTHOR("Hiroshi DOYU <[email protected]>");
-MODULE_ALIAS("platform:tegra-gart");
MODULE_LICENSE("GPL v2");
diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index ab383d7a7842..f2ace46d951d 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -706,13 +706,54 @@ static int tegra_mc_probe(struct platform_device *pdev)
PTR_ERR(mc->smmu));
}

+ if (IS_ENABLED(CONFIG_TEGRA_IOMMU_GART)) {
+ mc->gart = tegra_gart_probe(&pdev->dev, mc->soc->smmu, mc);
+ if (IS_ERR(mc->gart))
+ dev_err(&pdev->dev, "failed to probe GART: %ld\n",
+ PTR_ERR(mc->gart));
+ }
+
+ return 0;
+}
+
+static int tegra_mc_suspend(struct device *dev)
+{
+ struct tegra_mc *mc = dev_get_drvdata(dev);
+ int err;
+
+ if (mc->gart) {
+ err = tegra_gart_suspend(mc->gart);
+ if (err)
+ return err;
+ }
+
return 0;
}

+static int tegra_mc_resume(struct device *dev)
+{
+ struct tegra_mc *mc = dev_get_drvdata(dev);
+ int err;
+
+ if (mc->gart) {
+ err = tegra_gart_resume(mc->gart);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static const struct dev_pm_ops tegra_mc_pm_ops = {
+ .suspend = tegra_mc_suspend,
+ .resume = tegra_mc_resume,
+};
+
static struct platform_driver tegra_mc_driver = {
.driver = {
.name = "tegra-mc",
.of_match_table = tegra_mc_of_match,
+ .pm = &tegra_mc_pm_ops,
.suppress_bind_attrs = true,
},
.prevent_deferred_probe = true,
diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
index db5bfdf589b4..5da42e3fb801 100644
--- a/include/soc/tegra/mc.h
+++ b/include/soc/tegra/mc.h
@@ -77,6 +77,7 @@ struct tegra_smmu_soc {

struct tegra_mc;
struct tegra_smmu;
+struct gart_device;

#ifdef CONFIG_TEGRA_IOMMU_SMMU
struct tegra_smmu *tegra_smmu_probe(struct device *dev,
@@ -96,6 +97,31 @@ static inline void tegra_smmu_remove(struct tegra_smmu *smmu)
}
#endif

+#ifdef CONFIG_TEGRA_IOMMU_GART
+struct gart_device *tegra_gart_probe(struct device *dev,
+ const struct tegra_smmu_soc *soc,
+ struct tegra_mc *mc);
+int tegra_gart_suspend(struct gart_device *gart);
+int tegra_gart_resume(struct gart_device *gart);
+#else
+static inline struct gart_device *
+tegra_gart_probe(struct device *dev, const struct tegra_smmu_soc *soc,
+ struct tegra_mc *mc)
+{
+ return NULL;
+}
+
+static inline int tegra_gart_suspend(struct gart_device *gart)
+{
+ return -ENODEV;
+}
+
+static inline int tegra_gart_resume(struct gart_device *gart)
+{
+ return -ENODEV;
+}
+#endif
+
struct tegra_mc_reset {
const char *name;
unsigned long id;
@@ -144,6 +170,7 @@ struct tegra_mc_soc {
struct tegra_mc {
struct device *dev;
struct tegra_smmu *smmu;
+ struct gart_device *gart;
void __iomem *regs;
struct clk *clk;
int irq;
--
2.18.0


2018-08-18 15:59:16

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 10/19] memory: tegra: Read client ID on GART page fault

With the device tree binding changes, now Memory Controller has access to
GART registers. Hence it is now possible to read client ID on GART page
fault to get information about what memory client causes the fault.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/memory/tegra/mc.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index 3bf3138769f4..ab383d7a7842 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -38,6 +38,7 @@

#define MC_ERR_ADR 0x0c

+#define MC_GART_ERROR_REQ 0x30
#define MC_DECERR_EMEM_OTHERS_STATUS 0x58
#define MC_SECURITY_VIOLATION_STATUS 0x74

@@ -575,8 +576,15 @@ static __maybe_unused irqreturn_t tegra20_mc_irq(int irq, void *data)
break;

case MC_INT_INVALID_GART_PAGE:
- dev_err_ratelimited(mc->dev, "%s\n", error);
- continue;
+ reg = MC_GART_ERROR_REQ;
+ value = mc_readl(mc, reg);
+
+ id = (value >> 1) & mc->soc->client_id_mask;
+ desc = error_names[2];
+
+ if (value & BIT(0))
+ direction = "write";
+ break;

case MC_INT_SECURITY_VIOLATION:
reg = MC_SECURITY_VIOLATION_STATUS;
--
2.18.0


2018-08-18 15:59:24

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 06/19] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc

Splitting GART and Memory Controller wasn't a good decision that was made
back in the day. Given that the GART driver hasn't ever been used by
anything in the kernel, we decided that it will be better to correct the
mistakes of the past and merge two bindings into a single one. In a result
there is a DT ABI change for the Memory Controller that allows not to
break newer kernels using older DT by introducing a new required property,
the memory clock. Adding the new clock property also puts the tegra20-mc
binding in line with the bindings of the later Tegra generations.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
.../bindings/iommu/nvidia,tegra20-gart.txt | 14 -----------
.../memory-controllers/nvidia,tegra20-mc.txt | 23 ++++++++++++++-----
2 files changed, 17 insertions(+), 20 deletions(-)
delete mode 100644 Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt

diff --git a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt b/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
deleted file mode 100644
index 099d9362ebc1..000000000000
--- a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
+++ /dev/null
@@ -1,14 +0,0 @@
-NVIDIA Tegra 20 GART
-
-Required properties:
-- compatible: "nvidia,tegra20-gart"
-- reg: Two pairs of cells specifying the physical address and size of
- the memory controller registers and the GART aperture respectively.
-
-Example:
-
- gart {
- compatible = "nvidia,tegra20-gart";
- reg = <0x7000f024 0x00000018 /* controller registers */
- 0x58000000 0x02000000>; /* GART aperture */
- };
diff --git a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
index 7d60a50a4fa1..1564df89392a 100644
--- a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
+++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
@@ -2,25 +2,36 @@ NVIDIA Tegra20 MC(Memory Controller)

Required properties:
- compatible : "nvidia,tegra20-mc"
-- reg : Should contain 2 register ranges(address and length); see the
- example below. Note that the MC registers are interleaved with the
- GART registers, and hence must be represented as multiple ranges.
+- reg : Should contain 2 register ranges: physical base address and length of
+ the controller's registers and the GART aperture respectively.
+- clocks: Must contain an entry for each entry in clock-names.
+ See ../clocks/clock-bindings.txt for details.
+- clock-names: Must include the following entries:
+ - mc: the module's clock input
- interrupts : Should contain MC General interrupt.
- #reset-cells : Should be 1. This cell represents memory client module ID.
The assignments may be found in header file <dt-bindings/memory/tegra20-mc.h>
or in the TRM documentation.
+- #iommu-cells: Should be 0. This cell represents the number of cells in an
+ IOMMU specifier needed to encode an address. GART supports only a single
+ address space that is shared by all devices, therefore no additional
+ information needed for the address encoding.

Example:
mc: memory-controller@7000f000 {
compatible = "nvidia,tegra20-mc";
- reg = <0x7000f000 0x024
- 0x7000f03c 0x3c4>;
- interrupts = <0 77 0x04>;
+ reg = <0x7000f000 0x400 /* controller registers */
+ 0x58000000 0x02000000>; /* GART aperture */
+ clocks = <&tegra_car TEGRA20_CLK_MC>;
+ clock-names = "mc";
+ interrupts = <GIC_SPI 77 0x04>;
#reset-cells = <1>;
+ #iommu-cells = <0>;
};

video-codec@6001a000 {
compatible = "nvidia,tegra20-vde";
...
resets = <&mc TEGRA20_MC_RESET_VDE>;
+ iommus = <&mc>;
};
--
2.18.0


2018-08-18 15:59:32

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 03/19] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT

GART can't handle all devices, hence ignore devices that aren't related
to GART. IOMMU phandle must be explicitly assign to devices in the device
tree.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index e9524ed264cf..f6cf5cd5aaca 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -342,8 +342,12 @@ static bool gart_iommu_capable(enum iommu_cap cap)

static int gart_iommu_add_device(struct device *dev)
{
- struct iommu_group *group = iommu_group_get_for_dev(dev);
+ struct iommu_group *group;

+ if (!dev->iommu_fwspec)
+ return -ENODEV;
+
+ group = iommu_group_get_for_dev(dev);
if (IS_ERR(group))
return PTR_ERR(group);

@@ -360,6 +364,12 @@ static void gart_iommu_remove_device(struct device *dev)
iommu_device_unlink(&gart_handle->iommu, dev);
}

+static int gart_iommu_of_xlate(struct device *dev,
+ struct of_phandle_args *args)
+{
+ return 0;
+}
+
static const struct iommu_ops gart_iommu_ops = {
.capable = gart_iommu_capable,
.domain_alloc = gart_iommu_domain_alloc,
@@ -373,6 +383,7 @@ static const struct iommu_ops gart_iommu_ops = {
.unmap = gart_iommu_unmap,
.iova_to_phys = gart_iommu_iova_to_phys,
.pgsize_bitmap = GART_IOMMU_PGSIZES,
+ .of_xlate = gart_iommu_of_xlate,
};

static int tegra_gart_suspend(struct device *dev)
@@ -438,6 +449,7 @@ static int tegra_gart_probe(struct platform_device *pdev)
}

iommu_device_set_ops(&gart->iommu, &gart_iommu_ops);
+ iommu_device_set_fwnode(&gart->iommu, dev->fwnode);

ret = iommu_device_register(&gart->iommu);
if (ret) {
--
2.18.0


2018-08-18 15:59:36

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 07/19] ARM: dts: tegra20: Update Memory Controller node to the new binding

Device tree binding of Memory Controller has been changed: GART has been
squashed into the MC, there are a new mandatory clock and #iommu-cells
properties.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
arch/arm/boot/dts/tegra20.dtsi | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 979f38293fe5..74f6e52291c5 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -617,16 +617,13 @@

mc: memory-controller@7000f000 {
compatible = "nvidia,tegra20-mc";
- reg = <0x7000f000 0x024
- 0x7000f03c 0x3c4>;
+ reg = <0x7000f000 0x400 /* controller registers */
+ 0x58000000 0x02000000>; /* GART aperture */
+ clocks = <&tegra_car TEGRA20_CLK_MC>;
+ clock-names = "mc";
interrupts = <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>;
#reset-cells = <1>;
- };
-
- iommu@7000f024 {
- compatible = "nvidia,tegra20-gart";
- reg = <0x7000f024 0x00000018 /* controller registers */
- 0x58000000 0x02000000>; /* GART aperture */
+ #iommu-cells = <0>;
};

memory-controller@7000f400 {
--
2.18.0


2018-08-18 15:59:53

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 01/19] iommu/tegra: gart: Remove pr_fmt and clean up includes

Remove unneeded headers inclusion and sort the headers in alphabet order.
Remove pr_fmt macro since there is no pr_*() in the code and it doesn't
affect dev_*() functions.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 17 +++++------------
1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 7b1361d57a17..6dda7ee1d36c 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -17,21 +17,14 @@
* 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
*/

-#define pr_fmt(fmt) "%s(): " fmt, __func__
-
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/list.h>
#include <linux/module.h>
-#include <linux/platform_device.h>
-#include <linux/spinlock.h>
+#include <linux/of_device.h>
#include <linux/slab.h>
+#include <linux/spinlock.h>
#include <linux/vmalloc.h>
-#include <linux/mm.h>
-#include <linux/list.h>
-#include <linux/device.h>
-#include <linux/io.h>
-#include <linux/iommu.h>
-#include <linux/of.h>
-
-#include <asm/cacheflush.h>

/* bitmap of the page sizes currently supported */
#define GART_IOMMU_PGSIZES (SZ_4K)
--
2.18.0


2018-08-18 16:00:05

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 09/19] memory: tegra: Adapt to Tegra20 device-tree binding changes

The tegra20-mc DT binding has been changed, GART has been squashed
into Memory Controller and now the clock property is mandatory for
Tegra20. Adapt driver the to DT changes.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/memory/tegra/mc.c | 19 +++++++------------
drivers/memory/tegra/mc.h | 6 ------
include/soc/tegra/mc.h | 2 +-
3 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index e56862495f36..3bf3138769f4 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -638,24 +638,19 @@ static int tegra_mc_probe(struct platform_device *pdev)
if (IS_ERR(mc->regs))
return PTR_ERR(mc->regs);

+ mc->clk = devm_clk_get(&pdev->dev, "mc");
+ if (IS_ERR(mc->clk)) {
+ dev_err(&pdev->dev, "failed to get MC clock: %ld\n",
+ PTR_ERR(mc->clk));
+ return PTR_ERR(mc->clk);
+ }
+
#ifdef CONFIG_ARCH_TEGRA_2x_SOC
if (mc->soc == &tegra20_mc_soc) {
- res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
- mc->regs2 = devm_ioremap_resource(&pdev->dev, res);
- if (IS_ERR(mc->regs2))
- return PTR_ERR(mc->regs2);
-
isr = tegra20_mc_irq;
} else
#endif
{
- mc->clk = devm_clk_get(&pdev->dev, "mc");
- if (IS_ERR(mc->clk)) {
- dev_err(&pdev->dev, "failed to get MC clock: %ld\n",
- PTR_ERR(mc->clk));
- return PTR_ERR(mc->clk);
- }
-
err = tegra_mc_setup_latency_allowance(mc);
if (err < 0) {
dev_err(&pdev->dev, "failed to setup latency allowance: %d\n",
diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
index 01065f12ebeb..9856f085e487 100644
--- a/drivers/memory/tegra/mc.h
+++ b/drivers/memory/tegra/mc.h
@@ -26,18 +26,12 @@

static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
{
- if (mc->regs2 && offset >= 0x24)
- return readl(mc->regs2 + offset - 0x3c);
-
return readl(mc->regs + offset);
}

static inline void mc_writel(struct tegra_mc *mc, u32 value,
unsigned long offset)
{
- if (mc->regs2 && offset >= 0x24)
- return writel(value, mc->regs2 + offset - 0x3c);
-
writel(value, mc->regs + offset);
}

diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
index b43f37fea096..db5bfdf589b4 100644
--- a/include/soc/tegra/mc.h
+++ b/include/soc/tegra/mc.h
@@ -144,7 +144,7 @@ struct tegra_mc_soc {
struct tegra_mc {
struct device *dev;
struct tegra_smmu *smmu;
- void __iomem *regs, *regs2;
+ void __iomem *regs;
struct clk *clk;
int irq;

--
2.18.0


2018-08-18 16:00:13

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 02/19] iommu/tegra: gart: Clean up driver probe errors handling

Properly clean up allocated resources on the drivers probe failure and
remove unneeded checks.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/iommu/tegra-gart.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 6dda7ee1d36c..e9524ed264cf 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -408,9 +408,6 @@ static int tegra_gart_probe(struct platform_device *pdev)
struct device *dev = &pdev->dev;
int ret;

- if (gart_handle)
- return -EIO;
-
BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);

/* the GART memory aperture is required */
@@ -445,8 +442,7 @@ static int tegra_gart_probe(struct platform_device *pdev)
ret = iommu_device_register(&gart->iommu);
if (ret) {
dev_err(dev, "Failed to register IOMMU\n");
- iommu_device_sysfs_remove(&gart->iommu);
- return ret;
+ goto remove_sysfs;
}

gart->dev = &pdev->dev;
@@ -460,7 +456,8 @@ static int tegra_gart_probe(struct platform_device *pdev)
gart->savedata = vmalloc(array_size(sizeof(u32), gart->page_count));
if (!gart->savedata) {
dev_err(dev, "failed to allocate context save area\n");
- return -ENOMEM;
+ ret = -ENOMEM;
+ goto unregister_iommu;
}

platform_set_drvdata(pdev, gart);
@@ -469,6 +466,13 @@ static int tegra_gart_probe(struct platform_device *pdev)
gart_handle = gart;

return 0;
+
+unregister_iommu:
+ iommu_device_unregister(&gart->iommu);
+remove_sysfs:
+ iommu_device_sysfs_remove(&gart->iommu);
+
+ return ret;
}

static int tegra_gart_remove(struct platform_device *pdev)
--
2.18.0


2018-08-18 16:00:44

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v3 04/19] iommu: Introduce iotlb_sync_map callback

Introduce iotlb_sync_map() callback that is invoked in the end of
iommu_map(). This new callback allows IOMMU drivers to avoid syncing
after mapping of each contiguous chunk and sync only when the whole
mapping is completed, optimizing performance of the mapping operation.

Signed-off-by: Dmitry Osipenko <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
---
drivers/iommu/iommu.c | 8 ++++++--
include/linux/iommu.h | 1 +
2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8c15c5980299..8979b16caf61 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1545,13 +1545,14 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
int iommu_map(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t size, int prot)
{
+ const struct iommu_ops *ops = domain->ops;
unsigned long orig_iova = iova;
unsigned int min_pagesz;
size_t orig_size = size;
phys_addr_t orig_paddr = paddr;
int ret = 0;

- if (unlikely(domain->ops->map == NULL ||
+ if (unlikely(ops->map == NULL ||
domain->pgsize_bitmap == 0UL))
return -ENODEV;

@@ -1580,7 +1581,7 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx\n",
iova, &paddr, pgsize);

- ret = domain->ops->map(domain, iova, paddr, pgsize, prot);
+ ret = ops->map(domain, iova, paddr, pgsize, prot);
if (ret)
break;

@@ -1589,6 +1590,9 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
size -= pgsize;
}

+ if (ops->iotlb_sync_map)
+ ops->iotlb_sync_map(domain);
+
/* unroll mapping in case something went wrong */
if (ret)
iommu_unmap(domain, orig_iova, orig_size - size);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 87994c265bf5..4c488eb69752 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -202,6 +202,7 @@ struct iommu_ops {
void (*flush_iotlb_all)(struct iommu_domain *domain);
void (*iotlb_range_add)(struct iommu_domain *domain,
unsigned long iova, size_t size);
+ void (*iotlb_sync_map)(struct iommu_domain *domain);
void (*iotlb_sync)(struct iommu_domain *domain);
phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t iova);
int (*add_device)(struct device *dev);
--
2.18.0


2018-08-20 19:13:32

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v3 06/19] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc

On Sat, Aug 18, 2018 at 06:54:17PM +0300, Dmitry Osipenko wrote:
> Splitting GART and Memory Controller wasn't a good decision that was made
> back in the day. Given that the GART driver hasn't ever been used by
> anything in the kernel, we decided that it will be better to correct the
> mistakes of the past and merge two bindings into a single one. In a result

As a result...

> there is a DT ABI change for the Memory Controller that allows not to
> break newer kernels using older DT by introducing a new required property,
> the memory clock. Adding the new clock property also puts the tegra20-mc
> binding in line with the bindings of the later Tegra generations.

I don't understand this part. It looks to me like you are breaking
compatibility. The driver failing to probe with an old DT is okay?

OS's like OpenSUSE use new DTs with older kernel versions, so you should
consider how to not break them as well. I guess if all this is optional
or has been unused, then there shouldn't be a problem.

> Signed-off-by: Dmitry Osipenko <[email protected]>
> ---
> .../bindings/iommu/nvidia,tegra20-gart.txt | 14 -----------
> .../memory-controllers/nvidia,tegra20-mc.txt | 23 ++++++++++++++-----
> 2 files changed, 17 insertions(+), 20 deletions(-)
> delete mode 100644 Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
>
> diff --git a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt b/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
> deleted file mode 100644
> index 099d9362ebc1..000000000000
> --- a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
> +++ /dev/null
> @@ -1,14 +0,0 @@
> -NVIDIA Tegra 20 GART
> -
> -Required properties:
> -- compatible: "nvidia,tegra20-gart"
> -- reg: Two pairs of cells specifying the physical address and size of
> - the memory controller registers and the GART aperture respectively.
> -
> -Example:
> -
> - gart {
> - compatible = "nvidia,tegra20-gart";
> - reg = <0x7000f024 0x00000018 /* controller registers */
> - 0x58000000 0x02000000>; /* GART aperture */
> - };
> diff --git a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
> index 7d60a50a4fa1..1564df89392a 100644
> --- a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
> +++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
> @@ -2,25 +2,36 @@ NVIDIA Tegra20 MC(Memory Controller)
>
> Required properties:
> - compatible : "nvidia,tegra20-mc"
> -- reg : Should contain 2 register ranges(address and length); see the
> - example below. Note that the MC registers are interleaved with the
> - GART registers, and hence must be represented as multiple ranges.
> +- reg : Should contain 2 register ranges: physical base address and length of
> + the controller's registers and the GART aperture respectively.
> +- clocks: Must contain an entry for each entry in clock-names.
> + See ../clocks/clock-bindings.txt for details.
> +- clock-names: Must include the following entries:
> + - mc: the module's clock input
> - interrupts : Should contain MC General interrupt.
> - #reset-cells : Should be 1. This cell represents memory client module ID.
> The assignments may be found in header file <dt-bindings/memory/tegra20-mc.h>
> or in the TRM documentation.
> +- #iommu-cells: Should be 0. This cell represents the number of cells in an
> + IOMMU specifier needed to encode an address. GART supports only a single
> + address space that is shared by all devices, therefore no additional
> + information needed for the address encoding.
>
> Example:
> mc: memory-controller@7000f000 {
> compatible = "nvidia,tegra20-mc";
> - reg = <0x7000f000 0x024
> - 0x7000f03c 0x3c4>;
> - interrupts = <0 77 0x04>;
> + reg = <0x7000f000 0x400 /* controller registers */
> + 0x58000000 0x02000000>; /* GART aperture */
> + clocks = <&tegra_car TEGRA20_CLK_MC>;
> + clock-names = "mc";
> + interrupts = <GIC_SPI 77 0x04>;
> #reset-cells = <1>;
> + #iommu-cells = <0>;
> };
>
> video-codec@6001a000 {
> compatible = "nvidia,tegra20-vde";
> ...
> resets = <&mc TEGRA20_MC_RESET_VDE>;
> + iommus = <&mc>;
> };
> --
> 2.18.0
>

2018-08-20 19:29:13

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v3 06/19] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc

On 20.08.2018 22:12, Rob Herring wrote:
> On Sat, Aug 18, 2018 at 06:54:17PM +0300, Dmitry Osipenko wrote:
>> Splitting GART and Memory Controller wasn't a good decision that was made
>> back in the day. Given that the GART driver hasn't ever been used by
>> anything in the kernel, we decided that it will be better to correct the
>> mistakes of the past and merge two bindings into a single one. In a result
>
> As a result...
>
>> there is a DT ABI change for the Memory Controller that allows not to
>> break newer kernels using older DT by introducing a new required property,
>> the memory clock. Adding the new clock property also puts the tegra20-mc
>> binding in line with the bindings of the later Tegra generations.
>
> I don't understand this part. It looks to me like you are breaking
> compatibility. The driver failing to probe with an old DT is okay?

Yes, DT compatibility is broken. New driver won't probe/load with the old DT,
that's what we want.

> OS's like OpenSUSE use new DTs with older kernel versions, so you should
> consider how to not break them as well. I guess if all this is optional
> or has been unused, then there shouldn't be a problem.

That's interesting.. Memory Controller isn't optional, I guess we could change
compatible to "nvidia,tegra20-mc-gart".

Thierry, do you have any other suggestions?

2018-08-20 19:37:28

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v3 06/19] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc

On 20.08.2018 22:27, Dmitry Osipenko wrote:
> On 20.08.2018 22:12, Rob Herring wrote:
>> On Sat, Aug 18, 2018 at 06:54:17PM +0300, Dmitry Osipenko wrote:
>>> Splitting GART and Memory Controller wasn't a good decision that was made
>>> back in the day. Given that the GART driver hasn't ever been used by
>>> anything in the kernel, we decided that it will be better to correct the
>>> mistakes of the past and merge two bindings into a single one. In a result
>>
>> As a result...
>>
>>> there is a DT ABI change for the Memory Controller that allows not to
>>> break newer kernels using older DT by introducing a new required property,
>>> the memory clock. Adding the new clock property also puts the tegra20-mc
>>> binding in line with the bindings of the later Tegra generations.
>>
>> I don't understand this part. It looks to me like you are breaking
>> compatibility. The driver failing to probe with an old DT is okay?
>
> Yes, DT compatibility is broken. New driver won't probe/load with the old DT,
> that's what we want.
>
>> OS's like OpenSUSE use new DTs with older kernel versions, so you should
>> consider how to not break them as well. I guess if all this is optional
>> or has been unused, then there shouldn't be a problem.
>
> That's interesting.. Memory Controller isn't optional, I guess we could change
> compatible to "nvidia,tegra20-mc-gart".

* I meant it's not optional in a sense that it's enabled in kernels config by
default and driver is functional, but it's okay if MC driver will stop to probe
with older kernels as it is used only for reporting memory errors.

2018-08-28 10:49:26

by Thierry Reding

[permalink] [raw]
Subject: Re: [PATCH v3 06/19] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc

On Mon, Aug 20, 2018 at 10:35:54PM +0300, Dmitry Osipenko wrote:
> On 20.08.2018 22:27, Dmitry Osipenko wrote:
> > On 20.08.2018 22:12, Rob Herring wrote:
> >> On Sat, Aug 18, 2018 at 06:54:17PM +0300, Dmitry Osipenko wrote:
> >>> Splitting GART and Memory Controller wasn't a good decision that was made
> >>> back in the day. Given that the GART driver hasn't ever been used by
> >>> anything in the kernel, we decided that it will be better to correct the
> >>> mistakes of the past and merge two bindings into a single one. In a result
> >>
> >> As a result...
> >>
> >>> there is a DT ABI change for the Memory Controller that allows not to
> >>> break newer kernels using older DT by introducing a new required property,
> >>> the memory clock. Adding the new clock property also puts the tegra20-mc
> >>> binding in line with the bindings of the later Tegra generations.
> >>
> >> I don't understand this part. It looks to me like you are breaking
> >> compatibility. The driver failing to probe with an old DT is okay?
> >
> > Yes, DT compatibility is broken. New driver won't probe/load with the old DT,
> > that's what we want.
> >
> >> OS's like OpenSUSE use new DTs with older kernel versions, so you should
> >> consider how to not break them as well. I guess if all this is optional
> >> or has been unused, then there shouldn't be a problem.
> >
> > That's interesting.. Memory Controller isn't optional, I guess we could change
> > compatible to "nvidia,tegra20-mc-gart".
>
> * I meant it's not optional in a sense that it's enabled in kernels config by
> default and driver is functional, but it's okay if MC driver will stop to probe
> with older kernels as it is used only for reporting memory errors.

Yeah, we don't really regress at runtime. The errors reported by the
current driver are very rare, and even if you encounter them, they're
pretty cryptic, so I think this is one of the exceptional cases where
breaking the ABI "for the greater good" is acceptable.

Thierry


Attachments:
(No filename) (1.99 kB)
signature.asc (849.00 B)
Download all attachments

2018-08-28 13:10:37

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v3 06/19] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc

On 28.08.2018 13:47, Thierry Reding wrote:
> On Mon, Aug 20, 2018 at 10:35:54PM +0300, Dmitry Osipenko wrote:
>> On 20.08.2018 22:27, Dmitry Osipenko wrote:
>>> On 20.08.2018 22:12, Rob Herring wrote:
>>>> On Sat, Aug 18, 2018 at 06:54:17PM +0300, Dmitry Osipenko wrote:
>>>>> Splitting GART and Memory Controller wasn't a good decision that was made
>>>>> back in the day. Given that the GART driver hasn't ever been used by
>>>>> anything in the kernel, we decided that it will be better to correct the
>>>>> mistakes of the past and merge two bindings into a single one. In a result
>>>>
>>>> As a result...
>>>>
>>>>> there is a DT ABI change for the Memory Controller that allows not to
>>>>> break newer kernels using older DT by introducing a new required property,
>>>>> the memory clock. Adding the new clock property also puts the tegra20-mc
>>>>> binding in line with the bindings of the later Tegra generations.
>>>>
>>>> I don't understand this part. It looks to me like you are breaking
>>>> compatibility. The driver failing to probe with an old DT is okay?
>>>
>>> Yes, DT compatibility is broken. New driver won't probe/load with the old DT,
>>> that's what we want.
>>>
>>>> OS's like OpenSUSE use new DTs with older kernel versions, so you should
>>>> consider how to not break them as well. I guess if all this is optional
>>>> or has been unused, then there shouldn't be a problem.
>>>
>>> That's interesting.. Memory Controller isn't optional, I guess we could change
>>> compatible to "nvidia,tegra20-mc-gart".bled in kernels config by
>> default and driver is functional, but it's okay
>>
>> * I meant it's not optional in a sense that it's enaif MC driver will stop to probe
>> with older kernels as it is used only for reporting memory errors.
>
> Yeah, we don't really regress at runtime. The errors reported by the
> current driver are very rare, and even if you encounter them, they're
> pretty cryptic, so I think this is one of the exceptional cases where
> breaking the ABI "for the greater good" is acceptable.

It's now became apparent that factoring out EMC from MC isn't a good idea too
because MC need to interact with EMC and probably vice versa. Looks like we
should consider restructuring MC for all Tegra's.

2018-09-03 21:07:30

by Marcel Ziswiler

[permalink] [raw]
Subject: Re: [PATCH v3 09/19] memory: tegra: Adapt to Tegra20 device-tree binding changes

On Sat, 2018-08-18 at 18:54 +0300, Dmitry Osipenko wrote:
> The tegra20-mc DT binding has been changed, GART has been squashed
> into Memory Controller and now the clock property is mandatory for
> Tegra20. Adapt driver the to DT changes.

Minor nitpick concerning the commit message:

Adapt driver to the DT changes.

> Signed-off-by: Dmitry Osipenko <[email protected]>
> ---
> drivers/memory/tegra/mc.c | 19 +++++++------------
> drivers/memory/tegra/mc.h | 6 ------
> include/soc/tegra/mc.h | 2 +-
> 3 files changed, 8 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> index e56862495f36..3bf3138769f4 100644
> --- a/drivers/memory/tegra/mc.c
> +++ b/drivers/memory/tegra/mc.c
> @@ -638,24 +638,19 @@ static int tegra_mc_probe(struct
> platform_device *pdev)
> if (IS_ERR(mc->regs))
> return PTR_ERR(mc->regs);
>
> + mc->clk = devm_clk_get(&pdev->dev, "mc");
> + if (IS_ERR(mc->clk)) {
> + dev_err(&pdev->dev, "failed to get MC clock: %ld\n",
> + PTR_ERR(mc->clk));
> + return PTR_ERR(mc->clk);
> + }
> +
> #ifdef CONFIG_ARCH_TEGRA_2x_SOC
> if (mc->soc == &tegra20_mc_soc) {
> - res = platform_get_resource(pdev, IORESOURCE_MEM,
> 1);
> - mc->regs2 = devm_ioremap_resource(&pdev->dev, res);
> - if (IS_ERR(mc->regs2))
> - return PTR_ERR(mc->regs2);
> -
> isr = tegra20_mc_irq;
> } else
> #endif
> {
> - mc->clk = devm_clk_get(&pdev->dev, "mc");
> - if (IS_ERR(mc->clk)) {
> - dev_err(&pdev->dev, "failed to get MC clock:
> %ld\n",
> - PTR_ERR(mc->clk));
> - return PTR_ERR(mc->clk);
> - }
> -
> err = tegra_mc_setup_latency_allowance(mc);
> if (err < 0) {
> dev_err(&pdev->dev, "failed to setup latency
> allowance: %d\n",
> diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
> index 01065f12ebeb..9856f085e487 100644
> --- a/drivers/memory/tegra/mc.h
> +++ b/drivers/memory/tegra/mc.h
> @@ -26,18 +26,12 @@
>
> static inline u32 mc_readl(struct tegra_mc *mc, unsigned long
> offset)
> {
> - if (mc->regs2 && offset >= 0x24)
> - return readl(mc->regs2 + offset - 0x3c);
> -
> return readl(mc->regs + offset);
> }
>
> static inline void mc_writel(struct tegra_mc *mc, u32 value,
> unsigned long offset)
> {
> - if (mc->regs2 && offset >= 0x24)
> - return writel(value, mc->regs2 + offset - 0x3c);
> -
> writel(value, mc->regs + offset);
> }
>
> diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
> index b43f37fea096..db5bfdf589b4 100644
> --- a/include/soc/tegra/mc.h
> +++ b/include/soc/tegra/mc.h
> @@ -144,7 +144,7 @@ struct tegra_mc_soc {
> struct tegra_mc {
> struct device *dev;
> struct tegra_smmu *smmu;
> - void __iomem *regs, *regs2;
> + void __iomem *regs;
> struct clk *clk;
> int irq;

2018-09-04 08:57:59

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v3 09/19] memory: tegra: Adapt to Tegra20 device-tree binding changes

On Tuesday 04 September 2018 00:06:01 Marcel Ziswiler wrote:
> On Sat, 2018-08-18 at 18:54 +0300, Dmitry Osipenko wrote:
> > The tegra20-mc DT binding has been changed, GART has been squashed
> > into Memory Controller and now the clock property is mandatory for
> > Tegra20. Adapt driver the to DT changes.
>
> Minor nitpick concerning the commit message:
>
> Adapt driver to the DT changes.

Yes, thank you.