Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7480752imu; Wed, 14 Nov 2018 18:42:06 -0800 (PST) X-Google-Smtp-Source: AJdET5folJ88xMP6bcQSzhvNYl30B0/HH3GBkx/DMcnmyLJjwycJUxjsjem+M4mLR7oH/vOFSXkE X-Received: by 2002:a62:e30a:: with SMTP id g10-v6mr4511987pfh.151.1542249726119; Wed, 14 Nov 2018 18:42:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542249726; cv=none; d=google.com; s=arc-20160816; b=cgXF9/EEfb3DHdhiiPt/rm2E714bxyMXaKzKubz5DOOwpNBQQ5B9tUBPxwH5xOrVSA Ohh3Gc5fRFwkFSX9VDHqexPi2A7eSkcgxQg35NhYATYPc8PAExYxf7pxAugBKR0NkpAS nYdYRhSmLE2yYXftMT1CtU/gNJu9eVX9TmvsKMsnHp6YM0z//pkDmyIK0XdToxOIidqL naUhtinTYp9DtLK+rEZeDt49eEC8PkQZQ3XHa6fD5XVxa/lz6POgDH+5zT83yEgBrF3s xeXmrfMLqZRX7Ry9YmniN7btq0npoaw7HtvCNC5q9Mi5ZcuFMFB84y48lWQ3zo45GLuS RThQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:dkim-signature:content-transfer-encoding :content-language:mime-version:spamdiagnosticmetadata :spamdiagnosticoutput:msip_labels:accept-language:in-reply-to :references:message-id:date:thread-index:thread-topic:subject:cc:to :from; bh=Z7MnUN2O2TWF/5JYlkTczO2KNEvmOO9QjIXWCNznNSo=; b=eGd/M9pb/UgQRt+Z3QL/k8ZIs3dfYGE8cJZnBID5Wm5n5t1NLJS9ZSoHNjQGY+CWFA 60i6NfyQRpkUAG2gg3IucljmcoiPGJw0M5gLXabrN2bju4hSA10l74ktnLplJiwvplyI Kbf7SfVnLCOHxdhzkQQFl3wcmQ9e1Meg0wQ34VAcP+fv0R5apmOOVNmif6ie2Q3zJibl my/vvV1YjpGZ2j7jH7jb79WxTyUHy78YG7Tn8oQPhpkxyiTo0aac1613grbjq6pCf7JK nQyDFOy6WleCXkrnnV2ou3QbTWbRlrgKKdGFLe04PI7HRxqxR5ThrsY+EkmpYoE0+PI7 18iA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=XtrpO5X6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n11-v6si25527801pff.39.2018.11.14.18.41.36; Wed, 14 Nov 2018 18:42:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=XtrpO5X6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727099AbeKOMq1 (ORCPT + 99 others); Thu, 15 Nov 2018 07:46:27 -0500 Received: from hqemgate14.nvidia.com ([216.228.121.143]:3419 "EHLO hqemgate14.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726480AbeKOMq0 (ORCPT ); Thu, 15 Nov 2018 07:46:26 -0500 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate14.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Wed, 14 Nov 2018 18:24:59 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Wed, 14 Nov 2018 18:25:18 -0800 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Wed, 14 Nov 2018 18:25:18 -0800 Received: from HQMAIL107.nvidia.com (172.20.187.13) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Thu, 15 Nov 2018 02:25:17 +0000 Received: from NAM05-DM3-obe.outbound.protection.outlook.com (216.32.181.115) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1395.4 via Frontend Transport; Thu, 15 Nov 2018 02:25:17 +0000 Received: from BYAPR12MB2759.namprd12.prod.outlook.com (20.177.125.224) by BYAPR12MB2839.namprd12.prod.outlook.com (20.177.126.95) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1294.28; Thu, 15 Nov 2018 02:25:13 +0000 Received: from BYAPR12MB2759.namprd12.prod.outlook.com ([fe80::b8a5:a06:d7d8:d635]) by BYAPR12MB2759.namprd12.prod.outlook.com ([fe80::b8a5:a06:d7d8:d635%5]) with mapi id 15.20.1294.045; Thu, 15 Nov 2018 02:25:11 +0000 From: Krishna Reddy To: "will.deacon@arm.com" , "robin.murphy@arm.com" , "joro@8bytes.org" CC: "linux-arm-kernel@lists.infradead.org" , "iommu@lists.linux-foundation.org" , "linux-kernel@vger.kernel.org" , "linux-tegra@vger.kernel.org" , Thierry Reding , "Yu-Huan Hsu" , Sachin Nikam , Pritesh Raithatha , Timo Alho , Alexander Van Brunt , Nicolin Chen Subject: RE: [PATCH v2 1/5] iommu/arm-smmu: rearrange arm-smmu.c code Thread-Topic: [PATCH v2 1/5] iommu/arm-smmu: rearrange arm-smmu.c code Thread-Index: AQHUcXRPedPLhuK3Ik24VgV16WG24qVQHZUQ Date: Thu, 15 Nov 2018 02:25:11 +0000 Message-ID: References: <1541029716-14353-2-git-send-email-vdumpa@nvidia.com> In-Reply-To: <1541029716-14353-2-git-send-email-vdumpa@nvidia.com> Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_6b558183-044c-4105-8d9c-cea02a2a3d86_Enabled=True; MSIP_Label_6b558183-044c-4105-8d9c-cea02a2a3d86_SiteId=43083d15-7273-40c1-b7db-39efd9ccc17a; MSIP_Label_6b558183-044c-4105-8d9c-cea02a2a3d86_Owner=VDUMPA@nvidia.com; MSIP_Label_6b558183-044c-4105-8d9c-cea02a2a3d86_SetDate=2018-11-15T02:25:09.8466490Z; MSIP_Label_6b558183-044c-4105-8d9c-cea02a2a3d86_Name=Unrestricted; MSIP_Label_6b558183-044c-4105-8d9c-cea02a2a3d86_Application=Microsoft Azure Information Protection; MSIP_Label_6b558183-044c-4105-8d9c-cea02a2a3d86_Extended_MSFT_Method=Automatic; Sensitivity=Unrestricted authentication-results: spf=none (sender IP is ) smtp.mailfrom=vdumpa@nvidia.com; x-originating-ip: [216.228.112.22] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;BYAPR12MB2839;6:jgC/5ADVJoeD3LSFOZSInAs5JodLLvKOfCC/81M0waKwr8sr4yyvnLm8IgcZ6d3wQw/LtkXCR3tJ0xrc6HW1Vf8iuzSnoseXQzixYPzc0XYZ0IggJ4SjsIWavgqtS3HSGENFH87W+WdYAjrdsviaaLTBHqDX3M1msDxO7aKbJBWLaIVYmnd4b79aZCvzxEfJXYDG+2r1SLGRZc9A3xSDS/t0ULKkZoLJ9z+VDF8F8iWdRqRoTCz4Qs9lRXzTHJyB17i/wMiFHNKjbBbYPc4wGsjBxNk+a+bE9rmSZY3bdAFWOimyYzjDAnKh5wNdSDiarhFvMGIjq2L3Pmz7nHK2KcC986O3zeh7NkKZJjdZo9z/yXmTOgwh9jEjSFUf+pHd+XgKFbpvu3ubz9kioshFE5DrcP4YtSRC7ct72014s0qwC2oAL2uxBvti2ttrdWE1grTT9KPOljj0dzkMefmtJw==;5:GaPO3riP19hjVEN+Y4f3kqRHuiIyFAXdZixk4D9IvoeCtzCBkLTejNPcLygL4Y0VUxOj8OLZ4XRR5VREMO0MuS7U/HAQfdldD/D6MY5hpz5zffqermCicplYzaRHmzABUJzVL7GT85V7r8N3zpVwvMp8MOd51Hgq8AHFfx0wzJ4=;7:IU6nUaC99JL9ZtNzXehXpsi3UsjzRlP0SNoK9J3CZ7525w7rVNYBEgafctJ4FTw7FT1GHsrBLQYbIu+Va3YArdnTEjq5pOlG6lNUTfdRTUFGE7Sh+fqNxFm3YCCh0tcgqfncFb/iI8fpiqoD1djg3A== x-ms-exchange-antispam-srfa-diagnostics: SOS;SOR; x-forefront-antispam-report: SFV:SKI;SCL:-1;SFV:NSPM;SFS:(10009020)(346002)(376002)(366004)(39860400002)(396003)(136003)(199004)(189003)(13464003)(51234002)(26005)(476003)(8936002)(486006)(97736004)(11346002)(446003)(33656002)(106356001)(105586002)(81156014)(186003)(8676002)(6116002)(81166006)(4326008)(316002)(53936002)(53946003)(54906003)(110136005)(2900100001)(2906002)(3846002)(9686003)(55016002)(6436002)(7736002)(4744004)(229853002)(68736007)(66066001)(2501003)(99286004)(102836004)(6246003)(107886003)(25786009)(5660300001)(6506007)(53546011)(76176011)(575784001)(86362001)(71200400001)(71190400001)(74316002)(2201001)(14454004)(478600001)(5024004)(305945005)(14444005)(7696005)(256004)(2004002)(42262002)(559001)(579004)(569006);DIR:OUT;SFP:1101;SCL:1;SRVR:BYAPR12MB2839;H:BYAPR12MB2759.namprd12.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;A:1;MX:1; x-ms-office365-filtering-correlation-id: d464a979-c15e-4f39-80ab-08d64aa191d5 x-microsoft-antispam: BCL:0;PCL:0;RULEID:(2390098)(7020095)(4652040)(8989299)(5600074)(711020)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060)(7193020);SRVR:BYAPR12MB2839; x-ms-traffictypediagnostic: BYAPR12MB2839: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917)(258649278758335)(9452136761055)(269456686620040)(105169848403564)(211171220733660)(18589796830644); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040522)(2401047)(5005006)(8121501046)(3231415)(944501410)(52105112)(93006095)(93001095)(3002001)(10201501046)(148016)(149066)(150057)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123560045)(20161123564045)(20161123558120)(201708071742011)(7699051)(76991095);SRVR:BYAPR12MB2839;BCL:0;PCL:0;RULEID:;SRVR:BYAPR12MB2839; x-forefront-prvs: 08572BD77F received-spf: None (protection.outlook.com: nvidia.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: WaMM+MzMKgg7a9q6Y3YkhCttAmI3Qz/n6tYO0nWRXd5FX/Oi1FPks7Ur4dZyvRg97FfFv54WZqRhcOIxd0fq+h73wDzfG8S8JnR04PbDgMLRvipuP/ca06nR6N3LO4Rt/bQFm/UlJD0O4VbvKfgIc15Q05QYBhenysW1ubG8fYuuLBZb0AcboE1ZjED210IJ6C06RWnd6bSOzNSdttGFMsBXKOrMqtF2ykeQd7sOEHgpPB3xw9l8S5n8VEzbdzRhX71F1lEOfsvvC/XjwpP/1aFygaDP5BeobV/f6jQjNL6qBbIop5nV9W/596SJIj4aVCMvv9c1LS0gujoAz+rpLm9f5jjHHcskQzb1r5V2YRM= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: d464a979-c15e-4f39-80ab-08d64aa191d5 X-MS-Exchange-CrossTenant-originalarrivaltime: 15 Nov 2018 02:25:11.4020 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR12MB2839 X-OriginatorOrg: Nvidia.com Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1542248700; bh=Z7MnUN2O2TWF/5JYlkTczO2KNEvmOO9QjIXWCNznNSo=; h=X-PGP-Universal:From:To:CC:Subject:Thread-Topic:Thread-Index:Date: Message-ID:References:In-Reply-To:Accept-Language:X-MS-Has-Attach: X-MS-TNEF-Correlator:msip_labels:authentication-results: x-originating-ip:x-ms-publictraffictype: x-microsoft-exchange-diagnostics: x-ms-exchange-antispam-srfa-diagnostics: x-forefront-antispam-report: x-ms-office365-filtering-correlation-id:x-microsoft-antispam: x-ms-traffictypediagnostic:x-microsoft-antispam-prvs: x-exchange-antispam-report-test:x-ms-exchange-senderadcheck: x-exchange-antispam-report-cfa-test:x-forefront-prvs:received-spf: x-microsoft-antispam-message-info:spamdiagnosticoutput: spamdiagnosticmetadata:MIME-Version: X-MS-Exchange-CrossTenant-Network-Message-Id: X-MS-Exchange-CrossTenant-originalarrivaltime: X-MS-Exchange-CrossTenant-fromentityheader: X-MS-Exchange-CrossTenant-id: X-MS-Exchange-Transport-CrossTenantHeadersStamped:X-OriginatorOrg: Content-Language:Content-Type:Content-Transfer-Encoding; b=XtrpO5X6ipObSvjBmtwfgFoGSlQ0Ucy3zOeTN/JwKWpGg/ktz3Of7UAyLO2eMjATF dnWBu4RSimzGynwWvb4wt/X7UWk4EU6/20g4S5fslK+Z9GxGaBkjXjNz8ZhMk0ylo9 F0LGQ9RTduniHrf//wwKkEmofZgq82vZ16YWnTDNzoJZsZ7YSOD2d0Gvza2NR6z3F+ 1MapNKo4KQkd0LKL1g/BE4ebVBlrUbG0FCcvpGbPPjqFU5gHIggjJooD8X+REiDU+s J/QB/plYw3Mt3wfRQC8Wuuk6j+PlzM+HN2FQLcN9sM9DGsZGWWfeV7E3RsXc1cJ81Q Z68B48g+hxDDA== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Will, Could you provide feedback on this V2 patch set? Your early feedback and di= rection on how to Tegra194 SMMU driver into upstream would be highly apprec= iated. Thanks, -KR -----Original Message----- From: Krishna Reddy=20 Sent: Wednesday, October 31, 2018 4:49 PM To: will.deacon@arm.com; robin.murphy@arm.com; joro@8bytes.org Cc: linux-arm-kernel@lists.infradead.org; iommu@lists.linux-foundation.org;= linux-kernel@vger.kernel.org; linux-tegra@vger.kernel.org; Thierry Reding = ; Yu-Huan Hsu ; Sachin Nikam ; Pritesh Raithatha ; Timo Alho ; Alexander Van Brunt ; Nicolin Chen ; Krishna Reddy Subject: [PATCH v2 1/5] iommu/arm-smmu: rearrange arm-smmu.c code Rearrange arm-smmu.c code into arm-smmu-common.h, arm-smmu-common.c and arm-smmu.c. This patch rearranges the arm-smmu.c code to allow sharing the ARM SMMU driver code with dual ARM SMMU based Tegra194 SMMU driver. Signed-off-by: Krishna Reddy --- drivers/iommu/arm-smmu-common.c | 1922 +++++++++++++++++++++++++++++++++++ drivers/iommu/arm-smmu-common.h | 256 +++++ drivers/iommu/arm-smmu.c | 2133 +----------------------------------= ---- 3 files changed, 2180 insertions(+), 2131 deletions(-) create mode 100644 drivers/iommu/arm-smmu-common.c create mode 100644 drivers/iommu/arm-smmu-common.h diff --git a/drivers/iommu/arm-smmu-common.c b/drivers/iommu/arm-smmu-commo= n.c new file mode 100644 index 0000000..1ad8e5f --- /dev/null +++ b/drivers/iommu/arm-smmu-common.c @@ -0,0 +1,1922 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, U= SA. + * + * Copyright (C) 2013 ARM Limited + * + * Author: Will Deacon + */ + +static int force_stage; +module_param(force_stage, int, S_IRUGO); +MODULE_PARM_DESC(force_stage, + "Force SMMU mappings to be installed at a particular stage of translation= . A value of '1' or '2' forces the corresponding stage. All other values ar= e ignored (i.e. no stage is forced). Note that selecting a specific stage w= ill disable support for nested translation."); +static bool disable_bypass; +module_param(disable_bypass, bool, S_IRUGO); +MODULE_PARM_DESC(disable_bypass, + "Disable bypass streams such that incoming transactions from devices that= are not attached to an iommu domain will report an abort back to the devic= e and will not be allowed to pass through the SMMU."); + +#define ARM_SMMU_MATCH_DATA(name, ver, imp) \ +static struct arm_smmu_match_data name =3D { .version =3D ver, .model =3D = imp } + +static void arm_smmu_tlb_sync_global(struct arm_smmu_device *smmu); +static void arm_smmu_tlb_sync_context(void *cookie); +static irqreturn_t arm_smmu_context_fault(int irq, void *dev); +static irqreturn_t arm_smmu_global_fault(int irq, void *dev); + +static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) +{ + return container_of(dom, struct arm_smmu_domain, domain); +} + +static void parse_driver_options(struct arm_smmu_device *smmu) +{ + int i =3D 0; + + do { + if (of_property_read_bool(smmu->dev->of_node, + arm_smmu_options[i].prop)) { + smmu->options |=3D arm_smmu_options[i].opt; + dev_notice(smmu->dev, "option %s\n", + arm_smmu_options[i].prop); + } + } while (arm_smmu_options[++i].opt); +} + +static struct device_node *dev_get_dev_node(struct device *dev) +{ + if (dev_is_pci(dev)) { + struct pci_bus *bus =3D to_pci_dev(dev)->bus; + + while (!pci_is_root_bus(bus)) + bus =3D bus->parent; + return of_node_get(bus->bridge->parent->of_node); + } + + return of_node_get(dev->of_node); +} + +static int __arm_smmu_get_pci_sid(struct pci_dev *pdev, u16 alias, void *d= ata) +{ + *((__be32 *)data) =3D cpu_to_be32(alias); + return 0; /* Continue walking */ +} + +static int __find_legacy_master_phandle(struct device *dev, void *data) +{ + struct of_phandle_iterator *it =3D *(void **)data; + struct device_node *np =3D it->node; + int err; + + of_for_each_phandle(it, err, dev->of_node, "mmu-masters", + "#stream-id-cells", 0) + if (it->node =3D=3D np) { + *(void **)data =3D dev; + return 1; + } + it->node =3D np; + return err =3D=3D -ENOENT ? 0 : err; +} + +static struct platform_driver arm_smmu_driver; +static struct iommu_ops arm_smmu_ops; + +static int arm_smmu_register_legacy_master(struct device *dev, + struct arm_smmu_device **smmu) +{ + struct device *smmu_dev; + struct device_node *np; + struct of_phandle_iterator it; + void *data =3D ⁢ + u32 *sids; + __be32 pci_sid; + int err; + + np =3D dev_get_dev_node(dev); + if (!np || !of_find_property(np, "#stream-id-cells", NULL)) { + of_node_put(np); + return -ENODEV; + } + + it.node =3D np; + err =3D driver_for_each_device(&arm_smmu_driver.driver, NULL, &data, + __find_legacy_master_phandle); + smmu_dev =3D data; + of_node_put(np); + if (err =3D=3D 0) + return -ENODEV; + if (err < 0) + return err; + + if (dev_is_pci(dev)) { + /* "mmu-masters" assumes Stream ID =3D=3D Requester ID */ + pci_for_each_dma_alias(to_pci_dev(dev), __arm_smmu_get_pci_sid, + &pci_sid); + it.cur =3D &pci_sid; + it.cur_count =3D 1; + } + + err =3D iommu_fwspec_init(dev, &smmu_dev->of_node->fwnode, + &arm_smmu_ops); + if (err) + return err; + + sids =3D kcalloc(it.cur_count, sizeof(*sids), GFP_KERNEL); + if (!sids) + return -ENOMEM; + + *smmu =3D dev_get_drvdata(smmu_dev); + of_phandle_iterator_args(&it, sids, it.cur_count); + err =3D iommu_fwspec_add_ids(dev, sids, it.cur_count); + kfree(sids); + return err; +} + +static int __arm_smmu_alloc_bitmap(unsigned long *map, int start, int end) +{ + int idx; + + do { + idx =3D find_next_zero_bit(map, end, start); + if (idx =3D=3D end) + return -ENOSPC; + } while (test_and_set_bit(idx, map)); + + return idx; +} + +static void __arm_smmu_free_bitmap(unsigned long *map, int idx) +{ + clear_bit(idx, map); +} + +/* Wait for any pending TLB invalidations to complete */ +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, + void __iomem *sync, void __iomem *status) +{ + unsigned int spin_cnt, delay; + + writel_relaxed(0, sync); + for (delay =3D 1; delay < TLB_LOOP_TIMEOUT; delay *=3D 2) { + for (spin_cnt =3D TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) { + if (!(readl_relaxed(status) & sTLBGSTATUS_GSACTIVE)) + return; + cpu_relax(); + } + udelay(delay); + } + dev_err_ratelimited(smmu->dev, + "TLB sync timed out -- SMMU may be deadlocked\n"); +} + +static void arm_smmu_tlb_sync_vmid(void *cookie) +{ + struct arm_smmu_domain *smmu_domain =3D cookie; + + arm_smmu_tlb_sync_global(smmu_domain->smmu); +} + +static void arm_smmu_tlb_inv_context_s1(void *cookie) +{ + struct arm_smmu_domain *smmu_domain =3D cookie; + struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; + void __iomem *base =3D ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); + + /* + * NOTE: this is not a relaxed write; it needs to guarantee that PTEs + * cleared by the current CPU are visible to the SMMU before the TLBI. + */ + writel(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID); + arm_smmu_tlb_sync_context(cookie); +} + +static void arm_smmu_tlb_inv_context_s2(void *cookie) +{ + struct arm_smmu_domain *smmu_domain =3D cookie; + struct arm_smmu_device *smmu =3D smmu_domain->smmu; + void __iomem *base =3D ARM_SMMU_GR0(smmu); + + /* NOTE: see above */ + writel(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID); + arm_smmu_tlb_sync_global(smmu); +} + +static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, + size_t granule, bool leaf, void *cookie) +{ + struct arm_smmu_domain *smmu_domain =3D cookie; + struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; + bool stage1 =3D cfg->cbar !=3D CBAR_TYPE_S2_TRANS; + void __iomem *reg =3D ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); + + if (smmu_domain->smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) + wmb(); + + if (stage1) { + reg +=3D leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA; + + if (cfg->fmt !=3D ARM_SMMU_CTX_FMT_AARCH64) { + iova &=3D ~12UL; + iova |=3D cfg->asid; + do { + writel_relaxed(iova, reg); + iova +=3D granule; + } while (size -=3D granule); + } else { + iova >>=3D 12; + iova |=3D (u64)cfg->asid << 48; + do { + writeq_relaxed(iova, reg); + iova +=3D granule >> 12; + } while (size -=3D granule); + } + } else { + reg +=3D leaf ? ARM_SMMU_CB_S2_TLBIIPAS2L : + ARM_SMMU_CB_S2_TLBIIPAS2; + iova >>=3D 12; + do { + smmu_write_atomic_lq(iova, reg); + iova +=3D granule >> 12; + } while (size -=3D granule); + } +} + +/* + * On MMU-401 at least, the cost of firing off multiple TLBIVMIDs appears + * almost negligible, but the benefit of getting the first one in as far a= head + * of the sync as possible is significant, hence we don't just make this a + * no-op and set .tlb_sync to arm_smmu_inv_context_s2() as you might think= . + */ +static void arm_smmu_tlb_inv_vmid_nosync(unsigned long iova, size_t size, + size_t granule, bool leaf, void *cookie) +{ + struct arm_smmu_domain *smmu_domain =3D cookie; + void __iomem *base =3D ARM_SMMU_GR0(smmu_domain->smmu); + + if (smmu_domain->smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) + wmb(); + + writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID); +} + +static const struct iommu_gather_ops arm_smmu_s1_tlb_ops =3D { + .tlb_flush_all =3D arm_smmu_tlb_inv_context_s1, + .tlb_add_flush =3D arm_smmu_tlb_inv_range_nosync, + .tlb_sync =3D arm_smmu_tlb_sync_context, +}; + +static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v2 =3D { + .tlb_flush_all =3D arm_smmu_tlb_inv_context_s2, + .tlb_add_flush =3D arm_smmu_tlb_inv_range_nosync, + .tlb_sync =3D arm_smmu_tlb_sync_context, +}; + +static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v1 =3D { + .tlb_flush_all =3D arm_smmu_tlb_inv_context_s2, + .tlb_add_flush =3D arm_smmu_tlb_inv_vmid_nosync, + .tlb_sync =3D arm_smmu_tlb_sync_vmid, +}; + +static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain= , + struct io_pgtable_cfg *pgtbl_cfg) +{ + struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; + struct arm_smmu_cb *cb =3D &smmu_domain->smmu->cbs[cfg->cbndx]; + bool stage1 =3D cfg->cbar !=3D CBAR_TYPE_S2_TRANS; + + cb->cfg =3D cfg; + + /* TTBCR */ + if (stage1) { + if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH32_S) { + cb->tcr[0] =3D pgtbl_cfg->arm_v7s_cfg.tcr; + } else { + cb->tcr[0] =3D pgtbl_cfg->arm_lpae_s1_cfg.tcr; + cb->tcr[1] =3D pgtbl_cfg->arm_lpae_s1_cfg.tcr >> 32; + cb->tcr[1] |=3D TTBCR2_SEP_UPSTREAM; + if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH64) + cb->tcr[1] |=3D TTBCR2_AS; + } + } else { + cb->tcr[0] =3D pgtbl_cfg->arm_lpae_s2_cfg.vtcr; + } + + /* TTBRs */ + if (stage1) { + if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH32_S) { + cb->ttbr[0] =3D pgtbl_cfg->arm_v7s_cfg.ttbr[0]; + cb->ttbr[1] =3D pgtbl_cfg->arm_v7s_cfg.ttbr[1]; + } else { + cb->ttbr[0] =3D pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0]; + cb->ttbr[0] |=3D (u64)cfg->asid << TTBRn_ASID_SHIFT; + cb->ttbr[1] =3D pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1]; + cb->ttbr[1] |=3D (u64)cfg->asid << TTBRn_ASID_SHIFT; + } + } else { + cb->ttbr[0] =3D pgtbl_cfg->arm_lpae_s2_cfg.vttbr; + } + + /* MAIRs (stage-1 only) */ + if (stage1) { + if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH32_S) { + cb->mair[0] =3D pgtbl_cfg->arm_v7s_cfg.prrr; + cb->mair[1] =3D pgtbl_cfg->arm_v7s_cfg.nmrr; + } else { + cb->mair[0] =3D pgtbl_cfg->arm_lpae_s1_cfg.mair[0]; + cb->mair[1] =3D pgtbl_cfg->arm_lpae_s1_cfg.mair[1]; + } + } +} + +static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int = idx) +{ + u32 reg; + bool stage1; + struct arm_smmu_cb *cb =3D &smmu->cbs[idx]; + struct arm_smmu_cfg *cfg =3D cb->cfg; + void __iomem *cb_base, *gr1_base; + + cb_base =3D ARM_SMMU_CB(smmu, idx); + + /* Unassigned context banks only need disabling */ + if (!cfg) { + writel_relaxed(0, cb_base + ARM_SMMU_CB_SCTLR); + return; + } + + gr1_base =3D ARM_SMMU_GR1(smmu); + stage1 =3D cfg->cbar !=3D CBAR_TYPE_S2_TRANS; + + /* CBA2R */ + if (smmu->version > ARM_SMMU_V1) { + if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH64) + reg =3D CBA2R_RW64_64BIT; + else + reg =3D CBA2R_RW64_32BIT; + /* 16-bit VMIDs live in CBA2R */ + if (smmu->features & ARM_SMMU_FEAT_VMID16) + reg |=3D cfg->vmid << CBA2R_VMID_SHIFT; + + writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBA2R(idx)); + } + + /* CBAR */ + reg =3D cfg->cbar; + if (smmu->version < ARM_SMMU_V2) + reg |=3D cfg->irptndx << CBAR_IRPTNDX_SHIFT; + + /* + * Use the weakest shareability/memory types, so they are + * overridden by the ttbcr/pte. + */ + if (stage1) { + reg |=3D (CBAR_S1_BPSHCFG_NSH << CBAR_S1_BPSHCFG_SHIFT) | + (CBAR_S1_MEMATTR_WB << CBAR_S1_MEMATTR_SHIFT); + } else if (!(smmu->features & ARM_SMMU_FEAT_VMID16)) { + /* 8-bit VMIDs live in CBAR */ + reg |=3D cfg->vmid << CBAR_VMID_SHIFT; + } + writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBAR(idx)); + + /* + * TTBCR + * We must write this before the TTBRs, since it determines the + * access behaviour of some fields (in particular, ASID[15:8]). + */ + if (stage1 && smmu->version > ARM_SMMU_V1) + writel_relaxed(cb->tcr[1], cb_base + ARM_SMMU_CB_TTBCR2); + writel_relaxed(cb->tcr[0], cb_base + ARM_SMMU_CB_TTBCR); + + /* TTBRs */ + if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH32_S) { + writel_relaxed(cfg->asid, cb_base + ARM_SMMU_CB_CONTEXTIDR); + writel_relaxed(cb->ttbr[0], cb_base + ARM_SMMU_CB_TTBR0); + writel_relaxed(cb->ttbr[1], cb_base + ARM_SMMU_CB_TTBR1); + } else { + writeq_relaxed(cb->ttbr[0], cb_base + ARM_SMMU_CB_TTBR0); + if (stage1) + writeq_relaxed(cb->ttbr[1], cb_base + ARM_SMMU_CB_TTBR1); + } + + /* MAIRs (stage-1 only) */ + if (stage1) { + writel_relaxed(cb->mair[0], cb_base + ARM_SMMU_CB_S1_MAIR0); + writel_relaxed(cb->mair[1], cb_base + ARM_SMMU_CB_S1_MAIR1); + } + + /* SCTLR */ + reg =3D SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE | SCTLR_M; + if (stage1) + reg |=3D SCTLR_S1_ASIDPNE; + if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) + reg |=3D SCTLR_E; + + writel_relaxed(reg, cb_base + ARM_SMMU_CB_SCTLR); +} + +static int arm_smmu_init_domain_context(struct iommu_domain *domain, + struct arm_smmu_device *smmu) +{ + int irq, start, ret =3D 0; + unsigned long ias, oas; + struct io_pgtable_ops *pgtbl_ops; + struct io_pgtable_cfg pgtbl_cfg; + enum io_pgtable_fmt fmt; + struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); + struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; + + mutex_lock(&smmu_domain->init_mutex); + if (smmu_domain->smmu) + goto out_unlock; + + if (domain->type =3D=3D IOMMU_DOMAIN_IDENTITY) { + smmu_domain->stage =3D ARM_SMMU_DOMAIN_BYPASS; + smmu_domain->smmu =3D smmu; + goto out_unlock; + } + + /* + * Mapping the requested stage onto what we support is surprisingly + * complicated, mainly because the spec allows S1+S2 SMMUs without + * support for nested translation. That means we end up with the + * following table: + * + * Requested Supported Actual + * S1 N S1 + * S1 S1+S2 S1 + * S1 S2 S2 + * S1 S1 S1 + * N N N + * N S1+S2 S2 + * N S2 S2 + * N S1 S1 + * + * Note that you can't actually request stage-2 mappings. + */ + if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) + smmu_domain->stage =3D ARM_SMMU_DOMAIN_S2; + if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2)) + smmu_domain->stage =3D ARM_SMMU_DOMAIN_S1; + + /* + * Choosing a suitable context format is even more fiddly. Until we + * grow some way for the caller to express a preference, and/or move + * the decision into the io-pgtable code where it arguably belongs, + * just aim for the closest thing to the rest of the system, and hope + * that the hardware isn't esoteric enough that we can't assume AArch64 + * support to be a superset of AArch32 support... + */ + if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_L) + cfg->fmt =3D ARM_SMMU_CTX_FMT_AARCH32_L; + if (IS_ENABLED(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) && + !IS_ENABLED(CONFIG_64BIT) && !IS_ENABLED(CONFIG_ARM_LPAE) && + (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) && + (smmu_domain->stage =3D=3D ARM_SMMU_DOMAIN_S1)) + cfg->fmt =3D ARM_SMMU_CTX_FMT_AARCH32_S; + if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_NONE) &= & + (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K | + ARM_SMMU_FEAT_FMT_AARCH64_16K | + ARM_SMMU_FEAT_FMT_AARCH64_4K))) + cfg->fmt =3D ARM_SMMU_CTX_FMT_AARCH64; + + if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_NONE) { + ret =3D -EINVAL; + goto out_unlock; + } + + switch (smmu_domain->stage) { + case ARM_SMMU_DOMAIN_S1: + cfg->cbar =3D CBAR_TYPE_S1_TRANS_S2_BYPASS; + start =3D smmu->num_s2_context_banks; + ias =3D smmu->va_size; + oas =3D smmu->ipa_size; + if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH64) { + fmt =3D ARM_64_LPAE_S1; + } else if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH32_L) { + fmt =3D ARM_32_LPAE_S1; + ias =3D min(ias, 32UL); + oas =3D min(oas, 40UL); + } else { + fmt =3D ARM_V7S; + ias =3D min(ias, 32UL); + oas =3D min(oas, 32UL); + } + smmu_domain->tlb_ops =3D &arm_smmu_s1_tlb_ops; + break; + case ARM_SMMU_DOMAIN_NESTED: + /* + * We will likely want to change this if/when KVM gets + * involved. + */ + case ARM_SMMU_DOMAIN_S2: + cfg->cbar =3D CBAR_TYPE_S2_TRANS; + start =3D 0; + ias =3D smmu->ipa_size; + oas =3D smmu->pa_size; + if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH64) { + fmt =3D ARM_64_LPAE_S2; + } else { + fmt =3D ARM_32_LPAE_S2; + ias =3D min(ias, 40UL); + oas =3D min(oas, 40UL); + } + if (smmu->version =3D=3D ARM_SMMU_V2) + smmu_domain->tlb_ops =3D &arm_smmu_s2_tlb_ops_v2; + else + smmu_domain->tlb_ops =3D &arm_smmu_s2_tlb_ops_v1; + break; + default: + ret =3D -EINVAL; + goto out_unlock; + } + ret =3D __arm_smmu_alloc_bitmap(smmu->context_map, start, + smmu->num_context_banks); + if (ret < 0) + goto out_unlock; + + cfg->cbndx =3D ret; + if (smmu->version < ARM_SMMU_V2) { + cfg->irptndx =3D atomic_inc_return(&smmu->irptndx); + cfg->irptndx %=3D smmu->num_context_irqs; + } else { + cfg->irptndx =3D cfg->cbndx; + } + + if (smmu_domain->stage =3D=3D ARM_SMMU_DOMAIN_S2) + cfg->vmid =3D cfg->cbndx + 1 + smmu->cavium_id_base; + else + cfg->asid =3D cfg->cbndx + smmu->cavium_id_base; + + pgtbl_cfg =3D (struct io_pgtable_cfg) { + .pgsize_bitmap =3D smmu->pgsize_bitmap, + .ias =3D ias, + .oas =3D oas, + .tlb =3D smmu_domain->tlb_ops, + .iommu_dev =3D smmu->dev, + }; + + if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) + pgtbl_cfg.quirks =3D IO_PGTABLE_QUIRK_NO_DMA; + + if (smmu_domain->non_strict) + pgtbl_cfg.quirks |=3D IO_PGTABLE_QUIRK_NON_STRICT; + + smmu_domain->smmu =3D smmu; + pgtbl_ops =3D alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain); + if (!pgtbl_ops) { + ret =3D -ENOMEM; + goto out_clear_smmu; + } + + /* Update the domain's page sizes to reflect the page table format */ + domain->pgsize_bitmap =3D pgtbl_cfg.pgsize_bitmap; + domain->geometry.aperture_end =3D (1UL << ias) - 1; + domain->geometry.force_aperture =3D true; + + /* Initialise the context bank with our page table cfg */ + arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg); + arm_smmu_write_context_bank(smmu, cfg->cbndx); + + /* + * Request context fault interrupt. Do this last to avoid the + * handler seeing a half-initialised domain state. + */ + irq =3D smmu->irqs[smmu->num_global_irqs + cfg->irptndx]; + ret =3D devm_request_irq(smmu->dev, irq, arm_smmu_context_fault, + IRQF_SHARED, "arm-smmu-context-fault", domain); + if (ret < 0) { + dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n", + cfg->irptndx, irq); + cfg->irptndx =3D INVALID_IRPTNDX; + } + + mutex_unlock(&smmu_domain->init_mutex); + + /* Publish page table ops for map/unmap */ + smmu_domain->pgtbl_ops =3D pgtbl_ops; + return 0; + +out_clear_smmu: + smmu_domain->smmu =3D NULL; +out_unlock: + mutex_unlock(&smmu_domain->init_mutex); + return ret; +} + +static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) +{ + struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); + struct arm_smmu_device *smmu =3D smmu_domain->smmu; + struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; + int irq; + + if (!smmu || domain->type =3D=3D IOMMU_DOMAIN_IDENTITY) + return; + + /* + * Disable the context bank and free the page tables before freeing + * it. + */ + smmu->cbs[cfg->cbndx].cfg =3D NULL; + arm_smmu_write_context_bank(smmu, cfg->cbndx); + + if (cfg->irptndx !=3D INVALID_IRPTNDX) { + irq =3D smmu->irqs[smmu->num_global_irqs + cfg->irptndx]; + devm_free_irq(smmu->dev, irq, domain); + } + + free_io_pgtable_ops(smmu_domain->pgtbl_ops); + __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); +} + +static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) +{ + struct arm_smmu_domain *smmu_domain; + + if (type !=3D IOMMU_DOMAIN_UNMANAGED && + type !=3D IOMMU_DOMAIN_DMA && + type !=3D IOMMU_DOMAIN_IDENTITY) + return NULL; + /* + * Allocate the domain and initialise some of its data structures. + * We can't really do anything meaningful until we've added a + * master. + */ + smmu_domain =3D kzalloc(sizeof(*smmu_domain), GFP_KERNEL); + if (!smmu_domain) + return NULL; + + if (type =3D=3D IOMMU_DOMAIN_DMA && (using_legacy_binding || + iommu_get_dma_cookie(&smmu_domain->domain))) { + kfree(smmu_domain); + return NULL; + } + + mutex_init(&smmu_domain->init_mutex); + spin_lock_init(&smmu_domain->cb_lock); + + return &smmu_domain->domain; +} + +static void arm_smmu_domain_free(struct iommu_domain *domain) +{ + struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); + + /* + * Free the domain resources. We assume that all devices have + * already been detached. + */ + iommu_put_dma_cookie(domain); + arm_smmu_destroy_domain_context(domain); + kfree(smmu_domain); +} + +static void arm_smmu_write_smr(struct arm_smmu_device *smmu, int idx) +{ + struct arm_smmu_smr *smr =3D smmu->smrs + idx; + u32 reg =3D smr->id << SMR_ID_SHIFT | smr->mask << SMR_MASK_SHIFT; + + if (!(smmu->features & ARM_SMMU_FEAT_EXIDS) && smr->valid) + reg |=3D SMR_VALID; + writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx)); +} + +static void arm_smmu_write_s2cr(struct arm_smmu_device *smmu, int idx) +{ + struct arm_smmu_s2cr *s2cr =3D smmu->s2crs + idx; + u32 reg =3D (s2cr->type & S2CR_TYPE_MASK) << S2CR_TYPE_SHIFT | + (s2cr->cbndx & S2CR_CBNDX_MASK) << S2CR_CBNDX_SHIFT | + (s2cr->privcfg & S2CR_PRIVCFG_MASK) << S2CR_PRIVCFG_SHIFT; + + if (smmu->features & ARM_SMMU_FEAT_EXIDS && smmu->smrs && + smmu->smrs[idx].valid) + reg |=3D S2CR_EXIDVALID; + writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_S2CR(idx)); +} + +static void arm_smmu_write_sme(struct arm_smmu_device *smmu, int idx) +{ + arm_smmu_write_s2cr(smmu, idx); + if (smmu->smrs) + arm_smmu_write_smr(smmu, idx); +} + +/* + * The width of SMR's mask field depends on sCR0_EXIDENABLE, so this funct= ion + * should be called after sCR0 is written. + */ +static void arm_smmu_test_smr_masks(struct arm_smmu_device *smmu) +{ + void __iomem *gr0_base =3D ARM_SMMU_GR0(smmu); + u32 smr; + + if (!smmu->smrs) + return; + + /* + * SMR.ID bits may not be preserved if the corresponding MASK + * bits are set, so check each one separately. We can reject + * masters later if they try to claim IDs outside these masks. + */ + smr =3D smmu->streamid_mask << SMR_ID_SHIFT; + writel_relaxed(smr, gr0_base + ARM_SMMU_GR0_SMR(0)); + smr =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_SMR(0)); + smmu->streamid_mask =3D smr >> SMR_ID_SHIFT; + + smr =3D smmu->streamid_mask << SMR_MASK_SHIFT; + writel_relaxed(smr, gr0_base + ARM_SMMU_GR0_SMR(0)); + smr =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_SMR(0)); + smmu->smr_mask_mask =3D smr >> SMR_MASK_SHIFT; +} + +static int arm_smmu_find_sme(struct arm_smmu_device *smmu, u16 id, u16 mas= k) +{ + struct arm_smmu_smr *smrs =3D smmu->smrs; + int i, free_idx =3D -ENOSPC; + + /* Stream indexing is blissfully easy */ + if (!smrs) + return id; + + /* Validating SMRs is... less so */ + for (i =3D 0; i < smmu->num_mapping_groups; ++i) { + if (!smrs[i].valid) { + /* + * Note the first free entry we come across, which + * we'll claim in the end if nothing else matches. + */ + if (free_idx < 0) + free_idx =3D i; + continue; + } + /* + * If the new entry is _entirely_ matched by an existing entry, + * then reuse that, with the guarantee that there also cannot + * be any subsequent conflicting entries. In normal use we'd + * expect simply identical entries for this case, but there's + * no harm in accommodating the generalisation. + */ + if ((mask & smrs[i].mask) =3D=3D mask && + !((id ^ smrs[i].id) & ~smrs[i].mask)) + return i; + /* + * If the new entry has any other overlap with an existing one, + * though, then there always exists at least one stream ID + * which would cause a conflict, and we can't allow that risk. + */ + if (!((id ^ smrs[i].id) & ~(smrs[i].mask | mask))) + return -EINVAL; + } + + return free_idx; +} + +static bool arm_smmu_free_sme(struct arm_smmu_device *smmu, int idx) +{ + if (--smmu->s2crs[idx].count) + return false; + + smmu->s2crs[idx] =3D s2cr_init_val; + if (smmu->smrs) + smmu->smrs[idx].valid =3D false; + + return true; +} + +static int arm_smmu_master_alloc_smes(struct device *dev) +{ + struct iommu_fwspec *fwspec =3D dev->iommu_fwspec; + struct arm_smmu_master_cfg *cfg =3D fwspec->iommu_priv; + struct arm_smmu_device *smmu =3D cfg->smmu; + struct arm_smmu_smr *smrs =3D smmu->smrs; + struct iommu_group *group; + int i, idx, ret; + + mutex_lock(&smmu->stream_map_mutex); + /* Figure out a viable stream map entry allocation */ + for_each_cfg_sme(fwspec, i, idx) { + u16 sid =3D fwspec->ids[i]; + u16 mask =3D fwspec->ids[i] >> SMR_MASK_SHIFT; + + if (idx !=3D INVALID_SMENDX) { + ret =3D -EEXIST; + goto out_err; + } + + ret =3D arm_smmu_find_sme(smmu, sid, mask); + if (ret < 0) + goto out_err; + + idx =3D ret; + if (smrs && smmu->s2crs[idx].count =3D=3D 0) { + smrs[idx].id =3D sid; + smrs[idx].mask =3D mask; + smrs[idx].valid =3D true; + } + smmu->s2crs[idx].count++; + cfg->smendx[i] =3D (s16)idx; + } + + group =3D iommu_group_get_for_dev(dev); + if (!group) + group =3D ERR_PTR(-ENOMEM); + if (IS_ERR(group)) { + ret =3D PTR_ERR(group); + goto out_err; + } + iommu_group_put(group); + + /* It worked! Now, poke the actual hardware */ + for_each_cfg_sme(fwspec, i, idx) { + arm_smmu_write_sme(smmu, idx); + smmu->s2crs[idx].group =3D group; + } + + mutex_unlock(&smmu->stream_map_mutex); + return 0; + +out_err: + while (i--) { + arm_smmu_free_sme(smmu, cfg->smendx[i]); + cfg->smendx[i] =3D INVALID_SMENDX; + } + mutex_unlock(&smmu->stream_map_mutex); + return ret; +} + +static void arm_smmu_master_free_smes(struct iommu_fwspec *fwspec) +{ + struct arm_smmu_device *smmu =3D fwspec_smmu(fwspec); + struct arm_smmu_master_cfg *cfg =3D fwspec->iommu_priv; + int i, idx; + + mutex_lock(&smmu->stream_map_mutex); + for_each_cfg_sme(fwspec, i, idx) { + if (arm_smmu_free_sme(smmu, idx)) + arm_smmu_write_sme(smmu, idx); + cfg->smendx[i] =3D INVALID_SMENDX; + } + mutex_unlock(&smmu->stream_map_mutex); +} + +static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, + struct iommu_fwspec *fwspec) +{ + struct arm_smmu_device *smmu =3D smmu_domain->smmu; + struct arm_smmu_s2cr *s2cr =3D smmu->s2crs; + u8 cbndx =3D smmu_domain->cfg.cbndx; + enum arm_smmu_s2cr_type type; + int i, idx; + + if (smmu_domain->stage =3D=3D ARM_SMMU_DOMAIN_BYPASS) + type =3D S2CR_TYPE_BYPASS; + else + type =3D S2CR_TYPE_TRANS; + + for_each_cfg_sme(fwspec, i, idx) { + if (type =3D=3D s2cr[idx].type && cbndx =3D=3D s2cr[idx].cbndx) + continue; + + s2cr[idx].type =3D type; + s2cr[idx].privcfg =3D S2CR_PRIVCFG_DEFAULT; + s2cr[idx].cbndx =3D cbndx; + arm_smmu_write_s2cr(smmu, idx); + } + return 0; +} + +static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device = *dev) +{ + int ret; + struct iommu_fwspec *fwspec =3D dev->iommu_fwspec; + struct arm_smmu_device *smmu; + struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); + + if (!fwspec || fwspec->ops !=3D &arm_smmu_ops) { + dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n"); + return -ENXIO; + } + + /* + * FIXME: The arch/arm DMA API code tries to attach devices to its own + * domains between of_xlate() and add_device() - we have no way to cope + * with that, so until ARM gets converted to rely on groups and default + * domains, just say no (but more politely than by dereferencing NULL). + * This should be at least a WARN_ON once that's sorted. + */ + if (!fwspec->iommu_priv) + return -ENODEV; + + smmu =3D fwspec_smmu(fwspec); + /* Ensure that the domain is finalised */ + ret =3D arm_smmu_init_domain_context(domain, smmu); + if (ret < 0) + return ret; + + /* + * Sanity check the domain. We don't support domains across + * different SMMUs. + */ + if (smmu_domain->smmu !=3D smmu) { + dev_err(dev, + "cannot attach to SMMU %s whilst already attached to domain on SMMU %s\= n", + dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev)); + return -EINVAL; + } + + /* Looks ok, so add the device to the domain */ + return arm_smmu_domain_add_master(smmu_domain, fwspec); +} + +static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, size_t size, int prot) +{ + struct io_pgtable_ops *ops =3D to_smmu_domain(domain)->pgtbl_ops; + + if (!ops) + return -ENODEV; + + return ops->map(ops, iova, paddr, size, prot); +} + +static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long io= va, + size_t size) +{ + struct io_pgtable_ops *ops =3D to_smmu_domain(domain)->pgtbl_ops; + + if (!ops) + return 0; + + return ops->unmap(ops, iova, size); +} + +static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain) +{ + struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); + + if (smmu_domain->tlb_ops) + smmu_domain->tlb_ops->tlb_flush_all(smmu_domain); +} + +static void arm_smmu_iotlb_sync(struct iommu_domain *domain) +{ + struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); + + if (smmu_domain->tlb_ops) + smmu_domain->tlb_ops->tlb_sync(smmu_domain); +} + +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain, + dma_addr_t iova) +{ + struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); + struct arm_smmu_device *smmu =3D smmu_domain->smmu; + struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; + struct io_pgtable_ops *ops=3D smmu_domain->pgtbl_ops; + struct device *dev =3D smmu->dev; + void __iomem *cb_base; + u32 tmp; + u64 phys; + unsigned long va, flags; + + cb_base =3D ARM_SMMU_CB(smmu, cfg->cbndx); + + spin_lock_irqsave(&smmu_domain->cb_lock, flags); + /* ATS1 registers can only be written atomically */ + va =3D iova & ~0xfffUL; + if (smmu->version =3D=3D ARM_SMMU_V2) + smmu_write_atomic_lq(va, cb_base + ARM_SMMU_CB_ATS1PR); + else /* Register is only 32-bit in v1 */ + writel_relaxed(va, cb_base + ARM_SMMU_CB_ATS1PR); + + if (readl_poll_timeout_atomic(cb_base + ARM_SMMU_CB_ATSR, tmp, + !(tmp & ATSR_ACTIVE), 5, 50)) { + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags); + dev_err(dev, + "iova to phys timed out on %pad. Falling back to software table walk.\n= ", + &iova); + return ops->iova_to_phys(ops, iova); + } + + phys =3D readq_relaxed(cb_base + ARM_SMMU_CB_PAR); + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags); + if (phys & CB_PAR_F) { + dev_err(dev, "translation fault!\n"); + dev_err(dev, "PAR =3D 0x%llx\n", phys); + return 0; + } + + return (phys & GENMASK_ULL(39, 12)) | (iova & 0xfff); +} + +static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, + dma_addr_t iova) +{ + struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); + struct io_pgtable_ops *ops =3D smmu_domain->pgtbl_ops; + + if (domain->type =3D=3D IOMMU_DOMAIN_IDENTITY) + return iova; + + if (!ops) + return 0; + + if (smmu_domain->smmu->features & ARM_SMMU_FEAT_TRANS_OPS && + smmu_domain->stage =3D=3D ARM_SMMU_DOMAIN_S1) + return arm_smmu_iova_to_phys_hard(domain, iova); + + return ops->iova_to_phys(ops, iova); +} + +static bool arm_smmu_capable(enum iommu_cap cap) +{ + switch (cap) { + case IOMMU_CAP_CACHE_COHERENCY: + /* + * Return true here as the SMMU can always send out coherent + * requests. + */ + return true; + case IOMMU_CAP_NOEXEC: + return true; + default: + return false; + } +} + +static int arm_smmu_match_node(struct device *dev, void *data) +{ + return dev->fwnode =3D=3D data; +} + +static +struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle *fwnod= e) +{ + struct device *dev =3D driver_find_device(&arm_smmu_driver.driver, NULL, + fwnode, arm_smmu_match_node); + put_device(dev); + return dev ? dev_get_drvdata(dev) : NULL; +} + +static int arm_smmu_add_device(struct device *dev) +{ + struct arm_smmu_device *smmu; + struct arm_smmu_master_cfg *cfg; + struct iommu_fwspec *fwspec =3D dev->iommu_fwspec; + int i, ret; + + if (using_legacy_binding) { + ret =3D arm_smmu_register_legacy_master(dev, &smmu); + + /* + * If dev->iommu_fwspec is initally NULL, arm_smmu_register_legacy_maste= r() + * will allocate/initialise a new one. Thus we need to update fwspec for + * later use. + */ + fwspec =3D dev->iommu_fwspec; + if (ret) + goto out_free; + } else if (fwspec && fwspec->ops =3D=3D &arm_smmu_ops) { + smmu =3D arm_smmu_get_by_fwnode(fwspec->iommu_fwnode); + } else { + return -ENODEV; + } + + ret =3D -EINVAL; + for (i =3D 0; i < fwspec->num_ids; i++) { + u16 sid =3D fwspec->ids[i]; + u16 mask =3D fwspec->ids[i] >> SMR_MASK_SHIFT; + + if (sid & ~smmu->streamid_mask) { + dev_err(dev, "stream ID 0x%x out of range for SMMU (0x%x)\n", + sid, smmu->streamid_mask); + goto out_free; + } + if (mask & ~smmu->smr_mask_mask) { + dev_err(dev, "SMR mask 0x%x out of range for SMMU (0x%x)\n", + mask, smmu->smr_mask_mask); + goto out_free; + } + } + + ret =3D -ENOMEM; + cfg =3D kzalloc(offsetof(struct arm_smmu_master_cfg, smendx[i]), + GFP_KERNEL); + if (!cfg) + goto out_free; + + cfg->smmu =3D smmu; + fwspec->iommu_priv =3D cfg; + while (i--) + cfg->smendx[i] =3D INVALID_SMENDX; + + ret =3D arm_smmu_master_alloc_smes(dev); + if (ret) + goto out_cfg_free; + + iommu_device_link(&smmu->iommu, dev); + + return 0; + +out_cfg_free: + kfree(cfg); +out_free: + iommu_fwspec_free(dev); + return ret; +} + +static void arm_smmu_remove_device(struct device *dev) +{ + struct iommu_fwspec *fwspec =3D dev->iommu_fwspec; + struct arm_smmu_master_cfg *cfg; + struct arm_smmu_device *smmu; + + + if (!fwspec || fwspec->ops !=3D &arm_smmu_ops) + return; + + cfg =3D fwspec->iommu_priv; + smmu =3D cfg->smmu; + + iommu_device_unlink(&smmu->iommu, dev); + arm_smmu_master_free_smes(fwspec); + iommu_group_remove_device(dev); + kfree(fwspec->iommu_priv); + iommu_fwspec_free(dev); +} + +static struct iommu_group *arm_smmu_device_group(struct device *dev) +{ + struct iommu_fwspec *fwspec =3D dev->iommu_fwspec; + struct arm_smmu_device *smmu =3D fwspec_smmu(fwspec); + struct iommu_group *group =3D NULL; + int i, idx; + + for_each_cfg_sme(fwspec, i, idx) { + if (group && smmu->s2crs[idx].group && + group !=3D smmu->s2crs[idx].group) + return ERR_PTR(-EINVAL); + + group =3D smmu->s2crs[idx].group; + } + + if (group) + return iommu_group_ref_get(group); + + if (dev_is_pci(dev)) + group =3D pci_device_group(dev); + else if (dev_is_fsl_mc(dev)) + group =3D fsl_mc_device_group(dev); + else + group =3D generic_device_group(dev); + + return group; +} + +static int arm_smmu_domain_get_attr(struct iommu_domain *domain, + enum iommu_attr attr, void *data) +{ + struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); + + switch(domain->type) { + case IOMMU_DOMAIN_UNMANAGED: + switch (attr) { + case DOMAIN_ATTR_NESTING: + *(int *)data =3D (smmu_domain->stage =3D=3D ARM_SMMU_DOMAIN_NESTED); + return 0; + default: + return -ENODEV; + } + break; + case IOMMU_DOMAIN_DMA: + switch (attr) { + case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: + *(int *)data =3D smmu_domain->non_strict; + return 0; + default: + return -ENODEV; + } + break; + default: + return -EINVAL; + } +} + +static int arm_smmu_domain_set_attr(struct iommu_domain *domain, + enum iommu_attr attr, void *data) +{ + int ret =3D 0; + struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); + + mutex_lock(&smmu_domain->init_mutex); + + switch(domain->type) { + case IOMMU_DOMAIN_UNMANAGED: + switch (attr) { + case DOMAIN_ATTR_NESTING: + if (smmu_domain->smmu) { + ret =3D -EPERM; + goto out_unlock; + } + + if (*(int *)data) + smmu_domain->stage =3D ARM_SMMU_DOMAIN_NESTED; + else + smmu_domain->stage =3D ARM_SMMU_DOMAIN_S1; + break; + default: + ret =3D -ENODEV; + } + break; + case IOMMU_DOMAIN_DMA: + switch (attr) { + case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: + smmu_domain->non_strict =3D *(int *)data; + break; + default: + ret =3D -ENODEV; + } + break; + default: + ret =3D -EINVAL; + } +out_unlock: + mutex_unlock(&smmu_domain->init_mutex); + return ret; +} + +static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *a= rgs) +{ + u32 mask, fwid =3D 0; + + if (args->args_count > 0) + fwid |=3D (u16)args->args[0]; + + if (args->args_count > 1) + fwid |=3D (u16)args->args[1] << SMR_MASK_SHIFT; + else if (!of_property_read_u32(args->np, "stream-match-mask", &mask)) + fwid |=3D (u16)mask << SMR_MASK_SHIFT; + + return iommu_fwspec_add_ids(dev, &fwid, 1); +} + +static void arm_smmu_get_resv_regions(struct device *dev, + struct list_head *head) +{ + struct iommu_resv_region *region; + int prot =3D IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO; + + region =3D iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH, + prot, IOMMU_RESV_SW_MSI); + if (!region) + return; + + list_add_tail(®ion->list, head); + + iommu_dma_get_resv_regions(dev, head); +} + +static void arm_smmu_put_resv_regions(struct device *dev, + struct list_head *head) +{ + struct iommu_resv_region *entry, *next; + + list_for_each_entry_safe(entry, next, head, list) + kfree(entry); +} + +static struct iommu_ops arm_smmu_ops =3D { + .capable =3D arm_smmu_capable, + .domain_alloc =3D arm_smmu_domain_alloc, + .domain_free =3D arm_smmu_domain_free, + .attach_dev =3D arm_smmu_attach_dev, + .map =3D arm_smmu_map, + .unmap =3D arm_smmu_unmap, + .flush_iotlb_all =3D arm_smmu_flush_iotlb_all, + .iotlb_sync =3D arm_smmu_iotlb_sync, + .iova_to_phys =3D arm_smmu_iova_to_phys, + .add_device =3D arm_smmu_add_device, + .remove_device =3D arm_smmu_remove_device, + .device_group =3D arm_smmu_device_group, + .domain_get_attr =3D arm_smmu_domain_get_attr, + .domain_set_attr =3D arm_smmu_domain_set_attr, + .of_xlate =3D arm_smmu_of_xlate, + .get_resv_regions =3D arm_smmu_get_resv_regions, + .put_resv_regions =3D arm_smmu_put_resv_regions, + .pgsize_bitmap =3D -1UL, /* Restricted during device attach */ +}; + +static void arm_smmu_device_reset(struct arm_smmu_device *smmu) +{ + void __iomem *gr0_base =3D ARM_SMMU_GR0(smmu); + int i; + u32 reg, major; + + /* clear global FSR */ + reg =3D readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); + writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); + + /* + * Reset stream mapping groups: Initial values mark all SMRn as + * invalid and all S2CRn as bypass unless overridden. + */ + for (i =3D 0; i < smmu->num_mapping_groups; ++i) + arm_smmu_write_sme(smmu, i); + + if (smmu->model =3D=3D ARM_MMU500) { + /* + * Before clearing ARM_MMU500_ACTLR_CPRE, need to + * clear CACHE_LOCK bit of ACR first. And, CACHE_LOCK + * bit is only present in MMU-500r2 onwards. + */ + reg =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_ID7); + major =3D (reg >> ID7_MAJOR_SHIFT) & ID7_MAJOR_MASK; + reg =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_sACR); + if (major >=3D 2) + reg &=3D ~ARM_MMU500_ACR_CACHE_LOCK; + /* + * Allow unmatched Stream IDs to allocate bypass + * TLB entries for reduced latency. + */ + reg |=3D ARM_MMU500_ACR_SMTNMB_TLBEN | ARM_MMU500_ACR_S2CRB_TLBEN; + writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_sACR); + } + + /* Make sure all context banks are disabled and clear CB_FSR */ + for (i =3D 0; i < smmu->num_context_banks; ++i) { + void __iomem *cb_base =3D ARM_SMMU_CB(smmu, i); + + arm_smmu_write_context_bank(smmu, i); + writel_relaxed(FSR_FAULT, cb_base + ARM_SMMU_CB_FSR); + /* + * Disable MMU-500's not-particularly-beneficial next-page + * prefetcher for the sake of errata #841119 and #826419. + */ + if (smmu->model =3D=3D ARM_MMU500) { + reg =3D readl_relaxed(cb_base + ARM_SMMU_CB_ACTLR); + reg &=3D ~ARM_MMU500_ACTLR_CPRE; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ACTLR); + } + } + + /* Invalidate the TLB, just in case */ + writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH); + writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH); + + reg =3D readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); + + /* Enable fault reporting */ + reg |=3D (sCR0_GFRE | sCR0_GFIE | sCR0_GCFGFRE | sCR0_GCFGFIE); + + /* Disable TLB broadcasting. */ + reg |=3D (sCR0_VMIDPNE | sCR0_PTM); + + /* Enable client access, handling unmatched streams as appropriate */ + reg &=3D ~sCR0_CLIENTPD; + if (disable_bypass) + reg |=3D sCR0_USFCFG; + else + reg &=3D ~sCR0_USFCFG; + + /* Disable forced broadcasting */ + reg &=3D ~sCR0_FB; + + /* Don't upgrade barriers */ + reg &=3D ~(sCR0_BSU_MASK << sCR0_BSU_SHIFT); + + if (smmu->features & ARM_SMMU_FEAT_VMID16) + reg |=3D sCR0_VMID16EN; + + if (smmu->features & ARM_SMMU_FEAT_EXIDS) + reg |=3D sCR0_EXIDENABLE; + + /* Push the button */ + arm_smmu_tlb_sync_global(smmu); + writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); +} + +static int arm_smmu_id_size_to_bits(int size) +{ + switch (size) { + case 0: + return 32; + case 1: + return 36; + case 2: + return 40; + case 3: + return 42; + case 4: + return 44; + case 5: + default: + return 48; + } +} + +static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) +{ + unsigned long size; + void __iomem *gr0_base =3D ARM_SMMU_GR0(smmu); + u32 id; + bool cttw_reg, cttw_fw =3D smmu->features & ARM_SMMU_FEAT_COHERENT_WALK; + int i; + + dev_notice(smmu->dev, "probing hardware configuration...\n"); + dev_notice(smmu->dev, "SMMUv%d with:\n", + smmu->version =3D=3D ARM_SMMU_V2 ? 2 : 1); + + /* ID0 */ + id =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_ID0); + + /* Restrict available stages based on module parameter */ + if (force_stage =3D=3D 1) + id &=3D ~(ID0_S2TS | ID0_NTS); + else if (force_stage =3D=3D 2) + id &=3D ~(ID0_S1TS | ID0_NTS); + + if (id & ID0_S1TS) { + smmu->features |=3D ARM_SMMU_FEAT_TRANS_S1; + dev_notice(smmu->dev, "\tstage 1 translation\n"); + } + + if (id & ID0_S2TS) { + smmu->features |=3D ARM_SMMU_FEAT_TRANS_S2; + dev_notice(smmu->dev, "\tstage 2 translation\n"); + } + + if (id & ID0_NTS) { + smmu->features |=3D ARM_SMMU_FEAT_TRANS_NESTED; + dev_notice(smmu->dev, "\tnested translation\n"); + } + + if (!(smmu->features & + (ARM_SMMU_FEAT_TRANS_S1 | ARM_SMMU_FEAT_TRANS_S2))) { + dev_err(smmu->dev, "\tno translation support!\n"); + return -ENODEV; + } + + if ((id & ID0_S1TS) && + ((smmu->version < ARM_SMMU_V2) || !(id & ID0_ATOSNS))) { + smmu->features |=3D ARM_SMMU_FEAT_TRANS_OPS; + dev_notice(smmu->dev, "\taddress translation ops\n"); + } + + /* + * In order for DMA API calls to work properly, we must defer to what + * the FW says about coherency, regardless of what the hardware claims. + * Fortunately, this also opens up a workaround for systems where the + * ID register value has ended up configured incorrectly. + */ + cttw_reg =3D !!(id & ID0_CTTW); + if (cttw_fw || cttw_reg) + dev_notice(smmu->dev, "\t%scoherent table walk\n", + cttw_fw ? "" : "non-"); + if (cttw_fw !=3D cttw_reg) + dev_notice(smmu->dev, + "\t(IDR0.CTTW overridden by FW configuration)\n"); + + /* Max. number of entries we have for stream matching/indexing */ + if (smmu->version =3D=3D ARM_SMMU_V2 && id & ID0_EXIDS) { + smmu->features |=3D ARM_SMMU_FEAT_EXIDS; + size =3D 1 << 16; + } else { + size =3D 1 << ((id >> ID0_NUMSIDB_SHIFT) & ID0_NUMSIDB_MASK); + } + smmu->streamid_mask =3D size - 1; + if (id & ID0_SMS) { + smmu->features |=3D ARM_SMMU_FEAT_STREAM_MATCH; + size =3D (id >> ID0_NUMSMRG_SHIFT) & ID0_NUMSMRG_MASK; + if (size =3D=3D 0) { + dev_err(smmu->dev, + "stream-matching supported, but no SMRs present!\n"); + return -ENODEV; + } + + /* Zero-initialised to mark as invalid */ + smmu->smrs =3D devm_kcalloc(smmu->dev, size, sizeof(*smmu->smrs), + GFP_KERNEL); + if (!smmu->smrs) + return -ENOMEM; + + dev_notice(smmu->dev, + "\tstream matching with %lu register groups", size); + } + /* s2cr->type =3D=3D 0 means translation, so initialise explicitly */ + smmu->s2crs =3D devm_kmalloc_array(smmu->dev, size, sizeof(*smmu->s2crs), + GFP_KERNEL); + if (!smmu->s2crs) + return -ENOMEM; + for (i =3D 0; i < size; i++) + smmu->s2crs[i] =3D s2cr_init_val; + + smmu->num_mapping_groups =3D size; + mutex_init(&smmu->stream_map_mutex); + spin_lock_init(&smmu->global_sync_lock); + + if (smmu->version < ARM_SMMU_V2 || !(id & ID0_PTFS_NO_AARCH32)) { + smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH32_L; + if (!(id & ID0_PTFS_NO_AARCH32S)) + smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH32_S; + } + + /* ID1 */ + id =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_ID1); + smmu->pgshift =3D (id & ID1_PAGESIZE) ? 16 : 12; + + /* Check for size mismatch of SMMU address space from mapped region */ + size =3D 1 << (((id >> ID1_NUMPAGENDXB_SHIFT) & ID1_NUMPAGENDXB_MASK) + 1= ); + size <<=3D smmu->pgshift; + if (smmu->cb_base !=3D gr0_base + size) + dev_warn(smmu->dev, + "SMMU address space size (0x%lx) differs from mapped region size (0x%tx= )!\n", + size * 2, (smmu->cb_base - gr0_base) * 2); + + smmu->num_s2_context_banks =3D (id >> ID1_NUMS2CB_SHIFT) & ID1_NUMS2CB_MA= SK; + smmu->num_context_banks =3D (id >> ID1_NUMCB_SHIFT) & ID1_NUMCB_MASK; + if (smmu->num_s2_context_banks > smmu->num_context_banks) { + dev_err(smmu->dev, "impossible number of S2 context banks!\n"); + return -ENODEV; + } + dev_notice(smmu->dev, "\t%u context banks (%u stage-2 only)\n", + smmu->num_context_banks, smmu->num_s2_context_banks); + /* + * Cavium CN88xx erratum #27704. + * Ensure ASID and VMID allocation is unique across all SMMUs in + * the system. + */ + if (smmu->model =3D=3D CAVIUM_SMMUV2) { + smmu->cavium_id_base =3D + atomic_add_return(smmu->num_context_banks, + &cavium_smmu_context_count); + smmu->cavium_id_base -=3D smmu->num_context_banks; + dev_notice(smmu->dev, "\tenabling workaround for Cavium erratum 27704\n"= ); + } + smmu->cbs =3D devm_kcalloc(smmu->dev, smmu->num_context_banks, + sizeof(*smmu->cbs), GFP_KERNEL); + if (!smmu->cbs) + return -ENOMEM; + + /* ID2 */ + id =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_ID2); + size =3D arm_smmu_id_size_to_bits((id >> ID2_IAS_SHIFT) & ID2_IAS_MASK); + smmu->ipa_size =3D size; + + /* The output mask is also applied for bypass */ + size =3D arm_smmu_id_size_to_bits((id >> ID2_OAS_SHIFT) & ID2_OAS_MASK); + smmu->pa_size =3D size; + + if (id & ID2_VMID16) + smmu->features |=3D ARM_SMMU_FEAT_VMID16; + + /* + * What the page table walker can address actually depends on which + * descriptor format is in use, but since a) we don't know that yet, + * and b) it can vary per context bank, this will have to do... + */ + if (dma_set_mask_and_coherent(smmu->dev, DMA_BIT_MASK(size))) + dev_warn(smmu->dev, + "failed to set DMA mask for table walker\n"); + + if (smmu->version < ARM_SMMU_V2) { + smmu->va_size =3D smmu->ipa_size; + if (smmu->version =3D=3D ARM_SMMU_V1_64K) + smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH64_64K; + } else { + size =3D (id >> ID2_UBS_SHIFT) & ID2_UBS_MASK; + smmu->va_size =3D arm_smmu_id_size_to_bits(size); + if (id & ID2_PTFS_4K) + smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH64_4K; + if (id & ID2_PTFS_16K) + smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH64_16K; + if (id & ID2_PTFS_64K) + smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH64_64K; + } + + /* Now we've corralled the various formats, what'll it do? */ + if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) + smmu->pgsize_bitmap |=3D SZ_4K | SZ_64K | SZ_1M | SZ_16M; + if (smmu->features & + (ARM_SMMU_FEAT_FMT_AARCH32_L | ARM_SMMU_FEAT_FMT_AARCH64_4K)) + smmu->pgsize_bitmap |=3D SZ_4K | SZ_2M | SZ_1G; + if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_16K) + smmu->pgsize_bitmap |=3D SZ_16K | SZ_32M; + if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_64K) + smmu->pgsize_bitmap |=3D SZ_64K | SZ_512M; + + if (arm_smmu_ops.pgsize_bitmap =3D=3D -1UL) + arm_smmu_ops.pgsize_bitmap =3D smmu->pgsize_bitmap; + else + arm_smmu_ops.pgsize_bitmap |=3D smmu->pgsize_bitmap; + dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n", + smmu->pgsize_bitmap); + + + if (smmu->features & ARM_SMMU_FEAT_TRANS_S1) + dev_notice(smmu->dev, "\tStage-1: %lu-bit VA -> %lu-bit IPA\n", + smmu->va_size, smmu->ipa_size); + + if (smmu->features & ARM_SMMU_FEAT_TRANS_S2) + dev_notice(smmu->dev, "\tStage-2: %lu-bit IPA -> %lu-bit PA\n", + smmu->ipa_size, smmu->pa_size); + + return 0; +} + +#ifdef CONFIG_ACPI +static int acpi_smmu_get_data(u32 model, struct arm_smmu_device *smmu) +{ + int ret =3D 0; + + switch (model) { + case ACPI_IORT_SMMU_V1: + case ACPI_IORT_SMMU_CORELINK_MMU400: + smmu->version =3D ARM_SMMU_V1; + smmu->model =3D GENERIC_SMMU; + break; + case ACPI_IORT_SMMU_CORELINK_MMU401: + smmu->version =3D ARM_SMMU_V1_64K; + smmu->model =3D GENERIC_SMMU; + break; + case ACPI_IORT_SMMU_V2: + smmu->version =3D ARM_SMMU_V2; + smmu->model =3D GENERIC_SMMU; + break; + case ACPI_IORT_SMMU_CORELINK_MMU500: + smmu->version =3D ARM_SMMU_V2; + smmu->model =3D ARM_MMU500; + break; + case ACPI_IORT_SMMU_CAVIUM_THUNDERX: + smmu->version =3D ARM_SMMU_V2; + smmu->model =3D CAVIUM_SMMUV2; + break; + default: + ret =3D -ENODEV; + } + + return ret; +} + +static int arm_smmu_device_acpi_probe(struct platform_device *pdev, + struct arm_smmu_device *smmu) +{ + struct device *dev =3D smmu->dev; + struct acpi_iort_node *node =3D + *(struct acpi_iort_node **)dev_get_platdata(dev); + struct acpi_iort_smmu *iort_smmu; + int ret; + + /* Retrieve SMMU1/2 specific data */ + iort_smmu =3D (struct acpi_iort_smmu *)node->node_data; + + ret =3D acpi_smmu_get_data(iort_smmu->model, smmu); + if (ret < 0) + return ret; + + /* Ignore the configuration access interrupt */ + smmu->num_global_irqs =3D 1; + + if (iort_smmu->flags & ACPI_IORT_SMMU_COHERENT_WALK) + smmu->features |=3D ARM_SMMU_FEAT_COHERENT_WALK; + + return 0; +} +#else +static inline int arm_smmu_device_acpi_probe(struct platform_device *pdev, + struct arm_smmu_device *smmu) +{ + return -ENODEV; +} +#endif + +static int arm_smmu_device_dt_probe(struct platform_device *pdev, + struct arm_smmu_device *smmu) +{ + const struct arm_smmu_match_data *data; + struct device *dev =3D &pdev->dev; + bool legacy_binding; + + if (of_property_read_u32(dev->of_node, "#global-interrupts", + &smmu->num_global_irqs)) { + dev_err(dev, "missing #global-interrupts property\n"); + return -ENODEV; + } + + data =3D of_device_get_match_data(dev); + smmu->version =3D data->version; + smmu->model =3D data->model; + + parse_driver_options(smmu); + + legacy_binding =3D of_find_property(dev->of_node, "mmu-masters", NULL); + if (legacy_binding && !using_generic_binding) { + if (!using_legacy_binding) + pr_notice("deprecated \"mmu-masters\" DT property in use; DMA API suppo= rt unavailable\n"); + using_legacy_binding =3D true; + } else if (!legacy_binding && !using_legacy_binding) { + using_generic_binding =3D true; + } else { + dev_err(dev, "not probing due to mismatched DT properties\n"); + return -ENODEV; + } + + if (of_dma_is_coherent(dev->of_node)) + smmu->features |=3D ARM_SMMU_FEAT_COHERENT_WALK; + + return 0; +} + +static void arm_smmu_bus_init(void) +{ + /* Oh, for a proper bus abstraction */ + if (!iommu_present(&platform_bus_type)) + bus_set_iommu(&platform_bus_type, &arm_smmu_ops); +#ifdef CONFIG_ARM_AMBA + if (!iommu_present(&amba_bustype)) + bus_set_iommu(&amba_bustype, &arm_smmu_ops); +#endif +#ifdef CONFIG_PCI + if (!iommu_present(&pci_bus_type)) { + pci_request_acs(); + bus_set_iommu(&pci_bus_type, &arm_smmu_ops); + } +#endif +#ifdef CONFIG_FSL_MC_BUS + if (!iommu_present(&fsl_mc_bus_type)) + bus_set_iommu(&fsl_mc_bus_type, &arm_smmu_ops); +#endif +} + +static int arm_smmu_device_probe(struct platform_device *pdev) +{ + struct resource *res; + resource_size_t ioaddr; + struct arm_smmu_device *smmu; + struct device *dev =3D &pdev->dev; + int num_irqs, i, err; + + smmu =3D devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL); + if (!smmu) { + dev_err(dev, "failed to allocate arm_smmu_device\n"); + return -ENOMEM; + } + smmu->dev =3D dev; + + if (dev->of_node) + err =3D arm_smmu_device_dt_probe(pdev, smmu); + else + err =3D arm_smmu_device_acpi_probe(pdev, smmu); + + if (err) + return err; + + res =3D platform_get_resource(pdev, IORESOURCE_MEM, 0); + ioaddr =3D res->start; + smmu->base =3D devm_ioremap_resource(dev, res); + if (IS_ERR(smmu->base)) + return PTR_ERR(smmu->base); + smmu->cb_base =3D smmu->base + resource_size(res) / 2; + + num_irqs =3D 0; + while ((res =3D platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) { + num_irqs++; + if (num_irqs > smmu->num_global_irqs) + smmu->num_context_irqs++; + } + + if (!smmu->num_context_irqs) { + dev_err(dev, "found %d interrupts but expected at least %d\n", + num_irqs, smmu->num_global_irqs + 1); + return -ENODEV; + } + + smmu->irqs =3D devm_kcalloc(dev, num_irqs, sizeof(*smmu->irqs), + GFP_KERNEL); + if (!smmu->irqs) { + dev_err(dev, "failed to allocate %d irqs\n", num_irqs); + return -ENOMEM; + } + + for (i =3D 0; i < num_irqs; ++i) { + int irq =3D platform_get_irq(pdev, i); + + if (irq < 0) { + dev_err(dev, "failed to get irq index %d\n", i); + return -ENODEV; + } + smmu->irqs[i] =3D irq; + } + + err =3D arm_smmu_device_cfg_probe(smmu); + if (err) + return err; + + if (smmu->version =3D=3D ARM_SMMU_V2) { + if (smmu->num_context_banks > smmu->num_context_irqs) { + dev_err(dev, + "found only %d context irq(s) but %d required\n", + smmu->num_context_irqs, smmu->num_context_banks); + return -ENODEV; + } + + /* Ignore superfluous interrupts */ + smmu->num_context_irqs =3D smmu->num_context_banks; + } + + for (i =3D 0; i < smmu->num_global_irqs; ++i) { + err =3D devm_request_irq(smmu->dev, smmu->irqs[i], + arm_smmu_global_fault, + IRQF_SHARED, + "arm-smmu global fault", + smmu); + if (err) { + dev_err(dev, "failed to request global IRQ %d (%u)\n", + i, smmu->irqs[i]); + return err; + } + } + + err =3D iommu_device_sysfs_add(&smmu->iommu, smmu->dev, NULL, + "smmu.%pa", &ioaddr); + if (err) { + dev_err(dev, "Failed to register iommu in sysfs\n"); + return err; + } + + iommu_device_set_ops(&smmu->iommu, &arm_smmu_ops); + iommu_device_set_fwnode(&smmu->iommu, dev->fwnode); + + err =3D iommu_device_register(&smmu->iommu); + if (err) { + dev_err(dev, "Failed to register iommu\n"); + return err; + } + + platform_set_drvdata(pdev, smmu); + arm_smmu_device_reset(smmu); + arm_smmu_test_smr_masks(smmu); + + /* + * For ACPI and generic DT bindings, an SMMU will be probed before + * any device which might need it, so we want the bus ops in place + * ready to handle default domain setup as soon as any SMMU exists. + */ + if (!using_legacy_binding) + arm_smmu_bus_init(); + + return 0; +} + +/* + * With the legacy DT binding in play, though, we have no guarantees about + * probe order, but then we're also not doing default domains, so we can + * delay setting bus ops until we're sure every possible SMMU is ready, + * and that way ensure that no add_device() calls get missed. + */ +static int arm_smmu_legacy_bus_init(void) +{ + if (using_legacy_binding) + arm_smmu_bus_init(); + return 0; +} +device_initcall_sync(arm_smmu_legacy_bus_init); + +static int arm_smmu_device_remove(struct platform_device *pdev) +{ + struct arm_smmu_device *smmu =3D platform_get_drvdata(pdev); + + if (!smmu) + return -ENODEV; + + if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS)) + dev_err(&pdev->dev, "removing device with active domains!\n"); + + /* Turn the thing off */ + writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); + return 0; +} + +static void arm_smmu_device_shutdown(struct platform_device *pdev) +{ + arm_smmu_device_remove(pdev); +} + +static int __maybe_unused arm_smmu_pm_resume(struct device *dev) +{ + struct arm_smmu_device *smmu =3D dev_get_drvdata(dev); + + arm_smmu_device_reset(smmu); + return 0; +} + +static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume); diff --git a/drivers/iommu/arm-smmu-common.h b/drivers/iommu/arm-smmu-commo= n.h new file mode 100644 index 0000000..33feefd --- /dev/null +++ b/drivers/iommu/arm-smmu-common.h @@ -0,0 +1,256 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, U= SA. + * + * Copyright (C) 2013 ARM Limited + * + * Author: Will Deacon + */ + +#ifndef __ARM_SMMU_COMMON_H +#define __ARM_SMMU_COMMON_H + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "arm-smmu-regs.h" +#include "io-pgtable.h" + +#define ARM_MMU500_ACTLR_CPRE (1 << 1) + +#define ARM_MMU500_ACR_CACHE_LOCK (1 << 26) +#define ARM_MMU500_ACR_S2CRB_TLBEN (1 << 10) +#define ARM_MMU500_ACR_SMTNMB_TLBEN (1 << 8) + +#define TLB_LOOP_TIMEOUT 1000000 /* 1s! */ +#define TLB_SPIN_COUNT 10 + +/* Maximum number of context banks per SMMU */ +#define ARM_SMMU_MAX_CBS 128 + +/* SMMU global address space */ +#define ARM_SMMU_GR0(smmu) ((smmu)->base) +#define ARM_SMMU_GR1(smmu) ((smmu)->base + (1 << (smmu)->pgshift)) + +/* + * SMMU global address space with conditional offset to access secure + * aliases of non-secure registers (e.g. nsCR0: 0x400, nsGFSR: 0x448, + * nsGFSYNR0: 0x450) + */ +#define ARM_SMMU_GR0_NS(smmu) \ + ((smmu)->base + \ + ((smmu->options & ARM_SMMU_OPT_SECURE_CFG_ACCESS) \ + ? 0x400 : 0)) + +/* + * Some 64-bit registers only make sense to write atomically, but in such + * cases all the data relevant to AArch32 formats lies within the lower wo= rd, + * therefore this actually makes more sense than it might first appear. + */ +#ifdef CONFIG_64BIT +#define smmu_write_atomic_lq writeq_relaxed +#else +#define smmu_write_atomic_lq writel_relaxed +#endif + +/* Translation context bank */ +#define ARM_SMMU_CB(smmu, n) ((smmu)->cb_base + ((n) << (smmu)->pgshift)) + +#define MSI_IOVA_BASE 0x8000000 +#define MSI_IOVA_LENGTH 0x100000 + +enum arm_smmu_arch_version { + ARM_SMMU_V1, + ARM_SMMU_V1_64K, + ARM_SMMU_V2, +}; + +enum arm_smmu_implementation { + GENERIC_SMMU, + ARM_MMU500, + CAVIUM_SMMUV2, +}; + +struct arm_smmu_s2cr { + struct iommu_group *group; + int count; + enum arm_smmu_s2cr_type type; + enum arm_smmu_s2cr_privcfg privcfg; + u8 cbndx; +}; + +#define s2cr_init_val (struct arm_smmu_s2cr){ \ + .type =3D disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS, \ +} + +struct arm_smmu_smr { + u16 mask; + u16 id; + bool valid; +}; + +struct arm_smmu_cb { + u64 ttbr[2]; + u32 tcr[2]; + u32 mair[2]; + struct arm_smmu_cfg *cfg; +}; + +struct arm_smmu_master_cfg { + struct arm_smmu_device *smmu; + s16 smendx[]; +}; +#define INVALID_SMENDX -1 +#define __fwspec_cfg(fw) ((struct arm_smmu_master_cfg *)fw->iommu_priv) +#define fwspec_smmu(fw) (__fwspec_cfg(fw)->smmu) +#define fwspec_smendx(fw, i) \ + (i >=3D fw->num_ids ? INVALID_SMENDX : __fwspec_cfg(fw)->smendx[i]) +#define for_each_cfg_sme(fw, i, idx) \ + for (i =3D 0; idx =3D fwspec_smendx(fw, i), i < fw->num_ids; ++i) + +struct arm_smmu_device { + struct device *dev; + + void __iomem *base; + void __iomem *cb_base; + unsigned long pgshift; + +#define ARM_SMMU_FEAT_COHERENT_WALK (1 << 0) +#define ARM_SMMU_FEAT_STREAM_MATCH (1 << 1) +#define ARM_SMMU_FEAT_TRANS_S1 (1 << 2) +#define ARM_SMMU_FEAT_TRANS_S2 (1 << 3) +#define ARM_SMMU_FEAT_TRANS_NESTED (1 << 4) +#define ARM_SMMU_FEAT_TRANS_OPS (1 << 5) +#define ARM_SMMU_FEAT_VMID16 (1 << 6) +#define ARM_SMMU_FEAT_FMT_AARCH64_4K (1 << 7) +#define ARM_SMMU_FEAT_FMT_AARCH64_16K (1 << 8) +#define ARM_SMMU_FEAT_FMT_AARCH64_64K (1 << 9) +#define ARM_SMMU_FEAT_FMT_AARCH32_L (1 << 10) +#define ARM_SMMU_FEAT_FMT_AARCH32_S (1 << 11) +#define ARM_SMMU_FEAT_EXIDS (1 << 12) + u32 features; + +#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0) + u32 options; + enum arm_smmu_arch_version version; + enum arm_smmu_implementation model; + + u32 num_context_banks; + u32 num_s2_context_banks; + DECLARE_BITMAP(context_map, ARM_SMMU_MAX_CBS); + struct arm_smmu_cb *cbs; + atomic_t irptndx; + + u32 num_mapping_groups; + u16 streamid_mask; + u16 smr_mask_mask; + struct arm_smmu_smr *smrs; + struct arm_smmu_s2cr *s2crs; + struct mutex stream_map_mutex; + + unsigned long va_size; + unsigned long ipa_size; + unsigned long pa_size; + unsigned long pgsize_bitmap; + + u32 num_global_irqs; + u32 num_context_irqs; + unsigned int *irqs; + + u32 cavium_id_base; /* Specific to Cavium */ + + spinlock_t global_sync_lock; + + /* IOMMU core code handle */ + struct iommu_device iommu; +}; + +enum arm_smmu_context_fmt { + ARM_SMMU_CTX_FMT_NONE, + ARM_SMMU_CTX_FMT_AARCH64, + ARM_SMMU_CTX_FMT_AARCH32_L, + ARM_SMMU_CTX_FMT_AARCH32_S, +}; + +struct arm_smmu_cfg { + u8 cbndx; + u8 irptndx; + union { + u16 asid; + u16 vmid; + }; + u32 cbar; + enum arm_smmu_context_fmt fmt; +}; +#define INVALID_IRPTNDX 0xff + +enum arm_smmu_domain_stage { + ARM_SMMU_DOMAIN_S1 =3D 0, + ARM_SMMU_DOMAIN_S2, + ARM_SMMU_DOMAIN_NESTED, + ARM_SMMU_DOMAIN_BYPASS, +}; + +struct arm_smmu_domain { + struct arm_smmu_device *smmu; + struct io_pgtable_ops *pgtbl_ops; + const struct iommu_gather_ops *tlb_ops; + struct arm_smmu_cfg cfg; + enum arm_smmu_domain_stage stage; + bool non_strict; + struct mutex init_mutex; /* Protects smmu pointer */ + spinlock_t cb_lock; /* Serialises ATS1* ops and TLB syncs */ + struct iommu_domain domain; +}; + +struct arm_smmu_option_prop { + u32 opt; + const char *prop; +}; + +static atomic_t cavium_smmu_context_count =3D ATOMIC_INIT(0); + +static bool using_legacy_binding, using_generic_binding; + +static struct arm_smmu_option_prop arm_smmu_options[] =3D { + { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" }, + { 0, NULL}, +}; + +struct arm_smmu_match_data { + enum arm_smmu_arch_version version; + enum arm_smmu_implementation model; +}; + +#endif /* __ARM_SMMU_COMMON_H */ diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 5a28ae8..a341c9f 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -29,388 +29,9 @@ =20 #define pr_fmt(fmt) "arm-smmu: " fmt =20 -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include +#include "arm-smmu-common.h" =20 -#include -#include - -#include "io-pgtable.h" -#include "arm-smmu-regs.h" - -#define ARM_MMU500_ACTLR_CPRE (1 << 1) - -#define ARM_MMU500_ACR_CACHE_LOCK (1 << 26) -#define ARM_MMU500_ACR_S2CRB_TLBEN (1 << 10) -#define ARM_MMU500_ACR_SMTNMB_TLBEN (1 << 8) - -#define TLB_LOOP_TIMEOUT 1000000 /* 1s! */ -#define TLB_SPIN_COUNT 10 - -/* Maximum number of context banks per SMMU */ -#define ARM_SMMU_MAX_CBS 128 - -/* SMMU global address space */ -#define ARM_SMMU_GR0(smmu) ((smmu)->base) -#define ARM_SMMU_GR1(smmu) ((smmu)->base + (1 << (smmu)->pgshift)) - -/* - * SMMU global address space with conditional offset to access secure - * aliases of non-secure registers (e.g. nsCR0: 0x400, nsGFSR: 0x448, - * nsGFSYNR0: 0x450) - */ -#define ARM_SMMU_GR0_NS(smmu) \ - ((smmu)->base + \ - ((smmu->options & ARM_SMMU_OPT_SECURE_CFG_ACCESS) \ - ? 0x400 : 0)) - -/* - * Some 64-bit registers only make sense to write atomically, but in such - * cases all the data relevant to AArch32 formats lies within the lower wo= rd, - * therefore this actually makes more sense than it might first appear. - */ -#ifdef CONFIG_64BIT -#define smmu_write_atomic_lq writeq_relaxed -#else -#define smmu_write_atomic_lq writel_relaxed -#endif - -/* Translation context bank */ -#define ARM_SMMU_CB(smmu, n) ((smmu)->cb_base + ((n) << (smmu)->pgshift)) - -#define MSI_IOVA_BASE 0x8000000 -#define MSI_IOVA_LENGTH 0x100000 - -static int force_stage; -module_param(force_stage, int, S_IRUGO); -MODULE_PARM_DESC(force_stage, - "Force SMMU mappings to be installed at a particular stage of translation= . A value of '1' or '2' forces the corresponding stage. All other values ar= e ignored (i.e. no stage is forced). Note that selecting a specific stage w= ill disable support for nested translation."); -static bool disable_bypass; -module_param(disable_bypass, bool, S_IRUGO); -MODULE_PARM_DESC(disable_bypass, - "Disable bypass streams such that incoming transactions from devices that= are not attached to an iommu domain will report an abort back to the devic= e and will not be allowed to pass through the SMMU."); - -enum arm_smmu_arch_version { - ARM_SMMU_V1, - ARM_SMMU_V1_64K, - ARM_SMMU_V2, -}; - -enum arm_smmu_implementation { - GENERIC_SMMU, - ARM_MMU500, - CAVIUM_SMMUV2, -}; - -struct arm_smmu_s2cr { - struct iommu_group *group; - int count; - enum arm_smmu_s2cr_type type; - enum arm_smmu_s2cr_privcfg privcfg; - u8 cbndx; -}; - -#define s2cr_init_val (struct arm_smmu_s2cr){ \ - .type =3D disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS, \ -} - -struct arm_smmu_smr { - u16 mask; - u16 id; - bool valid; -}; - -struct arm_smmu_cb { - u64 ttbr[2]; - u32 tcr[2]; - u32 mair[2]; - struct arm_smmu_cfg *cfg; -}; - -struct arm_smmu_master_cfg { - struct arm_smmu_device *smmu; - s16 smendx[]; -}; -#define INVALID_SMENDX -1 -#define __fwspec_cfg(fw) ((struct arm_smmu_master_cfg *)fw->iommu_priv) -#define fwspec_smmu(fw) (__fwspec_cfg(fw)->smmu) -#define fwspec_smendx(fw, i) \ - (i >=3D fw->num_ids ? INVALID_SMENDX : __fwspec_cfg(fw)->smendx[i]) -#define for_each_cfg_sme(fw, i, idx) \ - for (i =3D 0; idx =3D fwspec_smendx(fw, i), i < fw->num_ids; ++i) - -struct arm_smmu_device { - struct device *dev; - - void __iomem *base; - void __iomem *cb_base; - unsigned long pgshift; - -#define ARM_SMMU_FEAT_COHERENT_WALK (1 << 0) -#define ARM_SMMU_FEAT_STREAM_MATCH (1 << 1) -#define ARM_SMMU_FEAT_TRANS_S1 (1 << 2) -#define ARM_SMMU_FEAT_TRANS_S2 (1 << 3) -#define ARM_SMMU_FEAT_TRANS_NESTED (1 << 4) -#define ARM_SMMU_FEAT_TRANS_OPS (1 << 5) -#define ARM_SMMU_FEAT_VMID16 (1 << 6) -#define ARM_SMMU_FEAT_FMT_AARCH64_4K (1 << 7) -#define ARM_SMMU_FEAT_FMT_AARCH64_16K (1 << 8) -#define ARM_SMMU_FEAT_FMT_AARCH64_64K (1 << 9) -#define ARM_SMMU_FEAT_FMT_AARCH32_L (1 << 10) -#define ARM_SMMU_FEAT_FMT_AARCH32_S (1 << 11) -#define ARM_SMMU_FEAT_EXIDS (1 << 12) - u32 features; - -#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0) - u32 options; - enum arm_smmu_arch_version version; - enum arm_smmu_implementation model; - - u32 num_context_banks; - u32 num_s2_context_banks; - DECLARE_BITMAP(context_map, ARM_SMMU_MAX_CBS); - struct arm_smmu_cb *cbs; - atomic_t irptndx; - - u32 num_mapping_groups; - u16 streamid_mask; - u16 smr_mask_mask; - struct arm_smmu_smr *smrs; - struct arm_smmu_s2cr *s2crs; - struct mutex stream_map_mutex; - - unsigned long va_size; - unsigned long ipa_size; - unsigned long pa_size; - unsigned long pgsize_bitmap; - - u32 num_global_irqs; - u32 num_context_irqs; - unsigned int *irqs; - - u32 cavium_id_base; /* Specific to Cavium */ - - spinlock_t global_sync_lock; - - /* IOMMU core code handle */ - struct iommu_device iommu; -}; - -enum arm_smmu_context_fmt { - ARM_SMMU_CTX_FMT_NONE, - ARM_SMMU_CTX_FMT_AARCH64, - ARM_SMMU_CTX_FMT_AARCH32_L, - ARM_SMMU_CTX_FMT_AARCH32_S, -}; - -struct arm_smmu_cfg { - u8 cbndx; - u8 irptndx; - union { - u16 asid; - u16 vmid; - }; - u32 cbar; - enum arm_smmu_context_fmt fmt; -}; -#define INVALID_IRPTNDX 0xff - -enum arm_smmu_domain_stage { - ARM_SMMU_DOMAIN_S1 =3D 0, - ARM_SMMU_DOMAIN_S2, - ARM_SMMU_DOMAIN_NESTED, - ARM_SMMU_DOMAIN_BYPASS, -}; - -struct arm_smmu_domain { - struct arm_smmu_device *smmu; - struct io_pgtable_ops *pgtbl_ops; - const struct iommu_gather_ops *tlb_ops; - struct arm_smmu_cfg cfg; - enum arm_smmu_domain_stage stage; - bool non_strict; - struct mutex init_mutex; /* Protects smmu pointer */ - spinlock_t cb_lock; /* Serialises ATS1* ops and TLB syncs */ - struct iommu_domain domain; -}; - -struct arm_smmu_option_prop { - u32 opt; - const char *prop; -}; - -static atomic_t cavium_smmu_context_count =3D ATOMIC_INIT(0); - -static bool using_legacy_binding, using_generic_binding; - -static struct arm_smmu_option_prop arm_smmu_options[] =3D { - { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" }, - { 0, NULL}, -}; - -static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) -{ - return container_of(dom, struct arm_smmu_domain, domain); -} - -static void parse_driver_options(struct arm_smmu_device *smmu) -{ - int i =3D 0; - - do { - if (of_property_read_bool(smmu->dev->of_node, - arm_smmu_options[i].prop)) { - smmu->options |=3D arm_smmu_options[i].opt; - dev_notice(smmu->dev, "option %s\n", - arm_smmu_options[i].prop); - } - } while (arm_smmu_options[++i].opt); -} - -static struct device_node *dev_get_dev_node(struct device *dev) -{ - if (dev_is_pci(dev)) { - struct pci_bus *bus =3D to_pci_dev(dev)->bus; - - while (!pci_is_root_bus(bus)) - bus =3D bus->parent; - return of_node_get(bus->bridge->parent->of_node); - } - - return of_node_get(dev->of_node); -} - -static int __arm_smmu_get_pci_sid(struct pci_dev *pdev, u16 alias, void *d= ata) -{ - *((__be32 *)data) =3D cpu_to_be32(alias); - return 0; /* Continue walking */ -} - -static int __find_legacy_master_phandle(struct device *dev, void *data) -{ - struct of_phandle_iterator *it =3D *(void **)data; - struct device_node *np =3D it->node; - int err; - - of_for_each_phandle(it, err, dev->of_node, "mmu-masters", - "#stream-id-cells", 0) - if (it->node =3D=3D np) { - *(void **)data =3D dev; - return 1; - } - it->node =3D np; - return err =3D=3D -ENOENT ? 0 : err; -} - -static struct platform_driver arm_smmu_driver; -static struct iommu_ops arm_smmu_ops; - -static int arm_smmu_register_legacy_master(struct device *dev, - struct arm_smmu_device **smmu) -{ - struct device *smmu_dev; - struct device_node *np; - struct of_phandle_iterator it; - void *data =3D ⁢ - u32 *sids; - __be32 pci_sid; - int err; - - np =3D dev_get_dev_node(dev); - if (!np || !of_find_property(np, "#stream-id-cells", NULL)) { - of_node_put(np); - return -ENODEV; - } - - it.node =3D np; - err =3D driver_for_each_device(&arm_smmu_driver.driver, NULL, &data, - __find_legacy_master_phandle); - smmu_dev =3D data; - of_node_put(np); - if (err =3D=3D 0) - return -ENODEV; - if (err < 0) - return err; - - if (dev_is_pci(dev)) { - /* "mmu-masters" assumes Stream ID =3D=3D Requester ID */ - pci_for_each_dma_alias(to_pci_dev(dev), __arm_smmu_get_pci_sid, - &pci_sid); - it.cur =3D &pci_sid; - it.cur_count =3D 1; - } - - err =3D iommu_fwspec_init(dev, &smmu_dev->of_node->fwnode, - &arm_smmu_ops); - if (err) - return err; - - sids =3D kcalloc(it.cur_count, sizeof(*sids), GFP_KERNEL); - if (!sids) - return -ENOMEM; - - *smmu =3D dev_get_drvdata(smmu_dev); - of_phandle_iterator_args(&it, sids, it.cur_count); - err =3D iommu_fwspec_add_ids(dev, sids, it.cur_count); - kfree(sids); - return err; -} - -static int __arm_smmu_alloc_bitmap(unsigned long *map, int start, int end) -{ - int idx; - - do { - idx =3D find_next_zero_bit(map, end, start); - if (idx =3D=3D end) - return -ENOSPC; - } while (test_and_set_bit(idx, map)); - - return idx; -} - -static void __arm_smmu_free_bitmap(unsigned long *map, int idx) -{ - clear_bit(idx, map); -} - -/* Wait for any pending TLB invalidations to complete */ -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, - void __iomem *sync, void __iomem *status) -{ - unsigned int spin_cnt, delay; - - writel_relaxed(0, sync); - for (delay =3D 1; delay < TLB_LOOP_TIMEOUT; delay *=3D 2) { - for (spin_cnt =3D TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) { - if (!(readl_relaxed(status) & sTLBGSTATUS_GSACTIVE)) - return; - cpu_relax(); - } - udelay(delay); - } - dev_err_ratelimited(smmu->dev, - "TLB sync timed out -- SMMU may be deadlocked\n"); -} +#include "arm-smmu-common.c" =20 static void arm_smmu_tlb_sync_global(struct arm_smmu_device *smmu) { @@ -436,114 +57,6 @@ static void arm_smmu_tlb_sync_context(void *cookie) spin_unlock_irqrestore(&smmu_domain->cb_lock, flags); } =20 -static void arm_smmu_tlb_sync_vmid(void *cookie) -{ - struct arm_smmu_domain *smmu_domain =3D cookie; - - arm_smmu_tlb_sync_global(smmu_domain->smmu); -} - -static void arm_smmu_tlb_inv_context_s1(void *cookie) -{ - struct arm_smmu_domain *smmu_domain =3D cookie; - struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; - void __iomem *base =3D ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); - - /* - * NOTE: this is not a relaxed write; it needs to guarantee that PTEs - * cleared by the current CPU are visible to the SMMU before the TLBI. - */ - writel(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID); - arm_smmu_tlb_sync_context(cookie); -} - -static void arm_smmu_tlb_inv_context_s2(void *cookie) -{ - struct arm_smmu_domain *smmu_domain =3D cookie; - struct arm_smmu_device *smmu =3D smmu_domain->smmu; - void __iomem *base =3D ARM_SMMU_GR0(smmu); - - /* NOTE: see above */ - writel(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID); - arm_smmu_tlb_sync_global(smmu); -} - -static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, - size_t granule, bool leaf, void *cookie) -{ - struct arm_smmu_domain *smmu_domain =3D cookie; - struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; - bool stage1 =3D cfg->cbar !=3D CBAR_TYPE_S2_TRANS; - void __iomem *reg =3D ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); - - if (smmu_domain->smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) - wmb(); - - if (stage1) { - reg +=3D leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA; - - if (cfg->fmt !=3D ARM_SMMU_CTX_FMT_AARCH64) { - iova &=3D ~12UL; - iova |=3D cfg->asid; - do { - writel_relaxed(iova, reg); - iova +=3D granule; - } while (size -=3D granule); - } else { - iova >>=3D 12; - iova |=3D (u64)cfg->asid << 48; - do { - writeq_relaxed(iova, reg); - iova +=3D granule >> 12; - } while (size -=3D granule); - } - } else { - reg +=3D leaf ? ARM_SMMU_CB_S2_TLBIIPAS2L : - ARM_SMMU_CB_S2_TLBIIPAS2; - iova >>=3D 12; - do { - smmu_write_atomic_lq(iova, reg); - iova +=3D granule >> 12; - } while (size -=3D granule); - } -} - -/* - * On MMU-401 at least, the cost of firing off multiple TLBIVMIDs appears - * almost negligible, but the benefit of getting the first one in as far a= head - * of the sync as possible is significant, hence we don't just make this a - * no-op and set .tlb_sync to arm_smmu_inv_context_s2() as you might think= . - */ -static void arm_smmu_tlb_inv_vmid_nosync(unsigned long iova, size_t size, - size_t granule, bool leaf, void *cookie) -{ - struct arm_smmu_domain *smmu_domain =3D cookie; - void __iomem *base =3D ARM_SMMU_GR0(smmu_domain->smmu); - - if (smmu_domain->smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) - wmb(); - - writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID); -} - -static const struct iommu_gather_ops arm_smmu_s1_tlb_ops =3D { - .tlb_flush_all =3D arm_smmu_tlb_inv_context_s1, - .tlb_add_flush =3D arm_smmu_tlb_inv_range_nosync, - .tlb_sync =3D arm_smmu_tlb_sync_context, -}; - -static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v2 =3D { - .tlb_flush_all =3D arm_smmu_tlb_inv_context_s2, - .tlb_add_flush =3D arm_smmu_tlb_inv_range_nosync, - .tlb_sync =3D arm_smmu_tlb_sync_context, -}; - -static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v1 =3D { - .tlb_flush_all =3D arm_smmu_tlb_inv_context_s2, - .tlb_add_flush =3D arm_smmu_tlb_inv_vmid_nosync, - .tlb_sync =3D arm_smmu_tlb_sync_vmid, -}; - static irqreturn_t arm_smmu_context_fault(int irq, void *dev) { u32 fsr, fsynr; @@ -595,1360 +108,6 @@ static irqreturn_t arm_smmu_global_fault(int irq, vo= id *dev) return IRQ_HANDLED; } =20 -static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain= , - struct io_pgtable_cfg *pgtbl_cfg) -{ - struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; - struct arm_smmu_cb *cb =3D &smmu_domain->smmu->cbs[cfg->cbndx]; - bool stage1 =3D cfg->cbar !=3D CBAR_TYPE_S2_TRANS; - - cb->cfg =3D cfg; - - /* TTBCR */ - if (stage1) { - if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH32_S) { - cb->tcr[0] =3D pgtbl_cfg->arm_v7s_cfg.tcr; - } else { - cb->tcr[0] =3D pgtbl_cfg->arm_lpae_s1_cfg.tcr; - cb->tcr[1] =3D pgtbl_cfg->arm_lpae_s1_cfg.tcr >> 32; - cb->tcr[1] |=3D TTBCR2_SEP_UPSTREAM; - if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH64) - cb->tcr[1] |=3D TTBCR2_AS; - } - } else { - cb->tcr[0] =3D pgtbl_cfg->arm_lpae_s2_cfg.vtcr; - } - - /* TTBRs */ - if (stage1) { - if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH32_S) { - cb->ttbr[0] =3D pgtbl_cfg->arm_v7s_cfg.ttbr[0]; - cb->ttbr[1] =3D pgtbl_cfg->arm_v7s_cfg.ttbr[1]; - } else { - cb->ttbr[0] =3D pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0]; - cb->ttbr[0] |=3D (u64)cfg->asid << TTBRn_ASID_SHIFT; - cb->ttbr[1] =3D pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1]; - cb->ttbr[1] |=3D (u64)cfg->asid << TTBRn_ASID_SHIFT; - } - } else { - cb->ttbr[0] =3D pgtbl_cfg->arm_lpae_s2_cfg.vttbr; - } - - /* MAIRs (stage-1 only) */ - if (stage1) { - if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH32_S) { - cb->mair[0] =3D pgtbl_cfg->arm_v7s_cfg.prrr; - cb->mair[1] =3D pgtbl_cfg->arm_v7s_cfg.nmrr; - } else { - cb->mair[0] =3D pgtbl_cfg->arm_lpae_s1_cfg.mair[0]; - cb->mair[1] =3D pgtbl_cfg->arm_lpae_s1_cfg.mair[1]; - } - } -} - -static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int = idx) -{ - u32 reg; - bool stage1; - struct arm_smmu_cb *cb =3D &smmu->cbs[idx]; - struct arm_smmu_cfg *cfg =3D cb->cfg; - void __iomem *cb_base, *gr1_base; - - cb_base =3D ARM_SMMU_CB(smmu, idx); - - /* Unassigned context banks only need disabling */ - if (!cfg) { - writel_relaxed(0, cb_base + ARM_SMMU_CB_SCTLR); - return; - } - - gr1_base =3D ARM_SMMU_GR1(smmu); - stage1 =3D cfg->cbar !=3D CBAR_TYPE_S2_TRANS; - - /* CBA2R */ - if (smmu->version > ARM_SMMU_V1) { - if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH64) - reg =3D CBA2R_RW64_64BIT; - else - reg =3D CBA2R_RW64_32BIT; - /* 16-bit VMIDs live in CBA2R */ - if (smmu->features & ARM_SMMU_FEAT_VMID16) - reg |=3D cfg->vmid << CBA2R_VMID_SHIFT; - - writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBA2R(idx)); - } - - /* CBAR */ - reg =3D cfg->cbar; - if (smmu->version < ARM_SMMU_V2) - reg |=3D cfg->irptndx << CBAR_IRPTNDX_SHIFT; - - /* - * Use the weakest shareability/memory types, so they are - * overridden by the ttbcr/pte. - */ - if (stage1) { - reg |=3D (CBAR_S1_BPSHCFG_NSH << CBAR_S1_BPSHCFG_SHIFT) | - (CBAR_S1_MEMATTR_WB << CBAR_S1_MEMATTR_SHIFT); - } else if (!(smmu->features & ARM_SMMU_FEAT_VMID16)) { - /* 8-bit VMIDs live in CBAR */ - reg |=3D cfg->vmid << CBAR_VMID_SHIFT; - } - writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBAR(idx)); - - /* - * TTBCR - * We must write this before the TTBRs, since it determines the - * access behaviour of some fields (in particular, ASID[15:8]). - */ - if (stage1 && smmu->version > ARM_SMMU_V1) - writel_relaxed(cb->tcr[1], cb_base + ARM_SMMU_CB_TTBCR2); - writel_relaxed(cb->tcr[0], cb_base + ARM_SMMU_CB_TTBCR); - - /* TTBRs */ - if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH32_S) { - writel_relaxed(cfg->asid, cb_base + ARM_SMMU_CB_CONTEXTIDR); - writel_relaxed(cb->ttbr[0], cb_base + ARM_SMMU_CB_TTBR0); - writel_relaxed(cb->ttbr[1], cb_base + ARM_SMMU_CB_TTBR1); - } else { - writeq_relaxed(cb->ttbr[0], cb_base + ARM_SMMU_CB_TTBR0); - if (stage1) - writeq_relaxed(cb->ttbr[1], cb_base + ARM_SMMU_CB_TTBR1); - } - - /* MAIRs (stage-1 only) */ - if (stage1) { - writel_relaxed(cb->mair[0], cb_base + ARM_SMMU_CB_S1_MAIR0); - writel_relaxed(cb->mair[1], cb_base + ARM_SMMU_CB_S1_MAIR1); - } - - /* SCTLR */ - reg =3D SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE | SCTLR_M; - if (stage1) - reg |=3D SCTLR_S1_ASIDPNE; - if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) - reg |=3D SCTLR_E; - - writel_relaxed(reg, cb_base + ARM_SMMU_CB_SCTLR); -} - -static int arm_smmu_init_domain_context(struct iommu_domain *domain, - struct arm_smmu_device *smmu) -{ - int irq, start, ret =3D 0; - unsigned long ias, oas; - struct io_pgtable_ops *pgtbl_ops; - struct io_pgtable_cfg pgtbl_cfg; - enum io_pgtable_fmt fmt; - struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); - struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; - - mutex_lock(&smmu_domain->init_mutex); - if (smmu_domain->smmu) - goto out_unlock; - - if (domain->type =3D=3D IOMMU_DOMAIN_IDENTITY) { - smmu_domain->stage =3D ARM_SMMU_DOMAIN_BYPASS; - smmu_domain->smmu =3D smmu; - goto out_unlock; - } - - /* - * Mapping the requested stage onto what we support is surprisingly - * complicated, mainly because the spec allows S1+S2 SMMUs without - * support for nested translation. That means we end up with the - * following table: - * - * Requested Supported Actual - * S1 N S1 - * S1 S1+S2 S1 - * S1 S2 S2 - * S1 S1 S1 - * N N N - * N S1+S2 S2 - * N S2 S2 - * N S1 S1 - * - * Note that you can't actually request stage-2 mappings. - */ - if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) - smmu_domain->stage =3D ARM_SMMU_DOMAIN_S2; - if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2)) - smmu_domain->stage =3D ARM_SMMU_DOMAIN_S1; - - /* - * Choosing a suitable context format is even more fiddly. Until we - * grow some way for the caller to express a preference, and/or move - * the decision into the io-pgtable code where it arguably belongs, - * just aim for the closest thing to the rest of the system, and hope - * that the hardware isn't esoteric enough that we can't assume AArch64 - * support to be a superset of AArch32 support... - */ - if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_L) - cfg->fmt =3D ARM_SMMU_CTX_FMT_AARCH32_L; - if (IS_ENABLED(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) && - !IS_ENABLED(CONFIG_64BIT) && !IS_ENABLED(CONFIG_ARM_LPAE) && - (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) && - (smmu_domain->stage =3D=3D ARM_SMMU_DOMAIN_S1)) - cfg->fmt =3D ARM_SMMU_CTX_FMT_AARCH32_S; - if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_NONE) &= & - (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K | - ARM_SMMU_FEAT_FMT_AARCH64_16K | - ARM_SMMU_FEAT_FMT_AARCH64_4K))) - cfg->fmt =3D ARM_SMMU_CTX_FMT_AARCH64; - - if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_NONE) { - ret =3D -EINVAL; - goto out_unlock; - } - - switch (smmu_domain->stage) { - case ARM_SMMU_DOMAIN_S1: - cfg->cbar =3D CBAR_TYPE_S1_TRANS_S2_BYPASS; - start =3D smmu->num_s2_context_banks; - ias =3D smmu->va_size; - oas =3D smmu->ipa_size; - if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH64) { - fmt =3D ARM_64_LPAE_S1; - } else if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH32_L) { - fmt =3D ARM_32_LPAE_S1; - ias =3D min(ias, 32UL); - oas =3D min(oas, 40UL); - } else { - fmt =3D ARM_V7S; - ias =3D min(ias, 32UL); - oas =3D min(oas, 32UL); - } - smmu_domain->tlb_ops =3D &arm_smmu_s1_tlb_ops; - break; - case ARM_SMMU_DOMAIN_NESTED: - /* - * We will likely want to change this if/when KVM gets - * involved. - */ - case ARM_SMMU_DOMAIN_S2: - cfg->cbar =3D CBAR_TYPE_S2_TRANS; - start =3D 0; - ias =3D smmu->ipa_size; - oas =3D smmu->pa_size; - if (cfg->fmt =3D=3D ARM_SMMU_CTX_FMT_AARCH64) { - fmt =3D ARM_64_LPAE_S2; - } else { - fmt =3D ARM_32_LPAE_S2; - ias =3D min(ias, 40UL); - oas =3D min(oas, 40UL); - } - if (smmu->version =3D=3D ARM_SMMU_V2) - smmu_domain->tlb_ops =3D &arm_smmu_s2_tlb_ops_v2; - else - smmu_domain->tlb_ops =3D &arm_smmu_s2_tlb_ops_v1; - break; - default: - ret =3D -EINVAL; - goto out_unlock; - } - ret =3D __arm_smmu_alloc_bitmap(smmu->context_map, start, - smmu->num_context_banks); - if (ret < 0) - goto out_unlock; - - cfg->cbndx =3D ret; - if (smmu->version < ARM_SMMU_V2) { - cfg->irptndx =3D atomic_inc_return(&smmu->irptndx); - cfg->irptndx %=3D smmu->num_context_irqs; - } else { - cfg->irptndx =3D cfg->cbndx; - } - - if (smmu_domain->stage =3D=3D ARM_SMMU_DOMAIN_S2) - cfg->vmid =3D cfg->cbndx + 1 + smmu->cavium_id_base; - else - cfg->asid =3D cfg->cbndx + smmu->cavium_id_base; - - pgtbl_cfg =3D (struct io_pgtable_cfg) { - .pgsize_bitmap =3D smmu->pgsize_bitmap, - .ias =3D ias, - .oas =3D oas, - .tlb =3D smmu_domain->tlb_ops, - .iommu_dev =3D smmu->dev, - }; - - if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) - pgtbl_cfg.quirks =3D IO_PGTABLE_QUIRK_NO_DMA; - - if (smmu_domain->non_strict) - pgtbl_cfg.quirks |=3D IO_PGTABLE_QUIRK_NON_STRICT; - - smmu_domain->smmu =3D smmu; - pgtbl_ops =3D alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain); - if (!pgtbl_ops) { - ret =3D -ENOMEM; - goto out_clear_smmu; - } - - /* Update the domain's page sizes to reflect the page table format */ - domain->pgsize_bitmap =3D pgtbl_cfg.pgsize_bitmap; - domain->geometry.aperture_end =3D (1UL << ias) - 1; - domain->geometry.force_aperture =3D true; - - /* Initialise the context bank with our page table cfg */ - arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg); - arm_smmu_write_context_bank(smmu, cfg->cbndx); - - /* - * Request context fault interrupt. Do this last to avoid the - * handler seeing a half-initialised domain state. - */ - irq =3D smmu->irqs[smmu->num_global_irqs + cfg->irptndx]; - ret =3D devm_request_irq(smmu->dev, irq, arm_smmu_context_fault, - IRQF_SHARED, "arm-smmu-context-fault", domain); - if (ret < 0) { - dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n", - cfg->irptndx, irq); - cfg->irptndx =3D INVALID_IRPTNDX; - } - - mutex_unlock(&smmu_domain->init_mutex); - - /* Publish page table ops for map/unmap */ - smmu_domain->pgtbl_ops =3D pgtbl_ops; - return 0; - -out_clear_smmu: - smmu_domain->smmu =3D NULL; -out_unlock: - mutex_unlock(&smmu_domain->init_mutex); - return ret; -} - -static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) -{ - struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); - struct arm_smmu_device *smmu =3D smmu_domain->smmu; - struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; - int irq; - - if (!smmu || domain->type =3D=3D IOMMU_DOMAIN_IDENTITY) - return; - - /* - * Disable the context bank and free the page tables before freeing - * it. - */ - smmu->cbs[cfg->cbndx].cfg =3D NULL; - arm_smmu_write_context_bank(smmu, cfg->cbndx); - - if (cfg->irptndx !=3D INVALID_IRPTNDX) { - irq =3D smmu->irqs[smmu->num_global_irqs + cfg->irptndx]; - devm_free_irq(smmu->dev, irq, domain); - } - - free_io_pgtable_ops(smmu_domain->pgtbl_ops); - __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); -} - -static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) -{ - struct arm_smmu_domain *smmu_domain; - - if (type !=3D IOMMU_DOMAIN_UNMANAGED && - type !=3D IOMMU_DOMAIN_DMA && - type !=3D IOMMU_DOMAIN_IDENTITY) - return NULL; - /* - * Allocate the domain and initialise some of its data structures. - * We can't really do anything meaningful until we've added a - * master. - */ - smmu_domain =3D kzalloc(sizeof(*smmu_domain), GFP_KERNEL); - if (!smmu_domain) - return NULL; - - if (type =3D=3D IOMMU_DOMAIN_DMA && (using_legacy_binding || - iommu_get_dma_cookie(&smmu_domain->domain))) { - kfree(smmu_domain); - return NULL; - } - - mutex_init(&smmu_domain->init_mutex); - spin_lock_init(&smmu_domain->cb_lock); - - return &smmu_domain->domain; -} - -static void arm_smmu_domain_free(struct iommu_domain *domain) -{ - struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); - - /* - * Free the domain resources. We assume that all devices have - * already been detached. - */ - iommu_put_dma_cookie(domain); - arm_smmu_destroy_domain_context(domain); - kfree(smmu_domain); -} - -static void arm_smmu_write_smr(struct arm_smmu_device *smmu, int idx) -{ - struct arm_smmu_smr *smr =3D smmu->smrs + idx; - u32 reg =3D smr->id << SMR_ID_SHIFT | smr->mask << SMR_MASK_SHIFT; - - if (!(smmu->features & ARM_SMMU_FEAT_EXIDS) && smr->valid) - reg |=3D SMR_VALID; - writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx)); -} - -static void arm_smmu_write_s2cr(struct arm_smmu_device *smmu, int idx) -{ - struct arm_smmu_s2cr *s2cr =3D smmu->s2crs + idx; - u32 reg =3D (s2cr->type & S2CR_TYPE_MASK) << S2CR_TYPE_SHIFT | - (s2cr->cbndx & S2CR_CBNDX_MASK) << S2CR_CBNDX_SHIFT | - (s2cr->privcfg & S2CR_PRIVCFG_MASK) << S2CR_PRIVCFG_SHIFT; - - if (smmu->features & ARM_SMMU_FEAT_EXIDS && smmu->smrs && - smmu->smrs[idx].valid) - reg |=3D S2CR_EXIDVALID; - writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_S2CR(idx)); -} - -static void arm_smmu_write_sme(struct arm_smmu_device *smmu, int idx) -{ - arm_smmu_write_s2cr(smmu, idx); - if (smmu->smrs) - arm_smmu_write_smr(smmu, idx); -} - -/* - * The width of SMR's mask field depends on sCR0_EXIDENABLE, so this funct= ion - * should be called after sCR0 is written. - */ -static void arm_smmu_test_smr_masks(struct arm_smmu_device *smmu) -{ - void __iomem *gr0_base =3D ARM_SMMU_GR0(smmu); - u32 smr; - - if (!smmu->smrs) - return; - - /* - * SMR.ID bits may not be preserved if the corresponding MASK - * bits are set, so check each one separately. We can reject - * masters later if they try to claim IDs outside these masks. - */ - smr =3D smmu->streamid_mask << SMR_ID_SHIFT; - writel_relaxed(smr, gr0_base + ARM_SMMU_GR0_SMR(0)); - smr =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_SMR(0)); - smmu->streamid_mask =3D smr >> SMR_ID_SHIFT; - - smr =3D smmu->streamid_mask << SMR_MASK_SHIFT; - writel_relaxed(smr, gr0_base + ARM_SMMU_GR0_SMR(0)); - smr =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_SMR(0)); - smmu->smr_mask_mask =3D smr >> SMR_MASK_SHIFT; -} - -static int arm_smmu_find_sme(struct arm_smmu_device *smmu, u16 id, u16 mas= k) -{ - struct arm_smmu_smr *smrs =3D smmu->smrs; - int i, free_idx =3D -ENOSPC; - - /* Stream indexing is blissfully easy */ - if (!smrs) - return id; - - /* Validating SMRs is... less so */ - for (i =3D 0; i < smmu->num_mapping_groups; ++i) { - if (!smrs[i].valid) { - /* - * Note the first free entry we come across, which - * we'll claim in the end if nothing else matches. - */ - if (free_idx < 0) - free_idx =3D i; - continue; - } - /* - * If the new entry is _entirely_ matched by an existing entry, - * then reuse that, with the guarantee that there also cannot - * be any subsequent conflicting entries. In normal use we'd - * expect simply identical entries for this case, but there's - * no harm in accommodating the generalisation. - */ - if ((mask & smrs[i].mask) =3D=3D mask && - !((id ^ smrs[i].id) & ~smrs[i].mask)) - return i; - /* - * If the new entry has any other overlap with an existing one, - * though, then there always exists at least one stream ID - * which would cause a conflict, and we can't allow that risk. - */ - if (!((id ^ smrs[i].id) & ~(smrs[i].mask | mask))) - return -EINVAL; - } - - return free_idx; -} - -static bool arm_smmu_free_sme(struct arm_smmu_device *smmu, int idx) -{ - if (--smmu->s2crs[idx].count) - return false; - - smmu->s2crs[idx] =3D s2cr_init_val; - if (smmu->smrs) - smmu->smrs[idx].valid =3D false; - - return true; -} - -static int arm_smmu_master_alloc_smes(struct device *dev) -{ - struct iommu_fwspec *fwspec =3D dev->iommu_fwspec; - struct arm_smmu_master_cfg *cfg =3D fwspec->iommu_priv; - struct arm_smmu_device *smmu =3D cfg->smmu; - struct arm_smmu_smr *smrs =3D smmu->smrs; - struct iommu_group *group; - int i, idx, ret; - - mutex_lock(&smmu->stream_map_mutex); - /* Figure out a viable stream map entry allocation */ - for_each_cfg_sme(fwspec, i, idx) { - u16 sid =3D fwspec->ids[i]; - u16 mask =3D fwspec->ids[i] >> SMR_MASK_SHIFT; - - if (idx !=3D INVALID_SMENDX) { - ret =3D -EEXIST; - goto out_err; - } - - ret =3D arm_smmu_find_sme(smmu, sid, mask); - if (ret < 0) - goto out_err; - - idx =3D ret; - if (smrs && smmu->s2crs[idx].count =3D=3D 0) { - smrs[idx].id =3D sid; - smrs[idx].mask =3D mask; - smrs[idx].valid =3D true; - } - smmu->s2crs[idx].count++; - cfg->smendx[i] =3D (s16)idx; - } - - group =3D iommu_group_get_for_dev(dev); - if (!group) - group =3D ERR_PTR(-ENOMEM); - if (IS_ERR(group)) { - ret =3D PTR_ERR(group); - goto out_err; - } - iommu_group_put(group); - - /* It worked! Now, poke the actual hardware */ - for_each_cfg_sme(fwspec, i, idx) { - arm_smmu_write_sme(smmu, idx); - smmu->s2crs[idx].group =3D group; - } - - mutex_unlock(&smmu->stream_map_mutex); - return 0; - -out_err: - while (i--) { - arm_smmu_free_sme(smmu, cfg->smendx[i]); - cfg->smendx[i] =3D INVALID_SMENDX; - } - mutex_unlock(&smmu->stream_map_mutex); - return ret; -} - -static void arm_smmu_master_free_smes(struct iommu_fwspec *fwspec) -{ - struct arm_smmu_device *smmu =3D fwspec_smmu(fwspec); - struct arm_smmu_master_cfg *cfg =3D fwspec->iommu_priv; - int i, idx; - - mutex_lock(&smmu->stream_map_mutex); - for_each_cfg_sme(fwspec, i, idx) { - if (arm_smmu_free_sme(smmu, idx)) - arm_smmu_write_sme(smmu, idx); - cfg->smendx[i] =3D INVALID_SMENDX; - } - mutex_unlock(&smmu->stream_map_mutex); -} - -static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, - struct iommu_fwspec *fwspec) -{ - struct arm_smmu_device *smmu =3D smmu_domain->smmu; - struct arm_smmu_s2cr *s2cr =3D smmu->s2crs; - u8 cbndx =3D smmu_domain->cfg.cbndx; - enum arm_smmu_s2cr_type type; - int i, idx; - - if (smmu_domain->stage =3D=3D ARM_SMMU_DOMAIN_BYPASS) - type =3D S2CR_TYPE_BYPASS; - else - type =3D S2CR_TYPE_TRANS; - - for_each_cfg_sme(fwspec, i, idx) { - if (type =3D=3D s2cr[idx].type && cbndx =3D=3D s2cr[idx].cbndx) - continue; - - s2cr[idx].type =3D type; - s2cr[idx].privcfg =3D S2CR_PRIVCFG_DEFAULT; - s2cr[idx].cbndx =3D cbndx; - arm_smmu_write_s2cr(smmu, idx); - } - return 0; -} - -static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device = *dev) -{ - int ret; - struct iommu_fwspec *fwspec =3D dev->iommu_fwspec; - struct arm_smmu_device *smmu; - struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); - - if (!fwspec || fwspec->ops !=3D &arm_smmu_ops) { - dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n"); - return -ENXIO; - } - - /* - * FIXME: The arch/arm DMA API code tries to attach devices to its own - * domains between of_xlate() and add_device() - we have no way to cope - * with that, so until ARM gets converted to rely on groups and default - * domains, just say no (but more politely than by dereferencing NULL). - * This should be at least a WARN_ON once that's sorted. - */ - if (!fwspec->iommu_priv) - return -ENODEV; - - smmu =3D fwspec_smmu(fwspec); - /* Ensure that the domain is finalised */ - ret =3D arm_smmu_init_domain_context(domain, smmu); - if (ret < 0) - return ret; - - /* - * Sanity check the domain. We don't support domains across - * different SMMUs. - */ - if (smmu_domain->smmu !=3D smmu) { - dev_err(dev, - "cannot attach to SMMU %s whilst already attached to domain on SMMU %s\= n", - dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev)); - return -EINVAL; - } - - /* Looks ok, so add the device to the domain */ - return arm_smmu_domain_add_master(smmu_domain, fwspec); -} - -static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, size_t size, int prot) -{ - struct io_pgtable_ops *ops =3D to_smmu_domain(domain)->pgtbl_ops; - - if (!ops) - return -ENODEV; - - return ops->map(ops, iova, paddr, size, prot); -} - -static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long io= va, - size_t size) -{ - struct io_pgtable_ops *ops =3D to_smmu_domain(domain)->pgtbl_ops; - - if (!ops) - return 0; - - return ops->unmap(ops, iova, size); -} - -static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain) -{ - struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); - - if (smmu_domain->tlb_ops) - smmu_domain->tlb_ops->tlb_flush_all(smmu_domain); -} - -static void arm_smmu_iotlb_sync(struct iommu_domain *domain) -{ - struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); - - if (smmu_domain->tlb_ops) - smmu_domain->tlb_ops->tlb_sync(smmu_domain); -} - -static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain, - dma_addr_t iova) -{ - struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); - struct arm_smmu_device *smmu =3D smmu_domain->smmu; - struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; - struct io_pgtable_ops *ops=3D smmu_domain->pgtbl_ops; - struct device *dev =3D smmu->dev; - void __iomem *cb_base; - u32 tmp; - u64 phys; - unsigned long va, flags; - - cb_base =3D ARM_SMMU_CB(smmu, cfg->cbndx); - - spin_lock_irqsave(&smmu_domain->cb_lock, flags); - /* ATS1 registers can only be written atomically */ - va =3D iova & ~0xfffUL; - if (smmu->version =3D=3D ARM_SMMU_V2) - smmu_write_atomic_lq(va, cb_base + ARM_SMMU_CB_ATS1PR); - else /* Register is only 32-bit in v1 */ - writel_relaxed(va, cb_base + ARM_SMMU_CB_ATS1PR); - - if (readl_poll_timeout_atomic(cb_base + ARM_SMMU_CB_ATSR, tmp, - !(tmp & ATSR_ACTIVE), 5, 50)) { - spin_unlock_irqrestore(&smmu_domain->cb_lock, flags); - dev_err(dev, - "iova to phys timed out on %pad. Falling back to software table walk.\n= ", - &iova); - return ops->iova_to_phys(ops, iova); - } - - phys =3D readq_relaxed(cb_base + ARM_SMMU_CB_PAR); - spin_unlock_irqrestore(&smmu_domain->cb_lock, flags); - if (phys & CB_PAR_F) { - dev_err(dev, "translation fault!\n"); - dev_err(dev, "PAR =3D 0x%llx\n", phys); - return 0; - } - - return (phys & GENMASK_ULL(39, 12)) | (iova & 0xfff); -} - -static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, - dma_addr_t iova) -{ - struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); - struct io_pgtable_ops *ops =3D smmu_domain->pgtbl_ops; - - if (domain->type =3D=3D IOMMU_DOMAIN_IDENTITY) - return iova; - - if (!ops) - return 0; - - if (smmu_domain->smmu->features & ARM_SMMU_FEAT_TRANS_OPS && - smmu_domain->stage =3D=3D ARM_SMMU_DOMAIN_S1) - return arm_smmu_iova_to_phys_hard(domain, iova); - - return ops->iova_to_phys(ops, iova); -} - -static bool arm_smmu_capable(enum iommu_cap cap) -{ - switch (cap) { - case IOMMU_CAP_CACHE_COHERENCY: - /* - * Return true here as the SMMU can always send out coherent - * requests. - */ - return true; - case IOMMU_CAP_NOEXEC: - return true; - default: - return false; - } -} - -static int arm_smmu_match_node(struct device *dev, void *data) -{ - return dev->fwnode =3D=3D data; -} - -static -struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle *fwnod= e) -{ - struct device *dev =3D driver_find_device(&arm_smmu_driver.driver, NULL, - fwnode, arm_smmu_match_node); - put_device(dev); - return dev ? dev_get_drvdata(dev) : NULL; -} - -static int arm_smmu_add_device(struct device *dev) -{ - struct arm_smmu_device *smmu; - struct arm_smmu_master_cfg *cfg; - struct iommu_fwspec *fwspec =3D dev->iommu_fwspec; - int i, ret; - - if (using_legacy_binding) { - ret =3D arm_smmu_register_legacy_master(dev, &smmu); - - /* - * If dev->iommu_fwspec is initally NULL, arm_smmu_register_legacy_maste= r() - * will allocate/initialise a new one. Thus we need to update fwspec for - * later use. - */ - fwspec =3D dev->iommu_fwspec; - if (ret) - goto out_free; - } else if (fwspec && fwspec->ops =3D=3D &arm_smmu_ops) { - smmu =3D arm_smmu_get_by_fwnode(fwspec->iommu_fwnode); - } else { - return -ENODEV; - } - - ret =3D -EINVAL; - for (i =3D 0; i < fwspec->num_ids; i++) { - u16 sid =3D fwspec->ids[i]; - u16 mask =3D fwspec->ids[i] >> SMR_MASK_SHIFT; - - if (sid & ~smmu->streamid_mask) { - dev_err(dev, "stream ID 0x%x out of range for SMMU (0x%x)\n", - sid, smmu->streamid_mask); - goto out_free; - } - if (mask & ~smmu->smr_mask_mask) { - dev_err(dev, "SMR mask 0x%x out of range for SMMU (0x%x)\n", - mask, smmu->smr_mask_mask); - goto out_free; - } - } - - ret =3D -ENOMEM; - cfg =3D kzalloc(offsetof(struct arm_smmu_master_cfg, smendx[i]), - GFP_KERNEL); - if (!cfg) - goto out_free; - - cfg->smmu =3D smmu; - fwspec->iommu_priv =3D cfg; - while (i--) - cfg->smendx[i] =3D INVALID_SMENDX; - - ret =3D arm_smmu_master_alloc_smes(dev); - if (ret) - goto out_cfg_free; - - iommu_device_link(&smmu->iommu, dev); - - return 0; - -out_cfg_free: - kfree(cfg); -out_free: - iommu_fwspec_free(dev); - return ret; -} - -static void arm_smmu_remove_device(struct device *dev) -{ - struct iommu_fwspec *fwspec =3D dev->iommu_fwspec; - struct arm_smmu_master_cfg *cfg; - struct arm_smmu_device *smmu; - - - if (!fwspec || fwspec->ops !=3D &arm_smmu_ops) - return; - - cfg =3D fwspec->iommu_priv; - smmu =3D cfg->smmu; - - iommu_device_unlink(&smmu->iommu, dev); - arm_smmu_master_free_smes(fwspec); - iommu_group_remove_device(dev); - kfree(fwspec->iommu_priv); - iommu_fwspec_free(dev); -} - -static struct iommu_group *arm_smmu_device_group(struct device *dev) -{ - struct iommu_fwspec *fwspec =3D dev->iommu_fwspec; - struct arm_smmu_device *smmu =3D fwspec_smmu(fwspec); - struct iommu_group *group =3D NULL; - int i, idx; - - for_each_cfg_sme(fwspec, i, idx) { - if (group && smmu->s2crs[idx].group && - group !=3D smmu->s2crs[idx].group) - return ERR_PTR(-EINVAL); - - group =3D smmu->s2crs[idx].group; - } - - if (group) - return iommu_group_ref_get(group); - - if (dev_is_pci(dev)) - group =3D pci_device_group(dev); - else if (dev_is_fsl_mc(dev)) - group =3D fsl_mc_device_group(dev); - else - group =3D generic_device_group(dev); - - return group; -} - -static int arm_smmu_domain_get_attr(struct iommu_domain *domain, - enum iommu_attr attr, void *data) -{ - struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); - - switch(domain->type) { - case IOMMU_DOMAIN_UNMANAGED: - switch (attr) { - case DOMAIN_ATTR_NESTING: - *(int *)data =3D (smmu_domain->stage =3D=3D ARM_SMMU_DOMAIN_NESTED); - return 0; - default: - return -ENODEV; - } - break; - case IOMMU_DOMAIN_DMA: - switch (attr) { - case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - *(int *)data =3D smmu_domain->non_strict; - return 0; - default: - return -ENODEV; - } - break; - default: - return -EINVAL; - } -} - -static int arm_smmu_domain_set_attr(struct iommu_domain *domain, - enum iommu_attr attr, void *data) -{ - int ret =3D 0; - struct arm_smmu_domain *smmu_domain =3D to_smmu_domain(domain); - - mutex_lock(&smmu_domain->init_mutex); - - switch(domain->type) { - case IOMMU_DOMAIN_UNMANAGED: - switch (attr) { - case DOMAIN_ATTR_NESTING: - if (smmu_domain->smmu) { - ret =3D -EPERM; - goto out_unlock; - } - - if (*(int *)data) - smmu_domain->stage =3D ARM_SMMU_DOMAIN_NESTED; - else - smmu_domain->stage =3D ARM_SMMU_DOMAIN_S1; - break; - default: - ret =3D -ENODEV; - } - break; - case IOMMU_DOMAIN_DMA: - switch (attr) { - case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - smmu_domain->non_strict =3D *(int *)data; - break; - default: - ret =3D -ENODEV; - } - break; - default: - ret =3D -EINVAL; - } -out_unlock: - mutex_unlock(&smmu_domain->init_mutex); - return ret; -} - -static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *a= rgs) -{ - u32 mask, fwid =3D 0; - - if (args->args_count > 0) - fwid |=3D (u16)args->args[0]; - - if (args->args_count > 1) - fwid |=3D (u16)args->args[1] << SMR_MASK_SHIFT; - else if (!of_property_read_u32(args->np, "stream-match-mask", &mask)) - fwid |=3D (u16)mask << SMR_MASK_SHIFT; - - return iommu_fwspec_add_ids(dev, &fwid, 1); -} - -static void arm_smmu_get_resv_regions(struct device *dev, - struct list_head *head) -{ - struct iommu_resv_region *region; - int prot =3D IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO; - - region =3D iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH, - prot, IOMMU_RESV_SW_MSI); - if (!region) - return; - - list_add_tail(®ion->list, head); - - iommu_dma_get_resv_regions(dev, head); -} - -static void arm_smmu_put_resv_regions(struct device *dev, - struct list_head *head) -{ - struct iommu_resv_region *entry, *next; - - list_for_each_entry_safe(entry, next, head, list) - kfree(entry); -} - -static struct iommu_ops arm_smmu_ops =3D { - .capable =3D arm_smmu_capable, - .domain_alloc =3D arm_smmu_domain_alloc, - .domain_free =3D arm_smmu_domain_free, - .attach_dev =3D arm_smmu_attach_dev, - .map =3D arm_smmu_map, - .unmap =3D arm_smmu_unmap, - .flush_iotlb_all =3D arm_smmu_flush_iotlb_all, - .iotlb_sync =3D arm_smmu_iotlb_sync, - .iova_to_phys =3D arm_smmu_iova_to_phys, - .add_device =3D arm_smmu_add_device, - .remove_device =3D arm_smmu_remove_device, - .device_group =3D arm_smmu_device_group, - .domain_get_attr =3D arm_smmu_domain_get_attr, - .domain_set_attr =3D arm_smmu_domain_set_attr, - .of_xlate =3D arm_smmu_of_xlate, - .get_resv_regions =3D arm_smmu_get_resv_regions, - .put_resv_regions =3D arm_smmu_put_resv_regions, - .pgsize_bitmap =3D -1UL, /* Restricted during device attach */ -}; - -static void arm_smmu_device_reset(struct arm_smmu_device *smmu) -{ - void __iomem *gr0_base =3D ARM_SMMU_GR0(smmu); - int i; - u32 reg, major; - - /* clear global FSR */ - reg =3D readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); - writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); - - /* - * Reset stream mapping groups: Initial values mark all SMRn as - * invalid and all S2CRn as bypass unless overridden. - */ - for (i =3D 0; i < smmu->num_mapping_groups; ++i) - arm_smmu_write_sme(smmu, i); - - if (smmu->model =3D=3D ARM_MMU500) { - /* - * Before clearing ARM_MMU500_ACTLR_CPRE, need to - * clear CACHE_LOCK bit of ACR first. And, CACHE_LOCK - * bit is only present in MMU-500r2 onwards. - */ - reg =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_ID7); - major =3D (reg >> ID7_MAJOR_SHIFT) & ID7_MAJOR_MASK; - reg =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_sACR); - if (major >=3D 2) - reg &=3D ~ARM_MMU500_ACR_CACHE_LOCK; - /* - * Allow unmatched Stream IDs to allocate bypass - * TLB entries for reduced latency. - */ - reg |=3D ARM_MMU500_ACR_SMTNMB_TLBEN | ARM_MMU500_ACR_S2CRB_TLBEN; - writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_sACR); - } - - /* Make sure all context banks are disabled and clear CB_FSR */ - for (i =3D 0; i < smmu->num_context_banks; ++i) { - void __iomem *cb_base =3D ARM_SMMU_CB(smmu, i); - - arm_smmu_write_context_bank(smmu, i); - writel_relaxed(FSR_FAULT, cb_base + ARM_SMMU_CB_FSR); - /* - * Disable MMU-500's not-particularly-beneficial next-page - * prefetcher for the sake of errata #841119 and #826419. - */ - if (smmu->model =3D=3D ARM_MMU500) { - reg =3D readl_relaxed(cb_base + ARM_SMMU_CB_ACTLR); - reg &=3D ~ARM_MMU500_ACTLR_CPRE; - writel_relaxed(reg, cb_base + ARM_SMMU_CB_ACTLR); - } - } - - /* Invalidate the TLB, just in case */ - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH); - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH); - - reg =3D readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); - - /* Enable fault reporting */ - reg |=3D (sCR0_GFRE | sCR0_GFIE | sCR0_GCFGFRE | sCR0_GCFGFIE); - - /* Disable TLB broadcasting. */ - reg |=3D (sCR0_VMIDPNE | sCR0_PTM); - - /* Enable client access, handling unmatched streams as appropriate */ - reg &=3D ~sCR0_CLIENTPD; - if (disable_bypass) - reg |=3D sCR0_USFCFG; - else - reg &=3D ~sCR0_USFCFG; - - /* Disable forced broadcasting */ - reg &=3D ~sCR0_FB; - - /* Don't upgrade barriers */ - reg &=3D ~(sCR0_BSU_MASK << sCR0_BSU_SHIFT); - - if (smmu->features & ARM_SMMU_FEAT_VMID16) - reg |=3D sCR0_VMID16EN; - - if (smmu->features & ARM_SMMU_FEAT_EXIDS) - reg |=3D sCR0_EXIDENABLE; - - /* Push the button */ - arm_smmu_tlb_sync_global(smmu); - writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); -} - -static int arm_smmu_id_size_to_bits(int size) -{ - switch (size) { - case 0: - return 32; - case 1: - return 36; - case 2: - return 40; - case 3: - return 42; - case 4: - return 44; - case 5: - default: - return 48; - } -} - -static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) -{ - unsigned long size; - void __iomem *gr0_base =3D ARM_SMMU_GR0(smmu); - u32 id; - bool cttw_reg, cttw_fw =3D smmu->features & ARM_SMMU_FEAT_COHERENT_WALK; - int i; - - dev_notice(smmu->dev, "probing hardware configuration...\n"); - dev_notice(smmu->dev, "SMMUv%d with:\n", - smmu->version =3D=3D ARM_SMMU_V2 ? 2 : 1); - - /* ID0 */ - id =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_ID0); - - /* Restrict available stages based on module parameter */ - if (force_stage =3D=3D 1) - id &=3D ~(ID0_S2TS | ID0_NTS); - else if (force_stage =3D=3D 2) - id &=3D ~(ID0_S1TS | ID0_NTS); - - if (id & ID0_S1TS) { - smmu->features |=3D ARM_SMMU_FEAT_TRANS_S1; - dev_notice(smmu->dev, "\tstage 1 translation\n"); - } - - if (id & ID0_S2TS) { - smmu->features |=3D ARM_SMMU_FEAT_TRANS_S2; - dev_notice(smmu->dev, "\tstage 2 translation\n"); - } - - if (id & ID0_NTS) { - smmu->features |=3D ARM_SMMU_FEAT_TRANS_NESTED; - dev_notice(smmu->dev, "\tnested translation\n"); - } - - if (!(smmu->features & - (ARM_SMMU_FEAT_TRANS_S1 | ARM_SMMU_FEAT_TRANS_S2))) { - dev_err(smmu->dev, "\tno translation support!\n"); - return -ENODEV; - } - - if ((id & ID0_S1TS) && - ((smmu->version < ARM_SMMU_V2) || !(id & ID0_ATOSNS))) { - smmu->features |=3D ARM_SMMU_FEAT_TRANS_OPS; - dev_notice(smmu->dev, "\taddress translation ops\n"); - } - - /* - * In order for DMA API calls to work properly, we must defer to what - * the FW says about coherency, regardless of what the hardware claims. - * Fortunately, this also opens up a workaround for systems where the - * ID register value has ended up configured incorrectly. - */ - cttw_reg =3D !!(id & ID0_CTTW); - if (cttw_fw || cttw_reg) - dev_notice(smmu->dev, "\t%scoherent table walk\n", - cttw_fw ? "" : "non-"); - if (cttw_fw !=3D cttw_reg) - dev_notice(smmu->dev, - "\t(IDR0.CTTW overridden by FW configuration)\n"); - - /* Max. number of entries we have for stream matching/indexing */ - if (smmu->version =3D=3D ARM_SMMU_V2 && id & ID0_EXIDS) { - smmu->features |=3D ARM_SMMU_FEAT_EXIDS; - size =3D 1 << 16; - } else { - size =3D 1 << ((id >> ID0_NUMSIDB_SHIFT) & ID0_NUMSIDB_MASK); - } - smmu->streamid_mask =3D size - 1; - if (id & ID0_SMS) { - smmu->features |=3D ARM_SMMU_FEAT_STREAM_MATCH; - size =3D (id >> ID0_NUMSMRG_SHIFT) & ID0_NUMSMRG_MASK; - if (size =3D=3D 0) { - dev_err(smmu->dev, - "stream-matching supported, but no SMRs present!\n"); - return -ENODEV; - } - - /* Zero-initialised to mark as invalid */ - smmu->smrs =3D devm_kcalloc(smmu->dev, size, sizeof(*smmu->smrs), - GFP_KERNEL); - if (!smmu->smrs) - return -ENOMEM; - - dev_notice(smmu->dev, - "\tstream matching with %lu register groups", size); - } - /* s2cr->type =3D=3D 0 means translation, so initialise explicitly */ - smmu->s2crs =3D devm_kmalloc_array(smmu->dev, size, sizeof(*smmu->s2crs), - GFP_KERNEL); - if (!smmu->s2crs) - return -ENOMEM; - for (i =3D 0; i < size; i++) - smmu->s2crs[i] =3D s2cr_init_val; - - smmu->num_mapping_groups =3D size; - mutex_init(&smmu->stream_map_mutex); - spin_lock_init(&smmu->global_sync_lock); - - if (smmu->version < ARM_SMMU_V2 || !(id & ID0_PTFS_NO_AARCH32)) { - smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH32_L; - if (!(id & ID0_PTFS_NO_AARCH32S)) - smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH32_S; - } - - /* ID1 */ - id =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_ID1); - smmu->pgshift =3D (id & ID1_PAGESIZE) ? 16 : 12; - - /* Check for size mismatch of SMMU address space from mapped region */ - size =3D 1 << (((id >> ID1_NUMPAGENDXB_SHIFT) & ID1_NUMPAGENDXB_MASK) + 1= ); - size <<=3D smmu->pgshift; - if (smmu->cb_base !=3D gr0_base + size) - dev_warn(smmu->dev, - "SMMU address space size (0x%lx) differs from mapped region size (0x%tx= )!\n", - size * 2, (smmu->cb_base - gr0_base) * 2); - - smmu->num_s2_context_banks =3D (id >> ID1_NUMS2CB_SHIFT) & ID1_NUMS2CB_MA= SK; - smmu->num_context_banks =3D (id >> ID1_NUMCB_SHIFT) & ID1_NUMCB_MASK; - if (smmu->num_s2_context_banks > smmu->num_context_banks) { - dev_err(smmu->dev, "impossible number of S2 context banks!\n"); - return -ENODEV; - } - dev_notice(smmu->dev, "\t%u context banks (%u stage-2 only)\n", - smmu->num_context_banks, smmu->num_s2_context_banks); - /* - * Cavium CN88xx erratum #27704. - * Ensure ASID and VMID allocation is unique across all SMMUs in - * the system. - */ - if (smmu->model =3D=3D CAVIUM_SMMUV2) { - smmu->cavium_id_base =3D - atomic_add_return(smmu->num_context_banks, - &cavium_smmu_context_count); - smmu->cavium_id_base -=3D smmu->num_context_banks; - dev_notice(smmu->dev, "\tenabling workaround for Cavium erratum 27704\n"= ); - } - smmu->cbs =3D devm_kcalloc(smmu->dev, smmu->num_context_banks, - sizeof(*smmu->cbs), GFP_KERNEL); - if (!smmu->cbs) - return -ENOMEM; - - /* ID2 */ - id =3D readl_relaxed(gr0_base + ARM_SMMU_GR0_ID2); - size =3D arm_smmu_id_size_to_bits((id >> ID2_IAS_SHIFT) & ID2_IAS_MASK); - smmu->ipa_size =3D size; - - /* The output mask is also applied for bypass */ - size =3D arm_smmu_id_size_to_bits((id >> ID2_OAS_SHIFT) & ID2_OAS_MASK); - smmu->pa_size =3D size; - - if (id & ID2_VMID16) - smmu->features |=3D ARM_SMMU_FEAT_VMID16; - - /* - * What the page table walker can address actually depends on which - * descriptor format is in use, but since a) we don't know that yet, - * and b) it can vary per context bank, this will have to do... - */ - if (dma_set_mask_and_coherent(smmu->dev, DMA_BIT_MASK(size))) - dev_warn(smmu->dev, - "failed to set DMA mask for table walker\n"); - - if (smmu->version < ARM_SMMU_V2) { - smmu->va_size =3D smmu->ipa_size; - if (smmu->version =3D=3D ARM_SMMU_V1_64K) - smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH64_64K; - } else { - size =3D (id >> ID2_UBS_SHIFT) & ID2_UBS_MASK; - smmu->va_size =3D arm_smmu_id_size_to_bits(size); - if (id & ID2_PTFS_4K) - smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH64_4K; - if (id & ID2_PTFS_16K) - smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH64_16K; - if (id & ID2_PTFS_64K) - smmu->features |=3D ARM_SMMU_FEAT_FMT_AARCH64_64K; - } - - /* Now we've corralled the various formats, what'll it do? */ - if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) - smmu->pgsize_bitmap |=3D SZ_4K | SZ_64K | SZ_1M | SZ_16M; - if (smmu->features & - (ARM_SMMU_FEAT_FMT_AARCH32_L | ARM_SMMU_FEAT_FMT_AARCH64_4K)) - smmu->pgsize_bitmap |=3D SZ_4K | SZ_2M | SZ_1G; - if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_16K) - smmu->pgsize_bitmap |=3D SZ_16K | SZ_32M; - if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_64K) - smmu->pgsize_bitmap |=3D SZ_64K | SZ_512M; - - if (arm_smmu_ops.pgsize_bitmap =3D=3D -1UL) - arm_smmu_ops.pgsize_bitmap =3D smmu->pgsize_bitmap; - else - arm_smmu_ops.pgsize_bitmap |=3D smmu->pgsize_bitmap; - dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n", - smmu->pgsize_bitmap); - - - if (smmu->features & ARM_SMMU_FEAT_TRANS_S1) - dev_notice(smmu->dev, "\tStage-1: %lu-bit VA -> %lu-bit IPA\n", - smmu->va_size, smmu->ipa_size); - - if (smmu->features & ARM_SMMU_FEAT_TRANS_S2) - dev_notice(smmu->dev, "\tStage-2: %lu-bit IPA -> %lu-bit PA\n", - smmu->ipa_size, smmu->pa_size); - - return 0; -} - -struct arm_smmu_match_data { - enum arm_smmu_arch_version version; - enum arm_smmu_implementation model; -}; - -#define ARM_SMMU_MATCH_DATA(name, ver, imp) \ -static struct arm_smmu_match_data name =3D { .version =3D ver, .model =3D = imp } - ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU); ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU); ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU); @@ -1966,294 +125,6 @@ static const struct of_device_id arm_smmu_of_match[]= =3D { }; MODULE_DEVICE_TABLE(of, arm_smmu_of_match); =20 -#ifdef CONFIG_ACPI -static int acpi_smmu_get_data(u32 model, struct arm_smmu_device *smmu) -{ - int ret =3D 0; - - switch (model) { - case ACPI_IORT_SMMU_V1: - case ACPI_IORT_SMMU_CORELINK_MMU400: - smmu->version =3D ARM_SMMU_V1; - smmu->model =3D GENERIC_SMMU; - break; - case ACPI_IORT_SMMU_CORELINK_MMU401: - smmu->version =3D ARM_SMMU_V1_64K; - smmu->model =3D GENERIC_SMMU; - break; - case ACPI_IORT_SMMU_V2: - smmu->version =3D ARM_SMMU_V2; - smmu->model =3D GENERIC_SMMU; - break; - case ACPI_IORT_SMMU_CORELINK_MMU500: - smmu->version =3D ARM_SMMU_V2; - smmu->model =3D ARM_MMU500; - break; - case ACPI_IORT_SMMU_CAVIUM_THUNDERX: - smmu->version =3D ARM_SMMU_V2; - smmu->model =3D CAVIUM_SMMUV2; - break; - default: - ret =3D -ENODEV; - } - - return ret; -} - -static int arm_smmu_device_acpi_probe(struct platform_device *pdev, - struct arm_smmu_device *smmu) -{ - struct device *dev =3D smmu->dev; - struct acpi_iort_node *node =3D - *(struct acpi_iort_node **)dev_get_platdata(dev); - struct acpi_iort_smmu *iort_smmu; - int ret; - - /* Retrieve SMMU1/2 specific data */ - iort_smmu =3D (struct acpi_iort_smmu *)node->node_data; - - ret =3D acpi_smmu_get_data(iort_smmu->model, smmu); - if (ret < 0) - return ret; - - /* Ignore the configuration access interrupt */ - smmu->num_global_irqs =3D 1; - - if (iort_smmu->flags & ACPI_IORT_SMMU_COHERENT_WALK) - smmu->features |=3D ARM_SMMU_FEAT_COHERENT_WALK; - - return 0; -} -#else -static inline int arm_smmu_device_acpi_probe(struct platform_device *pdev, - struct arm_smmu_device *smmu) -{ - return -ENODEV; -} -#endif - -static int arm_smmu_device_dt_probe(struct platform_device *pdev, - struct arm_smmu_device *smmu) -{ - const struct arm_smmu_match_data *data; - struct device *dev =3D &pdev->dev; - bool legacy_binding; - - if (of_property_read_u32(dev->of_node, "#global-interrupts", - &smmu->num_global_irqs)) { - dev_err(dev, "missing #global-interrupts property\n"); - return -ENODEV; - } - - data =3D of_device_get_match_data(dev); - smmu->version =3D data->version; - smmu->model =3D data->model; - - parse_driver_options(smmu); - - legacy_binding =3D of_find_property(dev->of_node, "mmu-masters", NULL); - if (legacy_binding && !using_generic_binding) { - if (!using_legacy_binding) - pr_notice("deprecated \"mmu-masters\" DT property in use; DMA API suppo= rt unavailable\n"); - using_legacy_binding =3D true; - } else if (!legacy_binding && !using_legacy_binding) { - using_generic_binding =3D true; - } else { - dev_err(dev, "not probing due to mismatched DT properties\n"); - return -ENODEV; - } - - if (of_dma_is_coherent(dev->of_node)) - smmu->features |=3D ARM_SMMU_FEAT_COHERENT_WALK; - - return 0; -} - -static void arm_smmu_bus_init(void) -{ - /* Oh, for a proper bus abstraction */ - if (!iommu_present(&platform_bus_type)) - bus_set_iommu(&platform_bus_type, &arm_smmu_ops); -#ifdef CONFIG_ARM_AMBA - if (!iommu_present(&amba_bustype)) - bus_set_iommu(&amba_bustype, &arm_smmu_ops); -#endif -#ifdef CONFIG_PCI - if (!iommu_present(&pci_bus_type)) { - pci_request_acs(); - bus_set_iommu(&pci_bus_type, &arm_smmu_ops); - } -#endif -#ifdef CONFIG_FSL_MC_BUS - if (!iommu_present(&fsl_mc_bus_type)) - bus_set_iommu(&fsl_mc_bus_type, &arm_smmu_ops); -#endif -} - -static int arm_smmu_device_probe(struct platform_device *pdev) -{ - struct resource *res; - resource_size_t ioaddr; - struct arm_smmu_device *smmu; - struct device *dev =3D &pdev->dev; - int num_irqs, i, err; - - smmu =3D devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL); - if (!smmu) { - dev_err(dev, "failed to allocate arm_smmu_device\n"); - return -ENOMEM; - } - smmu->dev =3D dev; - - if (dev->of_node) - err =3D arm_smmu_device_dt_probe(pdev, smmu); - else - err =3D arm_smmu_device_acpi_probe(pdev, smmu); - - if (err) - return err; - - res =3D platform_get_resource(pdev, IORESOURCE_MEM, 0); - ioaddr =3D res->start; - smmu->base =3D devm_ioremap_resource(dev, res); - if (IS_ERR(smmu->base)) - return PTR_ERR(smmu->base); - smmu->cb_base =3D smmu->base + resource_size(res) / 2; - - num_irqs =3D 0; - while ((res =3D platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) { - num_irqs++; - if (num_irqs > smmu->num_global_irqs) - smmu->num_context_irqs++; - } - - if (!smmu->num_context_irqs) { - dev_err(dev, "found %d interrupts but expected at least %d\n", - num_irqs, smmu->num_global_irqs + 1); - return -ENODEV; - } - - smmu->irqs =3D devm_kcalloc(dev, num_irqs, sizeof(*smmu->irqs), - GFP_KERNEL); - if (!smmu->irqs) { - dev_err(dev, "failed to allocate %d irqs\n", num_irqs); - return -ENOMEM; - } - - for (i =3D 0; i < num_irqs; ++i) { - int irq =3D platform_get_irq(pdev, i); - - if (irq < 0) { - dev_err(dev, "failed to get irq index %d\n", i); - return -ENODEV; - } - smmu->irqs[i] =3D irq; - } - - err =3D arm_smmu_device_cfg_probe(smmu); - if (err) - return err; - - if (smmu->version =3D=3D ARM_SMMU_V2) { - if (smmu->num_context_banks > smmu->num_context_irqs) { - dev_err(dev, - "found only %d context irq(s) but %d required\n", - smmu->num_context_irqs, smmu->num_context_banks); - return -ENODEV; - } - - /* Ignore superfluous interrupts */ - smmu->num_context_irqs =3D smmu->num_context_banks; - } - - for (i =3D 0; i < smmu->num_global_irqs; ++i) { - err =3D devm_request_irq(smmu->dev, smmu->irqs[i], - arm_smmu_global_fault, - IRQF_SHARED, - "arm-smmu global fault", - smmu); - if (err) { - dev_err(dev, "failed to request global IRQ %d (%u)\n", - i, smmu->irqs[i]); - return err; - } - } - - err =3D iommu_device_sysfs_add(&smmu->iommu, smmu->dev, NULL, - "smmu.%pa", &ioaddr); - if (err) { - dev_err(dev, "Failed to register iommu in sysfs\n"); - return err; - } - - iommu_device_set_ops(&smmu->iommu, &arm_smmu_ops); - iommu_device_set_fwnode(&smmu->iommu, dev->fwnode); - - err =3D iommu_device_register(&smmu->iommu); - if (err) { - dev_err(dev, "Failed to register iommu\n"); - return err; - } - - platform_set_drvdata(pdev, smmu); - arm_smmu_device_reset(smmu); - arm_smmu_test_smr_masks(smmu); - - /* - * For ACPI and generic DT bindings, an SMMU will be probed before - * any device which might need it, so we want the bus ops in place - * ready to handle default domain setup as soon as any SMMU exists. - */ - if (!using_legacy_binding) - arm_smmu_bus_init(); - - return 0; -} - -/* - * With the legacy DT binding in play, though, we have no guarantees about - * probe order, but then we're also not doing default domains, so we can - * delay setting bus ops until we're sure every possible SMMU is ready, - * and that way ensure that no add_device() calls get missed. - */ -static int arm_smmu_legacy_bus_init(void) -{ - if (using_legacy_binding) - arm_smmu_bus_init(); - return 0; -} -device_initcall_sync(arm_smmu_legacy_bus_init); - -static int arm_smmu_device_remove(struct platform_device *pdev) -{ - struct arm_smmu_device *smmu =3D platform_get_drvdata(pdev); - - if (!smmu) - return -ENODEV; - - if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS)) - dev_err(&pdev->dev, "removing device with active domains!\n"); - - /* Turn the thing off */ - writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); - return 0; -} - -static void arm_smmu_device_shutdown(struct platform_device *pdev) -{ - arm_smmu_device_remove(pdev); -} - -static int __maybe_unused arm_smmu_pm_resume(struct device *dev) -{ - struct arm_smmu_device *smmu =3D dev_get_drvdata(dev); - - arm_smmu_device_reset(smmu); - return 0; -} - -static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume); - static struct platform_driver arm_smmu_driver =3D { .driver =3D { .name =3D "arm-smmu", --=20 2.1.4