Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752684AbdIRStL (ORCPT ); Mon, 18 Sep 2017 14:49:11 -0400 Received: from mail-eopbgr30053.outbound.protection.outlook.com ([40.107.3.53]:7253 "EHLO EUR03-AM5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751499AbdIRStI (ORCPT ); Mon, 18 Sep 2017 14:49:08 -0400 From: Roy Pledge To: Catalin Marinas CC: "mark.rutland@arm.com" , "arnd@arndb.de" , Madalin-cristian Bucur , "linux-kernel@vger.kernel.org" , Leo Li , "oss@buserror.net" , "linux@armlinux.org.uk" , "linuxppc-dev@lists.ozlabs.org" , "linux-arm-kernel@lists.infradead.org" Subject: Re: [v4 07/11] soc/fsl/qbman: Rework portal mapping calls for ARM/PPC Thread-Topic: [v4 07/11] soc/fsl/qbman: Rework portal mapping calls for ARM/PPC Thread-Index: AQHTHRjndaeFq7ULeU2tESSqfBsMJg== Date: Mon, 18 Sep 2017 18:48:57 +0000 Message-ID: References: <1503607075-28970-1-git-send-email-roy.pledge@nxp.com> <1503607075-28970-8-git-send-email-roy.pledge@nxp.com> <20170914140014.taz7qwphfqm66kw7@localhost> <20170915214902.5argyl7d7bz4wykf@localhost> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=roy.pledge@nxp.com; x-originating-ip: [192.88.168.1] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;DB6PR04MB3048;6:Yunll1tAON8fVSw7of3dACTx3qaDBqgBaodxT2HP5BFgC77thgDPokKxbHPSNbMWA7R/muVeso2bNVlYWHZba+cs1SHaZlgiQa2rugYyYHYEVAtWWoRRmPnNLUdWzfQvgOwdbN9dOfQxByTjG38Yox+YyBSoZJ3GWPzmJ/atIakTCqyOHMsD7l1m2ofYtsA29RyRI9O/IO7Anp6rl3wK3Yq+czZrVto3G0wkpmyb7WGk+ozYtxICUT5xjEdEWii7JllSTdRwljVVmOjz+0YNjgInnyBYkkWOaasjf/cGb45IzTIOjdZk9VYT0WailztDLDVKbeIuU1Zi7dQvwakP7w==;5:HewPz91bUJI29K0BIyZtJCS75+kgFAPNJ2FDYCa+m9sOEaTYSRuZHcnB+A5vUqQHBSX3mKsERuCZ/v8DcjzqmD/myTfmEsahbGiBpxHuRqDgyhgCE9ViRwZkF0uXSKoZvlAJxTzszZb0MZXOP66yIw==;24:sGyzeK6znitJXW3Ut01sdzxFm0ufIUbG60QHfOW9PcFnNHwggAsKEfEPcOXerhPsF1SiVNmFmyt9brXeW784bMnxhlgje6a/BCLsloRNXN0=;7:bFMLpjRkKuEax9fVFRB5avQERxoUGCC8TsXktevRvJqExxgfzDkQqwiItA0amjf4i+JgVCJUYL4OSRpb39F897KTynDa6uLxg9jZpfCwHtsWjTTBxni3EBtecfKlwgyvOYfKw6G+t/+bpoOCzr4xAbaK1Y0EUTaZYIZaQ+7WQywC1jDIMr3o4rWmaMBNexR7ZJ1o4Li4++0QrCSpAb3C7cAN1WfK1t1R09aEgtDCNKI= x-ms-exchange-antispam-srfa-diagnostics: SSOS;SSOR; x-forefront-antispam-report: SFV:SKI;SCL:-1;SFV:NSPM;SFS:(10009020)(6009001)(346002)(376002)(39860400002)(51444003)(43784003)(199003)(189002)(377454003)(24454002)(6436002)(5250100002)(55016002)(99286003)(102836003)(6116002)(54906002)(305945005)(2906002)(7736002)(9686003)(68736007)(3846002)(3280700002)(93886005)(8936002)(81156014)(81166006)(8676002)(101416001)(6246003)(76176999)(478600001)(50986999)(14454004)(66066001)(54356999)(97736004)(3660700001)(53936002)(105586002)(106356001)(2900100001)(6916009)(74316002)(53546010)(7696004)(229853002)(189998001)(316002)(5660300001)(33656002)(25786009)(4326008)(6506006)(86362001);DIR:OUT;SFP:1101;SCL:1;SRVR:DB6PR04MB3048;H:DB6PR04MB2999.eurprd04.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; x-ms-office365-filtering-correlation-id: 43ba2013-008c-4524-a0f1-08d4fec5eb62 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(48565401081)(300000503095)(300135400095)(2017052603199)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:DB6PR04MB3048; x-ms-traffictypediagnostic: DB6PR04MB3048: x-exchange-antispam-report-test: UriScan:(275809806118684); x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(100000703101)(100105400095)(3002001)(6055026)(6041248)(20161123558100)(20161123555025)(20161123564025)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123562025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:DB6PR04MB3048;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:DB6PR04MB3048; x-forefront-prvs: 04347F8039 spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 X-OriginatorOrg: nxp.com X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Sep 2017 18:48:57.5954 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 686ea1d3-bc2b-4c6f-a92c-d99c5c301635 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR04MB3048 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id v8IInGwI018208 Content-Length: 4581 Lines: 99 On 9/15/2017 5:49 PM, Catalin Marinas wrote: > On Thu, Sep 14, 2017 at 07:07:50PM +0000, Roy Pledge wrote: >> On 9/14/2017 10:00 AM, Catalin Marinas wrote: >>> On Thu, Aug 24, 2017 at 04:37:51PM -0400, Roy Pledge wrote: >>>> @@ -123,23 +122,34 @@ static int bman_portal_probe(struct platform_device *pdev) >>>> } >>>> pcfg->irq = irq; >>>> >>>> - va = ioremap_prot(addr_phys[0]->start, resource_size(addr_phys[0]), 0); >>>> - if (!va) { >>>> - dev_err(dev, "ioremap::CE failed\n"); >>>> + /* >>>> + * TODO: Ultimately we would like to use a cacheable/non-shareable >>>> + * (coherent) mapping for the portal on both architectures but that >>>> + * isn't currently available in the kernel. Because of HW differences >>>> + * PPC needs to be mapped cacheable while ARM SoCs will work with non >>>> + * cacheable mappings >>>> + */ >>> >>> This comment mentions "cacheable/non-shareable (coherent)". Was this >>> meant for ARM platforms? Because non-shareable is not coherent, nor is >>> this combination guaranteed to work with different CPUs and >>> interconnects. >> >> My wording is poor I should have been clearer that non-shareable == >> non-coherent. I will fix this. >> >> We do understand that cacheable/non shareable isn't supported on all >> CPU/interconnect combinations but we have verified with ARM that for the >> CPU/interconnects we have integrated QBMan on our use is OK. The note is >> here to try to explain why the mapping is different right now. Once we >> get the basic QBMan support integrated for ARM we do plan to try to have >> patches integrated that enable the cacheable mapping as it gives a >> significant performance boost. > > I will definitely not ack those patches (at least not in the form I've > seen, assuming certain eviction order of the bytes in a cacheline). The > reason is that it is incredibly fragile, highly dependent on the CPU > microarchitecture and interconnects. Assuming that you ever only have a > single SoC with this device, you may get away with #ifdefs in the > driver. But if you support two or more SoCs with different behaviours, > you'd have to make run-time decisions in the driver or run-time code > patching. We are very keen on single kernel binary image/drivers and > architecturally compliant code (the cacheable mapping hacks are well > outside the architecture behaviour). > Let's put this particular point on hold for now, I would like to focus on getting the basic functions merged in ASAP. I removed the comment in question (it sort of happened naturally when I applied your other comments) in the next revision of the patchset. I have submitted the patches to our automated test system for sanity checking and I will sent a new patchset once I get the results. Thanks again for your comments - they have been very useful and have improved the quality of the code for sure. >>>> diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h >>>> index 81a9a5e..0a1d573 100644 >>>> --- a/drivers/soc/fsl/qbman/dpaa_sys.h >>>> +++ b/drivers/soc/fsl/qbman/dpaa_sys.h >>>> @@ -51,12 +51,12 @@ >>>> >>>> static inline void dpaa_flush(void *p) >>>> { >>>> + /* >>>> + * Only PPC needs to flush the cache currently - on ARM the mapping >>>> + * is non cacheable >>>> + */ >>>> #ifdef CONFIG_PPC >>>> flush_dcache_range((unsigned long)p, (unsigned long)p+64); >>>> -#elif defined(CONFIG_ARM) >>>> - __cpuc_flush_dcache_area(p, 64); >>>> -#elif defined(CONFIG_ARM64) >>>> - __flush_dcache_area(p, 64); >>>> #endif >>>> } >>> >>> Dropping the private API cache maintenance is fine and the memory is WC >>> now for ARM (mapping to Normal NonCacheable). However, do you require >>> any barriers here? Normal NC doesn't guarantee any ordering. >> >> The barrier is done in the code where the command is formed. We follow >> this pattern >> a) Zero the command cache line (the device never reacts to a 0 command >> verb so a cast out of this will have no effect) >> b) Fill in everything in the command except the command verb (byte 0) >> c) Execute a memory barrier >> d) Set the command verb (byte 0) >> e) Flush the command >> If a castout happens between d) and e) doesn't matter since it was about >> to be flushed anyway . Any castout before d) will not cause HW to >> process the command because verb is still 0. The barrier at c) prevents >> reordering so the HW cannot see the verb set before the command is formed. > > I think that's fine, the dpaa_flush() can be a no-op with non-cacheable > memory (I had forgotten the details). >