by Alex Deucher

[permalink] [raw]

Subject: Re: [RFC PATCH] drm: disable WC optimization for cache coherent devices on non-x86

On Thu, Jan 24, 2019 at 8:57 AM Ard Biesheuvel
<[email protected]> wrote:
>
> On Thu, 24 Jan 2019 at 14:54, Alex Deucher <[email protected]> wrote:
> >
> > On Thu, Jan 24, 2019 at 6:45 AM Ard Biesheuvel
> > <[email protected]> wrote:
> > >
> > > On Thu, 24 Jan 2019 at 12:37, Koenig, Christian
> > > <[email protected]> wrote:
> > > >
> > > > Am 24.01.19 um 12:26 schrieb Ard Biesheuvel:
> > > > > On Thu, 24 Jan 2019 at 12:23, Koenig, Christian
> > > > > <[email protected]> wrote:
> > > > >> Am 24.01.19 um 10:59 schrieb Ard Biesheuvel:
> > > > >>> [SNIP]
> > > > >>> This is *exactly* my point the whole time.
> > > > >>>
> > > > >>> The current code has
> > > > >>>
> > > > >>> static inline bool drm_arch_can_wc_memory(void)
> > > > >>> {
> > > > >>> #if defined(CONFIG_PPC) && !defined(CONFIG_NOT_COHERENT_CACHE)
> > > > >>> return false;
> > > > >>>
> > > > >>> which means the optimization is disabled *unless the system is
> > > > >>> non-cache coherent*
> > > > >>>
> > > > >>> So if you have reports that the optimization works on some PowerPC, it
> > > > >>> must be non-cache coherent PowerPC, because that is the only place
> > > > >>> where it is enabled in the first place.
> > > > >>>
> > > > >>>> The only problematic here actually seems to be ARM, so you should
> > > > >>>> probably just add an "#ifdef .._ARM return false;".
> > > > >>>>
> > > > >>> ARM/arm64 does not have a Kconfig symbol like
> > > > >>> CONFIG_NOT_COHERENT_CACHE, so we can only disable it everywhere. If
> > > > >>> there are non-coherent ARM systems that are currently working in the
> > > > >>> same way as those non-coherent PowerPC systems, we will break them by
> > > > >>> doing this.
> > > > >> Summing the things I've read so far for ARM up I actually think it
> > > > >> depends on a runtime configuration and not on compile time one.
> > > > >>
> > > > >> So the whole idea of providing the device to the drm_*_can_wc_memory()
> > > > >> function isn't so far fetched.
> > > > >>
> > > > > Thank you.
> > > > >
> > > > >> But for now I do prefer working and slightly slower system over broken
> > > > >> one, so I think we should just disable this on ARM for now.
> > > > >>
> > > > > Again, this is not about non-cache coherent being slower without the
> > > > > optimization, it is about non-cache coherent likely not working *at
> > > > > all* unless the optimization is enabled.
> > > >
> > > > As Michel tried to explain this CAN'T happen. The optimization is a
> > > > purely optional request from userspace.
> > > >
> > >
> > > Right.
> > >
> > > So in that case, we can assume that the following test
> > >
> > > static inline bool drm_arch_can_wc_memory(void)
> > > {
> > > #if defined(CONFIG_PPC) && !defined(CONFIG_NOT_COHERENT_CACHE)
> > > return false;
> > >
> > > is bogus, and it was just unnecessary caution on the part of the
> > > author to disregard non-cache coherent devices.
> > > Unfortunately, those commits have no log messages whatsoever, so it is
> > > difficult to infer the intent retroactively.
> > >
> > > > > Otherwise, the driver will vmap() DMA pages with cacheable attributes,
> > > > > while the non-cache coherent device uses uncached attributes, breaking
> > > > > coherency.
> > > >
> > > > Again this is mandated by the userspace APIs anyway. E.g. we can't
> > > > vmap() pages in any other way or our userspace APIs would break.
> > > >
> > >
> > > OK,
> > >
> > > So let's just disable this for all ARM and arm64 then, given that
> > > non-cache coherent is not supported in any case
> >
> > So I think we are back to this patch:
> > https://patchwork.kernel.org/patch/10739023/
> >
>
> Apart from the fact that the issue has nothing to do with write-combining, yes.

Your patch has a better description. Let's go with that.

Alex

2019-01-24 16:05:27

by Michel Dänzer

[permalink] [raw]

Subject: Re: [RFC PATCH] drm: disable WC optimization for cache coherent devices on non-x86

On 2019-01-24 12:45 p.m., Ard Biesheuvel wrote:
> On Thu, 24 Jan 2019 at 12:37, Koenig, Christian
> <[email protected]> wrote:
>> Am 24.01.19 um 12:26 schrieb Ard Biesheuvel:
>>> On Thu, 24 Jan 2019 at 12:23, Koenig, Christian
>>> <[email protected]> wrote:
>>>> Am 24.01.19 um 10:59 schrieb Ard Biesheuvel:
>>>>> [SNIP]
>>>>> This is *exactly* my point the whole time.
>>>>>
>>>>> The current code has
>>>>>
>>>>> static inline bool drm_arch_can_wc_memory(void)
>>>>> {
>>>>> #if defined(CONFIG_PPC) && !defined(CONFIG_NOT_COHERENT_CACHE)
>>>>> return false;
>>>>>
>>>>> which means the optimization is disabled *unless the system is
>>>>> non-cache coherent*
>>>>>
>>>>> So if you have reports that the optimization works on some PowerPC, it
>>>>> must be non-cache coherent PowerPC, because that is the only place
>>>>> where it is enabled in the first place.
>>>>>
>>>>>> The only problematic here actually seems to be ARM, so you should
>>>>>> probably just add an "#ifdef .._ARM return false;".
>>>>>>
>>>>> ARM/arm64 does not have a Kconfig symbol like
>>>>> CONFIG_NOT_COHERENT_CACHE, so we can only disable it everywhere. If
>>>>> there are non-coherent ARM systems that are currently working in the
>>>>> same way as those non-coherent PowerPC systems, we will break them by
>>>>> doing this.
>>>> Summing the things I've read so far for ARM up I actually think it
>>>> depends on a runtime configuration and not on compile time one.
>>>>
>>>> So the whole idea of providing the device to the drm_*_can_wc_memory()
>>>> function isn't so far fetched.
>>>>
>>> Thank you.
>>>
>>>> But for now I do prefer working and slightly slower system over broken
>>>> one, so I think we should just disable this on ARM for now.
>>>>
>>> Again, this is not about non-cache coherent being slower without the
>>> optimization, it is about non-cache coherent likely not working *at
>>> all* unless the optimization is enabled.
>>
>> As Michel tried to explain this CAN'T happen. The optimization is a
>> purely optional request from userspace.
>>
>
> Right.
>
> So in that case, we can assume that the following test
>
> static inline bool drm_arch_can_wc_memory(void)
> {
> #if defined(CONFIG_PPC) && !defined(CONFIG_NOT_COHERENT_CACHE)
> return false;
>
> is bogus, and it was just unnecessary caution on the part of the
> author to disregard non-cache coherent devices.

This is driver-independent core code, meaning "non-snooped PCIe
transfers don't work on cache coherent PPC". It doesn't imply anything
about whether or not amdgpu or any other driver works on
non-cache-coherent PPC in general.

--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer