2023-11-03 19:58:26

by Andrew Cooper

[permalink] [raw]
Subject: Notes on BAD_APICID, Was: [PATCH 0/3] x86/apic: Misc pruning

On 02/11/2023 12:26 pm, Andrew Cooper wrote:
> Seriously, this work started out trying to fix a buggy comment. It
> escalated somewhat... Perform some simple tidying.

Another dodgy construct spotted while doing this work is

#ifdef CONFIG_X86_32
 #define BAD_APICID 0xFFu
#else
 #define BAD_APICID 0xFFFFu
#endif

considering that both of those "bad" values are legal APIC IDs in an
x2APIC system.

The majority use is as a sentential (of varying types - int, u16
mostly), although the uses for NUM_APIC_CLUSTERS, and
safe_smp_processor_id() look suspect.

In particular, safe_smp_processor_id() *will* malfunction on some legal
CPUs, and needs to use -1 (32 bits wide) to spot the intended error case
of a bad xAPIC mapping.

However, it's use in amd_pmu_cpu_starting() from topology_die_id() looks
broken.  Partly because the error handling is (only) a WARN_ON_ONCE(),
and also because nb->nb_id's sentinel value is -1 of type int.

I suspect there's a lot of cleaning to be done here too.

~Andrew


2023-11-04 19:15:33

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Notes on BAD_APICID, Was: [PATCH 0/3] x86/apic: Misc pruning

On Fri, Nov 03 2023 at 19:58, Andrew Cooper wrote:
> On 02/11/2023 12:26 pm, Andrew Cooper wrote:
> Another dodgy construct spotted while doing this work is
>
> #ifdef CONFIG_X86_32
>  #define BAD_APICID 0xFFu
> #else
>  #define BAD_APICID 0xFFFFu
> #endif
>
> considering that both of those "bad" values are legal APIC IDs in an
> x2APIC system.
>
> The majority use is as a sentential (of varying types - int, u16
> mostly), although the uses for NUM_APIC_CLUSTERS, and
> safe_smp_processor_id() look suspect.
>
> In particular, safe_smp_processor_id() *will* malfunction on some legal
> CPUs, and needs to use -1 (32 bits wide) to spot the intended error case
> of a bad xAPIC mapping.
>
> However, it's use in amd_pmu_cpu_starting() from topology_die_id() looks
> broken.  Partly because the error handling is (only) a WARN_ON_ONCE(),
> and also because nb->nb_id's sentinel value is -1 of type int.
>
> I suspect there's a lot of cleaning to be done here too.

Sigh...