2021-04-28 10:15:21

by Laurent Dufour

[permalink] [raw]
Subject: [PATCH v3] pseries/drmem: update LMBs after LPM

After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
updated by the hypervisor in the case the NUMA topology of the LPAR's
memory is updated.

This is caught by the kernel, but the memory's node is updated because
there is no way to move a memory block between nodes.

If later a memory block is added or removed, drmem_update_dt() is called
and it is overwriting the DT node to match the added or removed LMB. But
the LMB's associativity node has not been updated after the DT node update
and thus the node is overwritten by the Linux's topology instead of the
hypervisor one.

Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
updated to force an update of the LMB's associativity.

Cc: Tyrel Datwyler <[email protected]>
Signed-off-by: Laurent Dufour <[email protected]>
---

V3:
- Check rd->dn->name instead of rd->dn->full_name
V2:
- Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
introducing a new hook mechanism.
---
arch/powerpc/include/asm/drmem.h | 1 +
arch/powerpc/mm/drmem.c | 35 +++++++++++++++++++
.../platforms/pseries/hotplug-memory.c | 4 +++
3 files changed, 40 insertions(+)

diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
index bf2402fed3e0..4265d5e95c2c 100644
--- a/arch/powerpc/include/asm/drmem.h
+++ b/arch/powerpc/include/asm/drmem.h
@@ -111,6 +111,7 @@ int drmem_update_dt(void);
int __init
walk_drmem_lmbs_early(unsigned long node, void *data,
int (*func)(struct drmem_lmb *, const __be32 **, void *));
+void drmem_update_lmbs(struct property *prop);
#endif

static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
index 9af3832c9d8d..f0a6633132af 100644
--- a/arch/powerpc/mm/drmem.c
+++ b/arch/powerpc/mm/drmem.c
@@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node, void *data,
return ret;
}

+/*
+ * Update the LMB associativity index.
+ */
+static int update_lmb(struct drmem_lmb *updated_lmb,
+ __maybe_unused const __be32 **usm,
+ __maybe_unused void *data)
+{
+ struct drmem_lmb *lmb;
+
+ /*
+ * Brut force there may be better way to fetch the LMB
+ */
+ for_each_drmem_lmb(lmb) {
+ if (lmb->drc_index != updated_lmb->drc_index)
+ continue;
+
+ lmb->aa_index = updated_lmb->aa_index;
+ break;
+ }
+ return 0;
+}
+
+/*
+ * Update the LMB associativity index.
+ *
+ * This needs to be called when the hypervisor is updating the
+ * dynamic-reconfiguration-memory node property.
+ */
+void drmem_update_lmbs(struct property *prop)
+{
+ if (!strcmp(prop->name, "ibm,dynamic-memory"))
+ __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
+ else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
+ __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
+}
#endif

static int init_drmem_lmb_size(struct device_node *dn)
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 8377f1f7c78e..672ffbee2e78 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct notifier_block *nb,
case OF_RECONFIG_DETACH_NODE:
err = pseries_remove_mem_node(rd->dn);
break;
+ case OF_RECONFIG_UPDATE_PROPERTY:
+ if (!strcmp(rd->dn->name,
+ "ibm,dynamic-reconfiguration-memory"))
+ drmem_update_lmbs(rd->prop);
}
return notifier_from_errno(err);
}
--
2.31.1


2021-04-29 10:29:30

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH v3] pseries/drmem: update LMBs after LPM

Laurent Dufour <[email protected]> writes:

> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
> updated by the hypervisor in the case the NUMA topology of the LPAR's
> memory is updated.
>
> This is caught by the kernel, but the memory's node is updated because
> there is no way to move a memory block between nodes.
>
> If later a memory block is added or removed, drmem_update_dt() is called
> and it is overwriting the DT node to match the added or removed LMB. But
> the LMB's associativity node has not been updated after the DT node update
> and thus the node is overwritten by the Linux's topology instead of the
> hypervisor one.
>
> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
> updated to force an update of the LMB's associativity.
>
> Cc: Tyrel Datwyler <[email protected]>
> Signed-off-by: Laurent Dufour <[email protected]>
> ---
>
> V3:
> - Check rd->dn->name instead of rd->dn->full_name
> V2:
> - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
> introducing a new hook mechanism.
> ---
> arch/powerpc/include/asm/drmem.h | 1 +
> arch/powerpc/mm/drmem.c | 35 +++++++++++++++++++
> .../platforms/pseries/hotplug-memory.c | 4 +++
> 3 files changed, 40 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
> index bf2402fed3e0..4265d5e95c2c 100644
> --- a/arch/powerpc/include/asm/drmem.h
> +++ b/arch/powerpc/include/asm/drmem.h
> @@ -111,6 +111,7 @@ int drmem_update_dt(void);
> int __init
> walk_drmem_lmbs_early(unsigned long node, void *data,
> int (*func)(struct drmem_lmb *, const __be32 **, void *));
> +void drmem_update_lmbs(struct property *prop);
> #endif
>
> static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
> index 9af3832c9d8d..f0a6633132af 100644
> --- a/arch/powerpc/mm/drmem.c
> +++ b/arch/powerpc/mm/drmem.c
> @@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node, void *data,
> return ret;
> }
>
> +/*
> + * Update the LMB associativity index.
> + */
> +static int update_lmb(struct drmem_lmb *updated_lmb,
> + __maybe_unused const __be32 **usm,
> + __maybe_unused void *data)
> +{
> + struct drmem_lmb *lmb;
> +
> + /*
> + * Brut force there may be better way to fetch the LMB
> + */
> + for_each_drmem_lmb(lmb) {
> + if (lmb->drc_index != updated_lmb->drc_index)
> + continue;
> +
> + lmb->aa_index = updated_lmb->aa_index;
> + break;
> + }
> + return 0;
> +}
> +
> +/*
> + * Update the LMB associativity index.
> + *
> + * This needs to be called when the hypervisor is updating the
> + * dynamic-reconfiguration-memory node property.
> + */
> +void drmem_update_lmbs(struct property *prop)
> +{
> + if (!strcmp(prop->name, "ibm,dynamic-memory"))
> + __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
> + else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
> + __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
> +}
> #endif
>
> static int init_drmem_lmb_size(struct device_node *dn)
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index 8377f1f7c78e..672ffbee2e78 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct notifier_block *nb,
> case OF_RECONFIG_DETACH_NODE:
> err = pseries_remove_mem_node(rd->dn);
> break;
> + case OF_RECONFIG_UPDATE_PROPERTY:
> + if (!strcmp(rd->dn->name,
> + "ibm,dynamic-reconfiguration-memory"))
> + drmem_update_lmbs(rd->prop);
> }
> return notifier_from_errno(err);

How will this interact with DLPAR memory? When we dlpar memory,
ibm,configure-connector is used to fetch the new associativity details
and set drmem_lmb->aa_index correctly there. Once that is done kernel
then call drmem_update_dt() which will result in the above notifier
callback?

IIUC, the call back then will update drmem_lmb->aa_index again?

-aneesh

2021-04-29 11:34:27

by Laurent Dufour

[permalink] [raw]
Subject: Re: [PATCH v3] pseries/drmem: update LMBs after LPM

Le 29/04/2021 à 12:27, Aneesh Kumar K.V a écrit :
> Laurent Dufour <[email protected]> writes:
>
>> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
>> updated by the hypervisor in the case the NUMA topology of the LPAR's
>> memory is updated.
>>
>> This is caught by the kernel, but the memory's node is updated because
>> there is no way to move a memory block between nodes.
>>
>> If later a memory block is added or removed, drmem_update_dt() is called
>> and it is overwriting the DT node to match the added or removed LMB. But
>> the LMB's associativity node has not been updated after the DT node update
>> and thus the node is overwritten by the Linux's topology instead of the
>> hypervisor one.
>>
>> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
>> updated to force an update of the LMB's associativity.
>>
>> Cc: Tyrel Datwyler <[email protected]>
>> Signed-off-by: Laurent Dufour <[email protected]>
>> ---
>>
>> V3:
>> - Check rd->dn->name instead of rd->dn->full_name
>> V2:
>> - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
>> introducing a new hook mechanism.
>> ---
>> arch/powerpc/include/asm/drmem.h | 1 +
>> arch/powerpc/mm/drmem.c | 35 +++++++++++++++++++
>> .../platforms/pseries/hotplug-memory.c | 4 +++
>> 3 files changed, 40 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
>> index bf2402fed3e0..4265d5e95c2c 100644
>> --- a/arch/powerpc/include/asm/drmem.h
>> +++ b/arch/powerpc/include/asm/drmem.h
>> @@ -111,6 +111,7 @@ int drmem_update_dt(void);
>> int __init
>> walk_drmem_lmbs_early(unsigned long node, void *data,
>> int (*func)(struct drmem_lmb *, const __be32 **, void *));
>> +void drmem_update_lmbs(struct property *prop);
>> #endif
>>
>> static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
>> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
>> index 9af3832c9d8d..f0a6633132af 100644
>> --- a/arch/powerpc/mm/drmem.c
>> +++ b/arch/powerpc/mm/drmem.c
>> @@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node, void *data,
>> return ret;
>> }
>>
>> +/*
>> + * Update the LMB associativity index.
>> + */
>> +static int update_lmb(struct drmem_lmb *updated_lmb,
>> + __maybe_unused const __be32 **usm,
>> + __maybe_unused void *data)
>> +{
>> + struct drmem_lmb *lmb;
>> +
>> + /*
>> + * Brut force there may be better way to fetch the LMB
>> + */
>> + for_each_drmem_lmb(lmb) {
>> + if (lmb->drc_index != updated_lmb->drc_index)
>> + continue;
>> +
>> + lmb->aa_index = updated_lmb->aa_index;
>> + break;
>> + }
>> + return 0;
>> +}
>> +
>> +/*
>> + * Update the LMB associativity index.
>> + *
>> + * This needs to be called when the hypervisor is updating the
>> + * dynamic-reconfiguration-memory node property.
>> + */
>> +void drmem_update_lmbs(struct property *prop)
>> +{
>> + if (!strcmp(prop->name, "ibm,dynamic-memory"))
>> + __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
>> + else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
>> + __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
>> +}
>> #endif
>>
>> static int init_drmem_lmb_size(struct device_node *dn)
>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> index 8377f1f7c78e..672ffbee2e78 100644
>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> @@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct notifier_block *nb,
>> case OF_RECONFIG_DETACH_NODE:
>> err = pseries_remove_mem_node(rd->dn);
>> break;
>> + case OF_RECONFIG_UPDATE_PROPERTY:
>> + if (!strcmp(rd->dn->name,
>> + "ibm,dynamic-reconfiguration-memory"))
>> + drmem_update_lmbs(rd->prop);
>> }
>> return notifier_from_errno(err);
>
> How will this interact with DLPAR memory? When we dlpar memory,
> ibm,configure-connector is used to fetch the new associativity details
> and set drmem_lmb->aa_index correctly there. Once that is done kernel
> then call drmem_update_dt() which will result in the above notifier
> callback?

When a memory DLPAR operation is done, the in memory DT property
"ibm,dynamic-memory" or "ibm,dynamic-memory-v2" (if existing) have to be updated
to reflect the added/removed memory part. This is done by calling drmem_update_dt().

This patch is addressing the case where the hypervisor has updated the DT
property mentioned above. In that case, the LMB tree should be updated so the
aa_index fields are matching the DT one. This way the next time a memory DLPAR
operation is done the DT properties "ibm,dynamic-memory" or
"ibm,dynamic-memory-v2" will be rebuilt correctly.

> IIUC, the call back then will update drmem_lmb->aa_index again?

drmem_update_dt() is not updating drmem_lmb->aa_index, that's the oppposite, it
is rebuilding the in memory DT property "ibm,dynamic-memory" or
"ibm,dynamic-memory-v2" based on the data stored in the LMB tree.

Laurent.

2021-04-29 11:38:20

by Laurent Dufour

[permalink] [raw]
Subject: Re: [PATCH v3] pseries/drmem: update LMBs after LPM

Le 29/04/2021 à 13:31, Laurent Dufour a écrit :
> Le 29/04/2021 à 12:27, Aneesh Kumar K.V a écrit :
>> Laurent Dufour <[email protected]> writes:
>>
>>> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
>>> updated by the hypervisor in the case the NUMA topology of the LPAR's
>>> memory is updated.
>>>
>>> This is caught by the kernel, but the memory's node is updated because
>>> there is no way to move a memory block between nodes.
>>>
>>> If later a memory block is added or removed, drmem_update_dt() is called
>>> and it is overwriting the DT node to match the added or removed LMB. But
>>> the LMB's associativity node has not been updated after the DT node update
>>> and thus the node is overwritten by the Linux's topology instead of the
>>> hypervisor one.
>>>
>>> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
>>> updated to force an update of the LMB's associativity.
>>>
>>> Cc: Tyrel Datwyler <[email protected]>
>>> Signed-off-by: Laurent Dufour <[email protected]>
>>> ---
>>>
>>> V3:
>>>   - Check rd->dn->name instead of rd->dn->full_name
>>> V2:
>>>   - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
>>>   introducing a new hook mechanism.
>>> ---
>>>   arch/powerpc/include/asm/drmem.h              |  1 +
>>>   arch/powerpc/mm/drmem.c                       | 35 +++++++++++++++++++
>>>   .../platforms/pseries/hotplug-memory.c        |  4 +++
>>>   3 files changed, 40 insertions(+)
>>>
>>> diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
>>> index bf2402fed3e0..4265d5e95c2c 100644
>>> --- a/arch/powerpc/include/asm/drmem.h
>>> +++ b/arch/powerpc/include/asm/drmem.h
>>> @@ -111,6 +111,7 @@ int drmem_update_dt(void);
>>>   int __init
>>>   walk_drmem_lmbs_early(unsigned long node, void *data,
>>>                 int (*func)(struct drmem_lmb *, const __be32 **, void *));
>>> +void drmem_update_lmbs(struct property *prop);
>>>   #endif
>>>   static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
>>> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
>>> index 9af3832c9d8d..f0a6633132af 100644
>>> --- a/arch/powerpc/mm/drmem.c
>>> +++ b/arch/powerpc/mm/drmem.c
>>> @@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node,
>>> void *data,
>>>       return ret;
>>>   }
>>> +/*
>>> + * Update the LMB associativity index.
>>> + */
>>> +static int update_lmb(struct drmem_lmb *updated_lmb,
>>> +              __maybe_unused const __be32 **usm,
>>> +              __maybe_unused void *data)
>>> +{
>>> +    struct drmem_lmb *lmb;
>>> +
>>> +    /*
>>> +     * Brut force there may be better way to fetch the LMB
>>> +     */
>>> +    for_each_drmem_lmb(lmb) {
>>> +        if (lmb->drc_index != updated_lmb->drc_index)
>>> +            continue;
>>> +
>>> +        lmb->aa_index = updated_lmb->aa_index;
>>> +        break;
>>> +    }
>>> +    return 0;
>>> +}
>>> +
>>> +/*
>>> + * Update the LMB associativity index.
>>> + *
>>> + * This needs to be called when the hypervisor is updating the
>>> + * dynamic-reconfiguration-memory node property.
>>> + */
>>> +void drmem_update_lmbs(struct property *prop)
>>> +{
>>> +    if (!strcmp(prop->name, "ibm,dynamic-memory"))
>>> +        __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
>>> +    else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
>>> +        __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
>>> +}
>>>   #endif
>>>   static int init_drmem_lmb_size(struct device_node *dn)
>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> index 8377f1f7c78e..672ffbee2e78 100644
>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> @@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct notifier_block
>>> *nb,
>>>       case OF_RECONFIG_DETACH_NODE:
>>>           err = pseries_remove_mem_node(rd->dn);
>>>           break;
>>> +    case OF_RECONFIG_UPDATE_PROPERTY:
>>> +        if (!strcmp(rd->dn->name,
>>> +                "ibm,dynamic-reconfiguration-memory"))
>>> +            drmem_update_lmbs(rd->prop);
>>>       }
>>>       return notifier_from_errno(err);
>>
>> How will this interact with DLPAR memory? When we dlpar memory,
>> ibm,configure-connector is used to fetch the new associativity details
>> and set drmem_lmb->aa_index correctly there. Once that is done kernel
>> then call drmem_update_dt() which will result in the above notifier
>> callback?
>
> When a memory DLPAR operation is done, the in memory DT property
> "ibm,dynamic-memory" or "ibm,dynamic-memory-v2" (if existing) have to be updated
> to reflect the added/removed memory part. This is done by calling
> drmem_update_dt().
>
> This patch is addressing the case where the hypervisor has updated the DT
> property mentioned above. In that case, the LMB tree should be updated so the
> aa_index fields are matching the DT one. This way the next time a memory DLPAR
> operation is done the DT properties "ibm,dynamic-memory" or
> "ibm,dynamic-memory-v2" will be rebuilt correctly.
>
>> IIUC, the call back then will update drmem_lmb->aa_index again?

Oh I missed what you pointed out.
Please ignore my previous answer, I need to double check code.

> drmem_update_dt() is not updating drmem_lmb->aa_index, that's the oppposite, it
> is rebuilding the in memory DT property "ibm,dynamic-memory" or
> "ibm,dynamic-memory-v2" based on the data stored in the LMB tree.
>
> Laurent.

2021-04-29 12:40:23

by Laurent Dufour

[permalink] [raw]
Subject: Re: [PATCH v3] pseries/drmem: update LMBs after LPM

Le 29/04/2021 à 12:27, Aneesh Kumar K.V a écrit :
> Laurent Dufour <[email protected]> writes:
>
>> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
>> updated by the hypervisor in the case the NUMA topology of the LPAR's
>> memory is updated.
>>
>> This is caught by the kernel, but the memory's node is updated because
>> there is no way to move a memory block between nodes.
>>
>> If later a memory block is added or removed, drmem_update_dt() is called
>> and it is overwriting the DT node to match the added or removed LMB. But
>> the LMB's associativity node has not been updated after the DT node update
>> and thus the node is overwritten by the Linux's topology instead of the
>> hypervisor one.
>>
>> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
>> updated to force an update of the LMB's associativity.
>>
>> Cc: Tyrel Datwyler <[email protected]>
>> Signed-off-by: Laurent Dufour <[email protected]>
>> ---
>>
>> V3:
>> - Check rd->dn->name instead of rd->dn->full_name
>> V2:
>> - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
>> introducing a new hook mechanism.
>> ---
>> arch/powerpc/include/asm/drmem.h | 1 +
>> arch/powerpc/mm/drmem.c | 35 +++++++++++++++++++
>> .../platforms/pseries/hotplug-memory.c | 4 +++
>> 3 files changed, 40 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
>> index bf2402fed3e0..4265d5e95c2c 100644
>> --- a/arch/powerpc/include/asm/drmem.h
>> +++ b/arch/powerpc/include/asm/drmem.h
>> @@ -111,6 +111,7 @@ int drmem_update_dt(void);
>> int __init
>> walk_drmem_lmbs_early(unsigned long node, void *data,
>> int (*func)(struct drmem_lmb *, const __be32 **, void *));
>> +void drmem_update_lmbs(struct property *prop);
>> #endif
>>
>> static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
>> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
>> index 9af3832c9d8d..f0a6633132af 100644
>> --- a/arch/powerpc/mm/drmem.c
>> +++ b/arch/powerpc/mm/drmem.c
>> @@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node, void *data,
>> return ret;
>> }
>>
>> +/*
>> + * Update the LMB associativity index.
>> + */
>> +static int update_lmb(struct drmem_lmb *updated_lmb,
>> + __maybe_unused const __be32 **usm,
>> + __maybe_unused void *data)
>> +{
>> + struct drmem_lmb *lmb;
>> +
>> + /*
>> + * Brut force there may be better way to fetch the LMB
>> + */
>> + for_each_drmem_lmb(lmb) {
>> + if (lmb->drc_index != updated_lmb->drc_index)
>> + continue;
>> +
>> + lmb->aa_index = updated_lmb->aa_index;
>> + break;
>> + }
>> + return 0;
>> +}
>> +
>> +/*
>> + * Update the LMB associativity index.
>> + *
>> + * This needs to be called when the hypervisor is updating the
>> + * dynamic-reconfiguration-memory node property.
>> + */
>> +void drmem_update_lmbs(struct property *prop)
>> +{
>> + if (!strcmp(prop->name, "ibm,dynamic-memory"))
>> + __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
>> + else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
>> + __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
>> +}
>> #endif
>>
>> static int init_drmem_lmb_size(struct device_node *dn)
>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> index 8377f1f7c78e..672ffbee2e78 100644
>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> @@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct notifier_block *nb,
>> case OF_RECONFIG_DETACH_NODE:
>> err = pseries_remove_mem_node(rd->dn);
>> break;
>> + case OF_RECONFIG_UPDATE_PROPERTY:
>> + if (!strcmp(rd->dn->name,
>> + "ibm,dynamic-reconfiguration-memory"))
>> + drmem_update_lmbs(rd->prop);
>> }
>> return notifier_from_errno(err);
>
> How will this interact with DLPAR memory? When we dlpar memory,
> ibm,configure-connector is used to fetch the new associativity details
> and set drmem_lmb->aa_index correctly there. Once that is done kernel
> then call drmem_update_dt() which will result in the above notifier
> callback?
>
> IIUC, the call back then will update drmem_lmb->aa_index again?

Thanks for pointing this Aneesh,

You're right I missed that callback and it was quite invisible during my test
because the value set back in the aa_index was the same.

When dmrem_update_dt() is called, there is no need to update the LMB back and
the DT modify notifier should be ignored.

As DLPAR operations are serialized (by lock_device_hotplug()), I'm proposing to
rely on a boolean static variable to do skip this notification, like this:

diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
index f0a6633132af..3c0130720086 100644
--- a/arch/powerpc/mm/drmem.c
+++ b/arch/powerpc/mm/drmem.c
@@ -18,6 +18,7 @@ static int n_root_addr_cells, n_root_size_cells;

static struct drmem_lmb_info __drmem_info;
struct drmem_lmb_info *drmem_info = &__drmem_info;
+static bool in_drmem_update;

u64 drmem_lmb_memory_max(void)
{
@@ -178,6 +179,11 @@ int drmem_update_dt(void)
if (!memory)
return -1;

+ /*
+ * Set in_drmem_update to prevent the notifier callback to process the
+ * DT property back since the change is coming from the LMB tree.
+ */
+ in_drmem_update = true;
prop = of_find_property(memory, "ibm,dynamic-memory", NULL);
if (prop) {
rc = drmem_update_dt_v1(memory, prop);
@@ -186,6 +192,7 @@ int drmem_update_dt(void)
if (prop)
rc = drmem_update_dt_v2(memory, prop);
}
+ in_drmem_update = false;

of_node_put(memory);
return rc;
@@ -337,6 +344,12 @@ static int update_lmb(struct drmem_lmb *updated_lmb,
*/
void drmem_update_lmbs(struct property *prop)
{
+ /*
+ * Don't update the LMBs If called from the update done in
+ * drmem_update_dt().
+ */
+ if (in_drmem_update)
+ return;
if (!strcmp(prop->name, "ibm,dynamic-memory"))
__walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))

Any concern with this option?

Laurent.

2021-04-29 19:25:11

by Tyrel Datwyler

[permalink] [raw]
Subject: Re: [PATCH v3] pseries/drmem: update LMBs after LPM

On 4/29/21 3:27 AM, Aneesh Kumar K.V wrote:
> Laurent Dufour <[email protected]> writes:
>
>> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
>> updated by the hypervisor in the case the NUMA topology of the LPAR's
>> memory is updated.
>>
>> This is caught by the kernel, but the memory's node is updated because
>> there is no way to move a memory block between nodes.
>>
>> If later a memory block is added or removed, drmem_update_dt() is called
>> and it is overwriting the DT node to match the added or removed LMB. But
>> the LMB's associativity node has not been updated after the DT node update
>> and thus the node is overwritten by the Linux's topology instead of the
>> hypervisor one.
>>
>> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
>> updated to force an update of the LMB's associativity.
>>
>> Cc: Tyrel Datwyler <[email protected]>
>> Signed-off-by: Laurent Dufour <[email protected]>
>> ---
>>
>> V3:
>> - Check rd->dn->name instead of rd->dn->full_name
>> V2:
>> - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
>> introducing a new hook mechanism.
>> ---
>> arch/powerpc/include/asm/drmem.h | 1 +
>> arch/powerpc/mm/drmem.c | 35 +++++++++++++++++++
>> .../platforms/pseries/hotplug-memory.c | 4 +++
>> 3 files changed, 40 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
>> index bf2402fed3e0..4265d5e95c2c 100644
>> --- a/arch/powerpc/include/asm/drmem.h
>> +++ b/arch/powerpc/include/asm/drmem.h
>> @@ -111,6 +111,7 @@ int drmem_update_dt(void);
>> int __init
>> walk_drmem_lmbs_early(unsigned long node, void *data,
>> int (*func)(struct drmem_lmb *, const __be32 **, void *));
>> +void drmem_update_lmbs(struct property *prop);
>> #endif
>>
>> static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
>> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
>> index 9af3832c9d8d..f0a6633132af 100644
>> --- a/arch/powerpc/mm/drmem.c
>> +++ b/arch/powerpc/mm/drmem.c
>> @@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node, void *data,
>> return ret;
>> }
>>
>> +/*
>> + * Update the LMB associativity index.
>> + */
>> +static int update_lmb(struct drmem_lmb *updated_lmb,
>> + __maybe_unused const __be32 **usm,
>> + __maybe_unused void *data)
>> +{
>> + struct drmem_lmb *lmb;
>> +
>> + /*
>> + * Brut force there may be better way to fetch the LMB
>> + */
>> + for_each_drmem_lmb(lmb) {
>> + if (lmb->drc_index != updated_lmb->drc_index)
>> + continue;
>> +
>> + lmb->aa_index = updated_lmb->aa_index;
>> + break;
>> + }
>> + return 0;
>> +}
>> +
>> +/*
>> + * Update the LMB associativity index.
>> + *
>> + * This needs to be called when the hypervisor is updating the
>> + * dynamic-reconfiguration-memory node property.
>> + */
>> +void drmem_update_lmbs(struct property *prop)
>> +{
>> + if (!strcmp(prop->name, "ibm,dynamic-memory"))
>> + __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
>> + else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
>> + __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
>> +}
>> #endif
>>
>> static int init_drmem_lmb_size(struct device_node *dn)
>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> index 8377f1f7c78e..672ffbee2e78 100644
>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> @@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct notifier_block *nb,
>> case OF_RECONFIG_DETACH_NODE:
>> err = pseries_remove_mem_node(rd->dn);
>> break;
>> + case OF_RECONFIG_UPDATE_PROPERTY:
>> + if (!strcmp(rd->dn->name,
>> + "ibm,dynamic-reconfiguration-memory"))
>> + drmem_update_lmbs(rd->prop);
>> }
>> return notifier_from_errno(err);
>
> How will this interact with DLPAR memory? When we dlpar memory,
> ibm,configure-connector is used to fetch the new associativity details
> and set drmem_lmb->aa_index correctly there. Once that is done kernel
> then call drmem_update_dt() which will result in the above notifier
> callback?
>
> IIUC, the call back then will update drmem_lmb->aa_index again?

After digging through some of this code I'm a bit concerned about all the kernel
device tree manipulation around memory DLPAR both with the assoc-lookup-array
prop update and post dynamic-memory prop updating. We build a drmem_info array
of the LMBs from the device-tree at boot. I don't really understand why we are
manipulating the device tree property every time we add/remove an LMB. Not sure
the reasoning was to write back in particular the aa_index and flags for each
LMB into the device tree when we already have them in the drmem_info array. On
the other hand the assoc-lookup-array I suppose would need to have an in kernel
representation to avoid updating the device tree property every time.

Changes to the device tree should be things reported to the system from the
hypervisor through the proper interfaces, and as a result any code that cares
can register an of_reconfig_notifier to respond to device tree updates. The
memory dlpar code seems to be needlessly manipulating the device-tree which
leads to the problem here where a notifier callback is now duplicating work.

Just my two cents FWIW.

-Tyrel

>
> -aneesh
>

2021-04-30 16:15:08

by Laurent Dufour

[permalink] [raw]
Subject: Re: [PATCH v3] pseries/drmem: update LMBs after LPM

Le 29/04/2021 à 21:12, Tyrel Datwyler a écrit :
> On 4/29/21 3:27 AM, Aneesh Kumar K.V wrote:
>> Laurent Dufour <[email protected]> writes:
>>
>>> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
>>> updated by the hypervisor in the case the NUMA topology of the LPAR's
>>> memory is updated.
>>>
>>> This is caught by the kernel, but the memory's node is updated because
>>> there is no way to move a memory block between nodes.
>>>
>>> If later a memory block is added or removed, drmem_update_dt() is called
>>> and it is overwriting the DT node to match the added or removed LMB. But
>>> the LMB's associativity node has not been updated after the DT node update
>>> and thus the node is overwritten by the Linux's topology instead of the
>>> hypervisor one.
>>>
>>> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
>>> updated to force an update of the LMB's associativity.
>>>
>>> Cc: Tyrel Datwyler <[email protected]>
>>> Signed-off-by: Laurent Dufour <[email protected]>
>>> ---
>>>
>>> V3:
>>> - Check rd->dn->name instead of rd->dn->full_name
>>> V2:
>>> - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
>>> introducing a new hook mechanism.
>>> ---
>>> arch/powerpc/include/asm/drmem.h | 1 +
>>> arch/powerpc/mm/drmem.c | 35 +++++++++++++++++++
>>> .../platforms/pseries/hotplug-memory.c | 4 +++
>>> 3 files changed, 40 insertions(+)
>>>
>>> diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
>>> index bf2402fed3e0..4265d5e95c2c 100644
>>> --- a/arch/powerpc/include/asm/drmem.h
>>> +++ b/arch/powerpc/include/asm/drmem.h
>>> @@ -111,6 +111,7 @@ int drmem_update_dt(void);
>>> int __init
>>> walk_drmem_lmbs_early(unsigned long node, void *data,
>>> int (*func)(struct drmem_lmb *, const __be32 **, void *));
>>> +void drmem_update_lmbs(struct property *prop);
>>> #endif
>>>
>>> static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
>>> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
>>> index 9af3832c9d8d..f0a6633132af 100644
>>> --- a/arch/powerpc/mm/drmem.c
>>> +++ b/arch/powerpc/mm/drmem.c
>>> @@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node, void *data,
>>> return ret;
>>> }
>>>
>>> +/*
>>> + * Update the LMB associativity index.
>>> + */
>>> +static int update_lmb(struct drmem_lmb *updated_lmb,
>>> + __maybe_unused const __be32 **usm,
>>> + __maybe_unused void *data)
>>> +{
>>> + struct drmem_lmb *lmb;
>>> +
>>> + /*
>>> + * Brut force there may be better way to fetch the LMB
>>> + */
>>> + for_each_drmem_lmb(lmb) {
>>> + if (lmb->drc_index != updated_lmb->drc_index)
>>> + continue;
>>> +
>>> + lmb->aa_index = updated_lmb->aa_index;
>>> + break;
>>> + }
>>> + return 0;
>>> +}
>>> +
>>> +/*
>>> + * Update the LMB associativity index.
>>> + *
>>> + * This needs to be called when the hypervisor is updating the
>>> + * dynamic-reconfiguration-memory node property.
>>> + */
>>> +void drmem_update_lmbs(struct property *prop)
>>> +{
>>> + if (!strcmp(prop->name, "ibm,dynamic-memory"))
>>> + __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
>>> + else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
>>> + __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
>>> +}
>>> #endif
>>>
>>> static int init_drmem_lmb_size(struct device_node *dn)
>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> index 8377f1f7c78e..672ffbee2e78 100644
>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> @@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct notifier_block *nb,
>>> case OF_RECONFIG_DETACH_NODE:
>>> err = pseries_remove_mem_node(rd->dn);
>>> break;
>>> + case OF_RECONFIG_UPDATE_PROPERTY:
>>> + if (!strcmp(rd->dn->name,
>>> + "ibm,dynamic-reconfiguration-memory"))
>>> + drmem_update_lmbs(rd->prop);
>>> }
>>> return notifier_from_errno(err);
>>
>> How will this interact with DLPAR memory? When we dlpar memory,
>> ibm,configure-connector is used to fetch the new associativity details
>> and set drmem_lmb->aa_index correctly there. Once that is done kernel
>> then call drmem_update_dt() which will result in the above notifier
>> callback?
>>
>> IIUC, the call back then will update drmem_lmb->aa_index again?
>
> After digging through some of this code I'm a bit concerned about all the kernel
> device tree manipulation around memory DLPAR both with the assoc-lookup-array
> prop update and post dynamic-memory prop updating. We build a drmem_info array
> of the LMBs from the device-tree at boot. I don't really understand why we are
> manipulating the device tree property every time we add/remove an LMB. Not sure
> the reasoning was to write back in particular the aa_index and flags for each
> LMB into the device tree when we already have them in the drmem_info array. On
> the other hand the assoc-lookup-array I suppose would need to have an in kernel
> representation to avoid updating the device tree property every time.

I think the reason is to keep the device tree in sync with the current set of LMBs.

My understanding is that the kernel is not really using the
'ibm,dynamic-memory*' DT property once the boot is done. But user space tools
(like lsslot and drmgr) read it to built the LMB tree and get the DRC index for
each LMBs as it is not available in SYSFS.

> Changes to the device tree should be things reported to the system from the
> hypervisor through the proper interfaces, and as a result any code that cares
> can register an of_reconfig_notifier to resepond to device tree updates. The
> memory dlpar code seems to be needlessly manipulating the device-tree which
> leads to the problem here where a notifier callback is now duplicating work.

I don't think the hypervisor is expected to update the 'ibm,dynamic-memory' each
time a LMB is added, this is not design this way AFAIK.

Laurent.

> Just my two cents FWIW.
>
> -Tyrel
>
>>
>> -aneesh
>>
>

2021-04-30 23:59:42

by Tyrel Datwyler

[permalink] [raw]
Subject: Re: [PATCH v3] pseries/drmem: update LMBs after LPM

On 4/30/21 9:13 AM, Laurent Dufour wrote:
> Le 29/04/2021 à 21:12, Tyrel Datwyler a écrit :
>> On 4/29/21 3:27 AM, Aneesh Kumar K.V wrote:
>>> Laurent Dufour <[email protected]> writes:
>>>
>>>> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
>>>> updated by the hypervisor in the case the NUMA topology of the LPAR's
>>>> memory is updated.
>>>>
>>>> This is caught by the kernel, but the memory's node is updated because
>>>> there is no way to move a memory block between nodes.
>>>>
>>>> If later a memory block is added or removed, drmem_update_dt() is called
>>>> and it is overwriting the DT node to match the added or removed LMB. But
>>>> the LMB's associativity node has not been updated after the DT node update
>>>> and thus the node is overwritten by the Linux's topology instead of the
>>>> hypervisor one.
>>>>
>>>> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
>>>> updated to force an update of the LMB's associativity.
>>>>
>>>> Cc: Tyrel Datwyler <[email protected]>
>>>> Signed-off-by: Laurent Dufour <[email protected]>
>>>> ---
>>>>
>>>> V3:
>>>>   - Check rd->dn->name instead of rd->dn->full_name
>>>> V2:
>>>>   - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
>>>>   introducing a new hook mechanism.
>>>> ---
>>>>   arch/powerpc/include/asm/drmem.h              |  1 +
>>>>   arch/powerpc/mm/drmem.c                       | 35 +++++++++++++++++++
>>>>   .../platforms/pseries/hotplug-memory.c        |  4 +++
>>>>   3 files changed, 40 insertions(+)
>>>>
>>>> diff --git a/arch/powerpc/include/asm/drmem.h
>>>> b/arch/powerpc/include/asm/drmem.h
>>>> index bf2402fed3e0..4265d5e95c2c 100644
>>>> --- a/arch/powerpc/include/asm/drmem.h
>>>> +++ b/arch/powerpc/include/asm/drmem.h
>>>> @@ -111,6 +111,7 @@ int drmem_update_dt(void);
>>>>   int __init
>>>>   walk_drmem_lmbs_early(unsigned long node, void *data,
>>>>                 int (*func)(struct drmem_lmb *, const __be32 **, void *));
>>>> +void drmem_update_lmbs(struct property *prop);
>>>>   #endif
>>>>     static inline void invalidate_lmb_associativity_index(struct drmem_lmb
>>>> *lmb)
>>>> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
>>>> index 9af3832c9d8d..f0a6633132af 100644
>>>> --- a/arch/powerpc/mm/drmem.c
>>>> +++ b/arch/powerpc/mm/drmem.c
>>>> @@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node,
>>>> void *data,
>>>>       return ret;
>>>>   }
>>>>   +/*
>>>> + * Update the LMB associativity index.
>>>> + */
>>>> +static int update_lmb(struct drmem_lmb *updated_lmb,
>>>> +              __maybe_unused const __be32 **usm,
>>>> +              __maybe_unused void *data)
>>>> +{
>>>> +    struct drmem_lmb *lmb;
>>>> +
>>>> +    /*
>>>> +     * Brut force there may be better way to fetch the LMB
>>>> +     */
>>>> +    for_each_drmem_lmb(lmb) {
>>>> +        if (lmb->drc_index != updated_lmb->drc_index)
>>>> +            continue;
>>>> +
>>>> +        lmb->aa_index = updated_lmb->aa_index;
>>>> +        break;
>>>> +    }
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Update the LMB associativity index.
>>>> + *
>>>> + * This needs to be called when the hypervisor is updating the
>>>> + * dynamic-reconfiguration-memory node property.
>>>> + */
>>>> +void drmem_update_lmbs(struct property *prop)
>>>> +{
>>>> +    if (!strcmp(prop->name, "ibm,dynamic-memory"))
>>>> +        __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
>>>> +    else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
>>>> +        __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
>>>> +}
>>>>   #endif
>>>>     static int init_drmem_lmb_size(struct device_node *dn)
>>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> index 8377f1f7c78e..672ffbee2e78 100644
>>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> @@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct
>>>> notifier_block *nb,
>>>>       case OF_RECONFIG_DETACH_NODE:
>>>>           err = pseries_remove_mem_node(rd->dn);
>>>>           break;
>>>> +    case OF_RECONFIG_UPDATE_PROPERTY:
>>>> +        if (!strcmp(rd->dn->name,
>>>> +                "ibm,dynamic-reconfiguration-memory"))
>>>> +            drmem_update_lmbs(rd->prop);
>>>>       }
>>>>       return notifier_from_errno(err);
>>>
>>> How will this interact with DLPAR memory? When we dlpar memory,
>>> ibm,configure-connector is used to fetch the new associativity details
>>> and set drmem_lmb->aa_index correctly there. Once that is done kernel
>>> then call drmem_update_dt() which will result in the above notifier
>>> callback?
>>>
>>> IIUC, the call back then will update drmem_lmb->aa_index again?
>>
>> After digging through some of this code I'm a bit concerned about all the kernel
>> device tree manipulation around memory DLPAR both with the assoc-lookup-array
>> prop update and post dynamic-memory prop updating. We build a drmem_info array
>> of the LMBs from the device-tree at boot. I don't really understand why we are
>> manipulating the device tree property every time we add/remove an LMB. Not sure
>> the reasoning was to write back in particular the aa_index and flags for each
>> LMB into the device tree when we already have them in the drmem_info array. On
>> the other hand the assoc-lookup-array I suppose would need to have an in kernel
>> representation to avoid updating the device tree property every time.
>
> I think the reason is to keep the device tree in sync with the current set of LMBs.

I don't really think that is how the device tree is meant to be used. We have an
in memory representation of the LMBs separate from the device tree, and that is
were we should track OS specific state. The values in the device-tree property
can be updated via device node remove/add or update-properties RTAS call. These
are the means that the platform reports (OS discovers) underlying changes. The
new property is going to blow away any previous state that the OS wrote there.
This is likely, one of the culprits of memory DLPAR problems that have been
observed after LPM.

>
> My understanding is that the kernel is not really using the
> 'ibm,dynamic-memory*' DT property once the boot is done. But user space tools
> (like lsslot and drmgr) read it to built the LMB tree and get the DRC index for
> each LMBs as it is not available in SYSFS.

Yeah, but as I mentioned above the property can change as a result of an update
we process from the hypervisor in response to something like LPM (or PRRN if we
ever figure out how to make that work correctly). So, if there is some sort of
state drmgr needs to know we have to figure out a different way to expose that
information.

>
>> Changes to the device tree should be things reported to the system from the
>> hypervisor through the proper interfaces, and as a result any code that cares
>> can register an of_reconfig_notifier to resepond to device tree updates. The
>> memory dlpar code seems to be needlessly manipulating the device-tree which
>> leads to the problem here where a notifier callback is now duplicating work.
>
> I don't think the hypervisor is expected to update the 'ibm,dynamic-memory' each
> time a LMB is added, this is not design this way AFAIK.

It shouldn't need to for the most part. The only information that should change
in ibm,dynamic-memory in the first place is the aa_index as a result of an
underlying platform reassignment. The flags in ibm,dynamic-memory aren't
changing as a result of DLPAR add/remove. The aa_index could be out of date as I
mentioned above. The use of DRCONF_MEM_ASSIGNED in my opinion is actually a hack
to mark LMBs as present and owned by the partition. Its actual PAPR definition
is soley to identify LMBs that are present at boot.

As of today I don't have a problem with your patch. This was more of me pointing
out things that I think are currently wrong with our memory hotplug
implementation, and that we need to take a long hard look at it down the road.

-Tyrel

>
> Laurent.
>
>> Just my two cents FWIW.
>>
>> -Tyrel
>>
>>>
>>> -aneesh
>>>
>>
>

2021-05-03 17:30:35

by Laurent Dufour

[permalink] [raw]
Subject: Re: [PATCH v3] pseries/drmem: update LMBs after LPM

Le 01/05/2021 à 01:58, Tyrel Datwyler a écrit :
> On 4/30/21 9:13 AM, Laurent Dufour wrote:
>> Le 29/04/2021 à 21:12, Tyrel Datwyler a écrit :
>>> On 4/29/21 3:27 AM, Aneesh Kumar K.V wrote:
>>>> Laurent Dufour <[email protected]> writes:
>>>>
>>>>> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
>>>>> updated by the hypervisor in the case the NUMA topology of the LPAR's
>>>>> memory is updated.
>>>>>
>>>>> This is caught by the kernel, but the memory's node is updated because
>>>>> there is no way to move a memory block between nodes.
>>>>>
>>>>> If later a memory block is added or removed, drmem_update_dt() is called
>>>>> and it is overwriting the DT node to match the added or removed LMB. But
>>>>> the LMB's associativity node has not been updated after the DT node update
>>>>> and thus the node is overwritten by the Linux's topology instead of the
>>>>> hypervisor one.
>>>>>
>>>>> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
>>>>> updated to force an update of the LMB's associativity.
>>>>>
>>>>> Cc: Tyrel Datwyler <[email protected]>
>>>>> Signed-off-by: Laurent Dufour <[email protected]>
>>>>> ---
>>>>>
>>>>> V3:
>>>>>   - Check rd->dn->name instead of rd->dn->full_name
>>>>> V2:
>>>>>   - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
>>>>>   introducing a new hook mechanism.
>>>>> ---
>>>>>   arch/powerpc/include/asm/drmem.h              |  1 +
>>>>>   arch/powerpc/mm/drmem.c                       | 35 +++++++++++++++++++
>>>>>   .../platforms/pseries/hotplug-memory.c        |  4 +++
>>>>>   3 files changed, 40 insertions(+)
>>>>>
>>>>> diff --git a/arch/powerpc/include/asm/drmem.h
>>>>> b/arch/powerpc/include/asm/drmem.h
>>>>> index bf2402fed3e0..4265d5e95c2c 100644
>>>>> --- a/arch/powerpc/include/asm/drmem.h
>>>>> +++ b/arch/powerpc/include/asm/drmem.h
>>>>> @@ -111,6 +111,7 @@ int drmem_update_dt(void);
>>>>>   int __init
>>>>>   walk_drmem_lmbs_early(unsigned long node, void *data,
>>>>>                 int (*func)(struct drmem_lmb *, const __be32 **, void *));
>>>>> +void drmem_update_lmbs(struct property *prop);
>>>>>   #endif
>>>>>     static inline void invalidate_lmb_associativity_index(struct drmem_lmb
>>>>> *lmb)
>>>>> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
>>>>> index 9af3832c9d8d..f0a6633132af 100644
>>>>> --- a/arch/powerpc/mm/drmem.c
>>>>> +++ b/arch/powerpc/mm/drmem.c
>>>>> @@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node,
>>>>> void *data,
>>>>>       return ret;
>>>>>   }
>>>>>   +/*
>>>>> + * Update the LMB associativity index.
>>>>> + */
>>>>> +static int update_lmb(struct drmem_lmb *updated_lmb,
>>>>> +              __maybe_unused const __be32 **usm,
>>>>> +              __maybe_unused void *data)
>>>>> +{
>>>>> +    struct drmem_lmb *lmb;
>>>>> +
>>>>> +    /*
>>>>> +     * Brut force there may be better way to fetch the LMB
>>>>> +     */
>>>>> +    for_each_drmem_lmb(lmb) {
>>>>> +        if (lmb->drc_index != updated_lmb->drc_index)
>>>>> +            continue;
>>>>> +
>>>>> +        lmb->aa_index = updated_lmb->aa_index;
>>>>> +        break;
>>>>> +    }
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +/*
>>>>> + * Update the LMB associativity index.
>>>>> + *
>>>>> + * This needs to be called when the hypervisor is updating the
>>>>> + * dynamic-reconfiguration-memory node property.
>>>>> + */
>>>>> +void drmem_update_lmbs(struct property *prop)
>>>>> +{
>>>>> +    if (!strcmp(prop->name, "ibm,dynamic-memory"))
>>>>> +        __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
>>>>> +    else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
>>>>> +        __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
>>>>> +}
>>>>>   #endif
>>>>>     static int init_drmem_lmb_size(struct device_node *dn)
>>>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>>> b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>>> index 8377f1f7c78e..672ffbee2e78 100644
>>>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>>> @@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct
>>>>> notifier_block *nb,
>>>>>       case OF_RECONFIG_DETACH_NODE:
>>>>>           err = pseries_remove_mem_node(rd->dn);
>>>>>           break;
>>>>> +    case OF_RECONFIG_UPDATE_PROPERTY:
>>>>> +        if (!strcmp(rd->dn->name,
>>>>> +                "ibm,dynamic-reconfiguration-memory"))
>>>>> +            drmem_update_lmbs(rd->prop);
>>>>>       }
>>>>>       return notifier_from_errno(err);
>>>>
>>>> How will this interact with DLPAR memory? When we dlpar memory,
>>>> ibm,configure-connector is used to fetch the new associativity details
>>>> and set drmem_lmb->aa_index correctly there. Once that is done kernel
>>>> then call drmem_update_dt() which will result in the above notifier
>>>> callback?
>>>>
>>>> IIUC, the call back then will update drmem_lmb->aa_index again?
>>>
>>> After digging through some of this code I'm a bit concerned about all the kernel
>>> device tree manipulation around memory DLPAR both with the assoc-lookup-array
>>> prop update and post dynamic-memory prop updating. We build a drmem_info array
>>> of the LMBs from the device-tree at boot. I don't really understand why we are
>>> manipulating the device tree property every time we add/remove an LMB. Not sure
>>> the reasoning was to write back in particular the aa_index and flags for each
>>> LMB into the device tree when we already have them in the drmem_info array. On
>>> the other hand the assoc-lookup-array I suppose would need to have an in kernel
>>> representation to avoid updating the device tree property every time.
>>
>> I think the reason is to keep the device tree in sync with the current set of LMBs.
>
> I don't really think that is how the device tree is meant to be used. We have an
> in memory representation of the LMBs separate from the device tree, and that is
> were we should track OS specific state. The values in the device-tree property
> can be updated via device node remove/add or update-properties RTAS call. These
> are the means that the platform reports (OS discovers) underlying changes. The
> new property is going to blow away any previous state that the OS wrote there.
> This is likely, one of the culprits of memory DLPAR problems that have been
> observed after LPM.

One of the issue is that the kernel is overwritting the drmem property set by
the hypervisor once a LMB is added or removed after a LPM (PRRN). My patch
prevents this to happen after a LPM, I plan to check and fix PRRN later.

>>
>> My understanding is that the kernel is not really using the
>> 'ibm,dynamic-memory*' DT property once the boot is done. But user space tools
>> (like lsslot and drmgr) read it to built the LMB tree and get the DRC index for
>> each LMBs as it is not available in SYSFS.
>
> Yeah, but as I mentioned above the property can change as a result of an update
> we process from the hypervisor in response to something like LPM (or PRRN if we
> ever figure out how to make that work correctly). So, if there is some sort of
> state drmgr needs to know we have to figure out a different way to expose that
> information.

Triggering drmgr at the end of the LPM (and after a PRRN) will be the next thing
to do, so userspace tools can be run to handle these changes.

>>
>>> Changes to the device tree should be things reported to the system from the
>>> hypervisor through the proper interfaces, and as a result any code that cares
>>> can register an of_reconfig_notifier to resepond to device tree updates. The
>>> memory dlpar code seems to be needlessly manipulating the device-tree which
>>> leads to the problem here where a notifier callback is now duplicating work.
>>
>> I don't think the hypervisor is expected to update the 'ibm,dynamic-memory' each
>> time a LMB is added, this is not design this way AFAIK.
>
> It shouldn't need to for the most part. The only information that should change
> in ibm,dynamic-memory in the first place is the aa_index as a result of an
> underlying platform reassignment. The flags in ibm,dynamic-memory aren't
> changing as a result of DLPAR add/remove. The aa_index could be out of date as I
> mentioned above. The use of DRCONF_MEM_ASSIGNED in my opinion is actually a hack
> to mark LMBs as present and owned by the partition. Its actual PAPR definition
> is soley to identify LMBs that are present at boot.
>
> As of today I don't have a problem with your patch. This was more of me pointing
> out things that I think are currently wrong with our memory hotplug
> implementation, and that we need to take a long hard look at it down the road.

I do agree, there is a lot of odd things there to address in this area.
If you're ok with that patch, do you mind to add a reviewed-by?

> -Tyrel
>
>>
>> Laurent.
>>
>>> Just my two cents FWIW.
>>>
>>> -Tyrel
>>>
>>>>
>>>> -aneesh
>>>>
>>>
>>
>

2021-05-03 20:48:11

by Tyrel Datwyler

[permalink] [raw]
Subject: Re: [PATCH v3] pseries/drmem: update LMBs after LPM

On 5/3/21 10:28 AM, Laurent Dufour wrote:
> Le 01/05/2021 à 01:58, Tyrel Datwyler a écrit :
>> On 4/30/21 9:13 AM, Laurent Dufour wrote:
>>> Le 29/04/2021 à 21:12, Tyrel Datwyler a écrit :
>>>> On 4/29/21 3:27 AM, Aneesh Kumar K.V wrote:
>>>>> Laurent Dufour <[email protected]> writes:
>>>>>

Snip

>>
>> As of today I don't have a problem with your patch. This was more of me pointing
>> out things that I think are currently wrong with our memory hotplug
>> implementation, and that we need to take a long hard look at it down the road.
>
> I do agree, there is a lot of odd things there to address in this area.
> If you're ok with that patch, do you mind to add a reviewed-by?
>

Can you send a v4 with the fix for the duplicate update included?

-Tyrel

2021-05-04 07:06:38

by Laurent Dufour

[permalink] [raw]
Subject: Re: [PATCH v3] pseries/drmem: update LMBs after LPM

Le 03/05/2021 à 22:44, Tyrel Datwyler a écrit :
> On 5/3/21 10:28 AM, Laurent Dufour wrote:
>> Le 01/05/2021 à 01:58, Tyrel Datwyler a écrit :
>>> On 4/30/21 9:13 AM, Laurent Dufour wrote:
>>>> Le 29/04/2021 à 21:12, Tyrel Datwyler a écrit :
>>>>> On 4/29/21 3:27 AM, Aneesh Kumar K.V wrote:
>>>>>> Laurent Dufour <[email protected]> writes:
>>>>>>
>
> Snip
>
>>>
>>> As of today I don't have a problem with your patch. This was more of me pointing
>>> out things that I think are currently wrong with our memory hotplug
>>> implementation, and that we need to take a long hard look at it down the road.
>>
>> I do agree, there is a lot of odd things there to address in this area.
>> If you're ok with that patch, do you mind to add a reviewed-by?
>>
>
> Can you send a v4 with the fix for the duplicate update included?

Of course !
I wrote it last week, but let in the to-be-sent list, my mistake.