A proper arch_remove_memory() implementation is on its way, which also
cleanly removes page tables in arch_add_memory() in case something goes
wrong.
As we want to use arch_remove_memory() in case something goes wrong
during memory hotplug after arch_add_memory() finished, let's add
a temporary hack that is sufficient enough until we get a proper
implementation that cleans up page table entries.
We will remove CONFIG_MEMORY_HOTREMOVE around this code in follow up
patches.
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Chintan Pandya <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Jun Yao <[email protected]>
Cc: Yu Zhao <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
---
arch/arm64/mm/mmu.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a1bfc4413982..e569a543c384 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1084,4 +1084,23 @@ int arch_add_memory(int nid, u64 start, u64 size,
return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
restrictions);
}
+#ifdef CONFIG_MEMORY_HOTREMOVE
+void arch_remove_memory(int nid, u64 start, u64 size,
+ struct vmem_altmap *altmap)
+{
+ unsigned long start_pfn = start >> PAGE_SHIFT;
+ unsigned long nr_pages = size >> PAGE_SHIFT;
+ struct zone *zone;
+
+ /*
+ * FIXME: Cleanup page tables (also in arch_add_memory() in case
+ * adding fails). Until then, this function should only be used
+ * during memory hotplug (adding memory), not for memory
+ * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be
+ * unlocked yet.
+ */
+ zone = page_zone(pfn_to_page(start_pfn));
+ __remove_pages(zone, start_pfn, nr_pages, altmap);
+}
+#endif
#endif
--
2.20.1
On Mon, May 27, 2019 at 01:11:45PM +0200, David Hildenbrand wrote:
>A proper arch_remove_memory() implementation is on its way, which also
>cleanly removes page tables in arch_add_memory() in case something goes
>wrong.
Would this be better to understand?
removes page tables created in arch_add_memory
>
>As we want to use arch_remove_memory() in case something goes wrong
>during memory hotplug after arch_add_memory() finished, let's add
>a temporary hack that is sufficient enough until we get a proper
>implementation that cleans up page table entries.
>
>We will remove CONFIG_MEMORY_HOTREMOVE around this code in follow up
>patches.
>
>Cc: Catalin Marinas <[email protected]>
>Cc: Will Deacon <[email protected]>
>Cc: Mark Rutland <[email protected]>
>Cc: Andrew Morton <[email protected]>
>Cc: Ard Biesheuvel <[email protected]>
>Cc: Chintan Pandya <[email protected]>
>Cc: Mike Rapoport <[email protected]>
>Cc: Jun Yao <[email protected]>
>Cc: Yu Zhao <[email protected]>
>Cc: Robin Murphy <[email protected]>
>Cc: Anshuman Khandual <[email protected]>
>Signed-off-by: David Hildenbrand <[email protected]>
>---
> arch/arm64/mm/mmu.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
>diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>index a1bfc4413982..e569a543c384 100644
>--- a/arch/arm64/mm/mmu.c
>+++ b/arch/arm64/mm/mmu.c
>@@ -1084,4 +1084,23 @@ int arch_add_memory(int nid, u64 start, u64 size,
> return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> restrictions);
> }
>+#ifdef CONFIG_MEMORY_HOTREMOVE
>+void arch_remove_memory(int nid, u64 start, u64 size,
>+ struct vmem_altmap *altmap)
>+{
>+ unsigned long start_pfn = start >> PAGE_SHIFT;
>+ unsigned long nr_pages = size >> PAGE_SHIFT;
>+ struct zone *zone;
>+
>+ /*
>+ * FIXME: Cleanup page tables (also in arch_add_memory() in case
>+ * adding fails). Until then, this function should only be used
>+ * during memory hotplug (adding memory), not for memory
>+ * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be
>+ * unlocked yet.
>+ */
>+ zone = page_zone(pfn_to_page(start_pfn));
Compared with arch_remove_memory in x86. If altmap is not NULL, zone will be
retrieved from page related to altmap. Not sure why this is not the same?
>+ __remove_pages(zone, start_pfn, nr_pages, altmap);
>+}
>+#endif
> #endif
>--
>2.20.1
--
Wei Yang
Help you, Help me
On 03.06.19 23:41, Wei Yang wrote:
> On Mon, May 27, 2019 at 01:11:45PM +0200, David Hildenbrand wrote:
>> A proper arch_remove_memory() implementation is on its way, which also
>> cleanly removes page tables in arch_add_memory() in case something goes
>> wrong.
>
> Would this be better to understand?
>
> removes page tables created in arch_add_memory
That's not what this sentence expresses. Have a look at
arch_add_memory(), in case __add_pages() fails, the page tables are not
removed. This will also be fixed by Anshuman in the same shot.
>
>>
>> As we want to use arch_remove_memory() in case something goes wrong
>> during memory hotplug after arch_add_memory() finished, let's add
>> a temporary hack that is sufficient enough until we get a proper
>> implementation that cleans up page table entries.
>>
>> We will remove CONFIG_MEMORY_HOTREMOVE around this code in follow up
>> patches.
>>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> Cc: Mark Rutland <[email protected]>
>> Cc: Andrew Morton <[email protected]>
>> Cc: Ard Biesheuvel <[email protected]>
>> Cc: Chintan Pandya <[email protected]>
>> Cc: Mike Rapoport <[email protected]>
>> Cc: Jun Yao <[email protected]>
>> Cc: Yu Zhao <[email protected]>
>> Cc: Robin Murphy <[email protected]>
>> Cc: Anshuman Khandual <[email protected]>
>> Signed-off-by: David Hildenbrand <[email protected]>
>> ---
>> arch/arm64/mm/mmu.c | 19 +++++++++++++++++++
>> 1 file changed, 19 insertions(+)
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index a1bfc4413982..e569a543c384 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1084,4 +1084,23 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
>> restrictions);
>> }
>> +#ifdef CONFIG_MEMORY_HOTREMOVE
>> +void arch_remove_memory(int nid, u64 start, u64 size,
>> + struct vmem_altmap *altmap)
>> +{
>> + unsigned long start_pfn = start >> PAGE_SHIFT;
>> + unsigned long nr_pages = size >> PAGE_SHIFT;
>> + struct zone *zone;
>> +
>> + /*
>> + * FIXME: Cleanup page tables (also in arch_add_memory() in case
>> + * adding fails). Until then, this function should only be used
>> + * during memory hotplug (adding memory), not for memory
>> + * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be
>> + * unlocked yet.
>> + */
>> + zone = page_zone(pfn_to_page(start_pfn));
>
> Compared with arch_remove_memory in x86. If altmap is not NULL, zone will be
> retrieved from page related to altmap. Not sure why this is not the same?
This is a minimal implementation, sufficient for this use case here. A
full implementation is in the works. For now, this function will not be
used with an altmap (ZONE_DEVICE is not esupported for arm64 yet).
Thanks!
>
>> + __remove_pages(zone, start_pfn, nr_pages, altmap);
>> +}
>> +#endif
>> #endif
>> --
>> 2.20.1
>
--
Thanks,
David / dhildenb
On 04/06/2019 07:56, David Hildenbrand wrote:
> On 03.06.19 23:41, Wei Yang wrote:
>> On Mon, May 27, 2019 at 01:11:45PM +0200, David Hildenbrand wrote:
>>> A proper arch_remove_memory() implementation is on its way, which also
>>> cleanly removes page tables in arch_add_memory() in case something goes
>>> wrong.
>>
>> Would this be better to understand?
>>
>> removes page tables created in arch_add_memory
>
> That's not what this sentence expresses. Have a look at
> arch_add_memory(), in case __add_pages() fails, the page tables are not
> removed. This will also be fixed by Anshuman in the same shot.
>
>>
>>>
>>> As we want to use arch_remove_memory() in case something goes wrong
>>> during memory hotplug after arch_add_memory() finished, let's add
>>> a temporary hack that is sufficient enough until we get a proper
>>> implementation that cleans up page table entries.
>>>
>>> We will remove CONFIG_MEMORY_HOTREMOVE around this code in follow up
>>> patches.
>>>
>>> Cc: Catalin Marinas <[email protected]>
>>> Cc: Will Deacon <[email protected]>
>>> Cc: Mark Rutland <[email protected]>
>>> Cc: Andrew Morton <[email protected]>
>>> Cc: Ard Biesheuvel <[email protected]>
>>> Cc: Chintan Pandya <[email protected]>
>>> Cc: Mike Rapoport <[email protected]>
>>> Cc: Jun Yao <[email protected]>
>>> Cc: Yu Zhao <[email protected]>
>>> Cc: Robin Murphy <[email protected]>
>>> Cc: Anshuman Khandual <[email protected]>
>>> Signed-off-by: David Hildenbrand <[email protected]>
>>> ---
>>> arch/arm64/mm/mmu.c | 19 +++++++++++++++++++
>>> 1 file changed, 19 insertions(+)
>>>
>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>> index a1bfc4413982..e569a543c384 100644
>>> --- a/arch/arm64/mm/mmu.c
>>> +++ b/arch/arm64/mm/mmu.c
>>> @@ -1084,4 +1084,23 @@ int arch_add_memory(int nid, u64 start, u64 size,
>>> return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
>>> restrictions);
>>> }
>>> +#ifdef CONFIG_MEMORY_HOTREMOVE
>>> +void arch_remove_memory(int nid, u64 start, u64 size,
>>> + struct vmem_altmap *altmap)
>>> +{
>>> + unsigned long start_pfn = start >> PAGE_SHIFT;
>>> + unsigned long nr_pages = size >> PAGE_SHIFT;
>>> + struct zone *zone;
>>> +
>>> + /*
>>> + * FIXME: Cleanup page tables (also in arch_add_memory() in case
>>> + * adding fails). Until then, this function should only be used
>>> + * during memory hotplug (adding memory), not for memory
>>> + * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be
>>> + * unlocked yet.
>>> + */
>>> + zone = page_zone(pfn_to_page(start_pfn));
>>
>> Compared with arch_remove_memory in x86. If altmap is not NULL, zone will be
>> retrieved from page related to altmap. Not sure why this is not the same?
>
> This is a minimal implementation, sufficient for this use case here. A
> full implementation is in the works. For now, this function will not be
> used with an altmap (ZONE_DEVICE is not esupported for arm64 yet).
FWIW the other pieces of ZONE_DEVICE are now due to land in parallel,
but as long as we don't throw the ARCH_ENABLE_MEMORY_HOTREMOVE switch
then there should still be no issue. Besides, given that we should
consistently ignore the altmap everywhere at the moment, it may even
work out regardless.
One thing stands out about the failure path thing, though - if
__add_pages() did fail, can it still be guaranteed to have initialised
the memmap such that page_zone() won't return nonsense? Last time I
looked that was still a problem when removing memory which had been
successfully added, but never onlined (although I do know that
particular case was already being discussed at the time, and I've not
been paying the greatest attention since).
Robin.
On 04.06.19 19:36, Robin Murphy wrote:
> On 04/06/2019 07:56, David Hildenbrand wrote:
>> On 03.06.19 23:41, Wei Yang wrote:
>>> On Mon, May 27, 2019 at 01:11:45PM +0200, David Hildenbrand wrote:
>>>> A proper arch_remove_memory() implementation is on its way, which also
>>>> cleanly removes page tables in arch_add_memory() in case something goes
>>>> wrong.
>>>
>>> Would this be better to understand?
>>>
>>> removes page tables created in arch_add_memory
>>
>> That's not what this sentence expresses. Have a look at
>> arch_add_memory(), in case __add_pages() fails, the page tables are not
>> removed. This will also be fixed by Anshuman in the same shot.
>>
>>>
>>>>
>>>> As we want to use arch_remove_memory() in case something goes wrong
>>>> during memory hotplug after arch_add_memory() finished, let's add
>>>> a temporary hack that is sufficient enough until we get a proper
>>>> implementation that cleans up page table entries.
>>>>
>>>> We will remove CONFIG_MEMORY_HOTREMOVE around this code in follow up
>>>> patches.
>>>>
>>>> Cc: Catalin Marinas <[email protected]>
>>>> Cc: Will Deacon <[email protected]>
>>>> Cc: Mark Rutland <[email protected]>
>>>> Cc: Andrew Morton <[email protected]>
>>>> Cc: Ard Biesheuvel <[email protected]>
>>>> Cc: Chintan Pandya <[email protected]>
>>>> Cc: Mike Rapoport <[email protected]>
>>>> Cc: Jun Yao <[email protected]>
>>>> Cc: Yu Zhao <[email protected]>
>>>> Cc: Robin Murphy <[email protected]>
>>>> Cc: Anshuman Khandual <[email protected]>
>>>> Signed-off-by: David Hildenbrand <[email protected]>
>>>> ---
>>>> arch/arm64/mm/mmu.c | 19 +++++++++++++++++++
>>>> 1 file changed, 19 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>>> index a1bfc4413982..e569a543c384 100644
>>>> --- a/arch/arm64/mm/mmu.c
>>>> +++ b/arch/arm64/mm/mmu.c
>>>> @@ -1084,4 +1084,23 @@ int arch_add_memory(int nid, u64 start, u64 size,
>>>> return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
>>>> restrictions);
>>>> }
>>>> +#ifdef CONFIG_MEMORY_HOTREMOVE
>>>> +void arch_remove_memory(int nid, u64 start, u64 size,
>>>> + struct vmem_altmap *altmap)
>>>> +{
>>>> + unsigned long start_pfn = start >> PAGE_SHIFT;
>>>> + unsigned long nr_pages = size >> PAGE_SHIFT;
>>>> + struct zone *zone;
>>>> +
>>>> + /*
>>>> + * FIXME: Cleanup page tables (also in arch_add_memory() in case
>>>> + * adding fails). Until then, this function should only be used
>>>> + * during memory hotplug (adding memory), not for memory
>>>> + * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be
>>>> + * unlocked yet.
>>>> + */
>>>> + zone = page_zone(pfn_to_page(start_pfn));
>>>
>>> Compared with arch_remove_memory in x86. If altmap is not NULL, zone will be
>>> retrieved from page related to altmap. Not sure why this is not the same?
>>
>> This is a minimal implementation, sufficient for this use case here. A
>> full implementation is in the works. For now, this function will not be
>> used with an altmap (ZONE_DEVICE is not esupported for arm64 yet).
>
> FWIW the other pieces of ZONE_DEVICE are now due to land in parallel,
> but as long as we don't throw the ARCH_ENABLE_MEMORY_HOTREMOVE switch
> then there should still be no issue. Besides, given that we should
> consistently ignore the altmap everywhere at the moment, it may even
> work out regardless.
Thanks for the info.
>
> One thing stands out about the failure path thing, though - if
> __add_pages() did fail, can it still be guaranteed to have initialised
> the memmap such that page_zone() won't return nonsense? Last time I
if __add_pages() fails, then arch_add_memory() fails and
arch_remove_memory() will not be called in the context of this series.
Only if it succeeded.
> looked that was still a problem when removing memory which had been
> successfully added, but never onlined (although I do know that
> particular case was already being discussed at the time, and I've not
> been paying the greatest attention since).
Yes, that part is next on my list. It works but is ugly. The memory
removal process should not care about zones at all.
Slowly moving into the right direction :)
>
> Robin.
>
--
Thanks,
David / dhildenb
On Mon 27-05-19 13:11:45, David Hildenbrand wrote:
> A proper arch_remove_memory() implementation is on its way, which also
> cleanly removes page tables in arch_add_memory() in case something goes
> wrong.
>
> As we want to use arch_remove_memory() in case something goes wrong
> during memory hotplug after arch_add_memory() finished, let's add
> a temporary hack that is sufficient enough until we get a proper
> implementation that cleans up page table entries.
>
> We will remove CONFIG_MEMORY_HOTREMOVE around this code in follow up
> patches.
I would drop this one as well (like s390 counterpart).
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Ard Biesheuvel <[email protected]>
> Cc: Chintan Pandya <[email protected]>
> Cc: Mike Rapoport <[email protected]>
> Cc: Jun Yao <[email protected]>
> Cc: Yu Zhao <[email protected]>
> Cc: Robin Murphy <[email protected]>
> Cc: Anshuman Khandual <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>
> ---
> arch/arm64/mm/mmu.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index a1bfc4413982..e569a543c384 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1084,4 +1084,23 @@ int arch_add_memory(int nid, u64 start, u64 size,
> return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> restrictions);
> }
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +void arch_remove_memory(int nid, u64 start, u64 size,
> + struct vmem_altmap *altmap)
> +{
> + unsigned long start_pfn = start >> PAGE_SHIFT;
> + unsigned long nr_pages = size >> PAGE_SHIFT;
> + struct zone *zone;
> +
> + /*
> + * FIXME: Cleanup page tables (also in arch_add_memory() in case
> + * adding fails). Until then, this function should only be used
> + * during memory hotplug (adding memory), not for memory
> + * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be
> + * unlocked yet.
> + */
> + zone = page_zone(pfn_to_page(start_pfn));
> + __remove_pages(zone, start_pfn, nr_pages, altmap);
> +}
> +#endif
> #endif
> --
> 2.20.1
--
Michal Hocko
SUSE Labs