2022-07-04 11:42:34

by Carlo Bai

[permalink] [raw]
Subject: [PATCH 0/2] kexec: accumulate and release the size of crashkernel

Currently x86 and arm64 support to reserve low memory range for
crashkernel. When crashkernel=Y,low is defined, the main kernel would
reserve another memblock (instead of crashkernel=X,high, which stored
in crashk_res) for crashkernel and store it in crashk_low_res.

The implementations of get_crash_size and crash_shrink_size do not
consider the extra reserved memory range if it exists. Thus, firstly
accumulate this range on the size of crashkernel and export the size
by /sys/kernel/kexec_crash_size.

If getting the input of /sys/kernel/kexec_crash_size, both reserved ranges
might be released if the new size is smaller than current size. The order
of release is (crashk_res -> crashk_low_res). Only if the new size defined
by the user is smaller than the size of low memory range, continue to
release the reserved low memory range after completely releasing the high
memory range.

Kaihao Bai (2):
kexec: accumulate kexec_crash_size if crashk_low_res defined
kexec: release reserved memory ranges to RAM if crashk_low_res defined

kernel/kexec_core.c | 77 ++++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 58 insertions(+), 19 deletions(-)

--
1.8.3.1


2022-07-04 11:42:52

by Carlo Bai

[permalink] [raw]
Subject: [PATCH 1/2] kexec: accumulate kexec_crash_size if crashk_low_res defined

Currently x86 and arm64 support to reserve low memory range for
crashkernel. When crashkernel=Y,low is defined, the main kernel would
reserve another memblock (instead of crashkernel=X,high, which stored
in crashk_res) for crashkernel and store it in crashk_low_res. But
the value of /sys/kernel/kexec_crash_size only calculates the size of
crashk_res size is not calculated.

To ensure the consistency of /sys/kernel/kexec_crash_size, when
crashk_low_res is defined, its size needs to be accumulated to
kexec_crash_size.

Signed-off-by: Kaihao Bai <[email protected]>
---
kernel/kexec_core.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 4d34c78..137f6eb 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1016,6 +1016,8 @@ size_t crash_get_memory_size(void)
mutex_lock(&kexec_mutex);
if (crashk_res.end != crashk_res.start)
size = resource_size(&crashk_res);
+ if (crashk_low_res.end != crashk_low_res.start)
+ size += resource_size(&crashk_low_res);
mutex_unlock(&kexec_mutex);
return size;
}
--
1.8.3.1

2022-07-04 11:46:58

by Carlo Bai

[permalink] [raw]
Subject: [PATCH 2/2] kexec: release reserved memory ranges to RAM if crashk_low_res defined

If reserving low memory range for crashkenrel, the range could not free
to System RAM all the time. However, the high memory range corresponding
to crashk_res can free to RAM through /sys/kernel/kexec_crash_size. If I
write a smaller size to /sys/kernel/kexec_crash_size, the exceeded part
of the new size would be released.

To support releasing the low memory range, we should determine whether
the new size is greater than the accumulated size. If not, the reserved
high memory range will be released firstly. If the new size is smaller
than the size of low memory range, we continue to release the reserved
low memory range after completely releasing the high memory range.

Signed-off-by: Kaihao Bai <[email protected]>
---
kernel/kexec_core.c | 75 +++++++++++++++++++++++++++++++++++++++--------------
1 file changed, 56 insertions(+), 19 deletions(-)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 137f6eb..e89c171 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1031,12 +1031,42 @@ void __weak crash_free_reserved_phys_range(unsigned long begin,
free_reserved_page(boot_pfn_to_page(addr >> PAGE_SHIFT));
}

+static int __crash_shrink_memory(struct resource *crashkernel,
+ unsigned long start, unsigned long end)
+{
+ int ret = 0;
+ struct resource *ram_res;
+
+ ram_res = kzalloc(sizeof(*ram_res), GFP_KERNEL);
+ if (!ram_res) {
+ ret = -ENOMEM;
+ return ret;
+ }
+
+ crash_free_reserved_phys_range(end, crashkernel->end);
+
+ if ((start == end) && (crashkernel->parent != NULL))
+ release_resource(crashkernel);
+
+ ram_res->start = end;
+ ram_res->end = crashk_res.end;
+ ram_res->flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM;
+ ram_res->name = "System RAM";
+
+ crashkernel->end = end - 1;
+
+ insert_resource(&iomem_resource, ram_res);
+
+ return ret;
+}
+
int crash_shrink_memory(unsigned long new_size)
{
int ret = 0;
unsigned long start, end;
+ unsigned long low_start, low_end;
unsigned long old_size;
- struct resource *ram_res;
+ unsigned long low_old_size;

mutex_lock(&kexec_mutex);

@@ -1047,33 +1077,40 @@ int crash_shrink_memory(unsigned long new_size)
start = crashk_res.start;
end = crashk_res.end;
old_size = (end == 0) ? 0 : end - start + 1;
+ low_start = crashk_low_res.start;
+ low_end = crashk_low_res.end;
+ low_old_size = (low_end == 0) ? 0 : low_end - low_start + 1;
+ old_size += low_old_size;
+
if (new_size >= old_size) {
ret = (new_size == old_size) ? 0 : -EINVAL;
goto unlock;
}
+ if (start != end) {
+ start = roundup(start, KEXEC_CRASH_MEM_ALIGN);

- ram_res = kzalloc(sizeof(*ram_res), GFP_KERNEL);
- if (!ram_res) {
- ret = -ENOMEM;
- goto unlock;
- }
-
- start = roundup(start, KEXEC_CRASH_MEM_ALIGN);
- end = roundup(start + new_size, KEXEC_CRASH_MEM_ALIGN);
-
- crash_free_reserved_phys_range(end, crashk_res.end);
+ /*
+ * If the new_size is smaller than the reserved lower memory
+ * range of crashkernel, it releases all higher memory range.
+ * Otherwise it releases part of higher range.
+ */
+ end = (new_size <= low_old_size) ?
+ roundup(start, KEXEC_CRASH_MEM_ALIGN) :
+ roundup(start + new_size - low_old_size,
+ KEXEC_CRASH_MEM_ALIGN);

- if ((start == end) && (crashk_res.parent != NULL))
- release_resource(&crashk_res);
+ ret = __crash_shrink_memory(&crashk_res, start, end);

- ram_res->start = end;
- ram_res->end = crashk_res.end;
- ram_res->flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM;
- ram_res->name = "System RAM";
+ if (ret)
+ goto unlock;
+ }

- crashk_res.end = end - 1;
+ if (new_size < low_old_size) {
+ low_start = roundup(low_start, KEXEC_CRASH_MEM_ALIGN);
+ low_end = roundup(low_start + new_size, KEXEC_CRASH_MEM_ALIGN);

- insert_resource(&iomem_resource, ram_res);
+ ret = __crash_shrink_memory(&crashk_low_res, low_start, low_end);
+ }

unlock:
mutex_unlock(&kexec_mutex);
--
1.8.3.1

2022-07-05 02:23:31

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH 0/2] kexec: accumulate and release the size of crashkernel

On 07/04/22 at 07:41pm, Kaihao Bai wrote:
> Currently x86 and arm64 support to reserve low memory range for
> crashkernel. When crashkernel=Y,low is defined, the main kernel would
> reserve another memblock (instead of crashkernel=X,high, which stored
> in crashk_res) for crashkernel and store it in crashk_low_res.
>
> The implementations of get_crash_size and crash_shrink_size do not
> consider the extra reserved memory range if it exists. Thus, firstly
> accumulate this range on the size of crashkernel and export the size
> by /sys/kernel/kexec_crash_size.
>
> If getting the input of /sys/kernel/kexec_crash_size, both reserved ranges
> might be released if the new size is smaller than current size. The order
> of release is (crashk_res -> crashk_low_res). Only if the new size defined
> by the user is smaller than the size of low memory range, continue to
> release the reserved low memory range after completely releasing the high
> memory range.

Sorry, I don't like this patchset.

I bet you don't encounter a real problem in your product environment.
Regarding crashkernel=,high|low, the ,low memory is for DMA and
requirement from memory under lower range. The ,high meomry is for
kernel/initrd loading, kernel data, user space program running. When
you configure crashkernel= in your system, you need evaluate what
value is suitable. /sys/kernel/kexec_crash_size is an interface you
can make use of to tune the memory usage. People are not suggested to
free all crashkernel reservation via the interface.

So, please leave this as is, unless you have a real case where this
change is needed.

Thanks
Baoquan

2022-07-22 11:44:26

by Carlo Bai

[permalink] [raw]
Subject: Re: [PATCH 0/2] kexec: accumulate and release the size of crashkernel

On 2022/7/5 9:56, Baoquan He wrote:
> On 07/04/22 at 07:41pm, Kaihao Bai wrote:
>> Currently x86 and arm64 support to reserve low memory range for
>> crashkernel. When crashkernel=Y,low is defined, the main kernel would
>> reserve another memblock (instead of crashkernel=X,high, which stored
>> in crashk_res) for crashkernel and store it in crashk_low_res.
>>
>> The implementations of get_crash_size and crash_shrink_size do not
>> consider the extra reserved memory range if it exists. Thus, firstly
>> accumulate this range on the size of crashkernel and export the size
>> by /sys/kernel/kexec_crash_size.
>>
>> If getting the input of /sys/kernel/kexec_crash_size, both reserved ranges
>> might be released if the new size is smaller than current size. The order
>> of release is (crashk_res -> crashk_low_res). Only if the new size defined
>> by the user is smaller than the size of low memory range, continue to
>> release the reserved low memory range after completely releasing the high
>> memory range.
>
> Sorry, I don't like this patchset.
>
> I bet you don't encounter a real problem in your product environment.
> Regarding crashkernel=,high|low, the ,low memory is for DMA and
> requirement from memory under lower range. The ,high meomry is for
> kernel/initrd loading, kernel data, user space program running. When
> you configure crashkernel= in your system, you need evaluate what
> value is suitable. /sys/kernel/kexec_crash_size is an interface you
> can make use of to tune the memory usage. People are not suggested to
> free all crashkernel reservation via the interface.
>
> So, please leave this as is, unless you have a real case where this
> change is needed.
>
> Thanks
> Baoquan

Sorry for the late reply.

Sincerely thanks for your reviewing, I don't have a real problem which
needs to release part/all of the reserved low memory range of
crashkernel. All I think is to change the interface more compatible with
the reserved low memory range.

Besides, I think it's still confusing if we have actually reserved low
memory range of crashkernel, but it does not reflect by the size of
kexec_crash_size.

Thanks,
Kaihao Bai