[permalink] [raw]

Subject: Re: [PATCH 4/7] ppc64 - Specify amount of kernel memory at boot time

Dave Hansen wrote:
>> That sort of surprise is totally unacceptable but the behaviour of
>> kernelcore needs to be consistent on both the x86 and the ppc (any any
>> other ar. How about;
>>
>> 1. kernelcore=X determines the total amount of memory for !ZONE_EASYRCLM
>> (be it ZONE_DMA, ZONE_NORMAL or ZONE_HIGHMEM)
>
> Sounds reasonable. But, if you're going to do that, should we just make
> it the opposite and explicitly be easy_reclaim_mem=? Do we want the
> limit to be set as "I need this much kernel memory", or "I want this
> much removable memory". I dunno.

Now, amount of EASYRCLM memory can change in run time, but kernelcore memory
cannot. So, I like kernelcore= option. I think this is clear setting for admin.

-- Kame

2006-02-24 09:04:50

by Mel Gorman

[permalink] [raw]

Subject: Re: [PATCH 4/7] ppc64 - Specify amount of kernel memory at boot time

On Thu, 23 Feb 2006, Dave Hansen wrote:

> On Thu, 2006-02-23 at 18:01 +0000, Mel Gorman wrote:
>> On Thu, 23 Feb 2006, Dave Hansen wrote:
>>> OK, back to the hapless system admin using kernelcore. They have a
>>> 4-node system with 2GB of RAM in each node for 8GB total. They use
>>> kernelcore=1GB. They end up with 4x1GB ZONE_DMA and 4x1GB
>>> ZONE_EASYRCLM. Perfect. You can safely remove 4GB of RAM.
>>>
>>> Now, imagine that the machine has been heavily used for a while, there
>>> is only 1 node's memory available, but CPUs are available in the same
>>> places as before. So, you start up your partition again have 8GB of
>>> memory in one node. Same kernelcore=1GB option. You get 1x7GB ZONE_DMA
>>> and 1x1GB ZONE_EASYRCLM. I'd argue this is going to be a bit of a
>>> surprise to the poor admin.
>>>
>>
>> That sort of surprise is totally unacceptable but the behaviour of
>> kernelcore needs to be consistent on both the x86 and the ppc (any any
>> other ar. How about;
>>
>> 1. kernelcore=X determines the total amount of memory for !ZONE_EASYRCLM
>> (be it ZONE_DMA, ZONE_NORMAL or ZONE_HIGHMEM)
>
> Sounds reasonable. But, if you're going to do that, should we just make
> it the opposite and explicitly be easy_reclaim_mem=? Do we want the
> limit to be set as "I need this much kernel memory", or "I want this
> much removable memory". I dunno.
>

I think we should keep it at kernelcore=. If you have too little easyrclm
memory, then hot-remove and hugetlb availability is impaired. If you
have too little kernel memory, you have a really bad day.

>> 2. For every node that can have ZONE_EASYRCLM, split the kernelcore across
>> the nodes as a percentage of the node size
>>
>> Example: 4 nodes, 1 GiB each, kernelcore=512MB
>> node 0 ZONE_DMA = 128MB
>> node 1 ZONE_DMA = 128MB
>> node 2 ZONE_DMA = 128MB
>> node 3 ZONE_DMA = 128MB
>>
>> 2 nodes, 3GiB and 1GIB, kernelcore=512MB
>> node 0 ZONE_DMA = 384
>> node 1 ZONE_DMA = 128
>>
>> It gets a bit more complex on NUMA for x86 because ZONE_NORMAL is
>> involved but the idea would essentially be the same.
>
> Yes, chopping it up seems like the right thing (or as close as we can
> get) to me.
>

Ok, will rework the code to make it happen.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab