LinuxLists.cc - Re: [RFC][PATCH 0/9] extend hugepage migration

2013-04-05 01:15:07

Subject: Re: [RFC][PATCH 0/9] extend hugepage migration

Hi Michal,
On 03/22/2013 04:15 PM, Michal Hocko wrote:
> [getting off-list]
>
> On Fri 22-03-13 07:46:32, Simon Jeons wrote:
>> Hi Michal,
>> On 03/21/2013 08:56 PM, Michal Hocko wrote:
>>> On Thu 21-03-13 07:49:48, Simon Jeons wrote:
>>> [...]
>>>> When I hacking arch/x86/mm/hugetlbpage.c like this,
>>>> diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
>>>> index ae1aa71..87f34ee 100644
>>>> --- a/arch/x86/mm/hugetlbpage.c
>>>> +++ b/arch/x86/mm/hugetlbpage.c
>>>> @@ -354,14 +354,13 @@ hugetlb_get_unmapped_area(struct file *file,
>>>> unsigned long addr,
>>>>
>>>> #endif /*HAVE_ARCH_HUGETLB_UNMAPPED_AREA*/
>>>>
>>>> -#ifdef CONFIG_X86_64
>>>> static __init int setup_hugepagesz(char *opt)
>>>> {
>>>> unsigned long ps = memparse(opt, &opt);
>>>> if (ps == PMD_SIZE) {
>>>> hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
>>>> - } else if (ps == PUD_SIZE && cpu_has_gbpages) {
>>>> - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
>>>> + } else if (ps == PUD_SIZE) {
>>>> + hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT+4);
>>>> } else {
>>>> printk(KERN_ERR "hugepagesz: Unsupported page size %lu M\n",
>>>> ps >> 20);
>>>>
>>>> I set boot=hugepagesz=1G hugepages=10, then I got 10 32MB huge pages.
>>>> What's the difference between these pages which I hacking and normal
>>>> huge pages?
>>> How is this related to the patch set?
>>> Please _stop_ distracting discussion to unrelated topics!
>>>
>>> Nothing personal but this is just wasting our time.
>> Sorry kindly Michal, my bad.
>> Btw, could you explain this question for me? very sorry waste your time.
> Your CPU has to support GB pages. You have removed cpu_has_gbpages test
> and added a hstate for order 13 pages which is a weird number on its
> own (32MB) because there is no page table level to support them.

But after hacking, there is /sys/kernel/mm/hugepages/hugepages-*, and
have equal number of 32MB huge pages which I set up in boot parameter.
If there is no page table level to support them, how can them present? I
can hacking this successfully in ubuntu, but not in fedora.

2013-04-05 08:08:41

by Michal Hocko

[permalink] [raw]

Subject: Re: [RFC][PATCH 0/9] extend hugepage migration

On Fri 05-04-13 09:14:58, Simon Jeons wrote:
> Hi Michal,
> On 03/22/2013 04:15 PM, Michal Hocko wrote:
> >[getting off-list]
> >
> >On Fri 22-03-13 07:46:32, Simon Jeons wrote:
> >>Hi Michal,
> >>On 03/21/2013 08:56 PM, Michal Hocko wrote:
> >>>On Thu 21-03-13 07:49:48, Simon Jeons wrote:
> >>>[...]
> >>>>When I hacking arch/x86/mm/hugetlbpage.c like this,
> >>>>diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
> >>>>index ae1aa71..87f34ee 100644
> >>>>--- a/arch/x86/mm/hugetlbpage.c
> >>>>+++ b/arch/x86/mm/hugetlbpage.c
> >>>>@@ -354,14 +354,13 @@ hugetlb_get_unmapped_area(struct file *file,
> >>>>unsigned long addr,
> >>>>
> >>>>#endif /*HAVE_ARCH_HUGETLB_UNMAPPED_AREA*/
> >>>>
> >>>>-#ifdef CONFIG_X86_64
> >>>>static __init int setup_hugepagesz(char *opt)
> >>>>{
> >>>>unsigned long ps = memparse(opt, &opt);
> >>>>if (ps == PMD_SIZE) {
> >>>>hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
> >>>>- } else if (ps == PUD_SIZE && cpu_has_gbpages) {
> >>>>- hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
> >>>>+ } else if (ps == PUD_SIZE) {
> >>>>+ hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT+4);
> >>>>} else {
> >>>>printk(KERN_ERR "hugepagesz: Unsupported page size %lu M\n",
> >>>>ps >> 20);
> >>>>
> >>>>I set boot=hugepagesz=1G hugepages=10, then I got 10 32MB huge pages.
> >>>>What's the difference between these pages which I hacking and normal
> >>>>huge pages?
> >>>How is this related to the patch set?
> >>>Please _stop_ distracting discussion to unrelated topics!
> >>>
> >>>Nothing personal but this is just wasting our time.
> >>Sorry kindly Michal, my bad.
> >>Btw, could you explain this question for me? very sorry waste your time.
> >Your CPU has to support GB pages. You have removed cpu_has_gbpages test
> >and added a hstate for order 13 pages which is a weird number on its
> >own (32MB) because there is no page table level to support them.
>
> But after hacking, there is /sys/kernel/mm/hugepages/hugepages-*,
> and have equal number of 32MB huge pages which I set up in boot
> parameter.

because hugetlb_add_hstate creates hstate for those pages and
hugetlb_init_hstates allocates them later on.

> If there is no page table level to support them, how can
> them present?

Because hugetlb hstate handling code doesn't care about page tables and
the way how those pages are going to be mapped _at all_. Or put it in
another way. Nobody prevents you to allocate order-5 page for a single
pte but that would be a pure waste. Page fault code expects that pages
with a proper size are allocated.
--
Michal Hocko
SUSE Labs

2013-04-05 09:01:10

by Simon Jeons

[permalink] [raw]

Subject: Re: [RFC][PATCH 0/9] extend hugepage migration

Hi Michal,
On 04/05/2013 04:08 PM, Michal Hocko wrote:
> On Fri 05-04-13 09:14:58, Simon Jeons wrote:
>> Hi Michal,
>> On 03/22/2013 04:15 PM, Michal Hocko wrote:
>>> [getting off-list]
>>>
>>> On Fri 22-03-13 07:46:32, Simon Jeons wrote:
>>>> Hi Michal,
>>>> On 03/21/2013 08:56 PM, Michal Hocko wrote:
>>>>> On Thu 21-03-13 07:49:48, Simon Jeons wrote:
>>>>> [...]
>>>>>> When I hacking arch/x86/mm/hugetlbpage.c like this,
>>>>>> diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
>>>>>> index ae1aa71..87f34ee 100644
>>>>>> --- a/arch/x86/mm/hugetlbpage.c
>>>>>> +++ b/arch/x86/mm/hugetlbpage.c
>>>>>> @@ -354,14 +354,13 @@ hugetlb_get_unmapped_area(struct file *file,
>>>>>> unsigned long addr,
>>>>>>
>>>>>> #endif /*HAVE_ARCH_HUGETLB_UNMAPPED_AREA*/
>>>>>>
>>>>>> -#ifdef CONFIG_X86_64
>>>>>> static __init int setup_hugepagesz(char *opt)
>>>>>> {
>>>>>> unsigned long ps = memparse(opt, &opt);
>>>>>> if (ps == PMD_SIZE) {
>>>>>> hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
>>>>>> - } else if (ps == PUD_SIZE && cpu_has_gbpages) {
>>>>>> - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
>>>>>> + } else if (ps == PUD_SIZE) {
>>>>>> + hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT+4);
>>>>>> } else {
>>>>>> printk(KERN_ERR "hugepagesz: Unsupported page size %lu M\n",
>>>>>> ps >> 20);
>>>>>>
>>>>>> I set boot=hugepagesz=1G hugepages=10, then I got 10 32MB huge pages.
>>>>>> What's the difference between these pages which I hacking and normal
>>>>>> huge pages?
>>>>> How is this related to the patch set?
>>>>> Please _stop_ distracting discussion to unrelated topics!
>>>>>
>>>>> Nothing personal but this is just wasting our time.
>>>> Sorry kindly Michal, my bad.
>>>> Btw, could you explain this question for me? very sorry waste your time.
>>> Your CPU has to support GB pages. You have removed cpu_has_gbpages test
>>> and added a hstate for order 13 pages which is a weird number on its
>>> own (32MB) because there is no page table level to support them.
>> But after hacking, there is /sys/kernel/mm/hugepages/hugepages-*,
>> and have equal number of 32MB huge pages which I set up in boot
>> parameter.
> because hugetlb_add_hstate creates hstate for those pages and
> hugetlb_init_hstates allocates them later on.
>
>> If there is no page table level to support them, how can
>> them present?
> Because hugetlb hstate handling code doesn't care about page tables and
> the way how those pages are going to be mapped _at all_. Or put it in
> another way. Nobody prevents you to allocate order-5 page for a single
> pte but that would be a pure waste. Page fault code expects that pages
> with a proper size are allocated.
Do you mean 32MB pages will map to one pmd which should map 2MB pages?

2013-04-05 09:30:39

by Michal Hocko

[permalink] [raw]

Subject: Re: [RFC][PATCH 0/9] extend hugepage migration

On Fri 05-04-13 17:00:58, Simon Jeons wrote:
> Hi Michal,
> On 04/05/2013 04:08 PM, Michal Hocko wrote:
> >On Fri 05-04-13 09:14:58, Simon Jeons wrote:
> >>Hi Michal,
> >>On 03/22/2013 04:15 PM, Michal Hocko wrote:
> >>>[getting off-list]
> >>>
> >>>On Fri 22-03-13 07:46:32, Simon Jeons wrote:
> >>>>Hi Michal,
> >>>>On 03/21/2013 08:56 PM, Michal Hocko wrote:
> >>>>>On Thu 21-03-13 07:49:48, Simon Jeons wrote:
> >>>>>[...]
> >>>>>>When I hacking arch/x86/mm/hugetlbpage.c like this,
> >>>>>>diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
> >>>>>>index ae1aa71..87f34ee 100644
> >>>>>>--- a/arch/x86/mm/hugetlbpage.c
> >>>>>>+++ b/arch/x86/mm/hugetlbpage.c
> >>>>>>@@ -354,14 +354,13 @@ hugetlb_get_unmapped_area(struct file *file,
> >>>>>>unsigned long addr,
> >>>>>>
> >>>>>>#endif /*HAVE_ARCH_HUGETLB_UNMAPPED_AREA*/
> >>>>>>
> >>>>>>-#ifdef CONFIG_X86_64
> >>>>>>static __init int setup_hugepagesz(char *opt)
> >>>>>>{
> >>>>>>unsigned long ps = memparse(opt, &opt);
> >>>>>>if (ps == PMD_SIZE) {
> >>>>>>hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
> >>>>>>- } else if (ps == PUD_SIZE && cpu_has_gbpages) {
> >>>>>>- hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
> >>>>>>+ } else if (ps == PUD_SIZE) {
> >>>>>>+ hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT+4);
> >>>>>>} else {
> >>>>>>printk(KERN_ERR "hugepagesz: Unsupported page size %lu M\n",
> >>>>>>ps >> 20);
> >>>>>>
> >>>>>>I set boot=hugepagesz=1G hugepages=10, then I got 10 32MB huge pages.
> >>>>>>What's the difference between these pages which I hacking and normal
> >>>>>>huge pages?
> >>>>>How is this related to the patch set?
> >>>>>Please _stop_ distracting discussion to unrelated topics!
> >>>>>
> >>>>>Nothing personal but this is just wasting our time.
> >>>>Sorry kindly Michal, my bad.
> >>>>Btw, could you explain this question for me? very sorry waste your time.
> >>>Your CPU has to support GB pages. You have removed cpu_has_gbpages test
> >>>and added a hstate for order 13 pages which is a weird number on its
> >>>own (32MB) because there is no page table level to support them.
> >>But after hacking, there is /sys/kernel/mm/hugepages/hugepages-*,
> >>and have equal number of 32MB huge pages which I set up in boot
> >>parameter.
> >because hugetlb_add_hstate creates hstate for those pages and
> >hugetlb_init_hstates allocates them later on.
> >
> >>If there is no page table level to support them, how can
> >>them present?
> >Because hugetlb hstate handling code doesn't care about page tables and
> >the way how those pages are going to be mapped _at all_. Or put it in
> >another way. Nobody prevents you to allocate order-5 page for a single
> >pte but that would be a pure waste. Page fault code expects that pages
> >with a proper size are allocated.
> Do you mean 32MB pages will map to one pmd which should map 2MB pages?
>

Please refer to hugetlb_fault for more information.

--
Michal Hocko
SUSE Labs

2013-04-07 00:32:38

by Simon Jeons

[permalink] [raw]

Subject: Re: [RFC][PATCH 0/9] extend hugepage migration

Hi Michal,
On 04/05/2013 05:30 PM, Michal Hocko wrote:
> On Fri 05-04-13 17:00:58, Simon Jeons wrote:
>> Hi Michal,
>> On 04/05/2013 04:08 PM, Michal Hocko wrote:
>>> On Fri 05-04-13 09:14:58, Simon Jeons wrote:
>>>> Hi Michal,
>>>> On 03/22/2013 04:15 PM, Michal Hocko wrote:
>>>>> [getting off-list]
>>>>>
>>>>> On Fri 22-03-13 07:46:32, Simon Jeons wrote:
>>>>>> Hi Michal,
>>>>>> On 03/21/2013 08:56 PM, Michal Hocko wrote:
>>>>>>> On Thu 21-03-13 07:49:48, Simon Jeons wrote:
>>>>>>> [...]
>>>>>>>> When I hacking arch/x86/mm/hugetlbpage.c like this,
>>>>>>>> diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
>>>>>>>> index ae1aa71..87f34ee 100644
>>>>>>>> --- a/arch/x86/mm/hugetlbpage.c
>>>>>>>> +++ b/arch/x86/mm/hugetlbpage.c
>>>>>>>> @@ -354,14 +354,13 @@ hugetlb_get_unmapped_area(struct file *file,
>>>>>>>> unsigned long addr,
>>>>>>>>
>>>>>>>> #endif /*HAVE_ARCH_HUGETLB_UNMAPPED_AREA*/
>>>>>>>>
>>>>>>>> -#ifdef CONFIG_X86_64
>>>>>>>> static __init int setup_hugepagesz(char *opt)
>>>>>>>> {
>>>>>>>> unsigned long ps = memparse(opt, &opt);
>>>>>>>> if (ps == PMD_SIZE) {
>>>>>>>> hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
>>>>>>>> - } else if (ps == PUD_SIZE && cpu_has_gbpages) {
>>>>>>>> - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
>>>>>>>> + } else if (ps == PUD_SIZE) {
>>>>>>>> + hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT+4);
>>>>>>>> } else {
>>>>>>>> printk(KERN_ERR "hugepagesz: Unsupported page size %lu M\n",
>>>>>>>> ps >> 20);
>>>>>>>>
>>>>>>>> I set boot=hugepagesz=1G hugepages=10, then I got 10 32MB huge pages.
>>>>>>>> What's the difference between these pages which I hacking and normal
>>>>>>>> huge pages?
>>>>>>> How is this related to the patch set?
>>>>>>> Please _stop_ distracting discussion to unrelated topics!
>>>>>>>
>>>>>>> Nothing personal but this is just wasting our time.
>>>>>> Sorry kindly Michal, my bad.
>>>>>> Btw, could you explain this question for me? very sorry waste your time.
>>>>> Your CPU has to support GB pages. You have removed cpu_has_gbpages test
>>>>> and added a hstate for order 13 pages which is a weird number on its
>>>>> own (32MB) because there is no page table level to support them.
>>>> But after hacking, there is /sys/kernel/mm/hugepages/hugepages-*,
>>>> and have equal number of 32MB huge pages which I set up in boot
>>>> parameter.
>>> because hugetlb_add_hstate creates hstate for those pages and
>>> hugetlb_init_hstates allocates them later on.
>>>
>>>> If there is no page table level to support them, how can
>>>> them present?
>>> Because hugetlb hstate handling code doesn't care about page tables and
>>> the way how those pages are going to be mapped _at all_. Or put it in
>>> another way. Nobody prevents you to allocate order-5 page for a single
>>> pte but that would be a pure waste. Page fault code expects that pages
>>> with a proper size are allocated.
>> Do you mean 32MB pages will map to one pmd which should map 2MB pages?
>>
> Please refer to hugetlb_fault for more information.

Thanks for your pointing out. So my assume is correct, is it? Can pmd
which support 2MB map 32MB pages work well?

>

2013-04-07 14:05:41

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [RFC][PATCH 0/9] extend hugepage migration

>> Please refer to hugetlb_fault for more information.
>
> Thanks for your pointing out. So my assume is correct, is it? Can pmd
> which support 2MB map 32MB pages work well?

Simon, Please stop hijaking unrelated threads. This is not question and answer thread.