2020-07-20 06:27:09

by Baoquan He

[permalink] [raw]
Subject: [PATCH 0/5] mm/hugetlb: Small cleanup and improvement

Patch 1~4 are small cleanup.

Patch 5 is to add warning message when failed to increase or decrease
the expected number of persistent huge pages by writing into
/proc/sys/vm/nr_hugepages
/sys/kernel/mm/hugepages/hugepages-xxx/nr_hugepages.

Baoquan He (5):
mm/hugetlb.c: Fix typo of glb_reserve
mm/hugetlb.c: make is_hugetlb_entry_hwpoisoned return bool
mm/hugetlb.c: Remove the unnecessary non_swap_entry()
doc/vm: fix typo in in the hugetlb admin documentation
mm/hugetl.c: warn out if expected count of huge pages adjustment is
not achieved

Documentation/admin-guide/mm/hugetlbpage.rst | 2 +-
mm/hugetlb.c | 49 ++++++++++++--------
2 files changed, 31 insertions(+), 20 deletions(-)

--
2.17.2


2020-07-20 06:27:17

by Baoquan He

[permalink] [raw]
Subject: [PATCH 2/5] mm/hugetlb.c: make is_hugetlb_entry_hwpoisoned return bool

Just like his neighbour is_hugetlb_entry_migration() has done.

Signed-off-by: Baoquan He <[email protected]>
---
mm/hugetlb.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 191a585bb315..a58f976a9dd9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3754,17 +3754,17 @@ bool is_hugetlb_entry_migration(pte_t pte)
return false;
}

-static int is_hugetlb_entry_hwpoisoned(pte_t pte)
+static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
{
swp_entry_t swp;

if (huge_pte_none(pte) || pte_present(pte))
- return 0;
+ return false;
swp = pte_to_swp_entry(pte);
if (non_swap_entry(swp) && is_hwpoison_entry(swp))
- return 1;
+ return true;
else
- return 0;
+ return false;
}

int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
--
2.17.2

2020-07-20 06:28:27

by Baoquan He

[permalink] [raw]
Subject: [PATCH 3/5] mm/hugetlb.c: Remove the unnecessary non_swap_entry()

The checking is_migration_entry() and is_hwpoison_entry() are stricter
than non_swap_entry(), means they have covered the conditional check
which non_swap_entry() is doing.

Hence remove the unnecessary non_swap_entry() in is_hugetlb_entry_migration()
and is_hugetlb_entry_hwpoisoned() to simplify code.

Signed-off-by: Baoquan He <[email protected]>
---
mm/hugetlb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a58f976a9dd9..467894d8332a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3748,7 +3748,7 @@ bool is_hugetlb_entry_migration(pte_t pte)
if (huge_pte_none(pte) || pte_present(pte))
return false;
swp = pte_to_swp_entry(pte);
- if (non_swap_entry(swp) && is_migration_entry(swp))
+ if (is_migration_entry(swp))
return true;
else
return false;
@@ -3761,7 +3761,7 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
if (huge_pte_none(pte) || pte_present(pte))
return false;
swp = pte_to_swp_entry(pte);
- if (non_swap_entry(swp) && is_hwpoison_entry(swp))
+ if (is_hwpoison_entry(swp))
return true;
else
return false;
--
2.17.2

2020-07-20 06:28:58

by Baoquan He

[permalink] [raw]
Subject: [PATCH 4/5] doc/vm: fix typo in in the hugetlb admin documentation

Change 'pecify' to 'Specify'.

Signed-off-by: Baoquan He <[email protected]>
---
Documentation/admin-guide/mm/hugetlbpage.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
index 015a5f7d7854..f7b1c7462991 100644
--- a/Documentation/admin-guide/mm/hugetlbpage.rst
+++ b/Documentation/admin-guide/mm/hugetlbpage.rst
@@ -131,7 +131,7 @@ hugepages
parameter is preceded by an invalid hugepagesz parameter, it will
be ignored.
default_hugepagesz
- pecify the default huge page size. This parameter can
+ Specify the default huge page size. This parameter can
only be specified once on the command line. default_hugepagesz can
optionally be followed by the hugepages parameter to preallocate a
specific number of huge pages of default size. The number of default
--
2.17.2

2020-07-20 06:30:09

by Baoquan He

[permalink] [raw]
Subject: [PATCH 5/5] mm/hugetl.c: warn out if expected count of huge pages adjustment is not achieved

A customer complained that no any message is printed out when failed to
allocate explicitly specified number of persistent huge pages. That
specifying can be done by writing into /proc/sys/vm/nr_hugepages to
increase the persisten huge pages.

In the current code, it takes the best effort way to allocate the expected
number of huge pages. If only succeeding to get part of them, no any
information is printed out.

Here try to send out warning message if the expected number of huge pages
adjustment is not achieved, including increasing and decreasing the count
of persistent huge pages.

Signed-off-by: Baoquan He <[email protected]>
---
mm/hugetlb.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 467894d8332a..1dfb5d9e4e06 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2661,7 +2661,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
nodemask_t *nodes_allowed)
{
- unsigned long min_count, ret;
+ unsigned long min_count, ret, old_max;
NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);

/*
@@ -2723,6 +2723,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
* pool might be one hugepage larger than it needs to be, but
* within all the constraints specified by the sysctls.
*/
+ old_max = persistent_huge_pages(h);
while (h->surplus_huge_pages && count > persistent_huge_pages(h)) {
if (!adjust_pool_surplus(h, nodes_allowed, -1))
break;
@@ -2779,6 +2780,16 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
}
out:
h->max_huge_pages = persistent_huge_pages(h);
+ if (count != h->max_huge_pages) {
+ char buf[32];
+
+ string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
+ pr_warn("HugeTLB: %s %lu of page size %s failed. Only %s %lu hugepages.\n",
+ count > old_max ? "increasing" : "decreasing",
+ abs(count - old_max), buf,
+ count > old_max ? "increased" : "decreased",
+ abs(old_max - h->max_huge_pages));
+ }
spin_unlock(&hugetlb_lock);

NODEMASK_FREE(node_alloc_noretry);
--
2.17.2

2020-07-20 06:30:48

by Baoquan He

[permalink] [raw]
Subject: [PATCH 1/5] mm/hugetlb.c: Fix typo of glb_reserve

The local variable is for global reservation of region.

Signed-off-by: Baoquan He <[email protected]>
---
mm/hugetlb.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f24acb3af741..191a585bb315 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3649,7 +3649,7 @@ static void hugetlb_vm_op_close(struct vm_area_struct *vma)
struct resv_map *resv = vma_resv_map(vma);
struct hugepage_subpool *spool = subpool_vma(vma);
unsigned long reserve, start, end;
- long gbl_reserve;
+ long glb_reserve;

if (!resv || !is_vma_resv_set(vma, HPAGE_RESV_OWNER))
return;
@@ -3664,8 +3664,8 @@ static void hugetlb_vm_op_close(struct vm_area_struct *vma)
* Decrement reserve counts. The global reserve count may be
* adjusted if the subpool has a minimum size.
*/
- gbl_reserve = hugepage_subpool_put_pages(spool, reserve);
- hugetlb_acct_memory(h, -gbl_reserve);
+ glb_reserve = hugepage_subpool_put_pages(spool, reserve);
+ hugetlb_acct_memory(h, -glb_reserve);
}

kref_put(&resv->refs, resv_map_release);
@@ -5054,7 +5054,7 @@ int hugetlb_reserve_pages(struct inode *inode,
struct hugepage_subpool *spool = subpool_inode(inode);
struct resv_map *resv_map;
struct hugetlb_cgroup *h_cg = NULL;
- long gbl_reserve, regions_needed = 0;
+ long glb_reserve, regions_needed = 0;

/* This should never happen */
if (from > to) {
@@ -5121,10 +5121,10 @@ int hugetlb_reserve_pages(struct inode *inode,
/*
* There must be enough pages in the subpool for the mapping. If
* the subpool has a minimum size, there may be some global
- * reservations already in place (gbl_reserve).
+ * reservations already in place (glb_reserve).
*/
- gbl_reserve = hugepage_subpool_get_pages(spool, chg);
- if (gbl_reserve < 0) {
+ glb_reserve = hugepage_subpool_get_pages(spool, chg);
+ if (glb_reserve < 0) {
ret = -ENOSPC;
goto out_uncharge_cgroup;
}
@@ -5133,7 +5133,7 @@ int hugetlb_reserve_pages(struct inode *inode,
* Check enough hugepages are available for the reservation.
* Hand the pages back to the subpool if there are not
*/
- ret = hugetlb_acct_memory(h, gbl_reserve);
+ ret = hugetlb_acct_memory(h, glb_reserve);
if (ret < 0) {
goto out_put_pages;
}
@@ -5153,7 +5153,7 @@ int hugetlb_reserve_pages(struct inode *inode,
add = region_add(resv_map, from, to, regions_needed, h, h_cg);

if (unlikely(add < 0)) {
- hugetlb_acct_memory(h, -gbl_reserve);
+ hugetlb_acct_memory(h, -glb_reserve);
goto out_put_pages;
} else if (unlikely(chg > add)) {
/*
@@ -5200,7 +5200,7 @@ long hugetlb_unreserve_pages(struct inode *inode, long start, long end,
struct resv_map *resv_map = inode_resv_map(inode);
long chg = 0;
struct hugepage_subpool *spool = subpool_inode(inode);
- long gbl_reserve;
+ long glb_reserve;

/*
* Since this routine can be called in the evict inode path for all
@@ -5225,8 +5225,8 @@ long hugetlb_unreserve_pages(struct inode *inode, long start, long end,
* If the subpool has a minimum size, the number of global
* reservations to be released may be adjusted.
*/
- gbl_reserve = hugepage_subpool_put_pages(spool, (chg - freed));
- hugetlb_acct_memory(h, -gbl_reserve);
+ glb_reserve = hugepage_subpool_put_pages(spool, (chg - freed));
+ hugetlb_acct_memory(h, -glb_reserve);

return 0;
}
--
2.17.2

2020-07-20 22:35:30

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH 1/5] mm/hugetlb.c: Fix typo of glb_reserve

On 7/19/20 11:26 PM, Baoquan He wrote:
> The local variable is for global reservation of region.
>
> Signed-off-by: Baoquan He <[email protected]>
> ---
> mm/hugetlb.c | 24 ++++++++++++------------
> 1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f24acb3af741..191a585bb315 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3649,7 +3649,7 @@ static void hugetlb_vm_op_close(struct vm_area_struct *vma)
> struct resv_map *resv = vma_resv_map(vma);
> struct hugepage_subpool *spool = subpool_vma(vma);
> unsigned long reserve, start, end;
> - long gbl_reserve;
> + long glb_reserve;

I see both 'gbl' and 'glb' being used for global in variable names. grep will
actually return more hits for gbl than glb. Unless there is consensus that
'glb' should be used for global, I would prefer not to make this change.

--
Mike Kravetz

2020-07-20 22:38:25

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm/hugetlb.c: make is_hugetlb_entry_hwpoisoned return bool

On 7/19/20 11:26 PM, Baoquan He wrote:
> Just like his neighbour is_hugetlb_entry_migration() has done.
>
> Signed-off-by: Baoquan He <[email protected]>

Thanks,
Reviewed-by: Mike Kravetz <[email protected]>
--
Mike Kravetz

> ---
> mm/hugetlb.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 191a585bb315..a58f976a9dd9 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3754,17 +3754,17 @@ bool is_hugetlb_entry_migration(pte_t pte)
> return false;
> }
>
> -static int is_hugetlb_entry_hwpoisoned(pte_t pte)
> +static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
> {
> swp_entry_t swp;
>
> if (huge_pte_none(pte) || pte_present(pte))
> - return 0;
> + return false;
> swp = pte_to_swp_entry(pte);
> if (non_swap_entry(swp) && is_hwpoison_entry(swp))
> - return 1;
> + return true;
> else
> - return 0;
> + return false;
> }
>
> int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>

2020-07-20 22:52:32

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH 3/5] mm/hugetlb.c: Remove the unnecessary non_swap_entry()

On 7/19/20 11:26 PM, Baoquan He wrote:
> The checking is_migration_entry() and is_hwpoison_entry() are stricter
> than non_swap_entry(), means they have covered the conditional check
> which non_swap_entry() is doing.
>
> Hence remove the unnecessary non_swap_entry() in is_hugetlb_entry_migration()
> and is_hugetlb_entry_hwpoisoned() to simplify code.
>
> Signed-off-by: Baoquan He <[email protected]>

Agreed, we can remove the checks for non_swap_entry.

Reviewed-by: Mike Kravetz <[email protected]>
--
Mike Kravetz

> ---
> mm/hugetlb.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index a58f976a9dd9..467894d8332a 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3748,7 +3748,7 @@ bool is_hugetlb_entry_migration(pte_t pte)
> if (huge_pte_none(pte) || pte_present(pte))
> return false;
> swp = pte_to_swp_entry(pte);
> - if (non_swap_entry(swp) && is_migration_entry(swp))
> + if (is_migration_entry(swp))
> return true;
> else
> return false;
> @@ -3761,7 +3761,7 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
> if (huge_pte_none(pte) || pte_present(pte))
> return false;
> swp = pte_to_swp_entry(pte);
> - if (non_swap_entry(swp) && is_hwpoison_entry(swp))
> + if (is_hwpoison_entry(swp))
> return true;
> else
> return false;
>

2020-07-20 22:56:40

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH 4/5] doc/vm: fix typo in in the hugetlb admin documentation

On 7/19/20 11:26 PM, Baoquan He wrote:
> Change 'pecify' to 'Specify'.
>
> Signed-off-by: Baoquan He <[email protected]>
> ---
> Documentation/admin-guide/mm/hugetlbpage.rst | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
> index 015a5f7d7854..f7b1c7462991 100644
> --- a/Documentation/admin-guide/mm/hugetlbpage.rst
> +++ b/Documentation/admin-guide/mm/hugetlbpage.rst
> @@ -131,7 +131,7 @@ hugepages
> parameter is preceded by an invalid hugepagesz parameter, it will
> be ignored.
> default_hugepagesz
> - pecify the default huge page size. This parameter can
> + Specify the default huge page size. This parameter can
> only be specified once on the command line. default_hugepagesz can
> optionally be followed by the hugepages parameter to preallocate a
> specific number of huge pages of default size. The number of default
>

Unfortunately, this review comment was missed when the typo was introduced.
https://lore.kernel.org/lkml/[email protected]/

Thanks for fixing,
Reviewed-by: Mike Kravetz <[email protected]>
--
Mike Kravetz

2020-07-21 00:42:34

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH 5/5] mm/hugetl.c: warn out if expected count of huge pages adjustment is not achieved

On 7/19/20 11:26 PM, Baoquan He wrote:
> A customer complained that no any message is printed out when failed to
> allocate explicitly specified number of persistent huge pages. That
> specifying can be done by writing into /proc/sys/vm/nr_hugepages to
> increase the persisten huge pages.
>
> In the current code, it takes the best effort way to allocate the expected
> number of huge pages. If only succeeding to get part of them, no any
> information is printed out.
>
> Here try to send out warning message if the expected number of huge pages
> adjustment is not achieved, including increasing and decreasing the count
> of persistent huge pages.

Perhaps change the wording a bit,

A customer complained that no message is logged when the number of
persistent huge pages is not changed to the exact value written to
the sysfs or proc nr_hugepages file.

In the current code, a best effort is made to satisfy requests made
via the nr_hugepages file. However, requests may be only partially
satisfied.

Log a message if the code was unsuccessful in fully satisfying a
request. This includes both increasing and decreasing the number
of persistent huge pages.

>
> Signed-off-by: Baoquan He <[email protected]>
> ---
> mm/hugetlb.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)

I am not opposed to this patch. However, I believe the best way for a user
to determine if their request was successful is to compare the value of
nr_hugepages to the value which was written.

>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 467894d8332a..1dfb5d9e4e06 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2661,7 +2661,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
> static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> nodemask_t *nodes_allowed)
> {
> - unsigned long min_count, ret;
> + unsigned long min_count, ret, old_max;
> NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);
>
> /*
> @@ -2723,6 +2723,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> * pool might be one hugepage larger than it needs to be, but
> * within all the constraints specified by the sysctls.
> */
> + old_max = persistent_huge_pages(h);
> while (h->surplus_huge_pages && count > persistent_huge_pages(h)) {
> if (!adjust_pool_surplus(h, nodes_allowed, -1))
> break;
> @@ -2779,6 +2780,16 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> }
> out:
> h->max_huge_pages = persistent_huge_pages(h);
> + if (count != h->max_huge_pages) {
> + char buf[32];
> +
> + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
> + pr_warn("HugeTLB: %s %lu of page size %s failed. Only %s %lu hugepages.\n",
> + count > old_max ? "increasing" : "decreasing",
> + abs(count - old_max), buf,
> + count > old_max ? "increased" : "decreased",
> + abs(old_max - h->max_huge_pages));
> + }
> spin_unlock(&hugetlb_lock);

I would prefer if we drop the lock before logging the message. That would
involve grabbing the value of h->max_huge_pages before dropping the lock.
--
Mike Kravetz

>
> NODEMASK_FREE(node_alloc_noretry);
>

2020-07-21 09:19:35

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH 1/5] mm/hugetlb.c: Fix typo of glb_reserve

On 07/20/20 at 03:32pm, Mike Kravetz wrote:
> On 7/19/20 11:26 PM, Baoquan He wrote:
> > The local variable is for global reservation of region.
> >
> > Signed-off-by: Baoquan He <[email protected]>
> > ---
> > mm/hugetlb.c | 24 ++++++++++++------------
> > 1 file changed, 12 insertions(+), 12 deletions(-)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index f24acb3af741..191a585bb315 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -3649,7 +3649,7 @@ static void hugetlb_vm_op_close(struct vm_area_struct *vma)
> > struct resv_map *resv = vma_resv_map(vma);
> > struct hugepage_subpool *spool = subpool_vma(vma);
> > unsigned long reserve, start, end;
> > - long gbl_reserve;
> > + long glb_reserve;
>
> I see both 'gbl' and 'glb' being used for global in variable names. grep will
> actually return more hits for gbl than glb. Unless there is consensus that
> 'glb' should be used for global, I would prefer not to make this change.

Ah, I thought it's typo, so it's just another kind of abbreviation. Then
I am fine to drop this patch.

Thanks a lot for careful reviewing on this patchset.

Thanks
Baoquan

2020-07-21 09:56:00

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH 5/5] mm/hugetl.c: warn out if expected count of huge pages adjustment is not achieved

On 07/20/20 at 05:38pm, Mike Kravetz wrote:
> On 7/19/20 11:26 PM, Baoquan He wrote:
> > A customer complained that no any message is printed out when failed to
> > allocate explicitly specified number of persistent huge pages. That
> > specifying can be done by writing into /proc/sys/vm/nr_hugepages to
> > increase the persisten huge pages.
> >
> > In the current code, it takes the best effort way to allocate the expected
> > number of huge pages. If only succeeding to get part of them, no any
> > information is printed out.
> >
> > Here try to send out warning message if the expected number of huge pages
> > adjustment is not achieved, including increasing and decreasing the count
> > of persistent huge pages.
>
> Perhaps change the wording a bit,
>
> A customer complained that no message is logged when the number of
> persistent huge pages is not changed to the exact value written to
> the sysfs or proc nr_hugepages file.
>
> In the current code, a best effort is made to satisfy requests made
> via the nr_hugepages file. However, requests may be only partially
> satisfied.
>
> Log a message if the code was unsuccessful in fully satisfying a
> request. This includes both increasing and decreasing the number
> of persistent huge pages.

Thanks, sounds much better, I will use these to replace the old log.

>
> >
> > Signed-off-by: Baoquan He <[email protected]>
> > ---
> > mm/hugetlb.c | 13 ++++++++++++-
> > 1 file changed, 12 insertions(+), 1 deletion(-)
>
> I am not opposed to this patch. However, I believe the best way for a user
> to determine if their request was successful is to compare the value of
> nr_hugepages to the value which was written.

Agree. While from our customer's request, they told the log can help
'Easily detect and analyse previous reservation failures'.

>
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 467894d8332a..1dfb5d9e4e06 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -2661,7 +2661,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
> > static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> > nodemask_t *nodes_allowed)
> > {
> > - unsigned long min_count, ret;
> > + unsigned long min_count, ret, old_max;
> > NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);
> >
> > /*
> > @@ -2723,6 +2723,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> > * pool might be one hugepage larger than it needs to be, but
> > * within all the constraints specified by the sysctls.
> > */
> > + old_max = persistent_huge_pages(h);
> > while (h->surplus_huge_pages && count > persistent_huge_pages(h)) {
> > if (!adjust_pool_surplus(h, nodes_allowed, -1))
> > break;
> > @@ -2779,6 +2780,16 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> > }
> > out:
> > h->max_huge_pages = persistent_huge_pages(h);
> > + if (count != h->max_huge_pages) {
> > + char buf[32];
> > +
> > + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
> > + pr_warn("HugeTLB: %s %lu of page size %s failed. Only %s %lu hugepages.\n",
> > + count > old_max ? "increasing" : "decreasing",
> > + abs(count - old_max), buf,
> > + count > old_max ? "increased" : "decreased",
> > + abs(old_max - h->max_huge_pages));
> > + }
> > spin_unlock(&hugetlb_lock);
>
> I would prefer if we drop the lock before logging the message. That would
> involve grabbing the value of h->max_huge_pages before dropping the lock.

Sure, will change. We should try to release the lock's burden. Thanks.

2020-07-21 09:57:30

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH 3/5] mm/hugetlb.c: Remove the unnecessary non_swap_entry()

On 20.07.20 08:26, Baoquan He wrote:
> The checking is_migration_entry() and is_hwpoison_entry() are stricter
> than non_swap_entry(), means they have covered the conditional check
> which non_swap_entry() is doing.
>
> Hence remove the unnecessary non_swap_entry() in is_hugetlb_entry_migration()
> and is_hugetlb_entry_hwpoisoned() to simplify code.
>
> Signed-off-by: Baoquan He <[email protected]>
> ---
> mm/hugetlb.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index a58f976a9dd9..467894d8332a 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3748,7 +3748,7 @@ bool is_hugetlb_entry_migration(pte_t pte)
> if (huge_pte_none(pte) || pte_present(pte))
> return false;
> swp = pte_to_swp_entry(pte);
> - if (non_swap_entry(swp) && is_migration_entry(swp))
> + if (is_migration_entry(swp))
> return true;
> else
> return false;
> @@ -3761,7 +3761,7 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
> if (huge_pte_none(pte) || pte_present(pte))
> return false;
> swp = pte_to_swp_entry(pte);
> - if (non_swap_entry(swp) && is_hwpoison_entry(swp))
> + if (is_hwpoison_entry(swp))
> return true;
> else
> return false;
>

Reviewed-by: David Hildenbrand <[email protected]>

--
Thanks,

David / dhildenb

2020-07-21 09:59:06

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm/hugetlb.c: make is_hugetlb_entry_hwpoisoned return bool

On 20.07.20 08:26, Baoquan He wrote:
> Just like his neighbour is_hugetlb_entry_migration() has done.
>
> Signed-off-by: Baoquan He <[email protected]>
> ---
> mm/hugetlb.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 191a585bb315..a58f976a9dd9 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3754,17 +3754,17 @@ bool is_hugetlb_entry_migration(pte_t pte)
> return false;
> }
>
> -static int is_hugetlb_entry_hwpoisoned(pte_t pte)
> +static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
> {
> swp_entry_t swp;
>
> if (huge_pte_none(pte) || pte_present(pte))
> - return 0;
> + return false;
> swp = pte_to_swp_entry(pte);
> if (non_swap_entry(swp) && is_hwpoison_entry(swp))
> - return 1;
> + return true;
> else
> - return 0;
> + return false;
> }
>
> int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>

Reviewed-by: David Hildenbrand <[email protected]>

--
Thanks,

David / dhildenb

2020-07-21 09:59:51

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH 4/5] doc/vm: fix typo in in the hugetlb admin documentation

On 20.07.20 08:26, Baoquan He wrote:
> Change 'pecify' to 'Specify'.
>
> Signed-off-by: Baoquan He <[email protected]>
> ---
> Documentation/admin-guide/mm/hugetlbpage.rst | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
> index 015a5f7d7854..f7b1c7462991 100644
> --- a/Documentation/admin-guide/mm/hugetlbpage.rst
> +++ b/Documentation/admin-guide/mm/hugetlbpage.rst
> @@ -131,7 +131,7 @@ hugepages
> parameter is preceded by an invalid hugepagesz parameter, it will
> be ignored.
> default_hugepagesz
> - pecify the default huge page size. This parameter can
> + Specify the default huge page size. This parameter can
> only be specified once on the command line. default_hugepagesz can
> optionally be followed by the hugepages parameter to preallocate a
> specific number of huge pages of default size. The number of default
>

Reviewed-by: David Hildenbrand <[email protected]>

--
Thanks,

David / dhildenb

2020-07-22 08:50:52

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH 5/5] mm/hugetl.c: warn out if expected count of huge pages adjustment is not achieved

Hi Mike,

On 07/20/20 at 05:38pm, Mike Kravetz wrote:
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 467894d8332a..1dfb5d9e4e06 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -2661,7 +2661,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
> > static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> > nodemask_t *nodes_allowed)
> > {
> > - unsigned long min_count, ret;
> > + unsigned long min_count, ret, old_max;
> > NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);
> >
> > /*
> > @@ -2723,6 +2723,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> > * pool might be one hugepage larger than it needs to be, but
> > * within all the constraints specified by the sysctls.
> > */
> > + old_max = persistent_huge_pages(h);
> > while (h->surplus_huge_pages && count > persistent_huge_pages(h)) {
> > if (!adjust_pool_surplus(h, nodes_allowed, -1))
> > break;
> > @@ -2779,6 +2780,16 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> > }
> > out:
> > h->max_huge_pages = persistent_huge_pages(h);
> > + if (count != h->max_huge_pages) {
> > + char buf[32];
> > +
> > + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
> > + pr_warn("HugeTLB: %s %lu of page size %s failed. Only %s %lu hugepages.\n",
> > + count > old_max ? "increasing" : "decreasing",
> > + abs(count - old_max), buf,
> > + count > old_max ? "increased" : "decreased",
> > + abs(old_max - h->max_huge_pages));
> > + }
> > spin_unlock(&hugetlb_lock);
>
> I would prefer if we drop the lock before logging the message. That would
> involve grabbing the value of h->max_huge_pages before dropping the lock.

Do you think the below change is OK to you to move the message logging
after lock dropping? If yes, I will repost with updated patches.

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6a9b7556ce5b..b5aa32a13569 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2661,7 +2661,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
nodemask_t *nodes_allowed)
{
- unsigned long min_count, ret, old_max;
+ unsigned long min_count, ret, old_max, new_max;
NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);

/*
@@ -2780,7 +2780,10 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
}
out:
h->max_huge_pages = persistent_huge_pages(h);
- if (count != h->max_huge_pages) {
+ new_max = h->max_huge_pages;
+ spin_unlock(&hugetlb_lock);
+
+ if (count != new_max) {
char buf[32];

string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
@@ -2788,9 +2791,8 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
count > old_max ? "increasing" : "decreasing",
abs(count - old_max), buf,
count > old_max ? "increased" : "decreased",
- abs(old_max - h->max_huge_pages));
+ abs(old_max - new_max));
}
- spin_unlock(&hugetlb_lock);

NODEMASK_FREE(node_alloc_noretry);


2020-07-22 16:19:11

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH 5/5] mm/hugetl.c: warn out if expected count of huge pages adjustment is not achieved

On 7/22/20 1:49 AM, Baoquan He wrote:
> On 07/20/20 at 05:38pm, Mike Kravetz wrote:
>>> + if (count != h->max_huge_pages) {
>>> + char buf[32];
>>> +
>>> + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
>>> + pr_warn("HugeTLB: %s %lu of page size %s failed. Only %s %lu hugepages.\n",
>>> + count > old_max ? "increasing" : "decreasing",
>>> + abs(count - old_max), buf,
>>> + count > old_max ? "increased" : "decreased",
>>> + abs(old_max - h->max_huge_pages));
>>> + }
>>> spin_unlock(&hugetlb_lock);
>>
>> I would prefer if we drop the lock before logging the message. That would
>> involve grabbing the value of h->max_huge_pages before dropping the lock.
>
> Do you think the below change is OK to you to move the message logging
> after lock dropping? If yes, I will repost with updated patches.
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 6a9b7556ce5b..b5aa32a13569 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2661,7 +2661,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
> static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> nodemask_t *nodes_allowed)
> {
> - unsigned long min_count, ret, old_max;
> + unsigned long min_count, ret, old_max, new_max;
> NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);
>
> /*
> @@ -2780,7 +2780,10 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> }
> out:
> h->max_huge_pages = persistent_huge_pages(h);
> - if (count != h->max_huge_pages) {
> + new_max = h->max_huge_pages;
> + spin_unlock(&hugetlb_lock);
> +
> + if (count != new_max) {
> char buf[32];
>
> string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
> @@ -2788,9 +2791,8 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> count > old_max ? "increasing" : "decreasing",
> abs(count - old_max), buf,
> count > old_max ? "increased" : "decreased",
> - abs(old_max - h->max_huge_pages));
> + abs(old_max - new_max));
> }
> - spin_unlock(&hugetlb_lock);
>
> NODEMASK_FREE(node_alloc_noretry);

Yes, that looks better. Thank you.

--
Mike Kravetz