2013-08-02 02:08:04

by Joonsoo Kim

[permalink] [raw]
Subject: [PATCH 1/4] mm, page_alloc: add likely macro to help compiler optimization

We rarely allocate a page with ALLOC_NO_WATERMARKS and it is used
in slow path. For making fast path more faster, add likely macro to
help compiler optimization.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b100255..86ad44b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1901,7 +1901,7 @@ zonelist_scan:
goto this_zone_full;

BUILD_BUG_ON(ALLOC_NO_WATERMARKS < NR_WMARK);
- if (!(alloc_flags & ALLOC_NO_WATERMARKS)) {
+ if (likely(!(alloc_flags & ALLOC_NO_WATERMARKS))) {
unsigned long mark;
int ret;

--
1.7.9.5


2013-08-02 02:08:07

by Joonsoo Kim

[permalink] [raw]
Subject: [PATCH 3/4] mm: move pgtable related functions to right place

pgtable related functions are mostly in pgtable-generic.c.
So move remaining functions from memory.c to pgtable-generic.c.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/memory.c b/mm/memory.c
index 1ce2e2a..26bce51 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -374,30 +374,6 @@ void tlb_remove_table(struct mmu_gather *tlb, void *table)
#endif /* CONFIG_HAVE_RCU_TABLE_FREE */

/*
- * If a p?d_bad entry is found while walking page tables, report
- * the error, before resetting entry to p?d_none. Usually (but
- * very seldom) called out from the p?d_none_or_clear_bad macros.
- */
-
-void pgd_clear_bad(pgd_t *pgd)
-{
- pgd_ERROR(*pgd);
- pgd_clear(pgd);
-}
-
-void pud_clear_bad(pud_t *pud)
-{
- pud_ERROR(*pud);
- pud_clear(pud);
-}
-
-void pmd_clear_bad(pmd_t *pmd)
-{
- pmd_ERROR(*pmd);
- pmd_clear(pmd);
-}
-
-/*
* Note: this doesn't free the actual pages themselves. That
* has been handled earlier when unmapping all the memory regions.
*/
diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index e1a6e4f..3929a40 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -10,6 +10,30 @@
#include <asm/tlb.h>
#include <asm-generic/pgtable.h>

+/*
+ * If a p?d_bad entry is found while walking page tables, report
+ * the error, before resetting entry to p?d_none. Usually (but
+ * very seldom) called out from the p?d_none_or_clear_bad macros.
+ */
+
+void pgd_clear_bad(pgd_t *pgd)
+{
+ pgd_ERROR(*pgd);
+ pgd_clear(pgd);
+}
+
+void pud_clear_bad(pud_t *pud)
+{
+ pud_ERROR(*pud);
+ pud_clear(pud);
+}
+
+void pmd_clear_bad(pmd_t *pmd)
+{
+ pmd_ERROR(*pmd);
+ pmd_clear(pmd);
+}
+
#ifndef __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
/*
* Only sets the access flags (dirty, accessed), as well as write
--
1.7.9.5

2013-08-02 02:08:25

by Joonsoo Kim

[permalink] [raw]
Subject: [PATCH 4/4] swap: clean-up #ifdef in page_mapping()

PageSwapCache() is always false when !CONFIG_SWAP, so compiler
properly discard related code. Therefore, we don't need #ifdef explicitly.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/include/linux/swap.h b/include/linux/swap.h
index d95cde5..c638a71 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -414,6 +414,7 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)

#else /* CONFIG_SWAP */

+#define swap_address_space(entry) (NULL)
#define get_nr_swap_pages() 0L
#define total_swap_pages 0L
#define total_swapcache_pages() 0UL
diff --git a/mm/util.c b/mm/util.c
index 7441c41..eaf63fc2 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -388,15 +388,12 @@ struct address_space *page_mapping(struct page *page)
struct address_space *mapping = page->mapping;

VM_BUG_ON(PageSlab(page));
-#ifdef CONFIG_SWAP
if (unlikely(PageSwapCache(page))) {
swp_entry_t entry;

entry.val = page_private(page);
mapping = swap_address_space(entry);
- } else
-#endif
- if ((unsigned long)mapping & PAGE_MAPPING_ANON)
+ } else if ((unsigned long)mapping & PAGE_MAPPING_ANON)
mapping = NULL;
return mapping;
}
--
1.7.9.5

2013-08-02 02:08:47

by Joonsoo Kim

[permalink] [raw]
Subject: [PATCH 2/4] mm, migrate: allocation new page lazyily in unmap_and_move()

We don't need a new page and then go out immediately if some condition
is met. Allocation has overhead in comparison with some condition check,
so allocating lazyily is preferable solution.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/migrate.c b/mm/migrate.c
index 6f0c244..86db87e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -864,10 +864,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
{
int rc = 0;
int *result = NULL;
- struct page *newpage = get_new_page(page, private, &result);
-
- if (!newpage)
- return -ENOMEM;
+ struct page *newpage = NULL;

if (page_count(page) == 1) {
/* page was freed from under us. So we are done. */
@@ -878,6 +875,10 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
if (unlikely(split_huge_page(page)))
goto out;

+ newpage = get_new_page(page, private, &result);
+ if (!newpage)
+ return -ENOMEM;
+
rc = __unmap_and_move(page, newpage, force, mode);

if (unlikely(rc == MIGRATEPAGE_BALLOON_SUCCESS)) {
@@ -908,7 +909,8 @@ out:
* Move the new page to the LRU. If migration was not successful
* then this will free the page.
*/
- putback_lru_page(newpage);
+ if (newpage)
+ putback_lru_page(newpage);
if (result) {
if (rc)
*result = rc;
--
1.7.9.5

2013-08-02 16:27:29

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm, page_alloc: add likely macro to help compiler optimization

On Fri 02-08-13 11:07:56, Joonsoo Kim wrote:
> We rarely allocate a page with ALLOC_NO_WATERMARKS and it is used
> in slow path. For making fast path more faster, add likely macro to
> help compiler optimization.

The code is different in mmotm tree (see mm: page_alloc: rearrange
watermark checking in get_page_from_freelist)

Besides that, make sure you provide numbers which prove your claims
about performance optimizations.

> Signed-off-by: Joonsoo Kim <[email protected]>
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index b100255..86ad44b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1901,7 +1901,7 @@ zonelist_scan:
> goto this_zone_full;
>
> BUILD_BUG_ON(ALLOC_NO_WATERMARKS < NR_WMARK);
> - if (!(alloc_flags & ALLOC_NO_WATERMARKS)) {
> + if (likely(!(alloc_flags & ALLOC_NO_WATERMARKS))) {
> unsigned long mark;
> int ret;
>
> --
> 1.7.9.5
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

--
Michal Hocko
SUSE Labs

2013-08-02 19:26:57

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm, page_alloc: add likely macro to help compiler optimization

On Fri, Aug 02, 2013 at 11:07:56AM +0900, Joonsoo Kim wrote:
> We rarely allocate a page with ALLOC_NO_WATERMARKS and it is used
> in slow path. For making fast path more faster, add likely macro to
> help compiler optimization.
>
> Signed-off-by: Joonsoo Kim <[email protected]>

Acked-by: Johannes Weiner <[email protected]>

2013-08-02 19:41:13

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm, migrate: allocation new page lazyily in unmap_and_move()

On Fri, Aug 02, 2013 at 11:07:57AM +0900, Joonsoo Kim wrote:
> We don't need a new page and then go out immediately if some condition
> is met. Allocation has overhead in comparison with some condition check,
> so allocating lazyily is preferable solution.
>
> Signed-off-by: Joonsoo Kim <[email protected]>
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 6f0c244..86db87e 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -864,10 +864,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
> {
> int rc = 0;
> int *result = NULL;
> - struct page *newpage = get_new_page(page, private, &result);
> -
> - if (!newpage)
> - return -ENOMEM;
> + struct page *newpage = NULL;
>
> if (page_count(page) == 1) {
> /* page was freed from under us. So we are done. */
> @@ -878,6 +875,10 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private,
> if (unlikely(split_huge_page(page)))
> goto out;
>
> + newpage = get_new_page(page, private, &result);
> + if (!newpage)
> + return -ENOMEM;

get_new_page() sets up result to communicate error codes from the
following checks. While the existing ones (page freed and thp split
failed) don't change rc, somebody else might add a condition whose
error code should be propagated back into *result but miss it.

Please leave get_new_page() where it is. The win from this change is
not big enough to risk these problems.

Thanks!

2013-08-02 19:43:39

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 4/4] swap: clean-up #ifdef in page_mapping()

On Fri, Aug 02, 2013 at 11:07:59AM +0900, Joonsoo Kim wrote:
> PageSwapCache() is always false when !CONFIG_SWAP, so compiler
> properly discard related code. Therefore, we don't need #ifdef explicitly.
>
> Signed-off-by: Joonsoo Kim <[email protected]>

Acked-by: Johannes Weiner <[email protected]>

2013-08-02 20:47:22

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm, page_alloc: add likely macro to help compiler optimization

On Fri, Aug 02, 2013 at 06:27:22PM +0200, Michal Hocko wrote:
> On Fri 02-08-13 11:07:56, Joonsoo Kim wrote:
> > We rarely allocate a page with ALLOC_NO_WATERMARKS and it is used
> > in slow path. For making fast path more faster, add likely macro to
> > help compiler optimization.
>
> The code is different in mmotm tree (see mm: page_alloc: rearrange
> watermark checking in get_page_from_freelist)

Yes, please rebase this on top.

> Besides that, make sure you provide numbers which prove your claims
> about performance optimizations.

Isn't that a bit overkill? We know it's a likely path (we would
deadlock constantly if a sizable portion of allocations were to ignore
the watermarks). Does he have to justify that likely in general makes
sense?

2013-08-02 21:36:12

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm, page_alloc: add likely macro to help compiler optimization

On Fri 02-08-13 16:47:10, Johannes Weiner wrote:
> On Fri, Aug 02, 2013 at 06:27:22PM +0200, Michal Hocko wrote:
> > On Fri 02-08-13 11:07:56, Joonsoo Kim wrote:
> > > We rarely allocate a page with ALLOC_NO_WATERMARKS and it is used
> > > in slow path. For making fast path more faster, add likely macro to
> > > help compiler optimization.
> >
> > The code is different in mmotm tree (see mm: page_alloc: rearrange
> > watermark checking in get_page_from_freelist)
>
> Yes, please rebase this on top.
>
> > Besides that, make sure you provide numbers which prove your claims
> > about performance optimizations.
>
> Isn't that a bit overkill? We know it's a likely path (we would
> deadlock constantly if a sizable portion of allocations were to ignore
> the watermarks). Does he have to justify that likely in general makes
> sense?

That was more a generic comment. If there is a claim that something
would be faster it would be nice to back that claim by some numbers
(e.g. smaller hot path).

In this particular case, unlikely(alloc_flags & ALLOC_NO_WATERMARKS)
doesn't make any change to the generated code with gcc 4.8.1 resp.
4.3.4 I have here.
Maybe other versions of gcc would benefit from the hint but changelog
didn't tell us. I wouldn't add the anotation if it doesn't make any
difference for the resulting code.
--
Michal Hocko
SUSE Labs

2013-08-05 07:40:52

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm, migrate: allocation new page lazyily in unmap_and_move()

> get_new_page() sets up result to communicate error codes from the
> following checks. While the existing ones (page freed and thp split
> failed) don't change rc, somebody else might add a condition whose
> error code should be propagated back into *result but miss it.
>
> Please leave get_new_page() where it is. The win from this change is
> not big enough to risk these problems.

Hello, Johannes.

Okay. You are right. I will omit this patch next time.

Thanks.

2013-08-05 08:10:01

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm, page_alloc: add likely macro to help compiler optimization

Hello, Michal.

On Fri, Aug 02, 2013 at 11:36:07PM +0200, Michal Hocko wrote:
> On Fri 02-08-13 16:47:10, Johannes Weiner wrote:
> > On Fri, Aug 02, 2013 at 06:27:22PM +0200, Michal Hocko wrote:
> > > On Fri 02-08-13 11:07:56, Joonsoo Kim wrote:
> > > > We rarely allocate a page with ALLOC_NO_WATERMARKS and it is used
> > > > in slow path. For making fast path more faster, add likely macro to
> > > > help compiler optimization.
> > >
> > > The code is different in mmotm tree (see mm: page_alloc: rearrange
> > > watermark checking in get_page_from_freelist)
> >
> > Yes, please rebase this on top.
> >
> > > Besides that, make sure you provide numbers which prove your claims
> > > about performance optimizations.
> >
> > Isn't that a bit overkill? We know it's a likely path (we would
> > deadlock constantly if a sizable portion of allocations were to ignore
> > the watermarks). Does he have to justify that likely in general makes
> > sense?
>
> That was more a generic comment. If there is a claim that something
> would be faster it would be nice to back that claim by some numbers
> (e.g. smaller hot path).
>
> In this particular case, unlikely(alloc_flags & ALLOC_NO_WATERMARKS)
> doesn't make any change to the generated code with gcc 4.8.1 resp.
> 4.3.4 I have here.
> Maybe other versions of gcc would benefit from the hint but changelog
> didn't tell us. I wouldn't add the anotation if it doesn't make any
> difference for the resulting code.

Hmm, Is there no change with gcc 4.8.1 and 4.3.4?

I found a change with gcc 4.6.3 and v3.10 kernel.

text data bss dec hex filename
35683 1461 644 37788 939c page_alloc_base.o
35715 1461 644 37820 93bc page_alloc_patch.o

Slightly larger (32 bytes) than before.
And assembly code looks different as I expected.

* Original code

17126 .LVL1518:
17127 .loc 2 1904 0 is_stmt 1
17128 testb $4, -116(%rbp) #, %sfp
17129 je .L866 #,

(snip)

17974 .L866:
17975 .LBE6053:
17976 .LBE6052:
17977 .LBE6051:
17978 .LBE6073:
17979 .LBE6093:
17980 .LBB6094:
17981 .loc 2 1908 0
17982 movl -116(%rbp), %r14d # %sfp, D.42080
17983 .loc 2 1909 0
17984 movl -116(%rbp), %r8d # %sfp,
17985 movq %rbx, %rdi # prephitmp.1723,
17986 movl -212(%rbp), %ecx # %sfp,
17987 movl -80(%rbp), %esi # %sfp,
17988 .loc 2 1908 0
17989 andl $3, %r14d #, D.42080
17990 movslq %r14d, %rax # D.42080, D.42080
17991 movq (%rbx,%rax,8), %r13 # prephitmp.1723_268->watermark, mark
17992 .LVL1591:
17993 .loc 2 1909 0
17994 movq %r13, %rdx # mark,
17995 call zone_watermark_ok #

On 17129 line, we check ALLOC_NO_WATERMARKS and if not matched, then jump to L866.
L866 is on 17981 line.

* Patched code

17122 .L807:
17123 .LVL1513:
17124 .loc 2 1904 0 is_stmt 1
17125 testb $4, -88(%rbp) #, %sfp
17126 jne .L811 #,
17127 .LBB6092:
17128 .loc 2 1908 0
17129 movl -88(%rbp), %r13d # %sfp, D.42082
17130 .loc 2 1909 0
17131 movl -88(%rbp), %r8d # %sfp,
17132 movq %rbx, %rdi # prephitmp.1723,
17133 movl -160(%rbp), %ecx # %sfp,
17134 movl -80(%rbp), %esi # %sfp,
17135 .loc 2 1908 0
17136 andl $3, %r13d #, D.42082
17137 movslq %r13d, %rax # D.42082, D.42082
17138 movq (%rbx,%rax,8), %r12 # prephitmp.1723_270->watermark, mark
17139 .LVL1514:
17140 .loc 2 1909 0
17141 movq %r12, %rdx # mark,
17142 call zone_watermark_ok #

On 17124 line, we check ALLOC_NO_WATERMARKS (0x4) and if not matched,
execute following code without jumping. This is effect of 'likely' macro.
Isn't it reasonable?

Thanks.

2013-08-05 08:50:34

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm, page_alloc: add likely macro to help compiler optimization

On Mon, Aug 05, 2013 at 05:10:08PM +0900, Joonsoo Kim wrote:
> Hello, Michal.
>
> On Fri, Aug 02, 2013 at 11:36:07PM +0200, Michal Hocko wrote:
> > On Fri 02-08-13 16:47:10, Johannes Weiner wrote:
> > > On Fri, Aug 02, 2013 at 06:27:22PM +0200, Michal Hocko wrote:
> > > > On Fri 02-08-13 11:07:56, Joonsoo Kim wrote:
> > > > > We rarely allocate a page with ALLOC_NO_WATERMARKS and it is used
> > > > > in slow path. For making fast path more faster, add likely macro to
> > > > > help compiler optimization.
> > > >
> > > > The code is different in mmotm tree (see mm: page_alloc: rearrange
> > > > watermark checking in get_page_from_freelist)
> > >
> > > Yes, please rebase this on top.
> > >
> > > > Besides that, make sure you provide numbers which prove your claims
> > > > about performance optimizations.
> > >
> > > Isn't that a bit overkill? We know it's a likely path (we would
> > > deadlock constantly if a sizable portion of allocations were to ignore
> > > the watermarks). Does he have to justify that likely in general makes
> > > sense?
> >
> > That was more a generic comment. If there is a claim that something
> > would be faster it would be nice to back that claim by some numbers
> > (e.g. smaller hot path).
> >
> > In this particular case, unlikely(alloc_flags & ALLOC_NO_WATERMARKS)
> > doesn't make any change to the generated code with gcc 4.8.1 resp.
> > 4.3.4 I have here.
> > Maybe other versions of gcc would benefit from the hint but changelog
> > didn't tell us. I wouldn't add the anotation if it doesn't make any
> > difference for the resulting code.
>
> Hmm, Is there no change with gcc 4.8.1 and 4.3.4?
>
> I found a change with gcc 4.6.3 and v3.10 kernel.

Ah... I did a test on mmotm (Johannes's git) and found that this patch
doesn't make any effect. I guess, a change from Johannes ('rearrange
watermark checking in get_page_from_freelist') already makes better code
for !ALLOC_NO_WATERMARKS case. IMHO, although there is no effect, it is
better to add likely macro, because arrangement can be changed from time
to time without any consideration of assembly code generation. How about
your opinion, Johannes and Michal?

Thanks.

2013-08-05 08:59:25

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm, page_alloc: add likely macro to help compiler optimization

On Mon 05-08-13 17:50:41, Joonsoo Kim wrote:
[...]
> IMHO, although there is no effect, it is better to add likely macro,
> because arrangement can be changed from time to time without any
> consideration of assembly code generation. How about your opinion,
> Johannes and Michal?

This is a matter of taste. I do not like new *likely annotations if they
do not make difference. But no strong objections if others like it.
--
Michal Hocko
SUSE Labs

2013-08-05 20:52:21

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm, page_alloc: add likely macro to help compiler optimization

On Mon, Aug 05, 2013 at 05:50:41PM +0900, Joonsoo Kim wrote:
> On Mon, Aug 05, 2013 at 05:10:08PM +0900, Joonsoo Kim wrote:
> > Hello, Michal.
> >
> > On Fri, Aug 02, 2013 at 11:36:07PM +0200, Michal Hocko wrote:
> > > On Fri 02-08-13 16:47:10, Johannes Weiner wrote:
> > > > On Fri, Aug 02, 2013 at 06:27:22PM +0200, Michal Hocko wrote:
> > > > > On Fri 02-08-13 11:07:56, Joonsoo Kim wrote:
> > > > > > We rarely allocate a page with ALLOC_NO_WATERMARKS and it is used
> > > > > > in slow path. For making fast path more faster, add likely macro to
> > > > > > help compiler optimization.
> > > > >
> > > > > The code is different in mmotm tree (see mm: page_alloc: rearrange
> > > > > watermark checking in get_page_from_freelist)
> > > >
> > > > Yes, please rebase this on top.
> > > >
> > > > > Besides that, make sure you provide numbers which prove your claims
> > > > > about performance optimizations.
> > > >
> > > > Isn't that a bit overkill? We know it's a likely path (we would
> > > > deadlock constantly if a sizable portion of allocations were to ignore
> > > > the watermarks). Does he have to justify that likely in general makes
> > > > sense?
> > >
> > > That was more a generic comment. If there is a claim that something
> > > would be faster it would be nice to back that claim by some numbers
> > > (e.g. smaller hot path).
> > >
> > > In this particular case, unlikely(alloc_flags & ALLOC_NO_WATERMARKS)
> > > doesn't make any change to the generated code with gcc 4.8.1 resp.
> > > 4.3.4 I have here.
> > > Maybe other versions of gcc would benefit from the hint but changelog
> > > didn't tell us. I wouldn't add the anotation if it doesn't make any
> > > difference for the resulting code.
> >
> > Hmm, Is there no change with gcc 4.8.1 and 4.3.4?
> >
> > I found a change with gcc 4.6.3 and v3.10 kernel.
>
> Ah... I did a test on mmotm (Johannes's git) and found that this patch
> doesn't make any effect. I guess, a change from Johannes ('rearrange
> watermark checking in get_page_from_freelist') already makes better code
> for !ALLOC_NO_WATERMARKS case. IMHO, although there is no effect, it is
> better to add likely macro, because arrangement can be changed from time
> to time without any consideration of assembly code generation. How about
> your opinion, Johannes and Michal?

I'm not against it. It does not really matter if gcc gets it right by
accident right now and the annotation does not have an immediate
effect. It's a compiler hint and we want gcc to know it so it doesn't
ever assume otherwise in the future. Yes, nobody will re-evaluate if
the conditional still generates the jumps correctly after shifting the
code around.

Thanks!