LinuxLists.cc - [PATCH 00/25] Cleanup and optimise the page allocator V5

2009-03-20 10:03:16

Subject: [PATCH 00/25] Cleanup and optimise the page allocator V5

Here is V5 of the cleanup and optimisation of the page allocator and it should
be ready for wider testing. Please consider a possibility for merging as a
Pass 1 at making the page allocator faster. Other passes will occur later
when this one has had a bit of exercise. The patchset completed a series
of tests based on the latest MMOTM.

Performance is improved in a variety of cases but note it's not universal due
to lock contention which I'll explain later. Text is reduced by 497 bytes on
the x86-64 config I checked. 18.78% less clock cycles were sampled in the page
allocator paths excluding zeroing which is roughly the same in either kernel,
L1 cache misses are reduced by about 7.36% and L2 cache misses were reduced
by 17.91% cache misses incurred within the allocator itself are reduced.

The lock contention on some machines goes up for the the zone->lru_lock
and zone->lock locks which can regress some workloads even though others on
the same machine still go faster. For netperf, a lock called slock-AF_INET
seemed very important although I didn't look too closely other than noting
contention went up. The zone->lock gets hammered a lot by high order allocs
and frees coming from SLUB which are not covered by the PCP allocator in
this patchset. zone->lru_lock goes up is less clear but as it's page cache
releases but overall contention may be up because CPUs are spending less
time with interrupts disabled and more time trying to do real work but
contending on the locks.

Changes since V4
o Drop the more controversial patches for now and focus on the "obvious win"
material.
o Add reviewed-by notes
o Fix changelog entry to say __rmqueue_fallback instead __rmqueue
o Add unlikely() for the clearMlocked check
o Change where PGFREE is accounted in free_hot_cold_page() to have symmetry
with __free_pages_ok()
o Convert num_online_nodes() to use a static value so that callers do
not have to be individually updated
o Rebase to mmotm-2003-03-13

Changes since V3
o Drop the more controversial patches for now and focus on the "obvious win"
material
o Add reviewed-by notes
o Fix changelog entry to say __rmqueue_fallback instead __rmqueue
o Add unlikely() for the clearMlocked check
o Change where PGFREE is accounted in free_hot_cold_page() to have symmetry
with __free_pages_ok()

Changes since V2
o Remove brances by treating watermark flags as array indices
o Remove branch by assuming __GFP_HIGH == ALLOC_HIGH
o Do not check for compound on every page free
o Remove branch by always ensuring the migratetype is known on free
o Simplify buffered_rmqueue further
o Reintroduce improved version of batched bulk free of pcp pages
o Use allocation flags as an index to zone watermarks
o Work out __GFP_COLD only once
o Reduce the number of times zone stats are updated
o Do not dump reserve pages back into the allocator. Instead treat them
as MOVABLE so that MIGRATE_RESERVE gets used on the max-order-overlapped
boundaries without causing trouble
o Allow pages up to PAGE_ALLOC_COSTLY_ORDER to use the per-cpu allocator.
order-1 allocations are frequently enough in particular to justify this
o Rearrange inlining such that the hot-path is inlined but not in a way
that increases the text size of the page allocator
o Make the check for needing additional zonelist filtering due to NUMA
or cpusets as light as possible
o Do not destroy compound pages going to the PCP lists
o Delay the merging of buddies until a high-order allocation needs them
or anti-fragmentation is being forced to fallback

Changes since V1
o Remove the ifdef CONFIG_CPUSETS from inside get_page_from_freelist()
o Use non-lock bit operations for clearing the mlock flag
o Factor out alloc_flags calculation so it is only done once (Peter)
o Make gfp.h a bit prettier and clear-cut (Peter)
o Instead of deleting a debugging check, replace page_count() in the
free path with a version that does not check for compound pages (Nick)
o Drop the alteration for hot/cold page freeing until we know if it
helps or not

2009-03-20 10:02:53

Subject: [PATCH 00/25] Cleanup and optimise the page allocator V5

Subject: [PATCH 02/25] Do not sanity check order in the fast path

Subject: [PATCH 01/25] Replace __alloc_pages_internal() with __alloc_pages_nodemask()

Subject: [PATCH 03/25] Do not check NUMA node ID when the caller knows the node is valid

Subject: [PATCH 05/25] Break up the allocator entry point into fast and slow paths

Subject: [PATCH 04/25] Check only once if the zonelist is suitable for the allocation

Subject: [PATCH 06/25] Move check for disabled anti-fragmentation out of fastpath

Subject: [PATCH 09/25] Calculate the migratetype for allocation only once

Subject: [PATCH 08/25] Calculate the preferred zone for allocation only once

Subject: [PATCH 07/25] Check in advance if the zonelist needs additional filtering

Subject: [PATCH 10/25] Calculate the alloc_flags for allocation only once

Subject: [PATCH 11/25] Calculate the cold parameter for allocation only once

Subject: [PATCH 12/25] Remove a branch by assuming __GFP_HIGH == ALLOC_HIGH

Subject: [PATCH 13/25] Inline __rmqueue_smallest()

Subject: [PATCH 15/25] Inline __rmqueue_fallback()

Subject: [PATCH 14/25] Inline buffered_rmqueue()

Subject: [PATCH 16/25] Save text by reducing call sites of __rmqueue()

Subject: [PATCH 17/25] Do not call get_pageblock_migratetype() more than necessary

Subject: [PATCH 19/25] Do not setup zonelist cache when there is only one node

Subject: [PATCH 18/25] Do not disable interrupts in free_page_mlock()

Subject: [PATCH 20/25] Do not check for compound pages during the page allocator sanity checks

Subject: [PATCH 21/25] Use allocation flags as an index to the zone watermark

Subject: [PATCH 22/25] Update NR_FREE_PAGES only as necessary

Subject: [PATCH 23/25] Get the pageblock migratetype without disabling interrupts

Subject: [PATCH 25/25] Use a pre-calculated value instead of num_online_nodes() in fast paths

Subject: [PATCH 24/25] Re-sort GFP flags and fix whitespace alignment for easier reading.

Subject: Re: [PATCH 00/25] Cleanup and optimise the page allocator V5

Subject: Re: [PATCH 07/25] Check in advance if the zonelist needs additional filtering

Subject: Re: [PATCH 08/25] Calculate the preferred zone for allocation only once

Subject: Re: [PATCH 09/25] Calculate the migratetype for allocation only once

Subject: Re: [PATCH 08/25] Calculate the preferred zone for allocation only once

Subject: Re: [PATCH 00/25] Cleanup and optimise the page allocator V5

Subject: Re: [PATCH 11/25] Calculate the cold parameter for allocation only once

Subject: Re: [PATCH 00/25] Cleanup and optimise the page allocator V5

Subject: Re: [PATCH 00/25] Cleanup and optimise the page allocator V5

Subject: Re: [PATCH 00/25] Cleanup and optimise the page allocator V5

Subject: Re: [PATCH 00/25] Cleanup and optimise the page allocator V5

Subject: Re: [PATCH 00/25] Cleanup and optimise the page allocator V5

Subject: Re: [PATCH 00/25] Cleanup and optimise the page allocator V5

Subject: Re: [PATCH 00/25] Cleanup and optimise the page allocator V5

Subject: Re: [PATCH 00/25] Cleanup and optimise the page allocator V5

Subject: Re: [PATCH 11/25] Calculate the cold parameter for allocation only once

Subject: Re: [PATCH 11/25] Calculate the cold parameter for allocation only once

Subject: Re: [PATCH 11/25] Calculate the cold parameter for allocation only once