Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754347Ab3EMTIq (ORCPT ); Mon, 13 May 2013 15:08:46 -0400 Received: from e9.ny.us.ibm.com ([32.97.182.139]:36036 "EHLO e9.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752570Ab3EMTIp (ORCPT ); Mon, 13 May 2013 15:08:45 -0400 From: Cody P Schafer To: Andrew Morton Cc: Gilad Ben-Yossef , Simon Jeons , KOSAKI Motohiro , Mel Gorman , Linux MM , LKML , Cody P Schafer Subject: [PATCH RESEND v3 00/11] mm: fixup changers of per cpu pageset's ->high and ->batch Date: Mon, 13 May 2013 12:08:12 -0700 Message-Id: <1368472103-3427-1-git-send-email-cody@linux.vnet.ibm.com> X-Mailer: git-send-email 1.8.2.2 X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13051319-7182-0000-0000-000006AB0182 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3029 Lines: 71 "Problems" with the current code: 1. there is a lack of synchronization in setting ->high and ->batch in percpu_pagelist_fraction_sysctl_handler() 2. stop_machine() in zone_pcp_update() is unnecissary. 3. zone_pcp_update() does not consider the case where percpu_pagelist_fraction is non-zero To fix: 1. add memory barriers, a safe ->batch value, an update side mutex when updating ->high and ->batch, and use ACCESS_ONCE() for ->batch users that expect a stable value. 2. avoid draining pages in zone_pcp_update(), rely upon the memory barriers added to fix #1 3. factor out quite a few functions, and then call the appropriate one. Note that it results in a change to the behavior of zone_pcp_update(), which is used by memory_hotplug. I'm rather certain that I've diserned (and preserved) the essential behavior (changing ->high and ->batch), and only eliminated unneeded actions (draining the per cpu pages), but this may not be the case. Further note that the draining of pages that previously took place in zone_pcp_update() occured after repeated draining when attempting to offline a page, and after the offline has "succeeded". It appears that the draining was added to zone_pcp_update() to avoid refactoring setup_pageset() into 2 funtions. -- Since v2: https://lkml.org/lkml/2013/4/9/718 - note ACCESS_ONCE() in fix #1 above. - consolidate ->batch & ->high update protocol into a single funtion (Gilad). - add missing ACCESS_ONCE() on ->batch Since v1: https://lkml.org/lkml/2013/4/5/444 - instead of using on_each_cpu(), use memory barriers (Gilad) and an update side mutex. - add "Problem" #3 above, and fix. - rename function to match naming style of similar function - move unrelated comment -- Cody P Schafer (11): mm/page_alloc: factor out setting of pcp->high and pcp->batch. mm/page_alloc: prevent concurrent updaters of pcp ->batch and ->high mm/page_alloc: insert memory barriers to allow async update of pcp batch and high mm/page_alloc: protect pcp->batch accesses with ACCESS_ONCE mm/page_alloc: convert zone_pcp_update() to rely on memory barriers instead of stop_machine() mm/page_alloc: when handling percpu_pagelist_fraction, don't unneedly recalulate high mm/page_alloc: factor setup_pageset() into pageset_init() and pageset_set_batch() mm/page_alloc: relocate comment to be directly above code it refers to. mm/page_alloc: factor zone_pageset_init() out of setup_zone_pageset() mm/page_alloc: in zone_pcp_update(), uze zone_pageset_init() mm/page_alloc: rename setup_pagelist_highmark() to match naming of pageset_set_batch() mm/page_alloc.c | 151 +++++++++++++++++++++++++++++++++----------------------- 1 file changed, 90 insertions(+), 61 deletions(-) -- 1.8.2.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/