Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp211841pxb; Wed, 11 Nov 2020 01:30:17 -0800 (PST) X-Google-Smtp-Source: ABdhPJyM2sZrm5FopPZ0VP0iubx8GusUvERf9QeYbY7SCba9UClfGjyLh3Ufr06ASRnDw1SJ1JcM X-Received: by 2002:a17:906:6d13:: with SMTP id m19mr24335296ejr.345.1605087017012; Wed, 11 Nov 2020 01:30:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605087017; cv=none; d=google.com; s=arc-20160816; b=hKRC4wrLfJKWjkFNOED79QcooQEUdJIMZic20DQGdqAU4BUJ+fd3K5RySU6fjkwlqz qP6cZzAQI3q03TIsqSIACEywBx9CCRZUIaQGGC8rJPjvQScsnRao9qXUFHeaaLvr6D0B 8MRPNJ7eDHaQR9v8k5JWcHGL75wP9SNQjQSJDlSAZ0WLsTFtzpmGPQ/34LhSyVXdxXZl 7KJlu1ycZvbmd57lEM/gGayNumHfMysc/dya2xLNKBrBpjDaTHogQ7SLIM+2sOWjRyXY aXeDRJmF2b9umbzFOBb/vWdqXt58YXnx+k91DSBaskc1kUxifXK83/MTpONR4iXzC9Tq bGSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=Je4rCbTE93kO5Y8NKye6bXXxUsMUG7wuDHyECWdetdQ=; b=u5ermZ7jA/TkQlHS7Evcdwlq/Ce8MATcAN+4fMNW+f6CHNR5S2GwLk1uVcx7Pqj952 Jp5fb9+NOsOjYu9FdMdqSdnpYLo7b0cg4L1xM1d8rmIi5osbLpnZI746FaL8R+MgqbeS k/CAPRZWztmTloRQGGql0oZ1wzooWEDxsDUbRmTbMrnIu5gAKSpwEztznmDduhIiLhQy XOKoU6Onh0um51xofre2Bnu1aazKqSt3T2RvbFcPL6h+20NGhP02Ia+NNN1FHFku/06T zNCp0OTrlvE54h6cysaD/Q+j+5rTALdW/QH48m2CrtTGKdDVGobphkDKl17Tzn8sMjBK gTIg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z2si932084ejr.636.2020.11.11.01.29.53; Wed, 11 Nov 2020 01:30:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727223AbgKKJ2a (ORCPT + 99 others); Wed, 11 Nov 2020 04:28:30 -0500 Received: from mx2.suse.de ([195.135.220.15]:37552 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726949AbgKKJ2Y (ORCPT ); Wed, 11 Nov 2020 04:28:24 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 5C293ABD6; Wed, 11 Nov 2020 09:28:22 +0000 (UTC) From: Vlastimil Babka To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Michal Hocko , Pavel Tatashin , David Hildenbrand , Oscar Salvador , Joonsoo Kim , Vlastimil Babka Subject: [PATCH v3 0/7] disable pcplists during memory offline Date: Wed, 11 Nov 2020 10:28:05 +0100 Message-Id: <20201111092812.11329-1-vbabka@suse.cz> X-Mailer: git-send-email 2.29.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Changes since v2 [8]: - add acks/reviews (thanks David and Oscar) - small wording and style changes - rebase to next-20201111 Changes since v1 [7]: - add acks/reviews (thanks David and Michal) - drop "mm, page_alloc: make per_cpu_pageset accessible only after init" as that's orthogonal and needs more consideration - squash "mm, page_alloc: drain all pcplists during memory offline" into the last patch, and move new zone_pcp_* functions into mm/page_alloc. As such, the new 'force all cpus' param of __drain_all_pages() is never exported outside page_alloc.c so I didn't add a new wrapper function to hide the bool - keep pcp_batch_high_lock a mutex as offline_pages is synchronized anyway, as suggested by Michal. Thus we don't need atomic variable and sync around it, and patch is much smaller. If alloc_contic_range() wants to use the new functionality and keep parallelism, we can add that on top. As per the discussions [1] [2] this is an attempt to implement David's suggestion that page isolation should disable pcplists to avoid races with page freeing in progress. This is done without extra checks in fast paths, as explained in Patch 9. The repeated draining done by [2] is then no longer needed. Previous version (RFC) is at [3]. The RFC tried to hide pcplists disabling/enabling into page isolation, but it wasn't completely possible, as memory offline does not unisolation. Michal suggested an explicit API in [4] so that's the current implementation and it seems indeed nicer. Once we accept that page isolation users need to do explicit actions around it depending on the needed guarantees, we can also IMHO accept that the current pcplist draining can be also done by the callers, which is more effective. After all, there are only two users of page isolation. So patch 6 does effectively the same thing as Pavel proposed in [5], and patch 7 implement stronger guarantees only for memory offline. If CMA decides to opt-in to the stronger guarantee, it can be added later. Patches 1-5 are preparatory cleanups for pcplist disabling. Patchset was briefly tested in QEMU so that memory online/offline works, but I haven't done a stress test that would prove the race fixed by [2] is eliminated. Note that patch 7 could be avoided if we instead adjusted page freeing in shown in [6], but I believe the current implementation of disabling pcplists is not too much complex, so I would prefer this instead of adding new checks and longer irq-disabled section into page freeing hotpaths. [1] https://lore.kernel.org/linux-mm/20200901124615.137200-1-pasha.tatashin@soleen.com/ [2] https://lore.kernel.org/linux-mm/20200903140032.380431-1-pasha.tatashin@soleen.com/ [3] https://lore.kernel.org/linux-mm/20200907163628.26495-1-vbabka@suse.cz/ [4] https://lore.kernel.org/linux-mm/20200909113647.GG7348@dhcp22.suse.cz/ [5] https://lore.kernel.org/linux-mm/20200904151448.100489-3-pasha.tatashin@soleen.com/ [6] https://lore.kernel.org/linux-mm/3d3b53db-aeaa-ff24-260b-36427fac9b1c@suse.cz/ [7] https://lore.kernel.org/linux-mm/20200922143712.12048-1-vbabka@suse.cz/ [8] https://lore.kernel.org/linux-mm/20201008114201.18824-1-vbabka@suse.cz/ Vlastimil Babka (7): mm, page_alloc: clean up pageset high and batch update mm, page_alloc: calculate pageset high and batch once per zone mm, page_alloc: remove setup_pageset() mm, page_alloc: simplify pageset_update() mm, page_alloc: cache pageset high and batch in struct zone mm, page_alloc: move draining pcplists to page isolation users mm, page_alloc: disable pcplists during memory offline include/linux/mmzone.h | 6 ++ mm/internal.h | 2 + mm/memory_hotplug.c | 27 +++--- mm/page_alloc.c | 195 ++++++++++++++++++++++++----------------- mm/page_isolation.c | 10 +-- 5 files changed, 141 insertions(+), 99 deletions(-) -- 2.29.1