Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp6459933pxb; Wed, 17 Feb 2021 05:18:46 -0800 (PST) X-Google-Smtp-Source: ABdhPJwI1V3ZRDrSRR9YwPztrux75SeiY4OzGKwgABF2z4F+LBYwLilsRI2jCb/om9jyhBTxn1e5 X-Received: by 2002:a17:906:5846:: with SMTP id h6mr25349058ejs.521.1613567924706; Wed, 17 Feb 2021 05:18:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613567924; cv=none; d=google.com; s=arc-20160816; b=F+EaCZSJ8/vqEwuu9zH9ZyJMw0HeZyh55dmJQaOu+8bIgt4ufjz3BndzxiHEkg+9gX 8z8nvBiYMFW2u0l9cJWZop+RNc5Fer4gaQUtqylW/Fs3Z6RUCJ55JqVzoL6h3tsD9m0r DrYsNSf9hiSRlrDZfow0mHPBlD+PD0al4uxeRCh6oQC4trLNh+S7jF5QFZNm8hD4VXql FpqCjfru4bfRf0UkE3feRIEk+76YFo/T/+pmOesiiyF9Hl6G/Mvbml9v4D13Ysg+xXIy wGhIh0mNKsmYyPGZuMQM42qXSXvyVyuGBSriud4+nrsaAMPf47XLdS8NLg5FGQ63MOXk 5c9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=A8liV9LivvgJbefXJ55VyrMSp067sOf3ODYYksKOF1U=; b=JbU148Vsbl6N3n0/yJArgC7GoDvoIEBqMnhXf9L1kTwCJ+dg4f57h4mjc7csAulM27 UVbs3wPjXxZ/rIMzJcY/qvnMFE2D6i4/kXiT2OV4pLK9wuDY94jSMPBhW7ShLhU+RXmz 8JqexqgZ0XB0aZu3R6ugGQR9klk6kB5uL2kae9mIwIpm/nBZKlymV1KMaVAGNoiEfv9b 2P+tg019F5SIZmSbAsr3qk5Fap3h79FyyKvqlbzKkz9sw/0FYUlyGYbgsf/ZyNInQgD9 3jAIZXRrryVdMeQDG2ALfYtsPT5/8BTuPaqkxLJLzseykplT2XtE/rVuxcKjNi8QWYBR 5hog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b="vUF/7zLq"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c26si1218180edx.566.2021.02.17.05.18.20; Wed, 17 Feb 2021 05:18:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b="vUF/7zLq"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231905AbhBQJvr (ORCPT + 99 others); Wed, 17 Feb 2021 04:51:47 -0500 Received: from mx2.suse.de ([195.135.220.15]:37088 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231470AbhBQJvo (ORCPT ); Wed, 17 Feb 2021 04:51:44 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613555457; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=A8liV9LivvgJbefXJ55VyrMSp067sOf3ODYYksKOF1U=; b=vUF/7zLqOIqRO3nHFbLVor/ibH6SIiaLTRd3+Z+WlOYv/WpjYhGjkjIDAI+N2LQZmScOMy 91aUHNP9DizHYPNHY8OIu+h5XNrlpC6KuxoZmT8YFFZgJW9aIrB97wCEMLoea/3cifM7sd skrLy8aV8Ymd4rwEaISTX5QXhPG4sFE= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 7079EB7A8; Wed, 17 Feb 2021 09:50:57 +0000 (UTC) Date: Wed, 17 Feb 2021 10:50:55 +0100 From: Michal Hocko To: Minchan Kim Cc: Andrew Morton , linux-mm , LKML , cgoldswo@codeaurora.org, linux-fsdevel@vger.kernel.org, willy@infradead.org, david@redhat.com, vbabka@suse.cz, viro@zeniv.linux.org.uk, joaodias@google.com Subject: Re: [RFC 1/2] mm: disable LRU pagevec during the migration temporarily Message-ID: References: <20210216170348.1513483-1-minchan@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 17-02-21 09:59:54, Michal Hocko wrote: > On Tue 16-02-21 09:03:47, Minchan Kim wrote: [...] > > /* > > * migrate_prep() needs to be called before we start compiling a list of pages > > * to be migrated using isolate_lru_page(). If scheduling work on other CPUs is > > @@ -64,11 +80,27 @@ > > */ > > void migrate_prep(void) > > { > > + unsigned int cpu; > > + > > + spin_lock(&migrate_pending_lock); > > + migrate_pending_count++; > > + spin_unlock(&migrate_pending_lock); > > I suspect you do not want to add atomic_read inside hot paths, right? Is > this really something that we have to microoptimize for? atomic_read is > a simple READ_ONCE on many archs. Or you rather wanted to prevent from read memory barrier to enfore the ordering. > > + > > + for_each_online_cpu(cpu) { > > + struct work_struct *work = &per_cpu(migrate_pending_work, cpu); > > + > > + INIT_WORK(work, read_migrate_pending); > > + queue_work_on(cpu, mm_percpu_wq, work); > > + } > > + > > + for_each_online_cpu(cpu) > > + flush_work(&per_cpu(migrate_pending_work, cpu)); > > I also do not follow this scheme. Where is the IPI you are mentioning > above? Thinking about it some more I think you mean the rescheduling IPI here? > > + /* > > + * From now on, every online cpu will see uptodate > > + * migarte_pending_work. > > + */ > > /* > > * Clear the LRU lists so pages can be isolated. > > - * Note that pages may be moved off the LRU after we have > > - * drained them. Those pages will fail to migrate like other > > - * pages that may be busy. > > */ > > lru_add_drain_all(); > > Overall, this looks rather heavy weight to my taste. Have you tried to > play with a simple atomic counter approach? atomic_read when adding to > the cache and atomic_inc inside migrate_prep followed by lrdu_add_drain. If you really want a strong ordering then it should be sufficient to simply alter lru_add_drain_all to force draining all CPUs. This will make sure no new pages are added to the pcp lists and you will also sync up anything that has accumulated because of a race between atomic_read and inc: diff --git a/mm/swap.c b/mm/swap.c index 2cca7141470c..91600d7bb7a8 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -745,7 +745,7 @@ static void lru_add_drain_per_cpu(struct work_struct *dummy) * Calling this function with cpu hotplug locks held can actually lead * to obscure indirect dependencies via WQ context. */ -void lru_add_drain_all(void) +void lru_add_drain_all(bool force_all_cpus) { /* * lru_drain_gen - Global pages generation number @@ -820,7 +820,8 @@ void lru_add_drain_all(void) for_each_online_cpu(cpu) { struct work_struct *work = &per_cpu(lru_add_drain_work, cpu); - if (pagevec_count(&per_cpu(lru_pvecs.lru_add, cpu)) || + if (force_all_cpus || + pagevec_count(&per_cpu(lru_pvecs.lru_add, cpu)) || data_race(pagevec_count(&per_cpu(lru_rotate.pvec, cpu))) || pagevec_count(&per_cpu(lru_pvecs.lru_deactivate_file, cpu)) || pagevec_count(&per_cpu(lru_pvecs.lru_deactivate, cpu)) || -- Michal Hocko SUSE Labs