Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp10127544imu; Wed, 5 Dec 2018 16:55:19 -0800 (PST) X-Google-Smtp-Source: AFSGD/Xc2otZIbnKeuBdsVDjCxnFXJb58JLl8cQeCQmJM9SN3YVLREF9yfkMgnVo1qW2lyrg7kzY X-Received: by 2002:a63:c00b:: with SMTP id h11mr22807428pgg.429.1544057719475; Wed, 05 Dec 2018 16:55:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544057719; cv=none; d=google.com; s=arc-20160816; b=Wh6ZngHODFfzEIqBtOFi7HxGqaK4F5Btk4IkVf7iKSqcyk9BXhLQzbCKSudgtiH6vS TQW93Tzirr0LNmrGISovP4/oS5N5mNXh4Itd013rqW9DRqrlrCEuPLNK6ZUztWx01kaJ uEBrT1J+Ke8+00ZfvO1x0ba4rSsbWMZXLWVF+WEsfAVftq/hTrWJHJOSzDPR+ANy1UwW SNo0gGv2SSquRb+/5GMcMKSr9C7WiLiag3K29O35TR8CsswyB/WXJnbFoku9DhEyAYE6 L2AITP4QbuRXTfuwEv2PfgHUfrLuP3Qo1nv4N5L5pFZN57oJ/rB7zNGn9HMbbbCujQja V2sg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=wJy3/e3+6zjlV3MCaKpQWnTWenIIOrnFxdYcS99bbwo=; b=wHns2eJ30M3d8TDVSNvNY9BVso9ToMJ9dm3JU2kfZtCRSK67Ff2CffkAOBYg+P9yUE 7ym/FK+Sx390gHAPmu0sqkIZDKqaiaobNRuF8XM3Ml1xaxoAbo5V11oVq2GjKaRDP2MH Hn8odtL+Odbfe6TUr8sCs6n7hvifCKAUkTLXxr8PEQVyh3udcrSMFY6a0ZwywHZGocww 39biVdj3BpQhSjYdAEQsMsC2ekFdCyXee8LUV53iqQmZ7An4rJp3XbbO5VDZpQxayF2w yoz12Wim6OBiBsB6CfoxdNZur/Y+tOPWdnEjv00Qa+Xwl8H8+8J3uUEHnE+uVjldaQvj ScHw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 23si12624355pfu.2.2018.12.05.16.55.04; Wed, 05 Dec 2018 16:55:19 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727889AbeLFAya (ORCPT + 99 others); Wed, 5 Dec 2018 19:54:30 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50012 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727358AbeLFAya (ORCPT ); Wed, 5 Dec 2018 19:54:30 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 03D283082130; Thu, 6 Dec 2018 00:54:29 +0000 (UTC) Received: from sky.random (ovpn-122-73.rdu2.redhat.com [10.10.122.73]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6F228604DA; Thu, 6 Dec 2018 00:54:26 +0000 (UTC) Date: Wed, 5 Dec 2018 19:54:25 -0500 From: Andrea Arcangeli To: David Rientjes Cc: Linus Torvalds , mgorman@techsingularity.net, Vlastimil Babka , mhocko@kernel.org, ying.huang@intel.com, s.priebe@profihost.ag, Linux List Kernel Mailing , alex.williamson@redhat.com, lkp@01.org, kirill@shutemov.name, Andrew Morton , zi.yan@cs.rutgers.edu Subject: Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression Message-ID: <20181206005425.GB21159@redhat.com> References: <20181203201214.GB3540@redhat.com> <64a4aec6-3275-a716-8345-f021f6186d9b@suse.cz> <20181204104558.GV23260@techsingularity.net> <20181205204034.GB11899@redhat.com> <20181205233632.GE11899@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.0 (2018-11-25) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Thu, 06 Dec 2018 00:54:29 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 05, 2018 at 04:18:14PM -0800, David Rientjes wrote: > On Wed, 5 Dec 2018, Andrea Arcangeli wrote: > > > __GFP_COMPACT_ONLY gave an hope it could give some middle ground but > > it shows awful compaction results, it basically destroys compaction > > effectiveness and we know why (COMPACT_SKIPPED must call reclaim or > > compaction can't succeed because there's not enough free memory in the > > node). If somebody used MADV_HUGEPAGE compaction should still work and > > not fail like that. Compaction would fail to be effective even in the > > local node where __GFP_THISNODE didn't fail. Worst of all it'd fail > > even on non-NUMA systems (that would be easy to fix though by making > > the HPAGE_PMD_ORDER check conditional to NUMA being enabled at > > runtime). > > > > Note that in addition to COMPACT_SKIPPED that you mention, compaction can > fail with COMPACT_COMPLETE, meaning the full scan has finished without > freeing a hugepage, or COMPACT_DEFERRED, meaning that doing another scan > is unlikely to produce a different result. COMPACT_SKIPPED makes sense to > do reclaim if it can become accessible to isolate_freepages() and > hopefully another allocator does not allocate from these newly freed pages > before compaction can scan the zone again. For COMPACT_COMPLETE and > COMPACT_DEFERRED, reclaim is unlikely to ever help. The COMPACT_COMPLETE and (COMPACT_PARTIAL_SKIPPED for that matter) seems just a mistake in the max() evaluation try_to_compact_pages() that let it return COMPACT_COMPLETE and COMPACT_PARTIAL_SKIPPED. I think it should just return COMPACT_DEFERRED in those two cases and it should be enforced forced for all prio. There are really only 3 cases that matter for the caller: 1) succeed -> we got the page 2) defer -> we failed (caller won't care about why) 3) skipped -> failed because not enough 4k freed -> reclaim must be invoked then compaction can be retried PARTIAL_SKIPPED/COMPLETE both fall into 2) above so for the caller they should be treated the same way. It doesn't seem very concerning that it may try like if it succeeded and do a spurious single reclaim invocation, but it's good to fix this and take the COMPACT_DEFERRED nopage path in the __GFP_NORETRY case.