Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp4500865imm; Mon, 15 Oct 2018 16:20:40 -0700 (PDT) X-Google-Smtp-Source: ACcGV600FqE6gz4Hfyc2mYgTp/rkQG8ksMXkgeAghZ975Uz00rmpNbwAHB+Ei5QwQE0wMakZffWD X-Received: by 2002:a62:67c3:: with SMTP id t64-v6mr12063356pfj.76.1539645640557; Mon, 15 Oct 2018 16:20:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539645640; cv=none; d=google.com; s=arc-20160816; b=oHy79kkPmNmz8iI8An3txidBz/NMgHcg+UY7ev7p6eee8Zdy95husTAUdzq9mlxVBs xlukGYTXaz7udmwdR6LE/2wa2aRhbWSUNncrO9+JXJ9XzeXJAChdgfHOcSijt5Cgwuxo J4IhItI26ITS3+lQiP+kpOtH4F3qHJnTKzDdajdwTRWGkOfromFBPxauqxqiaeDUgRZQ xp61BuO5139oXzdb41CE3LJb6ojnMi7i0HMbOmidw5j+YnRK2GKO2Vbgjgy30uEsrYTw a+9fJnCKxga4cY4sRRwRfd9FWFyE+MmjWG9kkUpVMaCIsMWPbYKlg1LEUs8l4oeaMU1u CGdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=NxY2U63qgyMEUCwSWZhtoAS8j18UI7r/3+C+RtVIUFQ=; b=Jhg8WlW7Aram1urFgPIWjluBm1YOaKJ8J8umz5XKj07XQ2EKo1KQb1ARJmHrOdVbVD CLH4Hkr7fTnpGUqKmVmpOWTIFz+JWbYu5lC5wd2npgRHxJYOAnvDAzmvMFksYbduImjr zTcfoThbrGpdpLSk6Xw2Myzxl4sviA5F1KG5SP+s2WDblCTA602vRnipJ1u82xp8ecD6 rN7mK3nWQiJ40SUw/g1045kQ3TNvA4UeRscqS0slKf/hVXRqikgVqvuYt2A7xbhGCMwH tN9n8ybd29Euc2zUEagwrDSWwe68mpKdedqOrHJwvbP/qxDkPuWVzEFkNHw4aHCPiphG nyFw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 34-v6si13630538pgy.249.2018.10.15.16.20.25; Mon, 15 Oct 2018 16:20:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727079AbeJPHHU (ORCPT + 99 others); Tue, 16 Oct 2018 03:07:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55674 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726917AbeJPHHU (ORCPT ); Tue, 16 Oct 2018 03:07:20 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3E47A308A94E; Mon, 15 Oct 2018 23:19:54 +0000 (UTC) Received: from sky.random (ovpn-120-12.rdu2.redhat.com [10.10.120.12]) by smtp.corp.redhat.com (Postfix) with ESMTPS id F368967C64; Mon, 15 Oct 2018 23:19:53 +0000 (UTC) Date: Mon, 15 Oct 2018 19:19:53 -0400 From: Andrea Arcangeli To: Andrew Morton Cc: David Rientjes , Michal Hocko , Mel Gorman , Vlastimil Babka , Andrea Argangeli , Zi Yan , Stefan Priebe - Profihost AG , "Kirill A. Shutemov" , linux-mm@kvack.org, LKML , Stable tree Subject: Re: [PATCH 1/2] mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings Message-ID: <20181015231953.GC30832@redhat.com> References: <20181005232155.GA2298@redhat.com> <20181009094825.GC6931@suse.de> <20181009122745.GN8528@dhcp22.suse.cz> <20181009130034.GD6931@suse.de> <20181009142510.GU8528@dhcp22.suse.cz> <20181009230352.GE9307@redhat.com> <20181015154459.e870c30df5c41966ffb4aed8@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181015154459.e870c30df5c41966ffb4aed8@linux-foundation.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Mon, 15 Oct 2018 23:19:54 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Andrew, On Mon, Oct 15, 2018 at 03:44:59PM -0700, Andrew Morton wrote: > On Mon, 15 Oct 2018 15:30:17 -0700 (PDT) David Rientjes wrote: > > Would it be possible to test with my > > patch[*] that does not try reclaim to address the thrashing issue? > > Yes please. It'd also be great if a testcase reproducing the 40% higher access latency (with the one liner original fix) was available. We don't have a testcase for David's 40% latency increase problem, but that's likely to only happen when the system is somewhat low on memory globally. So the measurement must be done when compaction starts failing globally on all zones, but before the system starts swapping. The more global fragmentation the larger will be the window between "compaction fails because all zones are too fragmented" and "there is still free PAGE_SIZEd memory available to reclaim without swapping it out". If I understood correctly, that is precisely the window where the 40% higher latency should materialize. The workload that shows the badness in the upstream code is fairly trivial. Mel and Zi reproduced it too and I have two testcases that can reproduce it, one with device assignment and the other is just memhog. That's a massively larger window than the one where the 40% higher latency materializes. When there's 75% or more of the RAM free (not even allocated as easily reclaimable pagecache) globally, you don't expect to hit heavy swapping. The 40% THP allocation latency increase if you use MADV_HUGEPAGE in such window where all remote zones are fully fragmented is somehow lesser of a concern in my view (plus there's the compact deferred logic that should mitigate that scenario). Furthermore it is only a concern for page faults in MADV_HUGEPAGE ranges. If MADV_HUGEPAGE is set the userland allocation is long lived, so such higher allocation latency won't risk to hit short lived allocations that don't set MADV_HUGEPAGE (unless madvise=always, but that's not the default precisely because not all allocations are long lived). If the MADV_HUGEPAGE using library was freely available it'd also be nice.