Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1655493imu; Wed, 12 Dec 2018 01:52:05 -0800 (PST) X-Google-Smtp-Source: AFSGD/VGmx1Ei1DixUm/Qd367TGKmDZppKned3lwNSMhcmrDBmMMJ266QRylELeWfb3y5DBgOg8S X-Received: by 2002:a62:6204:: with SMTP id w4mr19820185pfb.5.1544608325514; Wed, 12 Dec 2018 01:52:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544608325; cv=none; d=google.com; s=arc-20160816; b=FIFEaFU4aJ8R5umBWrjRamJ6mF/vVIMpw/Lw/c6w5IF3M9aIa0/UsR5KIqUHJMUMxv 7e2Ou39aE0nr1E2qCp/+JbAUTYo/48Y0leVUwQbfOV4KmdCGBZ4rikC+O96JFWN9K1M5 8qRQeV1mNL6Gd1UffSr0uo87d3qqlJaC/bK3jJ83BPyEx9cswiEzyKBFw31tZns42o3I 7/5bWvZAyCDThDRmXIE/gnkycLwa91ncYtFS+Z3vQeqTdL9XemZPzrM/v0DY1kepotcE oEMWwqSlMcN3y4LL66PVLDOeV4sP5QamoTO+X6opZLhuoDSTErZ/I4wodWoYpLCaYqnG /agQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=nczmoYbJNyOenu2yTcorq1Gr45tdVqx9K7LxMAlbOTg=; b=YN9f+vTymtjdQie8Dtz03cAb9KTPy2q9lFcA7RUxed8ofJaq9guSCcKZ6JnLA3/+/s n4z8pje1+HQk7TgAtO29qujNcvn485P0coKUKYpkA29lkgpbL5CqkgzWGaWGd3sM7plg pJIiL1zjulDsc89C8dEkoujgZGLm0fF5XbXbAfD1n+R+li3ToX0yiEV7k2fPubZaQ7RF RPdgk0EW2JBOB8S0B3jbsDhV0QbrSrWgCkJFZen08aB1RJ9B2+8mU+HSdgHxBwfDM3V5 Kx3mVFBsVOotzAiFA+f5yX4zDNDRmz8EsrvCQO6CVetMx6NLRMzMYY1rohUBZTY2Qp8c wMwA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w9si14294805pgg.72.2018.12.12.01.51.49; Wed, 12 Dec 2018 01:52:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726921AbeLLJuz (ORCPT + 99 others); Wed, 12 Dec 2018 04:50:55 -0500 Received: from mx2.suse.de ([195.135.220.15]:33638 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726646AbeLLJuz (ORCPT ); Wed, 12 Dec 2018 04:50:55 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 35863B002; Wed, 12 Dec 2018 09:50:53 +0000 (UTC) Date: Wed, 12 Dec 2018 10:50:51 +0100 From: Michal Hocko To: David Rientjes Cc: Andrea Arcangeli , Linus Torvalds , mgorman@techsingularity.net, Vlastimil Babka , ying.huang@intel.com, s.priebe@profihost.ag, Linux List Kernel Mailing , alex.williamson@redhat.com, lkp@01.org, kirill@shutemov.name, Andrew Morton , zi.yan@cs.rutgers.edu Subject: Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression Message-ID: <20181212095051.GO1286@dhcp22.suse.cz> References: <20181205204034.GB11899@redhat.com> <20181205233632.GE11899@redhat.com> <20181210044916.GC24097@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 11-12-18 16:37:22, David Rientjes wrote: [...] > Since it depends on the workload, specifically workloads that fit within a > single node, I think the reasonable approach would be to have a sane > default regardless of the use of MADV_HUGEPAGE or thp defrag settings and > then optimzie for the minority of cases where the workload does not fit in > a single node. I'm assuming there is no debate about these larger > workloads being in the minority, although we have single machines where > this encompasses the totality of their workloads. Your assumption is wrong I believe. This is the fundamental disagreement we are discussing here. You are essentially arguing for node_reclaim (formerly zone_reclaim) behavior for THP pages. All that without any actual data on wider variety of workloads. As the matter of _fact_ we know that node_reclaim behavior is not a suitable default. We did that mistake in the past and we had to revert that default _exactly_ because a wider variety of workloads suffered from over reclaim and performance issues as a result of constant reclaim You have also haven't explained why you do care so much about remote THP while you do not care about remote base bages (the page allocator falls back to those as soon as the kswapd doesn't keep pace with the allocation rate; THP or high order pages in general is analoguous with kcompactd doing a pro-active compaction). Like the base pages we do not want larger pages to fallback to a remote node too easily. There is no question about that I believe. I can be convinced that larger pages really require a different behavior than base pages but you should better show _real_ numbers on a wider variety workloads to back your claims. I have only heard hand waving and a very vague and quite doubtful numbers for a non-disclosed benchmark without a clear indication on how it relates to real world workloads. So color me unconvinced. -- Michal Hocko SUSE Labs