Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp780680ybt; Wed, 17 Jun 2020 13:55:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw4liLusy5vv7Q0dj95dQxaCzcs6HNDSbifvs+vWQI9gSANKBEKEwk6w3XbKWGbZitMb9eC X-Received: by 2002:aa7:c3d7:: with SMTP id l23mr986313edr.264.1592427335290; Wed, 17 Jun 2020 13:55:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592427335; cv=none; d=google.com; s=arc-20160816; b=a3UdrVtI/zCt1IGYw40wB1/qVRtjIz/N30AVxpt//FwkzeG0Xyt8xmcG3id3ENysI/ akggWV35AxOXpngOEeee0bvicKjj7nn5KGVZ6jVvybHu4gN0vTq5IbcxTCYIbtBwQuKW E0XIkc31VQXtussUJQV9l/+tv+h8u16E7pGuqNHFWJVrQHUzFAFCUxW5MTq78fZcYdO+ 7/KTAMh9JrvL6pFjZHIdCzW0u2D+hcJiDncct4Yb9p7cSQWJla+yO1LO7cBqGtxzAZkL wOzCXHLAoAnaydDhPv8FTDm4BeJeRtCp+gLP8RDP1nUUYs9yTZMKObnY0zdfSdOVobal RA7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=XQu9daZn51NXyFtujXDrZS0K/id1GLxiWki9rn+J0sw=; b=Ge6RDhf/RVHPXFMwiiDHXU4lJj4kyjL0hZPBDA/hP8VZUME4o8Uh1UPZVpTh0xp2yL ZN2DNxqPTos9T3eRJnEwfehMtPM0fCWOCPdQWk2SNQ8B99FlkJHRx9saxyuWkZA9SsOD Ka78hCTxxikTWqV27H64pWpVmRApLBCsm3ZIYbvTDBeO5eC+bTqRvvHcBaneNRTIjoTA iAE96TK3udp8jpgGg5+fHwC5xq3LXnLHSI+cd3zjT7dVIk+hhAsK8MxLC+VW8wJIkwsl 1VGZZ+pR4uKxY35z16FukbY17+wJgPyMO8kRt17XrnapR6UFvDOf7+Hn+12koZeOjnOv IrUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=HjDClTRv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id du12si799053ejc.441.2020.06.17.13.55.12; Wed, 17 Jun 2020 13:55:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=HjDClTRv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726993AbgFQUxO (ORCPT + 99 others); Wed, 17 Jun 2020 16:53:14 -0400 Received: from mail.kernel.org ([198.145.29.99]:38902 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726761AbgFQUxN (ORCPT ); Wed, 17 Jun 2020 16:53:13 -0400 Received: from X1 (nat-ab2241.sltdut.senawave.net [162.218.216.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id A8E2B2073E; Wed, 17 Jun 2020 20:53:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1592427193; bh=fn2yk/yFjARGWSEFsjLDcqp5WNIOynxfyeyLoNcKW+k=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=HjDClTRvr8xAI8oqRuscDf1288YkY+vBcA9xL8LlWBqGpFC63d80YZKXAwaGmhu1K yGOghB/CWY2PlFivh46jtSNKt6QlJIjws2MSND/pfhsIvwOdqe4tW1IP4F5Hkv0qqi 3kZqLpHGln4q7kjDM/Ap/jAyMTb1W6Y2n8fXtibc= Date: Wed, 17 Jun 2020 13:53:12 -0700 From: Andrew Morton To: Nitin Gupta Cc: Vlastimil Babka , Khalid Aziz , Oleksandr Natalenko , Michal Hocko , Mel Gorman , Matthew Wilcox , Mike Kravetz , Joonsoo Kim , David Rientjes , Nitin Gupta , linux-kernel , linux-mm , Linux API Subject: Re: [PATCH v8] mm: Proactive compaction Message-Id: <20200617135312.4f395479454c55a8d021b023@linux-foundation.org> In-Reply-To: <20200616204527.19185-1-nigupta@nvidia.com> References: <20200616204527.19185-1-nigupta@nvidia.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 16 Jun 2020 13:45:27 -0700 Nitin Gupta wrote: > For some applications, we need to allocate almost all memory as > hugepages. However, on a running system, higher-order allocations can > fail if the memory is fragmented. Linux kernel currently does on-demand > compaction as we request more hugepages, but this style of compaction > incurs very high latency. Experiments with one-time full memory > compaction (followed by hugepage allocations) show that kernel is able > to restore a highly fragmented memory state to a fairly compacted memory > state within <1 sec for a 32G system. Such data suggests that a more > proactive compaction can help us allocate a large fraction of memory as > hugepages keeping allocation latencies low. > > ... > All looks straightforward to me and easy to disable if it goes wrong. All the hard-coded magic numbers are a worry, but such is life. One teeny complaint: > > ... > > @@ -2650,12 +2801,34 @@ static int kcompactd(void *p) > unsigned long pflags; > > trace_mm_compaction_kcompactd_sleep(pgdat->node_id); > - wait_event_freezable(pgdat->kcompactd_wait, > - kcompactd_work_requested(pgdat)); > + if (wait_event_freezable_timeout(pgdat->kcompactd_wait, > + kcompactd_work_requested(pgdat), > + msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC))) { > + > + psi_memstall_enter(&pflags); > + kcompactd_do_work(pgdat); > + psi_memstall_leave(&pflags); > + continue; > + } > > - psi_memstall_enter(&pflags); > - kcompactd_do_work(pgdat); > - psi_memstall_leave(&pflags); > + /* kcompactd wait timeout */ > + if (should_proactive_compact_node(pgdat)) { > + unsigned int prev_score, score; Everywhere else, scores have type `int'. Here they are unsigned. How come? Would it be better to make these unsigned throughout? I don't think a score can ever be negative? > + if (proactive_defer) { > + proactive_defer--; > + continue; > + } > + prev_score = fragmentation_score_node(pgdat); > + proactive_compact_node(pgdat); > + score = fragmentation_score_node(pgdat); > + /* > + * Defer proactive compaction if the fragmentation > + * score did not go down i.e. no progress made. > + */ > + proactive_defer = score < prev_score ? > + 0 : 1 << COMPACT_MAX_DEFER_SHIFT; > + } > }