Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751612AbdGaTfE (ORCPT ); Mon, 31 Jul 2017 15:35:04 -0400 Received: from shelob.surriel.com ([96.67.55.147]:45748 "EHLO shelob.surriel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751013AbdGaTfD (ORCPT ); Mon, 31 Jul 2017 15:35:03 -0400 X-Greylist: delayed 367 seconds by postgrey-1.27 at vger.kernel.org; Mon, 31 Jul 2017 15:35:03 EDT From: riel@redhat.com To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, mgorman@suse.de, mingo@kernel.org, jhladky@redhat.com, lvenanci@redhat.com Subject: [PATCH 0/2] numa,sched: improve performance for multi-threaded workloads Date: Mon, 31 Jul 2017 15:28:45 -0400 Message-Id: <20170731192847.23050-1-riel@redhat.com> X-Mailer: git-send-email 2.9.4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 619 Lines: 14 The NUMA balancing code spends way too much CPU time scanning and faulting when running multi-threaded workloads. This patch set slows down NUMA PTE scanning when there are lots of shared faults, and when dealing with large NUMA groups that have a large fraction of shared faults. Some results from Jirka's half-week performance run, on a 4 node system: - improvements in the range of 10-30% for NAS benchmarks (mostly ft and lu subtests) - SPECjbb2005 single instance mode - improvements in the range of 5-10% - SPECjvm2008 - performance very similar to before, some small improvements for the scimark* subtests