Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2107187imm; Thu, 7 Jun 2018 05:40:02 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKunUSzStNR+eYPKkgBLgwlbg43FBIPDrNReDyZTJdefp29wZTv725ODIkQFRrcEDTLWVy/ X-Received: by 2002:a62:4b16:: with SMTP id y22-v6mr1672744pfa.214.1528375202174; Thu, 07 Jun 2018 05:40:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528375202; cv=none; d=google.com; s=arc-20160816; b=Mvb+MgkD0tQIAgjXtoYDGevPF15GfeyFBN2mfajURybdbmuATdaz+qNyXM/MMsW6yY HQpE6PCnTGUDRhroNhWFVwgQghGBwEuROOxaiEQiK3Ob7Thzs+u3rwAOyYSEaS8Apue/ Ve9BlGzddWcKDoNLtAK+zaBRcmPn/qWuUJwrwSfIAhzA9f1UofpL12T2srTC6f0wJeaB L101NOUqlZp2siDH1QvOl+mxYHuz7ikSs1cSjL4+LAiBR97k0GdYjbK+12HXIRlSUvH6 JsKm71nIqTUnefoJuBwGT/4VOAmKZW7x4P92rT1oonYuvnictD8K0ZLDP3AXC6/n1QEf AxZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=C694HtCCf8id+La+nQjiO2tmris3m1cdqB3ad24hXqs=; b=HWdzuu8uZtjMCJdQaHQdbT47so8fiEZ7fVLzHhbJRNtywqlBKQ4hKGXp/g8yqrGh6O wQOti5HgNcwhEou2mtsXgQ6Mu0+bRQzZhX8BWgckXsI+00iAxUbEyM9EjJahh/RpyraN dbHNGoe4TjlrcJxlTOWI4LuOmZhDDwKkMvCbhmQqtl5a7QXqKIVWEGi0c/Ca8ug+6OG2 /i8pv4ly0auQA1Xys4iFUvlP1tiUIMahalOLSOxfY60ZLF5J7KQKPVEga8BuL35//zBq /DwA4BNhFH/isuxRbvYOf5h6On0dgOy8r7qzSVlcinC9tQikzGO0FFeH3GxtwVlZddhY ++zA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q1-v6si42265343pgc.156.2018.06.07.05.39.47; Thu, 07 Jun 2018 05:40:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753577AbeFGMjS (ORCPT + 99 others); Thu, 7 Jun 2018 08:39:18 -0400 Received: from outbound-smtp11.blacknight.com ([46.22.139.106]:43915 "EHLO outbound-smtp11.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753261AbeFGMjR (ORCPT ); Thu, 7 Jun 2018 08:39:17 -0400 Received: from mail.blacknight.com (pemlinmail03.blacknight.ie [81.17.254.16]) by outbound-smtp11.blacknight.com (Postfix) with ESMTPS id D124E1C2D7F for ; Thu, 7 Jun 2018 13:39:15 +0100 (IST) Received: (qmail 30674 invoked from network); 7 Jun 2018 12:39:15 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[37.228.237.73]) by 81.17.254.9 with ESMTPSA (DHE-RSA-AES256-SHA encrypted, authenticated); 7 Jun 2018 12:39:15 -0000 Date: Thu, 7 Jun 2018 13:39:15 +0100 From: Mel Gorman To: Jakub Racek Cc: linux-kernel@vger.kernel.org, "Rafael J. Wysocki" , Len Brown , linux-acpi@vger.kernel.org Subject: Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks Message-ID: <20180607123915.avrqbpp4adgj7ck4@techsingularity.net> References: <20180606122731.GB27707@jra-laptop.brq.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20180606122731.GB27707@jra-laptop.brq.redhat.com> User-Agent: NeoMutt/20170912 (1.9.0) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 06, 2018 at 02:27:32PM +0200, Jakub Racek wrote: > There is a huge performance regression on the 2 and 4 NUMA node systems on > stream benchmark with 4.17 kernel compared to 4.16 kernel. Stream, Linpack > and NAS parallel benchmarks show upto 50% performance drop. > I have not observed this yet but NAS is the only one I'll see and that could be a week or more away before I have data. I'll keep an eye out at least. > When running for example 20 stream processes in parallel, we see the following behavior: > > * all processes are started at NODE #1 > * memory is also allocated on NODE #1 > * roughly half of the processes are moved to the NODE #0 very quickly. * > however, memory is not moved to NODE #0 and stays allocated on NODE #1 > Ok, 20 processes getting rescheduled to another node is not unreasonable from a load-balancing perspective but memory locality is not always taken into account. You also don't state what parallelisation method you used for STREAM and it's relevant because of how tasks end up communicating and what that means for placement. The only automatic NUMA balancing patch I can think of that has a high chance of being a factor is 7347fc87dfe6b7315e74310ee1243dc222c68086 but I cannot see how STREAM would be affected as I severely doubt the processes are communicating heavily (unless openmp and then it's a maybe). It might affect NAS because that does a lot of wakeups via futex that has "interesting" characteristics (either openmp or openmpi). 082f764a2f3f2968afa1a0b04a1ccb1b70633844 might also be a factor but it's doubtful. I don't know about Linpack as I've never characterised it so I don't know how it behaves. There are a few patches that affect utilisation calculation which might affect the load balancer but I can't pinpoint a single likely candidate. Given that STREAM is usually short-lived, is bisection an option? -- Mel Gorman SUSE Labs