Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp6089861ybi; Wed, 31 Jul 2019 08:12:02 -0700 (PDT) X-Google-Smtp-Source: APXvYqyhJpBcqSuM/gttFmExFExaDPyielVotByxaudxd9E22PmaK4/mukD1KwawaYdEY4I+DAdP X-Received: by 2002:aa7:9713:: with SMTP id a19mr29980204pfg.64.1564585922580; Wed, 31 Jul 2019 08:12:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564585922; cv=none; d=google.com; s=arc-20160816; b=O5P9g5WeTIkLSa8s0bYX9TcMkQEck7Gf3VcFBHMIWwySeLQRhzHmjtQD/bEU6m8o5x 78PaLMiCw3IX7Mzs5cvGihguH+upVM9mp43wOuqzdwlgXnLmlj0NGg941P2DuVhF2XWE ecqRqK5rjW6x2Apu5jCJIPmbwsC4hnvj7efdg8rCv8/E0j5M4vDvhbJIlLI6JvDLpfTO E0VrUrc5UssIEY9gnVfzs8FQSQ+t7qcJNngEL3mftKJjR0pVUHQE/Pmtq1ThA0IfDf4K z2wry3I9hpyqyKOS0rD42bKJEsZj57MSOFOTy80a20Y+/PdXmvo2TDGnDeWmwWz0i4jA vh6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:date:cc:to:from:subject:message-id; bh=a2UVqfb3vKkwAoFb8LMnoJhfpNx/nbwzZdgyGR+A5vg=; b=JbTSPgTbFIM3FDNLSwPA4Q3gmLuLTV5V9arW6FTRM2cfUtwKFz9lVoVrFWi28lIEXV +jMRrcQdMa31QX8C0V6Bj4aNKct27C2QV6gRNWSiXcEf2qF0kVIEx9etkNJoHSbeB8od ruCZqHkQZ5I4Sx2HmAPlleAdRqb6cYi6KZ6v6uBONTR0H7pDuvFCY7k2iUWzkEiK++Fh 4TJmvmK1HZmaW/9mi5iMiRcR3LcjpDU8VE3725ASyWLhXxvXt7A60xBiRpfEnCKsqYsD rfJihyKbkRFkoXfjnSoxowk/FWDyveWlzARUHC6fNGb2tkzfaxVhrT+vf3+PJ7NPyTRz eBSQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z10si30454330pln.285.2019.07.31.08.11.47; Wed, 31 Jul 2019 08:12:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728718AbfGaNsN (ORCPT + 99 others); Wed, 31 Jul 2019 09:48:13 -0400 Received: from shelob.surriel.com ([96.67.55.147]:45470 "EHLO shelob.surriel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727492AbfGaNsN (ORCPT ); Wed, 31 Jul 2019 09:48:13 -0400 Received: from imladris.surriel.com ([96.67.55.152]) by shelob.surriel.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.92) (envelope-from ) id 1hsoxZ-0000KL-F1; Wed, 31 Jul 2019 09:48:05 -0400 Message-ID: Subject: Re: [PATCH v3] sched/core: Don't use dying mm as active_mm of kthreads From: Rik van Riel To: Waiman Long , Peter Zijlstra , Ingo Molnar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Phil Auld , Michal Hocko Date: Wed, 31 Jul 2019 09:48:04 -0400 In-Reply-To: References: <20190729210728.21634-1-longman@redhat.com> <3e2ff4c9-c51f-8512-5051-5841131f4acb@redhat.com> <8021be4426fdafdce83517194112f43009fb9f6d.camel@surriel.com> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-xAvz8PAt4EpYAA5Kq7uu" User-Agent: Evolution 3.30.5 (3.30.5-1.fc29) MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-xAvz8PAt4EpYAA5Kq7uu Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 2019-07-30 at 17:01 -0400, Waiman Long wrote: > On 7/29/19 8:26 PM, Rik van Riel wrote: > > On Mon, 2019-07-29 at 17:42 -0400, Waiman Long wrote: > >=20 > > > What I have found is that a long running process on a mostly idle > > > system > > > with many CPUs is likely to cycle through a lot of the CPUs > > > during > > > its > > > lifetime and leave behind its mm in the active_mm of those > > > CPUs. My > > > 2-socket test system have 96 logical CPUs. After running the test > > > program for a minute or so, it leaves behind its mm in about half > > > of > > > the > > > CPUs with a mm_count of 45 after exit. So the dying mm will stay > > > until > > > all those 45 CPUs get new user tasks to run. > > OK. On what kernel are you seeing this? > >=20 > > On current upstream, the code in native_flush_tlb_others() > > will send a TLB flush to every CPU in mm_cpumask() if page > > table pages have been freed. > >=20 > > That should cause the lazy TLB CPUs to switch to init_mm > > when the exit->zap_page_range path gets to the point where > > it frees page tables. > >=20 > I was using the latest upstream 5.3-rc2 kernel. It may be the case > that > the mm has been switched, but the mm_count field of the active_mm of > the > kthread is not being decremented until a user task runs on a CPU. Is that something we could fix from the TLB flushing code? When switching to init_mm, drop the refcount on the lazy mm? That way that overhead is not added to the context switching code. --=20 All Rights Reversed. --=-xAvz8PAt4EpYAA5Kq7uu Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEKR73pCCtJ5Xj3yADznnekoTE3oMFAl1BnBUACgkQznnekoTE 3oP8LggAs4XburHZ+HTI3IJjkgfu6S82BUog14l4Iqg4Pk4/KMkf5dPrftjy8atc BcB98mXDlfQCjyPd3gj8JZVlxmpwcendnEKgh1ErkLh5cDDTUnhil7dSQjCVLCBi KRxakwewtyuK1MwCtcDM0fd1GhNJS/VWfGzDh5BxSLFbQSNlhGZyxR92xMMe9ra0 xIaIzzSdYJ9B9Uno9ZlaJdZwenrS/zEpE4iet6MSFaf/yy0gU0Bk07/x2IYNwsOB 0diPL3V6VWTPG7k0fjfiaBoDjSdBaogMAPWEO+0fG2g4KQMsxyPg1Kgfrayw2NW7 RSzLXBmZNFvkqJZMnErK1bmn937QQg== =14CR -----END PGP SIGNATURE----- --=-xAvz8PAt4EpYAA5Kq7uu--