Received: by 10.192.165.148 with SMTP id m20csp2245438imm; Thu, 26 Apr 2018 08:02:43 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqunyzAT4V+hkXpjGhlOS+6eFshNY3U8n5PhjNpI2bqQ+WsW9A+TGSeW1sC6fDVjakLk327 X-Received: by 10.98.189.24 with SMTP id a24mr11283459pff.30.1524754962932; Thu, 26 Apr 2018 08:02:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524754962; cv=none; d=google.com; s=arc-20160816; b=seCEvJsDFhn6WqOax3DEIxjOoSvWE0C+TN1768YD3QqHMtg2B2x1lQ67JN0maeqvqQ tfCWLTk2PvjljAO4L0y/fDX9NHlrwQE8wZGZ9IwUcCnqu615lQMgvkpIL05mOUm6unpI 2pReBRoCAUUkc+CrzIH+k8H6gpeZo206bOEJyBjiKPSptkJCsMUByZ8PepXUJZ8yglQn ewJQ7/TSpouB90P8RsZJWJ95d1y0RkGXkBLI8CTzSIx/Gyfd0qP1M2B0wiaJO2NZLIcz 7fEZ3jcaJ1Z8jH0gwfPaM34Em4Vi8kL5ybWkZ7kp3CQ21sFnShD/HQ2JhgORLNrOLhBy 9ugA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=DxTYmZ1C7hoEUUxPOt4n2ZLRdbWf6WGFD4ZiFMPHRC0=; b=xaR/3mvFDa755oNQdRg3zhwHGegrKCb3xJ8vdV5GA06MSO/c3sLEtRd5jD5Qz3RXVe Gm2S8TG87lj/PM1d0rRkz8B9CZu28Xl4LLP4XAb4TaR01JW7CkAsknG1vTqtHFu2Tid6 3QV79ZO3K1mRKY3rZ3wwdUMPr2RAuE166rywnF7savJ52CH0NQ6OmdkcGgB/Sizfj4Gl V5gAnbnYMl28zoyQVwtBKFhxb8HLKFIyOX6WERAUK9Gb7gcgTPOOFzOIPrM+55GxN8C+ zabGHM0bX83mZhnFBmoXWg59XRZN+8iJXEL24f5Lmy6nE6DTKv7Ba4RYL4sMA1Wg0Xm/ wEJQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b97-v6si19460099plb.135.2018.04.26.08.02.27; Thu, 26 Apr 2018 08:02:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756533AbeDZPBY (ORCPT + 99 others); Thu, 26 Apr 2018 11:01:24 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:55998 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754847AbeDZPBX (ORCPT ); Thu, 26 Apr 2018 11:01:23 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C7379EC00F; Thu, 26 Apr 2018 15:01:22 +0000 (UTC) Received: from dhcp-27-174.brq.redhat.com (unknown [10.34.27.30]) by smtp.corp.redhat.com (Postfix) with SMTP id E571784445; Thu, 26 Apr 2018 15:01:17 +0000 (UTC) Received: by dhcp-27-174.brq.redhat.com (nbSMTP-1.00) for uid 1000 oleg@redhat.com; Thu, 26 Apr 2018 17:01:22 +0200 (CEST) Date: Thu, 26 Apr 2018 17:01:17 +0200 From: Oleg Nesterov To: Kirill Tkhai Cc: akpm@linux-foundation.org, peterz@infradead.org, viro@zeniv.linux.org.uk, mingo@kernel.org, paulmck@linux.vnet.ibm.com, keescook@chromium.org, riel@redhat.com, mhocko@suse.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, marcos.souza.org@gmail.com, hoeun.ryu@gmail.com, pasha.tatashin@oracle.com, gs051095@gmail.com, ebiederm@xmission.com, dhowells@redhat.com, rppt@linux.vnet.ibm.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/4] exit: Move read_unlock() up in mm_update_next_owner() Message-ID: <20180426150116.GA14818@redhat.com> References: <152473763015.29458.1131542311542381803.stgit@localhost.localdomain> <152474043375.29458.13978538538182642678.stgit@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <152474043375.29458.13978538538182642678.stgit@localhost.localdomain> User-Agent: Mutt/1.5.24 (2015-08-30) X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Thu, 26 Apr 2018 15:01:22 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Thu, 26 Apr 2018 15:01:22 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'oleg@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/26, Kirill Tkhai wrote: > > @@ -464,18 +464,15 @@ void mm_update_next_owner(struct mm_struct *mm) > return; > > assign_new_owner: > - BUG_ON(c == p); > get_task_struct(c); > + read_unlock(&tasklist_lock); > + BUG_ON(c == p); > + > /* > * The task_lock protects c->mm from changing. > * We always want mm->owner->mm == mm > */ > task_lock(c); > - /* > - * Delay read_unlock() till we have the task_lock() > - * to ensure that c does not slip away underneath us > - */ > - read_unlock(&tasklist_lock); I think this is correct, but... Firstly, I agree with Michal, it would be nice to kill mm_update_next_owner() altogether. If this is not possible I agree, it needs cleanups and we can change it to avoid tasklist (although your 4/4 looks overcomplicated to me at first glance). But in this case I think that whatever we do we should start with something like the patch below. I wrote it 3 years ago but it still applies. Oleg. Subject: [PATCH 1/3] memcg: introduce assign_new_owner() The code under "assign_new_owner" looks very ugly and suboptimal. We do not really need get_task_struct/put_task_struct(), we can simply recheck/change mm->owner under tasklist_lock. And we do not want to restart from the very beginning if ->mm was changed by the time we take task_lock(), we can simply continue (if we do not drop tasklist_lock). Just move this code into the new simple helper, assign_new_owner(). Signed-off-by: Oleg Nesterov --- kernel/exit.c | 56 ++++++++++++++++++++++++++------------------------------ 1 files changed, 26 insertions(+), 30 deletions(-) diff --git a/kernel/exit.c b/kernel/exit.c index 22fcc05..4d446ab 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -293,6 +293,23 @@ kill_orphaned_pgrp(struct task_struct *tsk, struct task_struct *parent) } #ifdef CONFIG_MEMCG +static bool assign_new_owner(struct mm_struct *mm, struct task_struct *c) +{ + bool ret = false; + + if (c->mm != mm) + return ret; + + task_lock(c); /* protects c->mm from changing */ + if (c->mm == mm) { + mm->owner = c; + ret = true; + } + task_unlock(c); + + return ret; +} + /* * A task is exiting. If it owned this mm, find a new owner for the mm. */ @@ -300,7 +317,6 @@ void mm_update_next_owner(struct mm_struct *mm) { struct task_struct *c, *g, *p = current; -retry: /* * If the exiting or execing task is not the owner, it's * someone else's problem. @@ -322,16 +338,16 @@ retry: * Search in the children */ list_for_each_entry(c, &p->children, sibling) { - if (c->mm == mm) - goto assign_new_owner; + if (assign_new_owner(mm, c)) + goto done; } /* * Search in the siblings */ list_for_each_entry(c, &p->real_parent->children, sibling) { - if (c->mm == mm) - goto assign_new_owner; + if (assign_new_owner(mm, c)) + goto done; } /* @@ -341,42 +357,22 @@ retry: if (g->flags & PF_KTHREAD) continue; for_each_thread(g, c) { - if (c->mm == mm) - goto assign_new_owner; + if (assign_new_owner(mm, c)) + goto done; if (c->mm) break; } } - read_unlock(&tasklist_lock); + /* * We found no owner yet mm_users > 1: this implies that we are * most likely racing with swapoff (try_to_unuse()) or /proc or * ptrace or page migration (get_task_mm()). Mark owner as NULL. */ mm->owner = NULL; - return; - -assign_new_owner: - BUG_ON(c == p); - get_task_struct(c); - /* - * The task_lock protects c->mm from changing. - * We always want mm->owner->mm == mm - */ - task_lock(c); - /* - * Delay read_unlock() till we have the task_lock() - * to ensure that c does not slip away underneath us - */ +done: read_unlock(&tasklist_lock); - if (c->mm != mm) { - task_unlock(c); - put_task_struct(c); - goto retry; - } - mm->owner = c; - task_unlock(c); - put_task_struct(c); + return; } #endif /* CONFIG_MEMCG */ -- 1.5.5.1