Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758789AbYLLPzo (ORCPT ); Fri, 12 Dec 2008 10:55:44 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1757521AbYLLPze (ORCPT ); Fri, 12 Dec 2008 10:55:34 -0500 Received: from extu-mxob-2.symantec.com ([216.10.194.135]:47327 "EHLO extu-mxob-2.symantec.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757267AbYLLPzd (ORCPT ); Fri, 12 Dec 2008 10:55:33 -0500 Date: Fri, 12 Dec 2008 15:56:02 +0000 (GMT) From: Hugh Dickins X-X-Sender: hugh@blonde.anvils To: Oleg Nesterov cc: Andrew Morton , Balbir Singh , Jay Lan , Jiri Pirko , Jonathan Lim , KOSAKI Motohiro , linux-kernel@vger.kernel.org Subject: Re: [PATCH, RESEND] introduce get_mm_hiwater_xxx(), fix taskstats->hiwater_xxx accounting In-Reply-To: <20081212140524.GA29488@redhat.com> Message-ID: References: <20081212140524.GA29488@redhat.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3661 Lines: 87 On Fri, 12 Dec 2008, Oleg Nesterov wrote: > (changes: update the changelog/comments) > > xacct_add_tsk() relies on do_exit()->update_hiwater_xxx() and uses > mm->hiwater_xxx directly, this leads to 2 problems: > > - taskstats_user_cmd() can call fill_pid()->xacct_add_tsk() > at any moment before the task exits, so we should check the > current values of rss/vm anyway. > > - do_exit()->update_hiwater_xxx() calls are racy. An exiting > thread can be preempted right before mm->hiwater_xxx = new_val, > and another thread can use A_LOT of memory and exit in between. > When the first thread resumes it can be the last thread in the > thread group, in that case we report the wrong hiwater_xxx > values which do not take A_LOT into account. > > Introduce get_mm_hiwater_rss() and get_mm_hiwater_vm() helpers and > change xacct_add_tsk() to use them. The first helper will also be > used by rusage->ru_maxrss accounting. > > Kill do_exit()->update_hiwater_xxx() calls. Unless we are going to > decrease rss/vm there is no point to update mm->hiwater_xxx, and > nobody can look at this mm_struct when exit_mmap() actually unmaps > the memory. > > Signed-off-by: Oleg Nesterov Acked-by: Hugh Dickins > > --- K-28/include/linux/sched.h~HIWATER 2008-12-02 17:12:40.000000000 +0100 > +++ K-28/include/linux/sched.h 2008-12-03 18:17:18.000000000 +0100 > @@ -388,6 +388,9 @@ extern void arch_unmap_area_topdown(stru > (mm)->hiwater_vm = (mm)->total_vm; \ > } while (0) > > +#define get_mm_hiwater_rss(mm) max((mm)->hiwater_rss, get_mm_rss(mm)) > +#define get_mm_hiwater_vm(mm) max((mm)->hiwater_vm, (mm)->total_vm) > + > extern void set_dumpable(struct mm_struct *mm, int value); > extern int get_dumpable(struct mm_struct *mm); > > --- K-28/kernel/tsacct.c~HIWATER 2008-10-10 00:13:53.000000000 +0200 > +++ K-28/kernel/tsacct.c 2008-12-03 18:24:28.000000000 +0100 > @@ -90,8 +90,8 @@ void xacct_add_tsk(struct taskstats *sta > mm = get_task_mm(p); > if (mm) { > /* adjust to KB unit */ > - stats->hiwater_rss = mm->hiwater_rss * PAGE_SIZE / KB; > - stats->hiwater_vm = mm->hiwater_vm * PAGE_SIZE / KB; > + stats->hiwater_rss = get_mm_hiwater_rss(mm) * PAGE_SIZE / KB; > + stats->hiwater_vm = get_mm_hiwater_vm(mm) * PAGE_SIZE / KB; > mmput(mm); > } > stats->read_char = p->ioac.rchar; > --- K-28/kernel/exit.c~HIWATER 2008-12-02 17:12:40.000000000 +0100 > +++ K-28/kernel/exit.c 2008-12-03 18:21:06.000000000 +0100 > @@ -1048,10 +1048,7 @@ NORET_TYPE void do_exit(long code) > preempt_count()); > > acct_update_integrals(tsk); > - if (tsk->mm) { > - update_hiwater_rss(tsk->mm); > - update_hiwater_vm(tsk->mm); > - } > + > group_dead = atomic_dec_and_test(&tsk->signal->live); > if (group_dead) { > hrtimer_cancel(&tsk->signal->real_timer); > --- K-28/mm/mmap.c~HIWATER 2008-12-02 17:12:40.000000000 +0100 > +++ K-28/mm/mmap.c 2008-12-11 09:13:07.000000000 +0100 > @@ -2103,7 +2103,7 @@ void exit_mmap(struct mm_struct *mm) > lru_add_drain(); > flush_cache_mm(mm); > tlb = tlb_gather_mmu(mm, 1); > - /* Don't update_hiwater_rss(mm) here, do_exit already did */ > + /* update_hiwater_rss(mm) here? but nobody should be looking */ > /* Use -1 here to ensure all VMAs in the mm are unmapped */ > end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL); > vm_unacct_memory(nr_accounted); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/