Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp1614915ybh; Fri, 13 Mar 2020 04:28:27 -0700 (PDT) X-Google-Smtp-Source: ADFU+vt8iH8/KqsOflHrhCanVT0bGE9FMeNA23PCPS+CNH6PE9bFMkNj/eVsLGsah2FyX2RypjTO X-Received: by 2002:a54:4181:: with SMTP id 1mr6273499oiy.61.1584098907611; Fri, 13 Mar 2020 04:28:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584098907; cv=none; d=google.com; s=arc-20160816; b=sSasTAwn0PSKP6yWaF69nvZzu3OQdiiCAtB5C4/vq0kKPZleUnezIUaIsCIp4791vy FcWEbecZ90kKwz1I4HTUYNUbczGlmtvnhbZXgg4Wy1nC8e0nthWLHxs5MmwFSPgp/X5J vmDOhfVFG5SKP52TvIjTv9pfwAw2G3173/ujJfXvRfBxO89ARLgc86UqTg0MJniuhy2o 1ruxunuxUtdWbqu8KB34xbx1UGJIXKlYsgdj5b+bBFoZy/q0gpDJJAXju1W4MHs0V95B Eyh7b2fpwNCYVvuSkGJZsGaf3hzQ3a/Pyq72nQoIpU30brkeL85EMXxAYldK7pJ/7k/o O7GQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:subject:from; bh=HyaVjkj4HilWoszM9CFIGMTCgI7g7iTFY9VtXhdKvyM=; b=VMT6B1BhvnHhRWFRtOlFJRPqBb2w1WA6b8CEAu9IqfTyAcX8gmNBdki9yrOJbKGI/w tOMtPeM0sVjqupL1M3YRxK+H7d6MuZJem8QDbHqWLshz52uzYjagf+7iu168VQqngQRJ j74jOk5wj0yLdz73c+yRzbAe1I7HeqgXxNioQkK1hD9rzrC9qbtD150kprp4Xwi2f6HJ ekJLmQhDVckSGtcbvplno68gG48tCihWRVsq4GFbn15ga8UneLq1LDZR7AHmzS8LOgg1 uTqLxZ9zATd2qrIqs1xsU1pr5j1XwM3k0/wRzVhI3+7iD80939Ckn9UeRYCofks4++L2 GcWA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x16si4128970otp.180.2020.03.13.04.28.14; Fri, 13 Mar 2020 04:28:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726651AbgCML0C (ORCPT + 99 others); Fri, 13 Mar 2020 07:26:02 -0400 Received: from mx2.suse.de ([195.135.220.15]:41830 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726423AbgCML0C (ORCPT ); Fri, 13 Mar 2020 07:26:02 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 9FFD9AAC7; Fri, 13 Mar 2020 11:25:58 +0000 (UTC) From: Vlastimil Babka Subject: Re: [PATCH v2 2/2] mm: mmap: add trace point of vm_unmapped_area To: Jaewon Kim , willy@infradead.org, walken@google.com, bp@suse.de, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, jaewon31.kim@gmail.com, Steven Rostedt References: <20200313011420.15995-1-jaewon31.kim@samsung.com> <20200313011420.15995-3-jaewon31.kim@samsung.com> Message-ID: <66137719-aa4d-ac40-f167-3bcb38f092db@suse.cz> Date: Fri, 13 Mar 2020 12:25:57 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: <20200313011420.15995-3-jaewon31.kim@samsung.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/13/20 2:14 AM, Jaewon Kim wrote: > Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. > Virtual memory space shortage of a task on mmap is reported to userspace > as -ENOMEM. It can be confused as physical memory shortage of overall > system. > > The vm_unmapped_area can be called to by some drivers or other kernel > core system like filesystem. In my platform, GPU driver calls to > vm_unmapped_area and the driver returns -ENOMEM even in GPU side > shortage. It can be hard to distinguish which code layer returns the > -ENOMEM. > > Create mmap trace file and add trace point of vm_unmapped_area. > > i.e.) > 265.214275: vm_unmapped_area: addr=fffffffffffffff4 err=-12 tvm=0xffb08 flags=0x1 len=0x100000 lo=0x8000 hi=0xf7182000 mask=0x0 ofs=0x22 > 265.214276: vm_unmapped_area: addr=fffffffffffffff4 err=-12 tvm=0xffb08 flags=0x0 len=0x100000 lo=0x40000000 hi=0xfffff000 mask=0x0 ofs=0x22 I think it's nicer to print addr as 0 if it was in fact an error. Also "tvm" is cryptic, I would just use total_vm? > 510.827472: vm_unmapped_area: addr=7377060000 err=0 tvm=0x262f flags=0x1 len=0x7f000 lo=0x8000 hi=0x73795f7000 mask=0x0 ofs=0x22 > 510.827668: vm_unmapped_area: addr=73794d1000 err=0 tvm=0x25cc flags=0x1 len=0x1000 lo=0x8000 hi=0x73795f7000 mask=0x0 ofs=0x22 > > Signed-off-by: Jaewon Kim > --- > include/trace/events/mmap.h | 52 +++++++++++++++++++++++++++++++++++++++++++++ > mm/mmap.c | 12 +++++++++-- > 2 files changed, 62 insertions(+), 2 deletions(-) > create mode 100644 include/trace/events/mmap.h > > diff --git a/include/trace/events/mmap.h b/include/trace/events/mmap.h > new file mode 100644 > index 000000000000..a51dd8af6f8c > --- /dev/null > +++ b/include/trace/events/mmap.h > @@ -0,0 +1,52 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#undef TRACE_SYSTEM > +#define TRACE_SYSTEM mmap > + > +#if !defined(_TRACE_MMAP_H) || defined(TRACE_HEADER_MULTI_READ) > +#define _TRACE_MMAP_H > + > +#include > + > +TRACE_EVENT(vm_unmapped_area, > + > + TP_PROTO(unsigned long addr, struct vm_unmapped_area_info *info), > + > + TP_ARGS(addr, info), > + > + TP_STRUCT__entry( > + __field(unsigned long, addr) > + __field(unsigned long, error) That seems wasteful to store both, coming from the same value. > + __field(unsigned long, total_vm) > + __field(unsigned long, flags) > + __field(unsigned long, length) > + __field(unsigned long, low_limit) > + __field(unsigned long, high_limit) > + __field(unsigned long, align_mask) > + __field(unsigned long, align_offset) > + ), > + > + TP_fast_assign( > + __entry->addr = addr; > + if (IS_ERR_VALUE(addr)) > + __entry->error = addr; > + else > + __entry->error = 0; fast_assign should be also as fast as possible, i.e. avoid these decision, just store addr... > + __entry->total_vm = current->mm->total_vm; > + __entry->flags = info->flags; > + __entry->length = info->length; > + __entry->low_limit = info->low_limit; > + __entry->high_limit = info->high_limit; > + __entry->align_mask = info->align_mask; > + __entry->align_offset = info->align_offset; > + ), > + > + TP_printk("addr=%lx err=%ld tvm=0x%lx flags=0x%lx len=0x%lx lo=0x%lx hi=0x%lx mask=0x%lx ofs=0x%lx\n", > + __entry->addr, __entry->error, __entry->total_vm, ... and only here do the decisions if you print addr as zero and err as non-zero, or vice versa. Something like the patch at the end of the e-mail. Except then /sys/kernel/debug/tracing/events/mmap/vm_unmapped_area/format contains: print fmt: "addr=%lx err=%ld tvm=0x%lx flags=0x%lx len=0x%lx lo=0x%lx hi=0x%lx mask=0x%lx ofs=0x%lx ", __builtin_expect(!!((unsigned long)(void *)(REC->addr) >= (unsigned long)-4095), 0) ? 0 : REC->addr, __builtin_expect(!!((unsigned long)(void *)(REC->addr) >= (unsigned long)-4095), 0) ? REC->addr : 0, REC->total_vm, REC->flags, REC->length, REC->low_limit, REC->high_limit, REC->align_mask, REC->align_offset and trace-cmd doesn't like it: sleep-381 [001] 93.434157: vm_unmapped_area: [FAILED TO PARSE] addr=0x7f7f32c47000 total_vm=0x2a flags=0x1 length=0x2d000 low_limit=0x1000 high_limit=0x7f7f32c74000 align_mask=0x0 align_offset=0x0 CC Steven, any idea? Does it not like the __builtin_epect, i.e. unlikely() that's part of IS_ERR_VALUE? Thanks, Vlastimil > + __entry->flags, __entry->length, __entry->low_limit, > + __entry->high_limit, __entry->align_mask, > + __entry->align_offset) > +); > +#endif > + > +/* This part must be outside protection */ > +#include > diff --git a/mm/mmap.c b/mm/mmap.c > index eeaddb76286c..68ccbd7d5256 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -53,6 +53,9 @@ > #include > #include > > +#define CREATE_TRACE_POINTS > +#include > + > #include "internal.h" > > #ifndef arch_mmap_check > @@ -2061,10 +2064,15 @@ unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info) > */ > unsigned long vm_unmapped_area(struct vm_unmapped_area_info *info) > { > + unsigned long addr; > + > if (info->flags & VM_UNMAPPED_AREA_TOPDOWN) > - return unmapped_area_topdown(info); > + addr = unmapped_area_topdown(info); > else > - return unmapped_area(info); > + addr = unmapped_area(info); > + > + trace_vm_unmapped_area(addr, info); > + return addr; > } > > #ifndef arch_get_mmap_end > ------8<------ diff --git a/include/trace/events/mmap.h b/include/trace/events/mmap.h index a51dd8af6f8c..57cccf731721 100644 --- a/include/trace/events/mmap.h +++ b/include/trace/events/mmap.h @@ -15,7 +15,6 @@ TRACE_EVENT(vm_unmapped_area, TP_STRUCT__entry( __field(unsigned long, addr) - __field(unsigned long, error) __field(unsigned long, total_vm) __field(unsigned long, flags) __field(unsigned long, length) @@ -27,10 +26,6 @@ TRACE_EVENT(vm_unmapped_area, TP_fast_assign( __entry->addr = addr; - if (IS_ERR_VALUE(addr)) - __entry->error = addr; - else - __entry->error = 0; __entry->total_vm = current->mm->total_vm; __entry->flags = info->flags; __entry->length = info->length; @@ -41,8 +36,8 @@ TRACE_EVENT(vm_unmapped_area, ), TP_printk("addr=%lx err=%ld tvm=0x%lx flags=0x%lx len=0x%lx lo=0x%lx hi=0x%lx mask=0x%lx ofs=0x%lx\n", - __entry->addr, __entry->error, __entry->total_vm, - __entry->flags, __entry->length, __entry->low_limit, + IS_ERR_VALUE(__entry->addr) ? 0 : __entry->addr, IS_ERR_VALUE(__entry->addr) ? __entry->addr : 0, + __entry->total_vm, __entry->flags, __entry->length, __entry->low_limit, __entry->high_limit, __entry->align_mask, __entry->align_offset) );