Received: by 2002:a05:6602:2086:0:0:0:0 with SMTP id a6csp4501092ioa; Wed, 27 Apr 2022 05:19:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw35tt4tPbyZGFhESTRlGnpbXgPfHMxYDskXqiTPkJtyd9Dg4qTY8YNPzFnFCPk/0ZWN+OM X-Received: by 2002:a17:90a:e7c1:b0:1d2:b8f8:ecc1 with SMTP id kb1-20020a17090ae7c100b001d2b8f8ecc1mr43025179pjb.176.1651061988141; Wed, 27 Apr 2022 05:19:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651061988; cv=none; d=google.com; s=arc-20160816; b=AVr78XfEwyIp7Rb0DaavnpnmI69xfLlWegN6uAB/fk2h8DStWyOHclgmTInhcHO7XB EPGtcvIEDNlNbSIsRfLcvxB3yeeUyhjkCRcvvJp3++gA2kgdbF3naM+cch9hpF/a9Orz 9SBLEaf70sGWZxFuu7NzVi3qVwfh4YYD2c3aHXluyNCqpnvEVyN9+jCbLxvWdF2vAAk3 Eblp7xJoxZRtkHVOn2G0XQFM8Vvw7z58eHIS++pIu4RBYijKFrd+mBHG8Wsoogr0aScq /DJIxT8BpkYv4BZFuK2eF2xDCusrMJgg9acEWYzRp5PUMVFaVAJ8QSskaX9UP+K8x/QB 70bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=8h4qf0khrxPKbsB07fnRG9lfCodKjZrYusqEB8Pw1t0=; b=uSmXLArFJE/3Rz3JEK2+cu4V8Hl/SDDpu/4Uuu8+UYHM1IShR3qs76Lqrz4yW/kLGt Xw8vHrK2yxsS3k7/hwm2wR8UlW/EhAkS4SgQrGMJAzg9DEaH0OAwN0Pj2z3+0MmWULO7 uSDO3hoJh5C4hXcMFA0yU9RGOVFBZK//H2upmrOZZbKuFQEmfVLJUWtXgaT2YyMdqvzO FGpvQvDoFUqp1cMfOEC99sBiaQSjAazggP4/2Ome8MotgcFm0w50SCNW7RtwkVoYUIWB QZAUD1fipX+9jO5F6LaU4maN87scgxSc1kxHxbTU36EoyqxeYG97O+UEVshwcmUSU4xX ZkPA== ARC-Authentication-Results: i=1; mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id b11-20020a17090a8c8b00b001d7e1e0f96bsi1451186pjo.131.2022.04.27.05.19.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Apr 2022 05:19:48 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E0F9C2F01D; Wed, 27 Apr 2022 04:55:41 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233264AbiD0L6o (ORCPT + 99 others); Wed, 27 Apr 2022 07:58:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233248AbiD0L6n (ORCPT ); Wed, 27 Apr 2022 07:58:43 -0400 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4B352ED5A for ; Wed, 27 Apr 2022 04:55:28 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04395;MF=rongwei.wang@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0VBT62Mb_1651060523; Received: from 30.240.99.9(mailfrom:rongwei.wang@linux.alibaba.com fp:SMTPD_---0VBT62Mb_1651060523) by smtp.aliyun-inc.com(127.0.0.1); Wed, 27 Apr 2022 19:55:24 +0800 Message-ID: Date: Wed, 27 Apr 2022 19:55:22 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:100.0) Gecko/20100101 Thunderbird/100.0 Subject: Re: DAMON VA regions don't split on an large Android APP Content-Language: en-US To: Barry Song <21cnbao@gmail.com> Cc: sj@kernel.org, Andrew Morton , Linux-MM , LKML , Matthew Wilcox , shuah@kernel.org, brendanhiggins@google.com, foersleo@amazon.de, sieberf@amazon.com, Shakeel Butt , sjpark@amazon.de, tuhailong@gmail.com, Song Jiang , =?UTF-8?B?5byg6K+X5piOKFNpbW9uIFpoYW5nKQ==?= , =?UTF-8?B?5p2O5Z+56ZSLKHdpbmsp?= , xhao@linux.alibaba.com References: From: Rongwei Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,UNPARSEABLE_RELAY autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/27/22 5:22 PM, Barry Song wrote: > On Wed, Apr 27, 2022 at 7:44 PM Barry Song <21cnbao@gmail.com> wrote: >> >> On Wed, Apr 27, 2022 at 6:56 PM Rongwei Wang >> wrote: >>> >>> >>> >>> On 4/27/22 7:19 AM, Barry Song wrote: >>>> Hi SeongJae & Andrew, >>>> (also Cc-ed main damon developers) >>>> On an Android phone, I tried to use the DAMON vaddr monitor and found >>>> that vaddr regions don't split well on large Android Apps though >>>> everything works well on native Apps. >>>> >>>> I have tried the below two cases on an Android phone with 12GB memory >>>> and snapdragon 888 CPU. >>>> 1. a native program with small memory working set as below, >>>> #define size (1024*1024*100) >>>> main() >>>> { >>>> volatile int *p = malloc(size); >>>> memset(p, 0x55, size); >>>> >>>> while(1) { >>>> int i; >>>> for (i = 0; i < size / 4; i++) >>>> (void)*(p + i); >>>> usleep(1000); >>>> >>>> for (i = 0; i < size / 16; i++) >>>> (void)*(p + i); >>>> usleep(1000); >>>> >>>> } >>>> } >>>> For this application, the Damon vaddr monitor works very well. >>>> I have modified monitor.py in the damo userspace tool a little bit to >>>> show the raw data getting from the kernel. >>>> Regions can split decently on this kind of applications, a typical raw >>>> data is as below, >>>> >>>> monitoring_start: 2.224 s >>>> monitoring_end: 2.329 s >>>> monitoring_duration: 104.336 ms >>>> target_id: 0 >>>> nr_regions: 24 >>>> 005fb37b2000-005fb734a000( 59.594 MiB): 0 >>>> 005fb734a000-005fbaf95000( 60.293 MiB): 0 >>>> 005fbaf95000-005fbec0b000( 60.461 MiB): 0 >>>> 005fbec0b000-005fc2910000( 61.020 MiB): 0 >>>> 005fc2910000-005fc6769000( 62.348 MiB): 0 >>>> 005fc6769000-005fca33f000( 59.836 MiB): 0 >>>> 005fca33f000-005fcdc8b000( 57.297 MiB): 0 >>>> 005fcdc8b000-005fd115a000( 52.809 MiB): 0 >>>> 005fd115a000-005fd45bd000( 52.387 MiB): 0 >>>> 007661c59000-007661ee4000( 2.543 MiB): 2 >>>> 007661ee4000-0076623e4000( 5.000 MiB): 3 >>>> 0076623e4000-007662837000( 4.324 MiB): 2 >>>> 007662837000-0076630f1000( 8.727 MiB): 3 >>>> 0076630f1000-007663494000( 3.637 MiB): 2 >>>> 007663494000-007663753000( 2.746 MiB): 1 >>>> 007663753000-007664251000( 10.992 MiB): 3 >>>> 007664251000-0076666fd000( 36.672 MiB): 2 >>>> 0076666fd000-007666e73000( 7.461 MiB): 1 >>>> 007666e73000-007667c89000( 14.086 MiB): 2 >>>> 007667c89000-007667f97000( 3.055 MiB): 0 >>>> 007667f97000-007668112000( 1.480 MiB): 1 >>>> 007668112000-00766820f000(1012.000 KiB): 0 >>>> 007ff27b7000-007ff27d6000( 124.000 KiB): 0 >>>> 007ff27d6000-007ff27d8000( 8.000 KiB): 8 >>>> >>>> 2. a large Android app like Asphalt 9 >>>> For this case, basically regions can't split very well, but monitor >>>> works on small vma: >>>> >>>> monitoring_start: 2.220 s >>>> monitoring_end: 2.318 s >>>> monitoring_duration: 98.576 ms >>>> target_id: 0 >>>> nr_regions: 15 >>>> 000012c00000-0001c301e000( 6.754 GiB): 0 >>>> 0001c301e000-000371b6c000( 6.730 GiB): 0 >>>> 000371b6c000-000400000000( 2.223 GiB): 0 >>>> 005c6759d000-005c675a2000( 20.000 KiB): 0 >>>> 005c675a2000-005c675a3000( 4.000 KiB): 3 >>>> 005c675a3000-005c675a7000( 16.000 KiB): 0 >>>> 0072f1e14000-0074928d4000( 6.510 GiB): 0 >>>> 0074928d4000-00763c71f000( 6.655 GiB): 0 >>>> 00763c71f000-0077e863e000( 6.687 GiB): 0 >>>> 0077e863e000-00798e214000( 6.590 GiB): 0 >>>> 00798e214000-007b0e48a000( 6.002 GiB): 0 >>>> 007b0e48a000-007c62f00000( 5.323 GiB): 0 >>>> 007c62f00000-007defb19000( 6.199 GiB): 0 >>>> 007defb19000-007f794ef000( 6.150 GiB): 0 >>>> 007f794ef000-007fe8f53000( 1.745 GiB): 0 >>>> >>>> As you can see, we have some regions which are very very big and they >>>> are losing the chance to be splitted. But >>>> Damon can still monitor memory access for those small VMA areas very well like: >>>> 005c675a2000-005c675a3000( 4.000 KiB): 3 >>> Hi, Barry >>> >>> Actually, we also had found the same problem in redis by ourselves >>> tool[1]. The DAMON can not split the large anon VMA well, and the anon >>> VMA has 10G~20G memory. I guess the whole region doesn't have sufficient >>> hot areas to been monitored or found by DAMON, likes one or more address >>> choose by DAMON not been accessed during sample period. >> >> Hi Rongwei, >> Thanks for your comments and thanks for sharing your tools. >> >> I guess the cause might be: >> in case a region is very big like 10GiB, we have only 1MiB hot pages >> in this large region. >> damon will randomly pick one page to sample, but the page has only >> 1MiB/10GiB, thus >> less than 1/10000 chance to hit the hot 1MiB. so probably we need >> 10000 sample periods >> to hit the hot 1MiB in order to split this large region? >> >> @SeongJae, please correct me if I am wrong. >> >>> >>> I'm not sure whether sets init_regions can deal with the above problem, >>> or dynamic choose one or limited number VMA to monitor. >>> >> >> I won't set a limited number of VMA as this will make the damon too hard to use >> as nobody wants to make such complex operations, especially an Android >> app might have more than 8000 VMAs. >> >> I agree init_regions might be the right place to enhance the situation. >> >>> I'm not sure, just share my idea. >>> >>> [1] https://github.com/aliyun/data-profile-tools.git >> >> I suppose this tool is based on damon? How do you finally resolve the problem >> that large anon VMAs can't be splitted? >> Anyway, I will give your tool a try. > > Unfortunately, data-profile-tools.git doesn't build on aarch64 ubuntu > though autogen.sh > runs successfully. > > /usr/bin/ld: ./.libs/libdatop.a(disp.o): in function `cons_handler': > /root/data-profile-tools/src/disp.c:625: undefined reference to `stdscr' > /usr/bin/ld: /root/data-profile-tools/src/disp.c:625: undefined > reference to `stdscr' > /usr/bin/ld: /root/data-profile-tools/src/disp.c:625: undefined > reference to `wgetch' > /usr/bin/ld: ./.libs/libdatop.a(reg.o): in function `reg_win_create': > /root/data-profile-tools/src/reg.c:108: undefined reference to `stdscr' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:108: undefined > reference to `stdscr' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:108: undefined > reference to `subwin' > /usr/bin/ld: ./.libs/libdatop.a(reg.o): in function `reg_erase': > /root/data-profile-tools/src/reg.c:161: undefined reference to `werase' > /usr/bin/ld: ./.libs/libdatop.a(reg.o): in function `reg_refresh': > /root/data-profile-tools/src/reg.c:171: undefined reference to `wrefresh' > /usr/bin/ld: ./.libs/libdatop.a(reg.o): in function `reg_refresh_nout': > /root/data-profile-tools/src/reg.c:182: undefined reference to `wnoutrefresh' > /usr/bin/ld: ./.libs/libdatop.a(reg.o): in function `reg_update_all': > /root/data-profile-tools/src/reg.c:191: undefined reference to `doupdate' > /usr/bin/ld: ./.libs/libdatop.a(reg.o): in function `reg_win_destroy': > /root/data-profile-tools/src/reg.c:200: undefined reference to `delwin' > /usr/bin/ld: ./.libs/libdatop.a(reg.o): in function `reg_line_write': > /root/data-profile-tools/src/reg.c:226: undefined reference to `mvwprintw' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:230: undefined > reference to `wattr_off' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:217: undefined > reference to `wattr_on' > /usr/bin/ld: ./.libs/libdatop.a(reg.o): in function `reg_highlight_write': > /root/data-profile-tools/src/reg.c:245: undefined reference to `wattr_on' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:255: undefined > reference to `wattr_off' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:252: undefined > reference to `mvwprintw' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:255: undefined > reference to `wattr_off' > /usr/bin/ld: ./.libs/libdatop.a(reg.o): in function `reg_curses_fini': > /root/data-profile-tools/src/reg.c:367: undefined reference to `stdscr' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:367: undefined > reference to `stdscr' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:367: undefined > reference to `wclear' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:368: undefined > reference to `wrefresh' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:369: undefined > reference to `endwin' > /usr/bin/ld: ./.libs/libdatop.a(reg.o): in function `reg_curses_init': > /root/data-profile-tools/src/reg.c:382: undefined reference to `stdscr' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:381: undefined > reference to `initscr' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:382: undefined > reference to `stdscr' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:382: undefined > reference to `wrefresh' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:383: undefined > reference to `use_default_colors' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:384: undefined > reference to `start_color' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:385: undefined > reference to `keypad' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:386: undefined > reference to `nonl' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:387: undefined > reference to `cbreak' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:388: undefined > reference to `noecho' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:389: undefined > reference to `curs_set' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:401: undefined > reference to `stdscr' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:401: undefined > reference to `mvwprintw' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:403: undefined > reference to `mvwprintw' > /usr/bin/ld: /root/data-profile-tools/src/reg.c:405: undefined > reference to `wrefresh' > collect2: error: ld returned 1 exit status > make[1]: *** [Makefile:592: datop] Error 1 > make[1]: Leaving directory '/root/data-profile-tools' > make: *** [Makefile:438: all] Error 2 Hi, Barry Thank you for this bug report. It seems this tool had not supported with ubuntu. And we just support for CentOS or AnolisOS. I am trying to fix this bug. I see all these errors reported by you are extensions to the course library? I am not familiar with Ubuntu and It looks that these errors can be fixed if course relevant library installed. Anyway, I will try to fix it next. Thanks. > >> >>>> >>>> Typical characteristics of a large Android app is that it has >>>> thousands of vma and very large virtual address spaces: >>>> ~/damo # pmap 2550 | wc -l >>>> 8522 >>>> >>>> ~/damo # pmap 2550 >>>> ... >>>> 0000007992bbe000 4K r---- [ anon ] >>>> 0000007992bbf000 24K rw--- [ anon ] >>>> 0000007fe8753000 4K ----- [ anon ] >>>> 0000007fe8754000 8188K rw--- [ stack ] >>>> total 36742112K >>>> >>>> Because the whole vma list is too long, I have put the list here for >>>> you to download: >>>> wget http://www.linuxep.com/patches/android-app-vmas >>>> >>>> I can reproduce this problem on other Apps like youtube as well. >>>> I suppose we need to boost the algorithm of splitting regions for this >>>> kind of application. >>>> Any thoughts? >>>> >> >> Thanks >> Barry