[Please do not use html emails to the mailing list and try to fix your
email client to not break quoating. Fixed for this email]
On Mon 18-03-19 13:28:50, Pankaj Suryawanshi wrote:
> On Mon 18-03-19 12:58:28, Pankaj Suryawanshi wrote:
> > > Hello,
> > >
> > > I am facing issue of high latency in CMA allocation of large size buffer.
> > >
> > > I am frequently allocating/deallocation CMA memory, latency of allocation is very high.
> > >
> > > Below are the stat for allocation/deallocation latency issue.
> > >
> > > (390100 kB), latency 29997 us
> > > (390100 kB), latency 22957 us
> > > (390100 kB), latency 25735 us
> > > (390100 kB), latency 12736 us
> > > (390100 kB), latency 26009 us
> > > (390100 kB), latency 18058 us
> > > (390100 kB), latency 27997 us
> > > (16 kB), latency 560 us
> > > (256 kB), latency 280 us
> > > (4 kB), latency 311 us
> > >
> > > I am using kernel 4.14.65 with android pie(9.0).
> > >
> > > Is there any workaround or solution for this(cma_alloc latency) issue ?
> >
> > Do you have any more detailed information on where the time is spent?
> > E.g. migration tracepoints?
> >
> > Hello Michal,
>
> I have the system(vanilla kernel) with 2GB of RAM, reserved 1GB for CMA. No swap or zram.
> Sorry, I don't have information where the time is spent.
> time is calculated in between cma_alloc call.
> I have just cma_alloc trace information/function graph.
Then please collect that data because it is really hard to judge
anything from the numbers you have provided.
--
Michal Hocko
SUSE Labs
________________________________________
From: Michal Hocko <[email protected]>
Sent: 18 March 2019 19:12
To: Pankaj Suryawanshi
Cc: [email protected]; [email protected]; [email protected]; Kirill Tkhai
Subject: Re: [External] Re: mm/cma.c: High latency for cma allocation
[Please do not use html emails to the mailing list and try to fix your
email client to not break quoating. Fixed for this email]
okay.
On Mon 18-03-19 13:28:50, Pankaj Suryawanshi wrote:
> On Mon 18-03-19 12:58:28, Pankaj Suryawanshi wrote:
> > > Hello,
> > >
> > > I am facing issue of high latency in CMA allocation of large size buffer.
> > >
> > > I am frequently allocating/deallocation CMA memory, latency of allocation is very high.
> > >
> > > Below are the stat for allocation/deallocation latency issue.
> > >
> > > (390100 kB), latency 29997 us
> > > (390100 kB), latency 22957 us
> > > (390100 kB), latency 25735 us
> > > (390100 kB), latency 12736 us
> > > (390100 kB), latency 26009 us
> > > (390100 kB), latency 18058 us
> > > (390100 kB), latency 27997 us
> > > (16 kB), latency 560 us
> > > (256 kB), latency 280 us
> > > (4 kB), latency 311 us
> > >
> > > I am using kernel 4.14.65 with android pie(9.0).
> > >
> > > Is there any workaround or solution for this(cma_alloc latency) issue ?
> >
> > Do you have any more detailed information on where the time is spent?
> > E.g. migration tracepoints?
> >
> > Hello Michal,
>
> I have the system(vanilla kernel) with 2GB of RAM, reserved 1GB for CMA. No swap or zram.
> Sorry, I don't have information where the time is spent.
> time is calculated in between cma_alloc call.
> I have just cma_alloc trace information/function graph.
Then please collect that data because it is really hard to judge
anything from the numbers you have provided.
Any pointers from which i can get this details ?
--
Michal Hocko
SUSE Labs
************************************************************************************************************************************************************* eInfochips Business Disclaimer: This e-mail message and all attachments transmitted with it are intended solely for the use of the addressee and may contain legally privileged and confidential information. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution, copying, or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to this message and please delete it from your computer. Any views expressed in this message are those of the individual sender unless otherwise stated. Company has taken enough precautions to prevent the spread of viruses. However the company accepts no liability for any damage caused by any virus transmitted by this email. *************************************************************************************************************************************************************
On Mon 18-03-19 14:02:09, Pankaj Suryawanshi wrote:
>> > I have the system(vanilla kernel) with 2GB of RAM, reserved 1GB for CMA. No swap or zram.
>> > Sorry, I don't have information where the time is spent.
>> > time is calculated in between cma_alloc call.
>> > I have just cma_alloc trace information/function graph.
>
>> Then please collect that data because it is really hard to judge
>> anything from the numbers you have provided.
>
> Any pointers from which i can get this details ?
I would start by enabling built in tracepoints for the migration or use
a system wide perf monitoring with call graph data.
--
Michal Hocko
SUSE Labs
________________________________________
From: Michal Hocko <[email protected]>
Sent: 18 March 2019 19:59
To: Pankaj Suryawanshi
Cc: [email protected]; [email protected]; [email protected]; Kirill Tkhai
Subject: Re: [External] Re: mm/cma.c: High latency for cma allocation
On Mon 18-03-19 14:02:09, Pankaj Suryawanshi wrote:
>> > I have the system(vanilla kernel) with 2GB of RAM, reserved 1GB for CMA. No swap or zram.
>> > Sorry, I don't have information where the time is spent.
>> > time is calculated in between cma_alloc call.
>> > I have just cma_alloc trace information/function graph.
>
>> Then please collect that data because it is really hard to judge
>> anything from the numbers you have provided.
>
> Any pointers from which i can get this details ?
I would start by enabling built in tracepoints for the migration or use
a system wide perf monitoring with call graph data.
Calling Sequence is as below.
cma_alloc() -->
alloc_contig_range() -->
start_isolate_page_range() -->
__alloc_contig_migrate_range() -->
isolate_migratepages_range() -->
reclaim_clean_pages_from_list() -->
shrink_page_list()
There is no built in tracepoints except cma_alloc.
How to know where it taking time ?
--
Michal Hocko
SUSE Labs
************************************************************************************************************************************************************* eInfochips Business Disclaimer: This e-mail message and all attachments transmitted with it are intended solely for the use of the addressee and may contain legally privileged and confidential information. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution, copying, or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to this message and please delete it from your computer. Any views expressed in this message are those of the individual sender unless otherwise stated. Company has taken enough precautions to prevent the spread of viruses. However the company accepts no liability for any damage caused by any virus transmitted by this email. *************************************************************************************************************************************************************
________________________________________
From: Pankaj Suryawanshi
Sent: 19 March 2019 15:59
To: Michal Hocko
Cc: [email protected]; [email protected]; [email protected]; Kirill Tkhai
Subject: Re: [External] Re: mm/cma.c: High latency for cma allocation
________________________________________
From: Michal Hocko <[email protected]>
Sent: 18 March 2019 19:59
To: Pankaj Suryawanshi
Cc: [email protected]; [email protected]; [email protected]; Kirill Tkhai
Subject: Re: [External] Re: mm/cma.c: High latency for cma allocation
On Mon 18-03-19 14:02:09, Pankaj Suryawanshi wrote:
>> > I have the system(vanilla kernel) with 2GB of RAM, reserved 1GB for CMA. No swap or zram.
>> > Sorry, I don't have information where the time is spent.
>> > time is calculated in between cma_alloc call.
>> > I have just cma_alloc trace information/function graph.
>
>> Then please collect that data because it is really hard to judge
>> anything from the numbers you have provided.
>
> Any pointers from which i can get this details ?
I would start by enabling built in tracepoints for the migration or use
a system wide perf monitoring with call graph data.
Calling Sequence is as below.
cma_alloc() -->
alloc_contig_range() -->
start_isolate_page_range() -->
__alloc_contig_migrate_range() -->
isolate_migratepages_range() -->
reclaim_clean_pages_from_list() -->
shrink_page_list()
There is no built in tracepoints except cma_alloc.
How to know where it taking time ?
I have tried for latency count for 385MB:
reclaim- reclaim_clean_pages_from_list()
migrate- migrate_pages()
migrateranges- isolate_migratepages_range()
overall - __alloc_contig_migrate_range()
Note: output is in us
[ 1151.420923] LATENCY reclaim= 43 migrate=128 migrateranges=23
[ 1151.421209] LATENCY reclaim= 11 migrate=253 migrateranges=14
[ 1151.427856] LATENCY reclaim= 45 migrate=12 migrateranges=12
[ 1151.434485] LATENCY reclaim= 44 migrate=33 migrateranges=12
[ 1151.440975] LATENCY reclaim= 45 migrate=0 migrateranges=11
[ 1151.447513] LATENCY reclaim= 39 migrate=35 migrateranges=11
[ 1151.453919] LATENCY reclaim= 46 migrate=0 migrateranges=12
[ 1151.460474] LATENCY reclaim= 39 migrate=41 migrateranges=11
[ 1151.466947] LATENCY reclaim= 54 migrate=32 migrateranges=17
[ 1151.473464] LATENCY reclaim= 45 migrate=21 migrateranges=12
[ 1151.480016] LATENCY reclaim= 41 migrate=39 migrateranges=12
[ 1151.486551] LATENCY reclaim= 41 migrate=36 migrateranges=12
[ 1151.493199] LATENCY reclaim= 13 migrate=188 migrateranges=12
[ 1151.500034] LATENCY reclaim= 60 migrate=94 migrateranges=13
[ 1151.506686] LATENCY reclaim= 78 migrate=9 migrateranges=12
[ 1151.513313] LATENCY reclaim= 33 migrate=147 migrateranges=12
[ 1151.519839] LATENCY reclaim= 52 migrate=98 migrateranges=12
[ 1151.526556] LATENCY reclaim= 46 migrate=126 migrateranges=12
[ 1151.533254] LATENCY reclaim= 22 migrate=230 migrateranges=12
[ 1151.540145] LATENCY reclaim= 0 migrate=305 migrateranges=13
[ 1151.546997] LATENCY reclaim= 1 migrate=301 migrateranges=13
[ 1151.553686] LATENCY reclaim= 40 migrate=201 migrateranges=12
[ 1151.560395] LATENCY reclaim= 35 migrate=149 migrateranges=12
[ 1151.567076] LATENCY reclaim= 77 migrate=43 migrateranges=16
[ 1151.573836] LATENCY reclaim= 34 migrate=190 migrateranges=12
[ 1151.580510] LATENCY reclaim= 51 migrate=120 migrateranges=12
[ 1151.587240] LATENCY reclaim= 33 migrate=147 migrateranges=13
[ 1151.594036] LATENCY reclaim= 20 migrate=241 migrateranges=13
[ 1151.600749] LATENCY reclaim= 75 migrate=41 migrateranges=13
[ 1151.607402] LATENCY reclaim= 77 migrate=32 migrateranges=12
[ 1151.613956] LATENCY reclaim= 72 migrate=35 migrateranges=12
[ 1151.620642] LATENCY reclaim= 59 migrate=162 migrateranges=12
[ 1151.627181] LATENCY reclaim= 76 migrate=9 migrateranges=11
[ 1151.633795] LATENCY reclaim= 80 migrate=0 migrateranges=12
[ 1151.640278] LATENCY reclaim= 87 migrate=18 migrateranges=12
[ 1151.646758] LATENCY reclaim= 82 migrate=10 migrateranges=11
[ 1151.653307] LATENCY reclaim= 71 migrate=31 migrateranges=12
[ 1151.659911] LATENCY reclaim= 61 migrate=77 migrateranges=12
[ 1151.666514] LATENCY reclaim= 94 migrate=42 migrateranges=15
[ 1151.673089] LATENCY reclaim= 67 migrate=59 migrateranges=12
[ 1151.679655] LATENCY reclaim= 81 migrate=14 migrateranges=12
[ 1151.686253] LATENCY reclaim= 49 migrate=93 migrateranges=12
[ 1151.692815] LATENCY reclaim= 61 migrate=54 migrateranges=12
[ 1151.699438] LATENCY reclaim= 42 migrate=99 migrateranges=10
[ 1151.705881] OVERALL overall=285157
cma_alloc latency is = 297385 us
Please let me know is there any workaround/solution to reduce large size buffer cma_alloc latency ?
--
Michal Hocko
SUSE Labs
************************************************************************************************************************************************************* eInfochips Business Disclaimer: This e-mail message and all attachments transmitted with it are intended solely for the use of the addressee and may contain legally privileged and confidential information. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution, copying, or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to this message and please delete it from your computer. Any views expressed in this message are those of the individual sender unless otherwise stated. Company has taken enough precautions to prevent the spread of viruses. However the company accepts no liability for any damage caused by any virus transmitted by this email. *************************************************************************************************************************************************************
Can you please fix your email client? The broken quoting is just
irritating.
On Tue 19-03-19 11:45:03, Pankaj Suryawanshi wrote:
[...]
> I have tried for latency count for 385MB:
>
> reclaim- reclaim_clean_pages_from_list()
> migrate- migrate_pages()
> migrateranges- isolate_migratepages_range()
> overall - __alloc_contig_migrate_range()
>
> Note: output is in us
>
> [ 1151.420923] LATENCY reclaim= 43 migrate=128 migrateranges=23
> [ 1151.421209] LATENCY reclaim= 11 migrate=253 migrateranges=14
> [ 1151.427856] LATENCY reclaim= 45 migrate=12 migrateranges=12
> [ 1151.434485] LATENCY reclaim= 44 migrate=33 migrateranges=12
> [ 1151.440975] LATENCY reclaim= 45 migrate=0 migrateranges=11
> [ 1151.447513] LATENCY reclaim= 39 migrate=35 migrateranges=11
> [ 1151.453919] LATENCY reclaim= 46 migrate=0 migrateranges=12
> [ 1151.460474] LATENCY reclaim= 39 migrate=41 migrateranges=11
> [ 1151.466947] LATENCY reclaim= 54 migrate=32 migrateranges=17
> [ 1151.473464] LATENCY reclaim= 45 migrate=21 migrateranges=12
> [ 1151.480016] LATENCY reclaim= 41 migrate=39 migrateranges=12
> [ 1151.486551] LATENCY reclaim= 41 migrate=36 migrateranges=12
> [ 1151.493199] LATENCY reclaim= 13 migrate=188 migrateranges=12
> [ 1151.500034] LATENCY reclaim= 60 migrate=94 migrateranges=13
> [ 1151.506686] LATENCY reclaim= 78 migrate=9 migrateranges=12
> [ 1151.513313] LATENCY reclaim= 33 migrate=147 migrateranges=12
> [ 1151.519839] LATENCY reclaim= 52 migrate=98 migrateranges=12
> [ 1151.526556] LATENCY reclaim= 46 migrate=126 migrateranges=12
> [ 1151.533254] LATENCY reclaim= 22 migrate=230 migrateranges=12
> [ 1151.540145] LATENCY reclaim= 0 migrate=305 migrateranges=13
> [ 1151.546997] LATENCY reclaim= 1 migrate=301 migrateranges=13
> [ 1151.553686] LATENCY reclaim= 40 migrate=201 migrateranges=12
> [ 1151.560395] LATENCY reclaim= 35 migrate=149 migrateranges=12
> [ 1151.567076] LATENCY reclaim= 77 migrate=43 migrateranges=16
> [ 1151.573836] LATENCY reclaim= 34 migrate=190 migrateranges=12
> [ 1151.580510] LATENCY reclaim= 51 migrate=120 migrateranges=12
> [ 1151.587240] LATENCY reclaim= 33 migrate=147 migrateranges=13
> [ 1151.594036] LATENCY reclaim= 20 migrate=241 migrateranges=13
> [ 1151.600749] LATENCY reclaim= 75 migrate=41 migrateranges=13
> [ 1151.607402] LATENCY reclaim= 77 migrate=32 migrateranges=12
> [ 1151.613956] LATENCY reclaim= 72 migrate=35 migrateranges=12
> [ 1151.620642] LATENCY reclaim= 59 migrate=162 migrateranges=12
> [ 1151.627181] LATENCY reclaim= 76 migrate=9 migrateranges=11
> [ 1151.633795] LATENCY reclaim= 80 migrate=0 migrateranges=12
> [ 1151.640278] LATENCY reclaim= 87 migrate=18 migrateranges=12
> [ 1151.646758] LATENCY reclaim= 82 migrate=10 migrateranges=11
> [ 1151.653307] LATENCY reclaim= 71 migrate=31 migrateranges=12
> [ 1151.659911] LATENCY reclaim= 61 migrate=77 migrateranges=12
> [ 1151.666514] LATENCY reclaim= 94 migrate=42 migrateranges=15
> [ 1151.673089] LATENCY reclaim= 67 migrate=59 migrateranges=12
> [ 1151.679655] LATENCY reclaim= 81 migrate=14 migrateranges=12
> [ 1151.686253] LATENCY reclaim= 49 migrate=93 migrateranges=12
> [ 1151.692815] LATENCY reclaim= 61 migrate=54 migrateranges=12
> [ 1151.699438] LATENCY reclaim= 42 migrate=99 migrateranges=10
> [ 1151.705881] OVERALL overall=285157
cumulative numbers are
reclaim:2222 migrate:3995 migrateranges:552 sum:6217
So those code paths you were measuring were contributing to the overal
latency only marginally (~2%).
> cma_alloc latency is = 297385 us
As I've said earlier I would just use perf with the call graph support
and look at the time distribution to see where the bottleneck is.
--
Michal Hocko
SUSE Labs