2013-05-13 03:09:29

by PINTU KUMAR

[permalink] [raw]
Subject: [Query] Performance degradation with memory compaction (on QC chip-set)

* Sorry Re-sending as plain text.


Dear Mr. Mel Gorman,

I have one question about memory compaction.
Kernel version: kernel-3.4 (ARM)
Chipset: Qual-Comm MSM8930 dual-core.

We wanted to enable CONFIG_COMPACTION for our product with kernel-3.4.
But QC commented that, enabling compaction on their chip-set is causing performance degradation for some streaming scenarios (from the beginning).

I wanted to know is this possible always?
We used compaction with exynos processor and did not observe any performance degradation.


All,
Does any one observed any performance problem (on any chipset) by enabling compaction?


Please let me know your comments.
It will be helpful to decide on enabling compaction or not.


Thank You.
With Regards,
Pintu





>
>
>>________________________________
>> From: Mel Gorman <[email protected]>
>>To: Pintu Agarwal <[email protected]>
>>Sent: Tuesday, 12 July 2011 8:44 AM
>>Subject: Re: How to verify memory compaction on Kernel2.6.36??
>>
>>
>>On Tue, Jul 12, 2011 at 08:26:21AM -0700, Pintu Agarwal wrote:
>>> ?
>>> Actually I enabled compaction without HUGETLB support. Hope this is fine.
>>> ?
>>
>>In terms of compaction yes. In terms of your target application, I don't
>>know.
>>
>>> Then I wrote a sample kernel module to allocate physical pages using kmalloc.
>>> (By passing the memory size from sample user space application and passing to this kernel module via ioctl calls)
>>> ?
>>
>>The allocations will not be accessible to userspace without additional
>>driver support to map the pages in userspace.
>>
>>> Using these application, I request for total number of physical pages of the desired order(from commandline of user app).
>>> And at the sametime verifying the buddyinfo before and after the allocation.
>>> A sample output of my application is as
follows:-
>>> ============================================================
>>> /opt/pintu # ./app_pinchar.bin
>>> Node 0, zone?? Normal???? 34????? 9???? 13????? 7???? 11????? 6????? 2????? 2????? 3????? 1???? 36
>>> Node 0, zone? HighMem???? 53??? 194??? 110???? 36???? 21????? 7????? 1????? 2????? 3????? 2????? 6
>>> Page block order: 10
>>> Pages per block:? 1024
>>> Free pages count per migrate type at
order?????? 0????? 1????? 2????? 3????? 4????? 5????? 6????? 7????? 8????? 9???? 10
>>> Node??? 0, zone?? Normal, type??? Unmovable???? 32????? 5????? 8????? 5???? 11????? 5????? 2????? 0????? 2????? 0????? 0
>>> Node??? 0, zone?? Normal, type? Reclaimable????? 1?????
2????? 4????? 2????? 0????? 0????? 0????? 1????? 1????? 1????? 0
>>> Node??? 0, zone?? Normal, type????? Movable????? 1????? 0????? 1????? 0????? 0????? 1????? 0????? 1????? 0????? 0???? 35
>>> Node??? 0, zone?? Normal, type????? Reserve????? 0????? 0????? 0?????
0????? 0????? 0????? 0????? 0????? 0????? 0????? 1
>>> Node??? 0, zone?? Normal, type????? Isolate????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0
>>> Node??? 0, zone? HighMem, type??? Unmovable????? 1????? 0????? 2????? 3????? 1?????
0????? 0????? 1????? 2????? 1????? 1
>>> Node??? 0, zone? HighMem, type? Reclaimable????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0
>>> Node??? 0, zone? HighMem, type????? Movable???? 21??? 194??? 108???? 33???? 20????? 7????? 1????? 1?????
1????? 1????? 4
>>> Node??? 0, zone? HighMem, type????? Reserve????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 1
>>> Node??? 0, zone? HighMem, type????? Isolate????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0????? 0?????
0
>>> Number of blocks type???? Unmovable? Reclaimable????? Movable????? Reserve????? Isolate
>>> Node 0, zone?? Normal?????????? 82??????????? 4?????????? 73??????????? 1??????????? 0
>>> Node 0, zone? HighMem?????????? 14??????????? 0?????????? 81??????????? 1??????????? 0
>>>
-------------------------------------------------------------------------------------------
>>> ?
>>> Enter the page order(in power of 2) : 512
>>
>>Page order 512? That's a good trick. I assume you means order 9 for 512
>>pages.
>>
>>> Enter the number of such block : 200
>>> ERROR : ioctl - PINCHAR_ALLOC - Failed, after block num = 72 !!!
>>> DONE.....
>>>
>>
>>72 corresponds almost exactly to the number of order-9 pages that were
>>free when the application started.
>>
>>> ==========================================================================================
>>> Node 0, zone?? Normal??? 100???? 84???? 53???? 36???? 33???? 21????? 8????? 0????? 3?????
2????? 0
>>> Node 0, zone? HighMem??? 844??? 744??? 612??? 357??? 200???? 91????? 8????? 3????? 4????? 1????? 6
>>>
>>
>>There is almost no free memory in the Normal zone at this stage of
>>the test implying that even perfect compaction of all pages would
>>still not result in a new order-9 page while obeying watermarks.
>>
>>> ============================================================
>>> ?
>>> Then I want to verify whether compaction is working for the all allocation request or not.
>>
>>Read /proc/vmstat but I doubt it was used much. Memory was mostly
>>unfragmented when the application started. It is likely that after
>>72 order-9 pages there was not enough free
memory to compact further
>>and that is why the allocation failed.
>>
>>> OR, at least how far compaction is helpful in these scenarios.
>>> ?
>>
>>Compaction would have been helpful in the event the system has been
>>running for some time and was fragmented. This test looks like it
>>happened very close to boot so compaction would not have been requried.
>>
>>> Please let me know how compaction can be effective in such cases where order 8,9,10 pages are requested.
>>> ?
>>
>>Compaction reduces allocation latencies when memory is fragmented for
>>high-order allocations like this. I'm not what else you are expecting
>>to hear.
>>
>>--
>>Mel Gorman
>>SUSE Labs
>>
>>
>>
>
>


2013-05-13 08:40:13

by Mel Gorman

[permalink] [raw]
Subject: Re: [Query] Performance degradation with memory compaction (on QC chip-set)

On Sun, May 12, 2013 at 08:00:26PM -0700, PINTU KUMAR wrote:
> Dear Mel Gorman,
>
> I have one question about memory compaction.
> Kernel version: kernel-3.4 (ARM)
> Chipset: Qual-Comm MSM8930 dual-core.
>
> We wanted to enable CONFIG_COMPACTION for our product with kernel-3.4.
> But QC commented that, enabling compaction on their chip-set is causing performance degradation for some streaming scenarios (from the beginning).
>
> I wanted to know is this possible always?
> We used compaction with exynos processor and did not observe any performance degradation.
>

I suspect one of their drivers are using high-order allocations and
hitting compaction as a result. Compaction is not guaranteed to cause
overhead but if it's in use then the scanning and copying overhead can
cause problems.

> Please let me know your comments.
> It will be helpful to decide on enabling compaction or not.
>

Depends on workload and drivers.

--
Mel Gorman
SUSE Labs