Hi,
Since its very beginning, the implementation of madvise(MADV_DONTNEED)
has been a mere zapping of pages. This is wrong. It was already
discussed on this very list some time ago:
http://marc.theaimsgroup.com/?t=111996860600002&r=1&w=2&n=13
? There is something wrong with the current madvise(MADV_DONTNEED)
implementation. Both the manpage and the source code says that
MADV_DONTNEED means that the application does not care about the data,
so it might be thrown away by the kernel. But that's not what posix
says:
http://www.opengroup.org/onlinepubs/009695399/functions/posix_madvise.html
It says that "The posix_madvise() function shall have no effect on the
semantics of access to memory in the specified range". I.e. the data
that was recorded shall be saved! ?
The attached testcase was written during the discussion for clearly
showing the problem: whenever using madvise(DONT_NEED), data is not
preserved. This can be very problematic for applications that rely on
the posix behavior...
MADV_DONTNEED should really get fixed into a performance-only semantic:
for instance, just go through the range for zapping clean pages, and set
dirty pages as least recently used, so that they will be considered as
good candidates for eviction.
Regards,
Samuel