Hi Matthew,
I've just been looking at bdev_write_page().
You can read about why here:
http://marc.info/?t=142984068300001&r=1&w=2
it ends with a "git bisect" which points the finger at you.
If I look at bdev_write_page() it says:
* On entry, the page should be locked and not currently under writeback.
* On exit, if the write started successfully, the page will be unlocked and
* under writeback. If the write failed already (eg the driver failed to
* queue the page to the device), the page will still be locked. If the
* caller is a ->writepage implementation, it will need to unlock the page.
So the page is unlocked on success.
In __mpage_writepage() I find
if (!bdev_write_page(bdev, blocks[0] << (blkbits - 9),
page, wbc)) {
clean_buffers(page, first_unmapped);
so if bdev_write_page() succeeds, i.e. if it returns '0', then
clean_buffers() is called. At this point the page is unlocked remember.
clean_buffers may call
try_to_free_buffers(page);
(without first locking the page, so still unlocked)..
try_to_free_buffers starts:
BUG_ON(!PageLocked(page));
Opps.
Can you propose a fix for Charles, who can trigger this bug and nicely
bisected it for us - thanks Charles!!!
Also while looking at the code, I notice that brd_rw_page() unconditionally
calls page_endio() and, in the WRITE case, page_endio unconditionally calls
end_page_writeback(), which has
if (!test_clear_page_writeback(page))
BUG();
and so cannot tolerate being called twice in a row.
So if brd_rw_page() ever returned an error (which seems possible though not
likely), end_page_writeback() would be called once by page_endio() and once
in the error path of bdev_write_page(), and the BUG above would be triggered.
I'll leave that for you to sort out too :-)
Thanks,
NeilBrown
On 05/17/2015 12:02 AM, NeilBrown wrote:
>
>
> Hi Matthew,
> I've just been looking at bdev_write_page().
> You can read about why here:
>
> http://marc.info/?t=142984068300001&r=1&w=2
....
>
> Can you propose a fix for Charles, who can trigger this bug and nicely
> bisected it for us - thanks Charles!!!
>
This problem still occurs with 4.1.0-rc5 -- three stack traces attached.
Charles Bertsch