I've been experimenting with a swap block device driver which under
certain conditions generates write failures for pages. I'd presumed the
kernel would recover from a write error by restoring the copy it still
has in memory. In an ideal world it would then mark that page of the
swap device as bad although I'm fairly sure this doesn't happen. I'm not
sure about the restoring the in memory page but if it does happen, I'm
having trouble identifying the code.
When I tested this, it doesn't appear to recover and processes on the
system, presumably the ones using swap just disappear when write
failures occur.
end_swap_bio_write() simply does:
if (!uptodate)
SetPageError(page);
I know the uptodate flag is being cleared in the error cases. I'm having
trouble working out which code the setting of an error flag for a swap
page should trigger (any pointers appreciated!). I noticed its also used
for the read case which is unrecoverable.
Should this code be marking the page as dirty and the section of the
swap device as bad instead, does it already do that or is that not
possible for some reason?
Any comments and/or pointers to documentation on this would be
appreciated.
Thanks,
Richard
On Tue, 29 Aug 2006 21:48:34 +0100
Richard Purdie <[email protected]> wrote:
> end_swap_bio_write() simply does:
>
> if (!uptodate)
> SetPageError(page);
>
> I know the uptodate flag is being cleared in the error cases. I'm having
> trouble working out which code the setting of an error flag for a swap
> page should trigger (any pointers appreciated!). I noticed its also used
> for the read case which is unrecoverable.
>
> Should this code be marking the page as dirty and the section of the
> swap device as bad instead, does it already do that or is that not
> possible for some reason?
>
> Any comments and/or pointers to documentation on this would be
> appreciated.
>
Now, swap-write-failure-fixup.patch is merged in -mm kernel.
==
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc4/2.6.18-rc4-mm3/broken-out/mm-swap-write-failure-fixup.patch
==
error message comes and a page turns to be dirty again.
Thanks,
- Kame
On Thu, 2006-08-31 at 10:58 +0900, KAMEZAWA Hiroyuki wrote:
> Now, swap-write-failure-fixup.patch is merged in -mm kernel.
> ==
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc4/2.6.18-rc4-mm3/broken-out/mm-swap-write-failure-fixup.patch
> ==
> error message comes and a page turns to be dirty again.
+ if (!uptodate) {
SetPageError(page);
+ /*
+ * We failed to write the page out to swap-space.
+ * Re-dirty the page in order to avoid it being reclaimed.
+ * Also print a dire warning that things will go BAD (tm)
+ * very quickly.
+ *
+ * Also clear PG_reclaim to avoid rotate_reclaimable_page()
+ */
+ set_page_dirty(page);
+ printk(KERN_ALERT "Write-error on swap-device (%d:%d)\n",
+ imajor(bio->bi_bdev->bd_inode),
+ iminor(bio->bi_bdev->bd_inode));
+ ClearPageReclaim(page);
I'm not 100% convinced this will help as if you SetPageError, it will
still end up killing off the processes involved. Removing the
SetPageError gives much more stable results in my testing. I was
wondering how to stop it repeatedly trying to write to the particular
swap file sector. ClearPageReclaim() doesn't appear to help much as
rotate_reclaimable_page() does check if a page is dirty.
Ideally, we should remap the page to a new swap sector so we can mark
the existing one as bad. The easiest way to do that might be to have the
page move out of the PageSwapCache although I've not worked out how to
do that yet.
Richard