2005-01-06 13:47:59

by Lukas Hejtmanek

[permalink] [raw]
Subject: 2.6.10-bk8 swapoff after resume

Hello,

I've tried 2.6.10-bk8 suspend/resume. After resume I usually do swapoff -a to
load all the pages from swap to memory. Unfortunately with the latest version
swapoff does not work. It seems to cycle in an endless loop reading data from
disk.

According to sysrq show regs:

Pid: 2401, comm: swapoff
EIP: 0060:[<c01493d8>] CPU: 0
EIP is at unuse_process+0x41/0xc2
EFLAGS: 00200246 Not tainted (2.6.10-bk8)
EAX: 00000000 EBX: d790df90 ECX: b7218000 EDX: 17902001
ESI: d7805d00 EDI: d7805d2c EBP: c1315960 DS: 007b ES: 007b
CR0: 8005003b CR2: b6e1d000 CR3: 18ad0000 CR4: 00000690
[<c01496df>] try_to_unuse+0x238/0x5cc
[<c015d7bd>] getname+0x75/0xbd
[<c0149f88>] sys_swapoff+0x184/0x3bc
[<c0102f87>] syscall_call+0x7/0xb

--
Luk?? Hejtm?nek


2005-01-06 21:18:20

by Martin Josefsson

[permalink] [raw]
Subject: Swapoff inifinite loops on 2.6.10-bk (was: .6.10-bk8 swapoff after resume)

On Thu, 2005-01-06 at 14:47 +0100, Lukas Hejtmanek wrote:
> Hello,

Hi

> I've tried 2.6.10-bk8 suspend/resume. After resume I usually do swapoff -a to
> load all the pages from swap to memory. Unfortunately with the latest version
> swapoff does not work. It seems to cycle in an endless loop reading data from
> disk.

I second that, after resume my machine does exactly the same.
It swaps in most of the data, but it leaves ~1700kB on the swapdevice
that it doesn't manage to swap in, and apparently reads this over and
over again.

But it probably doesn't have anything to do with swsusp, I can reproduce
it without ever having suspended, just fill up the memory so the machine
swaps and then the same thing happens.

Apparently kernels from -bk in late december works fine, so it's a
recent introduction.
Needs investigating.

--
/Martin


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2005-01-06 22:02:54

by Hugh Dickins

[permalink] [raw]
Subject: Re: Swapoff inifinite loops on 2.6.10-bk (was: .6.10-bk8 swapoff after resume)

On Thu, 6 Jan 2005, Martin Josefsson wrote:
> On Thu, 2005-01-06 at 14:47 +0100, Lukas Hejtmanek wrote:
>
> > I've tried 2.6.10-bk8 suspend/resume. After resume I usually do swapoff -a to
> > load all the pages from swap to memory. Unfortunately with the latest version
> > swapoff does not work. It seems to cycle in an endless loop reading data from
> > disk.
>
> I second that, after resume my machine does exactly the same.
> It swaps in most of the data, but it leaves ~1700kB on the swapdevice
> that it doesn't manage to swap in, and apparently reads this over and
> over again.
>
> But it probably doesn't have anything to do with swsusp, I can reproduce
> it without ever having suspended, just fill up the memory so the machine
> swaps and then the same thing happens.
>
> Apparently kernels from -bk in late december works fine, so it's a
> recent introduction.
> Needs investigating.

Curious. I regularly check that swapoff is working (though not suspend
and resume), and have not noticed this. You fill memory so the machine
swaps, let the memory hog exit, then swapoff hangs indefinitely (not just
taking a long time)? With no suspend+resume since booting at all?

How much memory? How much swap? What .config? I'll try it tomorrow.

Thanks,
Hugh

2005-01-07 06:55:03

by Martin Josefsson

[permalink] [raw]
Subject: Re: Swapoff inifinite loops on 2.6.10-bk (was: .6.10-bk8 swapoff after resume)

On Thu, 2005-01-06 at 22:00 +0000, Hugh Dickins wrote:

> Curious. I regularly check that swapoff is working (though not suspend
> and resume), and have not noticed this. You fill memory so the machine
> swaps, let the memory hog exit, then swapoff hangs indefinitely (not just
> taking a long time)? With no suspend+resume since booting at all?

No suspend or resume involved, just a normal boot.

Steps to reproduce:

1. fill memory so it swaps
2. stop memory hog
3. swapoff -a

I'm pretty sure it's an infinite loop, I left it like that while
shaving. It produces the same sound over and over again as the head
seeks back in order to try again and again...

> How much memory? How much swap? What .config? I'll try it tomorrow.

768MB ram, 1.2GB swap, .config attached

Before memory hog:

# free
total used free shared buffers cached
Mem: 775664 110868 664796 0 4696 60420
-/+ buffers/cache: 45752 729912
Swap: 1226224 0 1226224

After memory hog:

# free
total used free shared buffers cached
Mem: 775664 36180 739484 0 140 7224
-/+ buffers/cache: 28816 746848
Swap: 1226224 27984 1198240

After swapoff:

# free
total used free shared buffers cached
Mem: 775664 53972 721692 0 268 11816
-/+ buffers/cache: 41888 733776
Swap: 1226224 2912 1223312

Running swapoff again produces the same behaviour and the amount of swap
used does not decrease.

I hope you can reproduce it.

--
/Martin


Attachments:
2.6-config-pingu (40.48 kB)
signature.asc (189.00 B)
This is a digitally signed message part
Download all attachments

2005-01-08 16:01:40

by Hugh Dickins

[permalink] [raw]
Subject: Re: Swapoff inifinite loops on 2.6.10-bk (was: .6.10-bk8 swapoff after resume)

On Fri, 7 Jan 2005, Martin Josefsson wrote:
>
> No suspend or resume involved, just a normal boot.
>
> Steps to reproduce:
>
> 1. fill memory so it swaps
> 2. stop memory hog
> 3. swapoff -a
>
> I'm pretty sure it's an infinite loop, I left it like that while
> shaving. It produces the same sound over and over again as the head
> seeks back in order to try again and again...

You're right, and yes, I could then reproduce it. Looks like I'd only
been testing on 3levels (HIGHMEM64G), and this only happens on 2levels.

Patch below, please verify it fixes your problems. And please, could
someone else check I haven't screwed up swapoff on 4levels (x86_64)?
>From the likeness of the code at all levels I'd expect it to be fine,
but there's nothing like a real test - thanks...

The 4level mods have caused 2level swapoff to miss entries and hang.
There's probably a one-line fix for that, but the error is really caused
by previous awkwardness - each mask applied on two levels, an "address"
that's an offset plus an "offset" that's an address. Simplify the four
levels to behave in the same address/next/end way and the bug vanishes.

Signed-off-by: Hugh Dickins <[email protected]>

--- 2.6.10-bk11/mm/swapfile.c 2005-01-07 16:15:12.000000000 +0000
+++ linux/mm/swapfile.c 2005-01-07 20:58:38.933209800 +0000
@@ -442,12 +442,11 @@ unuse_pte(struct vm_area_struct *vma, un
}

/* vma->vm_mm->page_table_lock is held */
-static unsigned long unuse_pmd(struct vm_area_struct * vma, pmd_t *dir,
- unsigned long address, unsigned long size, unsigned long offset,
+static unsigned long unuse_pmd(struct vm_area_struct *vma, pmd_t *dir,
+ unsigned long address, unsigned long end,
swp_entry_t entry, struct page *page)
{
- pte_t * pte;
- unsigned long end;
+ pte_t *pte;
pte_t swp_pte = swp_entry_to_pte(entry);

if (pmd_none(*dir))
@@ -458,18 +457,13 @@ static unsigned long unuse_pmd(struct vm
return 0;
}
pte = pte_offset_map(dir, address);
- offset += address & PMD_MASK;
- address &= ~PMD_MASK;
- end = address + size;
- if (end > PMD_SIZE)
- end = PMD_SIZE;
do {
/*
* swapoff spends a _lot_ of time in this loop!
* Test inline before going to call unuse_pte.
*/
if (unlikely(pte_same(*pte, swp_pte))) {
- unuse_pte(vma, offset + address, pte, entry, page);
+ unuse_pte(vma, address, pte, entry, page);
pte_unmap(pte);

/*
@@ -479,22 +473,22 @@ static unsigned long unuse_pmd(struct vm
activate_page(page);

/* add 1 since address may be 0 */
- return 1 + offset + address;
+ return 1 + address;
}
address += PAGE_SIZE;
pte++;
- } while (address && (address < end));
+ } while (address < end);
pte_unmap(pte - 1);
return 0;
}

/* vma->vm_mm->page_table_lock is held */
-static unsigned long unuse_pud(struct vm_area_struct * vma, pud_t *pud,
- unsigned long address, unsigned long size, unsigned long offset,
+static unsigned long unuse_pud(struct vm_area_struct *vma, pud_t *pud,
+ unsigned long address, unsigned long end,
swp_entry_t entry, struct page *page)
{
- pmd_t * pmd;
- unsigned long end;
+ pmd_t *pmd;
+ unsigned long next;
unsigned long foundaddr;

if (pud_none(*pud))
@@ -505,33 +499,27 @@ static unsigned long unuse_pud(struct vm
return 0;
}
pmd = pmd_offset(pud, address);
- offset += address & PUD_MASK;
- address &= ~PUD_MASK;
- end = address + size;
- if (end > PUD_SIZE)
- end = PUD_SIZE;
- if (address >= end)
- BUG();
do {
- foundaddr = unuse_pmd(vma, pmd, address, end - address,
- offset, entry, page);
+ next = (address + PMD_SIZE) & PMD_MASK;
+ if (next > end || !next)
+ next = end;
+ foundaddr = unuse_pmd(vma, pmd, address, next, entry, page);
if (foundaddr)
return foundaddr;
- address = (address + PMD_SIZE) & PMD_MASK;
+ address = next;
pmd++;
- } while (address && (address < end));
+ } while (address < end);
return 0;
}

/* vma->vm_mm->page_table_lock is held */
-static unsigned long unuse_pgd(struct vm_area_struct * vma, pgd_t *pgd,
- unsigned long address, unsigned long size,
+static unsigned long unuse_pgd(struct vm_area_struct *vma, pgd_t *pgd,
+ unsigned long address, unsigned long end,
swp_entry_t entry, struct page *page)
{
- pud_t * pud;
- unsigned long offset;
+ pud_t *pud;
+ unsigned long next;
unsigned long foundaddr;
- unsigned long end;

if (pgd_none(*pgd))
return 0;
@@ -541,54 +529,48 @@ static unsigned long unuse_pgd(struct vm
return 0;
}
pud = pud_offset(pgd, address);
- offset = address & PGDIR_MASK;
- address &= ~PGDIR_MASK;
- end = address + size;
- if (end > PGDIR_SIZE)
- end = PGDIR_SIZE;
- BUG_ON (address >= end);
do {
- foundaddr = unuse_pud(vma, pud, address, end - address,
- offset, entry, page);
+ next = (address + PUD_SIZE) & PUD_MASK;
+ if (next > end || !next)
+ next = end;
+ foundaddr = unuse_pud(vma, pud, address, next, entry, page);
if (foundaddr)
return foundaddr;
- address = (address + PUD_SIZE) & PUD_MASK;
+ address = next;
pud++;
- } while (address && (address < end));
+ } while (address < end);
return 0;
}

/* vma->vm_mm->page_table_lock is held */
-static unsigned long unuse_vma(struct vm_area_struct * vma,
+static unsigned long unuse_vma(struct vm_area_struct *vma,
swp_entry_t entry, struct page *page)
{
pgd_t *pgd;
- unsigned long start, end, next;
+ unsigned long address, next, end;
unsigned long foundaddr;
- int i;

if (page->mapping) {
- start = page_address_in_vma(page, vma);
- if (start == -EFAULT)
+ address = page_address_in_vma(page, vma);
+ if (address == -EFAULT)
return 0;
else
- end = start + PAGE_SIZE;
+ end = address + PAGE_SIZE;
} else {
- start = vma->vm_start;
+ address = vma->vm_start;
end = vma->vm_end;
}
- pgd = pgd_offset(vma->vm_mm, start);
- for (i = pgd_index(start); i <= pgd_index(end-1); i++) {
- next = (start + PGDIR_SIZE) & PGDIR_MASK;
- if (next > end || next <= start)
+ pgd = pgd_offset(vma->vm_mm, address);
+ do {
+ next = (address + PGDIR_SIZE) & PGDIR_MASK;
+ if (next > end || !next)
next = end;
- foundaddr = unuse_pgd(vma, pgd, start, next - start, entry, page);
+ foundaddr = unuse_pgd(vma, pgd, address, next, entry, page);
if (foundaddr)
return foundaddr;
- start = next;
- i++;
+ address = next;
pgd++;
- }
+ } while (address < end);
return 0;
}


2005-01-08 16:23:41

by Martin Josefsson

[permalink] [raw]
Subject: Re: Swapoff inifinite loops on 2.6.10-bk (was: .6.10-bk8 swapoff after resume)

On Sat, 2005-01-08 at 16:00 +0000, Hugh Dickins wrote:

> You're right, and yes, I could then reproduce it. Looks like I'd only
> been testing on 3levels (HIGHMEM64G), and this only happens on 2levels.
>
> Patch below, please verify it fixes your problems. And please, could
> someone else check I haven't screwed up swapoff on 4levels (x86_64)?
> From the likeness of the code at all levels I'd expect it to be fine,
> but there's nothing like a real test - thanks...

The patch fixes the problem completely here.
swapoff after running the memory hog works as expected.
and swapoff after suspend to disk and resume also works fine.

Thanks for tracking this down and fixing it.

--
/Martin


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2005-01-08 19:58:32

by Andrew Morton

[permalink] [raw]
Subject: Re: Swapoff inifinite loops on 2.6.10-bk (was: .6.10-bk8 swapoff after resume)

Hugh Dickins <[email protected]> wrote:
>
> > I'm pretty sure it's an infinite loop, I left it like that while
> > shaving. It produces the same sound over and over again as the head
> > seeks back in order to try again and again...
>
> You're right, and yes, I could then reproduce it. Looks like I'd only
> been testing on 3levels (HIGHMEM64G), and this only happens on 2levels.
>
> Patch below, please verify it fixes your problems. And please, could
> someone else check I haven't screwed up swapoff on 4levels (x86_64)?

Thanks. I'll do the x86_64 testing.