From: Huang Ying <[email protected]>
Mike reported the following warning messages
get_swap_device: Bad swap file entry 1400000000000001
This is produced by
- total_swapcache_pages()
- get_swap_device()
Where get_swap_device() is used to check whether the swap device is
valid and prevent it from being swapoff if so. But get_swap_device()
may produce warning message as above for some invalid swap devices.
This is fixed via calling swp_swap_info() before get_swap_device() to
filter out the swap devices that may cause warning messages.
Fixes: 6a946753dbe6 ("mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device()")
Signed-off-by: "Huang, Ying" <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Andrea Parri <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Hugh Dickins <[email protected]>
---
mm/swap_state.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/mm/swap_state.c b/mm/swap_state.c
index b84c58b572ca..62da25b7f2ed 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -76,8 +76,13 @@ unsigned long total_swapcache_pages(void)
struct swap_info_struct *si;
for (i = 0; i < MAX_SWAPFILES; i++) {
+ swp_entry_t entry = swp_entry(i, 1);
+
+ /* Avoid get_swap_device() to warn for bad swap entry */
+ if (!swp_swap_info(entry))
+ continue;
/* Prevent swapoff to free swapper_spaces */
- si = get_swap_device(swp_entry(i, 1));
+ si = get_swap_device(entry);
if (!si)
continue;
nr = nr_swapper_spaces[i];
--
2.20.1
On Fri 31-05-19 10:41:02, Huang, Ying wrote:
> From: Huang Ying <[email protected]>
>
> Mike reported the following warning messages
>
> get_swap_device: Bad swap file entry 1400000000000001
>
> This is produced by
>
> - total_swapcache_pages()
> - get_swap_device()
>
> Where get_swap_device() is used to check whether the swap device is
> valid and prevent it from being swapoff if so. But get_swap_device()
> may produce warning message as above for some invalid swap devices.
> This is fixed via calling swp_swap_info() before get_swap_device() to
> filter out the swap devices that may cause warning messages.
>
> Fixes: 6a946753dbe6 ("mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device()")
I suspect this is referring to a mmotm patch right? There doesn't seem
to be any sha like this in Linus' tree AFAICS. If that is the case then
please note that mmotm patch showing up in linux-next do not have a
stable sha1 and therefore you shouldn't reference them in the commit
message. Instead please refer to the specific mmotm patch file so that
Andrew knows it should be folded in to it.
Thanks!
--
Michal Hocko
SUSE Labs
On 5/30/19 7:41 PM, Huang, Ying wrote:
> From: Huang Ying <[email protected]>
>
> Mike reported the following warning messages
>
> get_swap_device: Bad swap file entry 1400000000000001
>
> This is produced by
>
> - total_swapcache_pages()
> - get_swap_device()
>
> Where get_swap_device() is used to check whether the swap device is
> valid and prevent it from being swapoff if so. But get_swap_device()
> may produce warning message as above for some invalid swap devices.
> This is fixed via calling swp_swap_info() before get_swap_device() to
> filter out the swap devices that may cause warning messages.
>
> Fixes: 6a946753dbe6 ("mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device()")
> Signed-off-by: "Huang, Ying" <[email protected]>
Thank you, this eliminates the messages for me:
Tested-by: Mike Kravetz <[email protected]>
--
Mike Kravetz
On Fri, May 31, 2019 at 10:00 AM Mike Kravetz <[email protected]> wrote:
>
> On 5/30/19 7:41 PM, Huang, Ying wrote:
> > From: Huang Ying <[email protected]>
> >
> > Mike reported the following warning messages
> >
> > get_swap_device: Bad swap file entry 1400000000000001
> >
> > This is produced by
> >
> > - total_swapcache_pages()
> > - get_swap_device()
> >
> > Where get_swap_device() is used to check whether the swap device is
> > valid and prevent it from being swapoff if so. But get_swap_device()
> > may produce warning message as above for some invalid swap devices.
> > This is fixed via calling swp_swap_info() before get_swap_device() to
> > filter out the swap devices that may cause warning messages.
> >
> > Fixes: 6a946753dbe6 ("mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device()")
> > Signed-off-by: "Huang, Ying" <[email protected]>
>
> Thank you, this eliminates the messages for me:
>
> Tested-by: Mike Kravetz <[email protected]>
>
> --
> Mike Kravetz
Hi,
Did you know about the panic reported here:
https://marc.info/?t=155930773000003&r=1&w=2
"Kernel panic - not syncing: stack-protector: Kernel stack is
corrupted in: write_irq_affinity.isra"
This panic is reported on PowerPC and x86.
In the case of x86, we see a lot of "get_swap_device: Bad swap file entry"
errors before the panic:
...
[ 24.404693] get_swap_device: Bad swap file entry 5800000000000001
[ 24.408702] get_swap_device: Bad swap file entry 5c00000000000001
[ 24.412510] get_swap_device: Bad swap file entry 6000000000000001
[ 24.416519] get_swap_device: Bad swap file entry 6400000000000001
[ 24.420217] get_swap_device: Bad swap file entry 6800000000000001
[ 24.423921] get_swap_device: Bad swap file entry 6c00000000000001
[ 24.427685] get_swap_device: Bad swap file entry 7000000000000001
[ 24.760678] Kernel panic - not syncing: stack-protector: Kernel
stack is corrupted in: write_irq_affinity.isra.7+0xe5/0xf0
[ 24.760975] CPU: 25 PID: 1773 Comm: irqbalance Not tainted
5.2.0-rc2-2fefea438dac #1
[ 24.760975] Hardware name: Microsoft Corporation Virtual
Machine/Virtual Machine, BIOS 090007 06/02/2017
[ 24.760975] Call Trace:
[ 24.760975] dump_stack+0x46/0x5b
[ 24.760975] panic+0xf8/0x2d2
[ 24.760975] ? write_irq_affinity.isra.7+0xe5/0xf0
[ 24.760975] __stack_chk_fail+0x15/0x20
[ 24.760975] write_irq_affinity.isra.7+0xe5/0xf0
[ 24.760975] proc_reg_write+0x40/0x60
[ 24.760975] vfs_write+0xb3/0x1a0
[ 24.760975] ? _cond_resched+0x16/0x40
[ 24.760975] ksys_write+0x5c/0xe0
[ 24.760975] do_syscall_64+0x4f/0x120
[ 24.760975] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 24.760975] RIP: 0033:0x7f93bcdde187
[ 24.760975] Code: c3 66 90 41 54 55 49 89 d4 53 48 89 f5 89 fb 48
83 ec 10 e8 6b 05 02 00 4c 89 e2 41 89 c0 48 89 ee 89 df b8 01 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 a4 05 02
00 48
[ 24.760975] RSP: 002b:00007ffc4600d900 EFLAGS: 00000293 ORIG_RAX:
0000000000000001
[ 24.760975] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f93bcdde187
[ 24.760975] RDX: 0000000000000008 RSI: 00005595ad515540 RDI: 0000000000000006
[ 24.760975] RBP: 00005595ad515540 R08: 0000000000000000 R09: 00005595ab381820
[ 24.760975] R10: 0000000000000008 R11: 0000000000000293 R12: 0000000000000008
[ 24.760975] R13: 0000000000000008 R14: 00007f93bd0b62a0 R15: 00007f93bd0b5760
[ 24.760975] Kernel Offset: 0x3a000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 24.760975] ---[ end Kernel panic - not syncing: stack-protector:
Kernel stack is corrupted in: write_irq_affinity.isra.7+0xe5/0xf0 ]---
Thanks,
-- Dexuan
On Fri, 2019-05-31 at 11:27 -0700, Dexuan-Linux Cui wrote:
> Hi,
> Did you know about the panic reported here:
> https://marc.info/?t=155930773000003&r=1&w=2
>
> "Kernel panic - not syncing: stack-protector: Kernel stack is
> corrupted in: write_irq_affinity.isra"
>
> This panic is reported on PowerPC and x86.
>
> In the case of x86, we see a lot of "get_swap_device: Bad swap file entry"
> errors before the panic:
>
> ...
> [ 24.404693] get_swap_device: Bad swap file entry 5800000000000001
> [ 24.408702] get_swap_device: Bad swap file entry 5c00000000000001
> [ 24.412510] get_swap_device: Bad swap file entry 6000000000000001
> [ 24.416519] get_swap_device: Bad swap file entry 6400000000000001
> [ 24.420217] get_swap_device: Bad swap file entry 6800000000000001
> [ 24.423921] get_swap_device: Bad swap file entry 6c00000000000001
> [ 24.427685] get_swap_device: Bad swap file entry 7000000000000001
> [ 24.760678] Kernel panic - not syncing: stack-protector: Kernel
> stack is corrupted in: write_irq_affinity.isra.7+0xe5/0xf0
> [ 24.760975] CPU: 25 PID: 1773 Comm: irqbalance Not tainted
> 5.2.0-rc2-2fefea438dac #1
> [ 24.760975] Hardware name: Microsoft Corporation Virtual
> Machine/Virtual Machine, BIOS 090007 06/02/2017
> [ 24.760975] Call Trace:
> [ 24.760975] dump_stack+0x46/0x5b
> [ 24.760975] panic+0xf8/0x2d2
> [ 24.760975] ? write_irq_affinity.isra.7+0xe5/0xf0
> [ 24.760975] __stack_chk_fail+0x15/0x20
> [ 24.760975] write_irq_affinity.isra.7+0xe5/0xf0
> [ 24.760975] proc_reg_write+0x40/0x60
> [ 24.760975] vfs_write+0xb3/0x1a0
> [ 24.760975] ? _cond_resched+0x16/0x40
> [ 24.760975] ksys_write+0x5c/0xe0
> [ 24.760975] do_syscall_64+0x4f/0x120
> [ 24.760975] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 24.760975] RIP: 0033:0x7f93bcdde187
> [ 24.760975] Code: c3 66 90 41 54 55 49 89 d4 53 48 89 f5 89 fb 48
> 83 ec 10 e8 6b 05 02 00 4c 89 e2 41 89 c0 48 89 ee 89 df b8 01 00 00
> 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 a4 05 02
> 00 48
> [ 24.760975] RSP: 002b:00007ffc4600d900 EFLAGS: 00000293 ORIG_RAX:
> 0000000000000001
> [ 24.760975] RAX: ffffffffffffffda RBX: 0000000000000006 RCX:
> 00007f93bcdde187
> [ 24.760975] RDX: 0000000000000008 RSI: 00005595ad515540 RDI:
> 0000000000000006
> [ 24.760975] RBP: 00005595ad515540 R08: 0000000000000000 R09:
> 00005595ab381820
> [ 24.760975] R10: 0000000000000008 R11: 0000000000000293 R12:
> 0000000000000008
> [ 24.760975] R13: 0000000000000008 R14: 00007f93bd0b62a0 R15:
> 00007f93bd0b5760
> [ 24.760975] Kernel Offset: 0x3a000000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [ 24.760975] ---[ end Kernel panic - not syncing: stack-protector:
> Kernel stack is corrupted in: write_irq_affinity.isra.7+0xe5/0xf0 ]---
Looks familiar,
https://lore.kernel.org/lkml/[email protected]/
I suppose Andrew might be better of reverting the whole series first before Yury
came up with a right fix, so that other people who is testing linux-next don't
need to waste time for the same problem.
Michal Hocko <[email protected]> writes:
> On Fri 31-05-19 10:41:02, Huang, Ying wrote:
>> From: Huang Ying <[email protected]>
>>
>> Mike reported the following warning messages
>>
>> get_swap_device: Bad swap file entry 1400000000000001
>>
>> This is produced by
>>
>> - total_swapcache_pages()
>> - get_swap_device()
>>
>> Where get_swap_device() is used to check whether the swap device is
>> valid and prevent it from being swapoff if so. But get_swap_device()
>> may produce warning message as above for some invalid swap devices.
>> This is fixed via calling swp_swap_info() before get_swap_device() to
>> filter out the swap devices that may cause warning messages.
>>
>> Fixes: 6a946753dbe6 ("mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device()")
>
> I suspect this is referring to a mmotm patch right?
Yes.
> There doesn't seem
> to be any sha like this in Linus' tree AFAICS. If that is the case then
> please note that mmotm patch showing up in linux-next do not have a
> stable sha1 and therefore you shouldn't reference them in the commit
> message. Instead please refer to the specific mmotm patch file so that
> Andrew knows it should be folded in to it.
Thanks for reminding! I will be more careful in the future. It seems
that Andrew has identified the right patch to be folded into.
Best Regards,
Huang, Ying
(Resend as LKML didn't take outlook settings.)
> On Fri, 2019-05-31 at 11:27 -0700, Dexuan-Linux Cui wrote:
> > Hi,
> > Did you know about the panic reported here:
> > https://marc.info/?t=155930773000003&r=1&w=2
> >
> > "Kernel panic - not syncing: stack-protector: Kernel stack is
> > corrupted in: write_irq_affinity.isra> "
> >
> > This panic is reported on PowerPC and x86.
> >
> > In the case of x86, we see a lot of "get_swap_device: Bad swap file entry"
> > errors before the panic:
> >
> > ...
> > [???24.404693] get_swap_device: Bad swap file entry 5800000000000001
> > [???24.408702] get_swap_device: Bad swap file entry 5c00000000000001
> > [???24.412510] get_swap_device: Bad swap file entry 6000000000000001
> > [???24.416519] get_swap_device: Bad swap file entry 6400000000000001
> > [???24.420217] get_swap_device: Bad swap file entry 6800000000000001
> > [???24.423921] get_swap_device: Bad swap file entry 6c00000000000001
[..]
I don't have a panic, but I observe many lines like this.
> Looks familiar,
>
> https://lore.kernel.org/lkml/[email protected]/
>
> I suppose Andrew might be better of reverting the whole series first before Yury
> came up with a right fix, so that other people who is testing linux-next don't
> need to waste time for the same problem.
I didn't observe any problems with this series on top of 5.1, and there's a fix for swap\
that eliminates the problem on top of current next for me:
https://lkml.org/lkml/2019/5/30/1630
Could you please test my series with the patch above?
Thanks,
Yury