2022-09-25 12:03:25

by syzbot

[permalink] [raw]
Subject: [syzbot] WARNING in __change_page_attr_set_clr

Hello,

syzbot found the following issue on:

HEAD commit: 483fed3b5dc8 Add linux-next specific files for 20220921
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=13450b0f080000
kernel config: https://syzkaller.appspot.com/x/.config?x=849cb9f70f15b1ba
dashboard link: https://syzkaller.appspot.com/bug?extid=cdcd5043ce8155d92ab1
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15e2a1b0880000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=154e7d08880000

Downloadable assets:
disk image: https://storage.googleapis.com/1cb3f4618323/disk-483fed3b.raw.xz
vmlinux: https://storage.googleapis.com/cc02cb30b495/vmlinux-483fed3b.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

------------[ cut here ]------------
CPA refuse W^X violation: 8000000000000163 -> 0000000000000163 range: 0xffffffffa0401000 - 0xffffffffa0401fff PFN 7d8d5
WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 verify_rwx arch/x86/mm/pat/set_memory.c:600 [inline]
WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 __change_page_attr arch/x86/mm/pat/set_memory.c:1569 [inline]
WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 __change_page_attr_set_clr+0x1f40/0x2020 arch/x86/mm/pat/set_memory.c:1691
Modules linked in:
CPU: 0 PID: 3607 Comm: syz-executor178 Not tainted 6.0.0-rc6-next-20220921-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/16/2022
RIP: 0010:verify_rwx arch/x86/mm/pat/set_memory.c:600 [inline]
RIP: 0010:__change_page_attr arch/x86/mm/pat/set_memory.c:1569 [inline]
RIP: 0010:__change_page_attr_set_clr+0x1f40/0x2020 arch/x86/mm/pat/set_memory.c:1691
Code: 8b 44 24 50 4d 89 f1 4c 89 e2 4c 89 ee 48 c7 c7 80 0c ea 89 c6 05 1f 3b 94 0c 01 4c 8d 80 ff 0f 00 00 48 89 c1 e8 fd 62 10 08 <0f> 0b e9 8a fc ff ff e8 f4 a1 91 00 e9 14 f8 ff ff 48 8b 7c 24 08
RSP: 0018:ffffc90003c9ebf8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 800000007d8d5163 RCX: 0000000000000000
RDX: ffff8880217c57c0 RSI: ffffffff81620348 RDI: fffff52000793d71
RBP: 0000000000000001 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000001 R11: 7566657220415043 R12: 0000000000000163
R13: 8000000000000163 R14: 000000000007d8d5 R15: 0000000000000000
FS: 0000555556be0300(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000045b630 CR3: 0000000073ec9000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
change_page_attr_set_clr+0x333/0x500 arch/x86/mm/pat/set_memory.c:1784
change_page_attr_clear arch/x86/mm/pat/set_memory.c:1821 [inline]
set_memory_x+0xb2/0x110 arch/x86/mm/pat/set_memory.c:1999
bpf_jit_alloc_exec_page+0x69/0x80 kernel/bpf/trampoline.c:131
bpf_dispatcher_change_prog+0x303/0x8f0 kernel/bpf/dispatcher.c:143
dev_xdp_install+0x198/0x2b0 net/core/dev.c:9134
dev_xdp_attach+0xa30/0x12a0 net/core/dev.c:9274
dev_change_xdp_fd+0x246/0x300 net/core/dev.c:9520
do_setlink+0x31e3/0x3bb0 net/core/rtnetlink.c:3002
rtnl_group_changelink net/core/rtnetlink.c:3303 [inline]
__rtnl_newlink+0xb96/0x17e0 net/core/rtnetlink.c:3557
rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3594
rtnetlink_rcv_msg+0x43a/0xca0 net/core/rtnetlink.c:6091
netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2540
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
netlink_sendmsg+0x917/0xe10 net/netlink/af_netlink.c:1921
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:734
____sys_sendmsg+0x712/0x8c0 net/socket.c:2482
___sys_sendmsg+0x110/0x1b0 net/socket.c:2536
__sys_sendmsg+0xf3/0x1c0 net/socket.c:2565
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fe786b1ce59
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffecfcfcf28 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe786b1ce59
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003
RBP: 00007fe786ae1000 R08: 0000000000000008 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000246 R12: 00007fe786ae1090
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches


2022-09-25 17:11:51

by Dave Hansen

[permalink] [raw]
Subject: Re: [syzbot] WARNING in __change_page_attr_set_clr

On 9/25/22 04:18, syzbot wrote:
> ------------[ cut here ]------------
> CPA refuse W^X violation: 8000000000000163 -> 0000000000000163 range: 0xffffffffa0401000 - 0xffffffffa0401fff PFN 7d8d5
> WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 verify_rwx arch/x86/mm/pat/set_memory.c:600 [inline]
> WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 __change_page_attr arch/x86/mm/pat/set_memory.c:1569 [inline]
> WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 __change_page_attr_set_clr+0x1f40/0x2020 arch/x86/mm/pat/set_memory.c:1691
> Modules linked in:

Yay, one of these that isn't due to wonky 32-bit kernels!

This one looks to be naughty intentionally:

> void *bpf_jit_alloc_exec_page(void)
> {
...
> /* Keep image as writeable. The alternative is to keep flipping ro/rw
> * every time new program is attached or detached.
> */
> set_memory_x((long)image, 1);
> return image;
> }

For STRICT_KERNEL_RWX kernels, I think we would really rather that this
code *did* flip ro/rw every time a new BPF program is attached or detached.

2022-09-25 22:11:34

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [syzbot] WARNING in __change_page_attr_set_clr

On Sun, Sep 25, 2022 at 9:44 AM Dave Hansen <[email protected]> wrote:
>
> On 9/25/22 04:18, syzbot wrote:
> > ------------[ cut here ]------------
> > CPA refuse W^X violation: 8000000000000163 -> 0000000000000163 range: 0xffffffffa0401000 - 0xffffffffa0401fff PFN 7d8d5
> > WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 verify_rwx arch/x86/mm/pat/set_memory.c:600 [inline]
> > WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 __change_page_attr arch/x86/mm/pat/set_memory.c:1569 [inline]
> > WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 __change_page_attr_set_clr+0x1f40/0x2020 arch/x86/mm/pat/set_memory.c:1691
> > Modules linked in:
>
> Yay, one of these that isn't due to wonky 32-bit kernels!
>
> This one looks to be naughty intentionally:
>
> > void *bpf_jit_alloc_exec_page(void)
> > {
> ...
> > /* Keep image as writeable. The alternative is to keep flipping ro/rw
> > * every time new program is attached or detached.
> > */
> > set_memory_x((long)image, 1);
> > return image;
> > }
>
> For STRICT_KERNEL_RWX kernels, I think we would really rather that this
> code *did* flip ro/rw every time a new BPF program is attached or detached.

Steven Rostedt noticed that comment around the middle of August
and told you and Peter about it.
Then Peter added a WARN_ONCE in commit
https://lore.kernel.org/all/[email protected]/
to explicitly trigger that known issue.
Sure enough the fedora fails to boot on linux-next since then,
because systemd is loading bpf programs that use bpf trampoline.
The boot issue was was reported 3 days ago:
https://lore.kernel.org/bpf/[email protected]/T/#u
Now we're trying to urgently address it with:
https://lore.kernel.org/bpf/[email protected]/

So instead of pinging us with your w^x concern you've decided
to fail hard in -next to force the issue and
now acting like this is something surprising to you?!

This is Code of Conduct "worthy" behavior demonstrated
by a newly elected member of the Technical Advisory Board.
Please consider resigning.
A TAB member should be better than this.

As far as this w^x issue...
we've communicated back in May 2022 (sorry I cannot find the link
to that discussion) that bpf_prog_pack is targeting to be used
by everything bpf. Currently by bpf progs only.
bpf trampoline and bpf dispatcher were next on the list.
But then folks expressed the desire to generalize bpf_prog_pack for
everything: bpf, modules and all other trampolines.
We've posted multiple revisions and kept pinging for feedback.
The last one on Aug 24:
https://lkml.org/lkml/2022/8/24/857
If/when the generic vmalloc_exec lands we can finally close
the issue for modules, bpf progs, and trampolines of all kinds.
Make them fast by using large pages and w^x compliant.

And, sorry, "flip ro/rw every time" is not a good idea from
security pov.
There is a much better solution that stalled on the code review.
In the meantime we'll land a quick fix to re-enable boot in -next
in the coming days.

2022-09-25 23:28:46

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [syzbot] WARNING in __change_page_attr_set_clr

On Sun, Sep 25, 2022 at 02:55:46PM -0700, Alexei Starovoitov wrote:
>
> So instead of pinging us with your w^x concern you've decided
> to fail hard in -next to force the issue and
> now acting like this is something surprising to you?!
>
> This is Code of Conduct "worthy" behavior demonstrated
> by a newly elected member of the Technical Advisory Board.

I must be missing something. Why/what do you think this is
specifically a Code of Conduct violation?

- Ted

2022-09-25 23:29:37

by Randy Dunlap

[permalink] [raw]
Subject: Re: [syzbot] WARNING in __change_page_attr_set_clr



On 9/25/22 14:55, Alexei Starovoitov wrote:
> On Sun, Sep 25, 2022 at 9:44 AM Dave Hansen <[email protected]> wrote:
>>
>> On 9/25/22 04:18, syzbot wrote:
>>> ------------[ cut here ]------------
>>> CPA refuse W^X violation: 8000000000000163 -> 0000000000000163 range: 0xffffffffa0401000 - 0xffffffffa0401fff PFN 7d8d5
>>> WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 verify_rwx arch/x86/mm/pat/set_memory.c:600 [inline]
>>> WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 __change_page_attr arch/x86/mm/pat/set_memory.c:1569 [inline]
>>> WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 __change_page_attr_set_clr+0x1f40/0x2020 arch/x86/mm/pat/set_memory.c:1691
>>> Modules linked in:
>>
>> Yay, one of these that isn't due to wonky 32-bit kernels!
>>
>> This one looks to be naughty intentionally:
>>
>>> void *bpf_jit_alloc_exec_page(void)
>>> {
>> ...
>>> /* Keep image as writeable. The alternative is to keep flipping ro/rw
>>> * every time new program is attached or detached.
>>> */
>>> set_memory_x((long)image, 1);
>>> return image;
>>> }
>>
>> For STRICT_KERNEL_RWX kernels, I think we would really rather that this
>> code *did* flip ro/rw every time a new BPF program is attached or detached.
>
> Steven Rostedt noticed that comment around the middle of August
> and told you and Peter about it.
> Then Peter added a WARN_ONCE in commit
> https://lore.kernel.org/all/[email protected]/
> to explicitly trigger that known issue.
> Sure enough the fedora fails to boot on linux-next since then,
> because systemd is loading bpf programs that use bpf trampoline.
> The boot issue was was reported 3 days ago:
> https://lore.kernel.org/bpf/[email protected]/T/#u
> Now we're trying to urgently address it with:
> https://lore.kernel.org/bpf/[email protected]/
>
> So instead of pinging us with your w^x concern you've decided
> to fail hard in -next to force the issue and
> now acting like this is something surprising to you?!
>
> This is Code of Conduct "worthy" behavior demonstrated
> by a newly elected member of the Technical Advisory Board.
> Please consider resigning.
> A TAB member should be better than this.

If it is (and I don't see it), just file a complaint.
Don't try to be the enforcer.

--
~Randy

2022-09-26 00:35:00

by Dave Hansen

[permalink] [raw]
Subject: Re: [syzbot] WARNING in __change_page_attr_set_clr

On 9/25/22 14:55, Alexei Starovoitov wrote:
> And, sorry, "flip ro/rw every time" is not a good idea from
> security pov.
> There is a much better solution that stalled on the code review.
> In the meantime we'll land a quick fix to re-enable boot in -next
> in the coming days.

Peter, I remember an earlier version of your patch having some various
enforcement modes. Since the strict enforcement has actually broken a
few things, should we resurrect the nicer soft detection mode? Or,
maybe make the soft one the only mode for now?

Alexei, the "quick fix" looks sane to me at first glance. Is there
something in there that's not viable long-term?

Also, the intention here was not to force any issues. I thought the
earlier discussion resulted in a bpf fix and applying Peter's patch was
intended to catch _future_ issues. I should have double-checked before
applying it. My apologies.

2022-09-26 09:17:59

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [syzbot] WARNING in __change_page_attr_set_clr

On Sun, Sep 25, 2022 at 02:55:46PM -0700, Alexei Starovoitov wrote:

> Steven Rostedt noticed that comment around the middle of August
> and told you and Peter about it.

He did indeed; and I was thinking he'd told you about it too so you
could fix, what is a very juicy security issue, ASAP.

> Then Peter added a WARN_ONCE in commit
> https://lore.kernel.org/all/[email protected]/
> to explicitly trigger that known issue.

Well, I had sincerely hoped you'd fixed it by now. You just don't let
things like that slide. Note how I didn't post that mostly trivial patch
in mid August. Giving you ample time to fix up.

> Now we're trying to urgently address it with:
> https://lore.kernel.org/bpf/[email protected]/

Glad to see it being fixed. Thanks!

2022-09-26 09:19:51

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [syzbot] WARNING in __change_page_attr_set_clr

On Sun, Sep 25, 2022 at 05:10:19PM -0700, Dave Hansen wrote:

> Peter, I remember an earlier version of your patch having some various
> enforcement modes. Since the strict enforcement has actually broken a
> few things, should we resurrect the nicer soft detection mode? Or,
> maybe make the soft one the only mode for now?

Well, I think we'll have to disable the whole thing for i386, but I'm
sincerely hoping this is the only one we'll hit on x86_64 -- I did spend
some effort fixing W^X issues a while back.

2022-09-26 15:45:05

by Steven Rostedt

[permalink] [raw]
Subject: Re: [Tech-board] [syzbot] WARNING in __change_page_attr_set_clr

On Sun, 25 Sep 2022 14:55:46 -0700
Alexei Starovoitov <[email protected]> wrote:

> On Sun, Sep 25, 2022 at 9:44 AM Dave Hansen <[email protected]> wrote:
> >
> > On 9/25/22 04:18, syzbot wrote:
> > > ------------[ cut here ]------------
> > > CPA refuse W^X violation: 8000000000000163 -> 0000000000000163 range: 0xffffffffa0401000 - 0xffffffffa0401fff PFN 7d8d5
> > > WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 verify_rwx arch/x86/mm/pat/set_memory.c:600 [inline]
> > > WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 __change_page_attr arch/x86/mm/pat/set_memory.c:1569 [inline]
> > > WARNING: CPU: 0 PID: 3607 at arch/x86/mm/pat/set_memory.c:600 __change_page_attr_set_clr+0x1f40/0x2020 arch/x86/mm/pat/set_memory.c:1691
> > > Modules linked in:
> >
> > Yay, one of these that isn't due to wonky 32-bit kernels!
> >
> > This one looks to be naughty intentionally:
> >
> > > void *bpf_jit_alloc_exec_page(void)
> > > {
> > ...
> > > /* Keep image as writeable. The alternative is to keep flipping ro/rw
> > > * every time new program is attached or detached.
> > > */
> > > set_memory_x((long)image, 1);
> > > return image;
> > > }
> >
> > For STRICT_KERNEL_RWX kernels, I think we would really rather that this
> > code *did* flip ro/rw every time a new BPF program is attached or detached.
>
> Steven Rostedt noticed that comment around the middle of August
> and told you and Peter about it.
> Then Peter added a WARN_ONCE in commit
> https://lore.kernel.org/all/[email protected]/
> to explicitly trigger that known issue.
> Sure enough the fedora fails to boot on linux-next since then,
> because systemd is loading bpf programs that use bpf trampoline.
> The boot issue was was reported 3 days ago:
> https://lore.kernel.org/bpf/[email protected]/T/#u
> Now we're trying to urgently address it with:
> https://lore.kernel.org/bpf/[email protected]/
>
> So instead of pinging us with your w^x concern you've decided
> to fail hard in -next to force the issue and
> now acting like this is something surprising to you?!
>
> This is Code of Conduct "worthy" behavior demonstrated

Here's the link to the Code of Conduct:

https://www.kernel.org/doc/html/latest/process/code-of-conduct.html

Which states:

Examples of behavior that contributes to creating a positive environment include:

- Using welcoming and inclusive language
- Being respectful of differing viewpoints and experiences
- Gracefully accepting constructive criticism
- Focusing on what is best for the community
- Showing empathy towards other community members

Examples of unacceptable behavior by participants include:

- The use of sexualized language or imagery and unwelcome sexual attention or advances
- Trolling, insulting/derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others’ private information, such as a physical or electronic address, without explicit permission
- Other conduct which could reasonably be considered inappropriate in a professional setting

I do not see how Dave's response is a violation of any of the above.


> by a newly elected member of the Technical Advisory Board.
> Please consider resigning.

Asking someone to resign is a personal attack, and that can be construed as
a violation of the Code of Conduct.

> A TAB member should be better than this.
>

Let's please keep this on a technical level, as there appears to be a fix
we all can agree on, and let's move forward on that.

Thanks,

-- Steve