Subject: af_packet: use after free in prb_retire_rx_blk_timer_expired

Hi all,

I seem to be hitting this use-after-free on a -next kernel using trinity:

[ 531.036054] BUG: KASAN: use-after-free in prb_retire_rx_blk_timer_expired (net/packet/af_packet.c:688) [ 531.036961] Read of size 8 at addr ffff88038c1fb0e8 by task swapper/1/0 [ 531.037727] [ 531.037928] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc5-next-20170407-dirty #24
[ 531.038862] Call Trace:
[ 531.039163] <IRQ>
[ 531.039447] dump_stack (lib/dump_stack.c:54)
[ 531.041612] print_address_description (mm/kasan/report.c:253)
[ 531.042809] kasan_report (mm/kasan/report.c:352 mm/kasan/report.c:408)
[ 531.043263] __asan_report_load8_noabort (mm/kasan/report.c:429)
[ 531.043829] prb_retire_rx_blk_timer_expired (net/packet/af_packet.c:688)
[ 531.048298] call_timer_fn.isra.15 (./arch/x86/include/asm/preempt.h:22 kernel/time/timer.c:1246)
[ 531.048805] __run_timers (./include/linux/spinlock.h:324 kernel/time/timer.c:1308 kernel/time/timer.c:1601)
[ 531.055404] run_timer_softirq (kernel/time/timer.c:1614)
[ 531.055883] __do_softirq (./arch/x86/include/asm/preempt.h:22 kernel/softirq.c:286)
[ 531.057507] irq_exit (kernel/softirq.c:364 kernel/softirq.c:405)
[ 531.057893] smp_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:965)
[ 531.058446] apic_timer_interrupt (arch/x86/entry/entry_64.S:704)
[ 531.058951] RIP: 0010:native_safe_halt (??:?)
[ 531.059718] RSP: 0018:ffff88039aa8fe88 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
[ 531.060593] RAX: 0000000000080000 RBX: ffff88039aa68fc0 RCX: 0000000000000000
[ 531.061411] RDX: 1ffff1007354d1f8 RSI: 0000000000000000 RDI: 0000000000000000
[ 531.062203] RBP: ffff88039aa8fe88 R08: ffff880376251bc0 R09: 0000000000000001
[ 531.063001] R10: ffff88038e0f7838 R11: 0000000000000001 R12: ffff88039aa68fc0
[ 531.064007] R13: ffffffff83df0028 R14: 0000000000000000 R15: ffff88039aa68fc0
[ 531.064811] </IRQ>
[ 531.065886] default_idle (./arch/x86/include/asm/paravirt.h:98 arch/x86/kernel/process.c:341)
[ 531.066284] arch_cpu_idle (arch/x86/kernel/process.c:333)
[ 531.066692] default_idle_call (kernel/sched/idle.c:101)
[ 531.067151] do_idle (kernel/sched/idle.c:156 kernel/sched/idle.c:245)
[ 531.067537] cpu_startup_entry (kernel/sched/idle.c:350 (discriminator 1))
[ 531.067992] start_secondary (arch/x86/kernel/smpboot.c:276)
[ 531.068444] secondary_startup_64 (arch/x86/kernel/verify_cpu.S:37)
[ 531.068924] [ 531.069109] Allocated by task 18982: [ 531.069522] save_stack_trace (arch/x86/kernel/stacktrace.c:60) [ 531.069965] save_stack (mm/kasan/kasan.c:493 mm/kasan/kasan.c:514)
[ 531.070347] kasan_kmalloc (mm/kasan/kasan.c:525 mm/kasan/kasan.c:617)
[ 531.070757] __kmalloc (mm/slub.c:3747)
[ 531.071153] packet_set_ring (net/packet/af_packet.c:4130 net/packet/af_packet.c:4218)
[ 531.072024] packet_setsockopt (net/packet/af_packet.c:3617)
[ 531.072525] SyS_setsockopt (net/socket.c:1797 net/socket.c:1777)
[ 531.072968] do_syscall_64 (arch/x86/entry/common.c:284)
[ 531.073405] return_from_SYSCALL_64 (arch/x86/entry/entry_64.S:249)
[ 531.073893]
[ 531.074076] Freed by task 7019:
[ 531.074443] save_stack_trace (arch/x86/kernel/stacktrace.c:60)
[ 531.074882] save_stack (mm/kasan/kasan.c:493 mm/kasan/kasan.c:514)
[ 531.075275] kasan_slab_free (mm/kasan/kasan.c:525 mm/kasan/kasan.c:590)
[ 531.075705] kfree (mm/slub.c:2966 mm/slub.c:3882)
[ 531.076052] free_pg_vec (net/packet/af_packet.c:4096)
[ 531.076448] packet_set_ring (net/packet/af_packet.c:4298)
[ 531.076922] packet_setsockopt (net/packet/af_packet.c:3617)
[ 531.077406] SyS_setsockopt (net/socket.c:1797 net/socket.c:1777)
[ 531.077848] do_syscall_64 (arch/x86/entry/common.c:284)
[ 531.078285] return_from_SYSCALL_64 (arch/x86/entry/entry_64.S:249)
[ 531.078773]
[ 531.078956] The buggy address belongs to the object at ffff88038c1fb0e8
[ 531.078956] which belongs to the cache kmalloc-8 of size 8
[ 531.080341] The buggy address is located 0 bytes inside of
[ 531.080341] 8-byte region [ffff88038c1fb0e8, ffff88038c1fb0f0)
[ 531.081600] The buggy address belongs to the page:
[ 531.082150] page:ffffea000e307e80 count:1 mapcount:0 mapping: (null) index:0xffff88038c1fbd90 compound_mapcount: 0
[ 531.083613] flags: 0x2fffc0000008100(slab|head)
[ 531.084139] raw: 02fffc0000008100 0000000000000000 ffff88038c1fbd90 0000000100160015
[ 531.085010] raw: ffffea000e417ea0 ffffea000e421520 ffff88039c4103c0 0000000000000000
[ 531.085875] page dumped because: kasan: bad access detected
[ 531.086504]
[ 531.086686] Memory state around the buggy address:
[ 531.087242] ffff88038c1faf80: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 531.088054] ffff88038c1fb000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 531.088873] >ffff88038c1fb080: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fc fc
[ 531.089679] ^
[ 531.090425] ffff88038c1fb100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 531.091433] ffff88038c1fb180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 531.092240] ==================================================================
[ 531.093054] Disabling lock debugging due to kernel taint
[ 533.819741] ODEBUG: free active (active state 0) object type: timer_list hint: prb_retire_rx_blk_timer_expired (net/packet/af_packet.c:679)
[ 533.822564] ------------[ cut here ]------------
[ 533.823119] WARNING: CPU: 7 PID: 1226 at lib/debugobjects.c:289 debug_print_object (lib/debugobjects.c:286)
[ 533.824111] Modules linked in:
[ 533.824471] CPU: 7 PID: 1226 Comm: trinity-main Tainted: G B 4.11.0-rc5-next-20170407-dirty #24
[ 533.825558] task: ffff880395cedd40 task.stack: ffff880395e90000
[ 533.826235] RIP: 0010:debug_print_object (??:?)
[ 533.826788] RSP: 0018:ffff880395e974d0 EFLAGS: 00010082
[ 533.827375] RAX: 000000000000006c RBX: 0000000000000003 RCX: 0000000000000000
[ 533.828171] RDX: 000000000000006c RSI: 1ffff10072bd2e39 RDI: ffffed0072bd2e90
[ 533.828963] RBP: ffff880395e974f8 R08: 203a47554245444f R09: 65657266203a4755
[ 533.829779] R10: ffffed0072bd2ec9 R11: 0000000000001638 R12: ffffffff83459660
[ 533.830576] R13: ffffffff82fd2b20 R14: 0000000000000000 R15: dffffc0000000000
[ 533.831395] FS: 00007fec989f4700(0000) GS:ffff88039cbc0000(0000) knlGS:0000000000000000
[ 533.832296] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 533.832941] CR2: 0000000000000008 CR3: 0000000395ea2000 CR4: 00000000000406a0
[ 533.833736] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 533.834523] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 533.835351] Call Trace:
[ 533.835642] debug_check_no_obj_freed (lib/debugobjects.c:744 lib/debugobjects.c:772)
[ 533.840679] kfree (mm/slub.c:1357 mm/slub.c:1379 mm/slub.c:2961 mm/slub.c:3882)
[ 533.841025] __sk_destruct (net/core/sock.c:1458 net/core/sock.c:1536)
[ 533.845132] sk_destruct (net/core/sock.c:1545)
[ 533.845527] __sk_free (net/core/sock.c:1553)
[ 533.845919] sk_free (net/core/sock.c:1564)
[ 533.846274] packet_release (net/packet/af_packet.c:2941)
[ 533.850968] sock_release (net/socket.c:598)
[ 533.851813] sock_close (net/socket.c:1074)
[ 533.852195] __fput (fs/file_table.c:210)
[ 533.853779] ____fput (fs/file_table.c:246)
[ 533.854143] task_work_run (kernel/task_work.c:118 (discriminator 1))
[ 533.855516] exit_to_usermode_loop (./include/linux/tracehook.h:193 arch/x86/entry/common.c:161)
[ 533.856803] do_syscall_64 (./arch/x86/include/asm/current.h:14 arch/x86/entry/common.c:208 arch/x86/entry/common.c:263 arch/x86/entry/common.c:289)
[ 533.860762] entry_SYSCALL64_slow_path (arch/x86/entry/entry_64.S:249)
[ 533.861294] RIP: 0033:0x7fec982f9d10
[ 533.861703] RSP: 002b:00007ffffc92d5a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[ 533.862536] RAX: 0000000000000000 RBX: 0000000002cb2cf0 RCX: 00007fec982f9d10
[ 533.863349] RDX: 000000000000000d RSI: 0000000000000002 RDI: 0000000000000179
[ 533.864149] RBP: 0000000000000179 R08: 0000000000000008 R09: 00007fec989f4700
[ 533.864930] R10: 00007ffffc92d5b0 R11: 0000000000000246 R12: 0000000000000000
[ 533.865729] R13: 00007fec989ef1a0 R14: 0000000000000000 R15: 0000000000000000 [ 533.866521] Code: 0d 48 89 75 d8 e8 20 01 8b ff 48 8b 75 d8 48 8b 14 dd 40 8f 51 83 4d 89 e9 4d 89 e0 44 89 f1 48 c7 c7 e0 85 51 83 e8 d3 29 75 ff <0f> ff 83 05 2a 1e 16 02 01 48 83 c4 08 5b 41 5c 41 5d 41 5e 5d
All code
========
0: 0d 48 89 75 d8 or $0xd8758948,%eax
5: e8 20 01 8b ff callq 0xffffffffff8b012a
a: 48 8b 75 d8 mov -0x28(%rbp),%rsi
e: 48 8b 14 dd 40 8f 51 mov -0x7cae70c0(,%rbx,8),%rdx
15: 83
16: 4d 89 e9 mov %r13,%r9
19: 4d 89 e0 mov %r12,%r8
1c: 44 89 f1 mov %r14d,%ecx
1f: 48 c7 c7 e0 85 51 83 mov $0xffffffff835185e0,%rdi
26: e8 d3 29 75 ff callq 0xffffffffff7529fe
2b:* 0f ff (bad) <-- trapping instruction
2d: 83 05 2a 1e 16 02 01 addl $0x1,0x2161e2a(%rip) # 0x2161e5e
34: 48 83 c4 08 add $0x8,%rsp
38: 5b pop %rbx
39: 41 5c pop %r12
3b: 41 5d pop %r13
3d: 41 5e pop %r14
3f: 5d pop %rbp
...

Code starting with the faulting instruction
===========================================
0: 0f ff (bad)
2: 83 05 2a 1e 16 02 01 addl $0x1,0x2161e2a(%rip) # 0x2161e33
9: 48 83 c4 08 add $0x8,%rsp
d: 5b pop %rbx
e: 41 5c pop %r12
10: 41 5d pop %r13
12: 41 5e pop %r14
14: 5d pop %rbp
...
[ 533.868922] ---[ end trace eb76f4e0fb42fae2 ]---
--

Thanks,
Sasha


2017-04-10 19:23:16

by Dave Jones

[permalink] [raw]
Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired

On Mon, Apr 10, 2017 at 07:03:30PM +0000, [email protected] wrote:
> Hi all,
>
> I seem to be hitting this use-after-free on a -next kernel using trinity:
>
> [ 531.036054] BUG: KASAN: use-after-free in prb_retire_rx_blk_timer_expired (net/packet/af_packet.c:688) [ 531.036961] Read of size 8 at addr ffff88038c1fb0e8 by task swapper/1/0 [ 531.037727] [ 531.037928] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc5-next-20170407-dirty #24

Funny, I was just going over my old pending bugs, and found this one
from January that looks like what happens with the same bug, but without kasan..

context: PID: 0 TASK: ffff881ff2fa5100 CPU: 5 COMMAND: "swapper/5"
panic: general protection fault: 0000 [#1]
netversion: 2.2-1 (Feb 2014)
Backtrace:
#0 [ffff881fffaa3c00] machine_kexec at ffffffff81044af8
#1 [ffff881fffaa3c60] __crash_kexec at ffffffff810ec755
#2 [ffff881fffaa3d28] crash_kexec at ffffffff810ec81f
#3 [ffff881fffaa3d40] oops_end at ffffffff8101e348
#4 [ffff881fffaa3d68] die at ffffffff8101e76b
#5 [ffff881fffaa3d98] do_general_protection at ffffffff8101be76
#6 [ffff881fffaa3dc0] general_protection at ffffffff817fe5a2
[exception RIP: prb_retire_rx_blk_timer_expired+65]
RIP: ffffffff817e6e41 RSP: ffff881fffaa3e78 RFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff881fd7075800 RCX: 0000000000000000
RDX: ffff883ff0a16bb0 RSI: 0074636361757063 RDI: ffff881fd70758bc
RBP: ffff881fffaa3e88 R8: 0000000000000001 R9: 0000000000000005
R10: 0000000000000000 R11: 0000000000000000 R12: ffff881fd7075b78
R13: 0000000000000100 R14: ffffffff817e6e00 R15: ffff881fd7075800
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff881fffaa3e90] call_timer_fn at ffffffff810cec35
#8 [ffff881fffaa3ec8] run_timer_softirq at ffffffff810cf01c
#9 [ffff881fffaa3f28] __softirqentry_text_start at ffffffff817ff05c
#10 [ffff881fffaa3f88] irq_exit at ffffffff8107d5fc
#11 [ffff881fffaa3f98] smp_apic_timer_interrupt at ffffffff817feea2
#12 [ffff881fffaa3fb0] apic_timer_interrupt at ffffffff817fd56f
--- <IRQ stack> ---
#13 [ffff881ff2fbfdd0] apic_timer_interrupt at ffffffff817fd56f
RIP: 0000000000000018 RSP: 0000000000000000 RFLAGS: ffffffff81ebbb60
RAX: ffffe8e0002a0400 RBX: 00000067b502e95f RCX: 0000000000000006
RDX: 000000000000002e RSI: 0000000000000034 RDI: 0000000000000001
RBP: ffffffff81150540 R8: ffff881ff2fbfee0 R9: 0000000000000001
R10: 0000000000000005 R11: ffffffff81ebbb60 R12: ffff881ff2fbfe48
R13: ffff881ff2fa5110 R14: 0000000000000000 R15: ffff881ff2fa5100
ORIG_RAX: ffff881fffab5340 CS: 20c49ba5e353f7cf SS: ffffffffffffff10
WARNING: possibly bogus exception frame
Dmesg:
Code: 00 00 48 8b 93 10 03 00 00 80 bb 21 03 00 00 00 44 0f b6 83 20 03 00 00 0f b7 c8 48 8b 34 ca 75 57 <44> 8b 5e 0c 45 85 db 74 1d 8b 93 68 03 00 00 85 d2 74 13 f3 90

RIP
[<ffffffff817e6e41>] prb_retire_rx_blk_timer_expired+0x41/0x120
RSP <ffff881fffaa3e78>
------------[ cut here ]------------

2017-04-11 23:23:23

by Willem de Bruijn

[permalink] [raw]
Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired

On Mon, Apr 10, 2017 at 3:23 PM, Dave Jones <[email protected]> wrote:
> On Mon, Apr 10, 2017 at 07:03:30PM +0000, [email protected] wrote:
> > Hi all,
> >
> > I seem to be hitting this use-after-free on a -next kernel using trinity:
> >
> > [ 531.036054] BUG: KASAN: use-after-free in prb_retire_rx_blk_timer_expired (net/packet/af_packet.c:688)

The retire_blk_timer is called after the pg_vec struct for this ring
was freed. This should not happen. packet_set_ring stops the timer
with del_timer_sync when tearing down the ring before freeing that
struct:

if (closing && (po->tp_version > TPACKET_V2)) {
/* Because we don't support block-based V3 on tx-ring */
if (!tx_ring)
prb_shutdown_retire_blk_timer(po, rb_queue);
}

if (pg_vec)
free_pg_vec(pg_vec, order, req->tp_block_nr);

This is a similar race to the use-after-free fixed by 84ac7260236a
("packet: fix race condition in packet_set_ring"). The previous race
was triggered by a call to setsockopt PACKET_VERSION changing
tp_version while the ring is active. It is not immediately obvious
what is the cause now. I suppose trinity does not give a trace of such
system calls on this file descriptor? That would be helpful.

The bug report shows both a timer firing after the packet_set_ring
call that freed the pg_vec, and later a CONFIG_DEBUG_OBJECTS_FREE
warning that the timer is still active when the socket is closed on
release of the last file descriptor.

2017-07-22 10:05:13

by liujian (CE)

[permalink] [raw]
Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired

I also hit this issue with trinity test:

The call trace:
[exception RIP: prb_retire_rx_blk_timer_expired+70]
RIP: ffffffff81633be6 RSP: ffff8801bec03dc0 RFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8801b49d0948 RCX: 0000000000000000
RDX: ffff8801b31057a0 RSI: a56b6b6b6b6b6b6b RDI: ffff8801b49d09ec
RBP: ffff8801bec03dd8 R8: 0000000000000001 R9: ffffffff83e1bf80
R10: 0000000000000002 R11: 0000000000000005 R12: ffff8801b49d09ec
R13: 0000000000000100 R14: ffffffff81633ba0 R15: ffff8801b49d0948
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff8801bec03de0] call_timer_fn at ffffffff8108cb76
#8 [ffff8801bec03e18] run_timer_softirq at ffffffff8108f87c
#9 [ffff8801bec03e90] __do_softirq at ffffffff8108629f
#10 [ffff8801bec03f00] call_softirq at ffffffff8166a01c
#11 [ffff8801bec03f18] do_softirq at ffffffff810172ad
#12 [ffff8801bec03f30] irq_exit at ffffffff81086655
#13 [ffff8801bec03f48] msa_irq_exit at ffffffff810b1ab3
#14 [ffff8801bec03f88] smp_apic_timer_interrupt at ffffffff8166aeae
#15 [ffff8801bec03fb0] apic_timer_interrupt at ffffffff816692dd
--- <IRQ stack> ---

And from vmcore, I can see the pointer GET_CURR_PBLOCK_DESC_FROM_CORE(pkc); is a56b6b6b6b6b6b6b


struct packet_ring_buffer rx_ring = {
pg_vec = 0x0,
head = 0x0,
frames_per_block = 0x400,
frame_size = 0x0,
frame_max = 0xffffffff,
pg_vec_order = 0x0,
pg_vec_pages = 0x0,
pg_vec_len = 0x0,
pending_refcnt = 0x0,
prb_bdqc = {
pkbdq = 0xffff8801b31057a0,
feature_req_word = 0x1,
hdrlen = 0x44,
reset_pending_on_curr_blk = 0x1,
delete_blk_timer = 0x0,
kactive_blk_num = 0x0,
blk_sizeof_priv = 0x0,
last_kactive_blk_num = 0x0,
pkblk_start = 0xffff8800a7000000 struct: page excluded: kernel virtual address: ffff8800a7000000 type: "gdb_readmem_callback"
struct: page excluded: kernel virtual address: ffff8800a7000000 type: "gdb_readmem_callback"
<Address 0xffff8800a7000000 out of bounds>,
pkblk_end = 0xffff8800a7200000 "\002",
kblk_size = 0x200000,
max_frame_len = 0x1fffd0,
knum_blocks = 0x1,
knxt_seq_num = 0x2,
prev = 0xffff8800a7000030 struct: page excluded: kernel virtual address: ffff8800a7000030 type: "gdb_readmem_callback"
struct: page excluded: kernel virtual address: ffff8800a7000030 type: "gdb_readmem_callback"
<Address 0xffff8800a7000030 out of bounds>,
nxt_offset = 0xffff8800a7000030 struct: page excluded: kernel virtual address: ffff8800a7000030 type: "gdb_readmem_callback"
struct: page excluded: kernel virtual address: ffff8800a7000030 type: "gdb_readmem_callback"
<Address 0xffff8800a7000030 out of bounds>,
skb = 0x0,
blk_fill_in_prog = {
counter = 0x0

crash> struct pgv 0xffff8801b31057a0
struct pgv {
buffer = 0xa56b6b6b6b6b6b6b <Address 0xa56b6b6b6b6b6b6b out of bounds>
}


Best Regards,
liujian


> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> On Behalf Of Willem de Bruijn
> Sent: Wednesday, April 12, 2017 7:23 AM
> To: Dave Jones; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired
>
> On Mon, Apr 10, 2017 at 3:23 PM, Dave Jones <[email protected]>
> wrote:
> > On Mon, Apr 10, 2017 at 07:03:30PM +0000, [email protected]
> wrote:
> > > Hi all,
> > >
> > > I seem to be hitting this use-after-free on a -next kernel using trinity:
> > >
> > > [ 531.036054] BUG: KASAN: use-after-free in
> > prb_retire_rx_blk_timer_expired (net/packet/af_packet.c:688)
>
> The retire_blk_timer is called after the pg_vec struct for this ring was freed.
> This should not happen. packet_set_ring stops the timer with del_timer_sync
> when tearing down the ring before freeing that
> struct:
>
> if (closing && (po->tp_version > TPACKET_V2)) {
> /* Because we don't support block-based V3 on tx-ring */
> if (!tx_ring)
> prb_shutdown_retire_blk_timer(po, rb_queue);
> }
>
> if (pg_vec)
> free_pg_vec(pg_vec, order, req->tp_block_nr);
>
> This is a similar race to the use-after-free fixed by 84ac7260236a
> ("packet: fix race condition in packet_set_ring"). The previous race was
> triggered by a call to setsockopt PACKET_VERSION changing tp_version while
> the ring is active. It is not immediately obvious what is the cause now. I
> suppose trinity does not give a trace of such system calls on this file descriptor?
> That would be helpful.
>
> The bug report shows both a timer firing after the packet_set_ring call that
> freed the pg_vec, and later a CONFIG_DEBUG_OBJECTS_FREE warning that
> the timer is still active when the socket is closed on release of the last file
> descriptor.

2017-07-22 19:02:25

by Cong Wang

[permalink] [raw]
Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired

Hello,

On Sat, Jul 22, 2017 at 2:55 AM, liujian (CE) <[email protected]> wrote:
> I also hit this issue with trinity test:
>
> The call trace:
> [exception RIP: prb_retire_rx_blk_timer_expired+70]
> RIP: ffffffff81633be6 RSP: ffff8801bec03dc0 RFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff8801b49d0948 RCX: 0000000000000000
> RDX: ffff8801b31057a0 RSI: a56b6b6b6b6b6b6b RDI: ffff8801b49d09ec
> RBP: ffff8801bec03dd8 R8: 0000000000000001 R9: ffffffff83e1bf80
> R10: 0000000000000002 R11: 0000000000000005 R12: ffff8801b49d09ec
> R13: 0000000000000100 R14: ffffffff81633ba0 R15: ffff8801b49d0948
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #7 [ffff8801bec03de0] call_timer_fn at ffffffff8108cb76
> #8 [ffff8801bec03e18] run_timer_softirq at ffffffff8108f87c
> #9 [ffff8801bec03e90] __do_softirq at ffffffff8108629f
> #10 [ffff8801bec03f00] call_softirq at ffffffff8166a01c
> #11 [ffff8801bec03f18] do_softirq at ffffffff810172ad
> #12 [ffff8801bec03f30] irq_exit at ffffffff81086655
> #13 [ffff8801bec03f48] msa_irq_exit at ffffffff810b1ab3
> #14 [ffff8801bec03f88] smp_apic_timer_interrupt at ffffffff8166aeae
> #15 [ffff8801bec03fb0] apic_timer_interrupt at ffffffff816692dd
> --- <IRQ stack> ---
>
> And from vmcore, I can see the pointer GET_CURR_PBLOCK_DESC_FROM_CORE(pkc); is a56b6b6b6b6b6b6b
>

Does the following quick fix help?


diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 008bb34ee324..09ec1640e5f7 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4264,6 +4264,7 @@ static int packet_set_ring(struct sock *sk,
union tpacket_req_u *req_u,
/* Block transmit is not supported yet */
if (!tx_ring) {
init_prb_bdqc(po, rb, pg_vec, req_u);
+ pg_vec = NULL;
} else {
struct tpacket_req3 *req3 = &req_u->req3;

2017-07-23 03:44:33

by Ding Tianhong

[permalink] [raw]
Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired



On 2017/7/23 3:02, Cong Wang wrote:
> Hello,
>
> On Sat, Jul 22, 2017 at 2:55 AM, liujian (CE) <[email protected]> wrote:
>> I also hit this issue with trinity test:
>>
>> The call trace:
>> [exception RIP: prb_retire_rx_blk_timer_expired+70]
>> RIP: ffffffff81633be6 RSP: ffff8801bec03dc0 RFLAGS: 00010246
>> RAX: 0000000000000000 RBX: ffff8801b49d0948 RCX: 0000000000000000
>> RDX: ffff8801b31057a0 RSI: a56b6b6b6b6b6b6b RDI: ffff8801b49d09ec
>> RBP: ffff8801bec03dd8 R8: 0000000000000001 R9: ffffffff83e1bf80
>> R10: 0000000000000002 R11: 0000000000000005 R12: ffff8801b49d09ec
>> R13: 0000000000000100 R14: ffffffff81633ba0 R15: ffff8801b49d0948
>> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
>> #7 [ffff8801bec03de0] call_timer_fn at ffffffff8108cb76
>> #8 [ffff8801bec03e18] run_timer_softirq at ffffffff8108f87c
>> #9 [ffff8801bec03e90] __do_softirq at ffffffff8108629f
>> #10 [ffff8801bec03f00] call_softirq at ffffffff8166a01c
>> #11 [ffff8801bec03f18] do_softirq at ffffffff810172ad
>> #12 [ffff8801bec03f30] irq_exit at ffffffff81086655
>> #13 [ffff8801bec03f48] msa_irq_exit at ffffffff810b1ab3
>> #14 [ffff8801bec03f88] smp_apic_timer_interrupt at ffffffff8166aeae
>> #15 [ffff8801bec03fb0] apic_timer_interrupt at ffffffff816692dd
>> --- <IRQ stack> ---
>>
>> And from vmcore, I can see the pointer GET_CURR_PBLOCK_DESC_FROM_CORE(pkc); is a56b6b6b6b6b6b6b
>>
>
> Does the following quick fix help?
>
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 008bb34ee324..09ec1640e5f7 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -4264,6 +4264,7 @@ static int packet_set_ring(struct sock *sk,
> union tpacket_req_u *req_u,
> /* Block transmit is not supported yet */
> if (!tx_ring) {
> init_prb_bdqc(po, rb, pg_vec, req_u);
> + pg_vec = NULL;
> } else {
> struct tpacket_req3 *req3 = &req_u->req3;
>

Hi, Cong:

Thanks for your quirk solution, but I still has some doubts about it,
it looks like fix the problem in the packet_setsockopt->packet_set_ring processing,
but when in packet_release processing, it may could not release the
real pg_vec for the TPACKET_V3 ring, and then cause the mem leak,
maybe I miss something here, nice to hear from your feedback. :)

what about fix it this way:
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4335,9 +4335,13 @@ static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u,
/* Because we don't support block-based V3 on tx-ring */
if (!tx_ring)
prb_shutdown_retire_blk_timer(po, rb_queue);
+
+ if (pg_vec)
+ free_pg_vec(pg_vec, order, req->tp_block_nr);
+
}

- if (pg_vec)
+ if (pg_vec && (po->tp_version < TPACKET_V3))
free_pg_vec(pg_vec, order, req->tp_block_nr);
out:
release_sock(sk);


Regards
Ding

> .
>

2017-07-23 05:59:32

by Cong Wang

[permalink] [raw]
Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired

On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong <[email protected]> wrote:
> Hi, Cong:
>
> Thanks for your quirk solution, but I still has some doubts about it,
> it looks like fix the problem in the packet_setsockopt->packet_set_ring processing,
> but when in packet_release processing, it may could not release the
> real pg_vec for the TPACKET_V3 ring, and then cause the mem leak,
> maybe I miss something here, nice to hear from your feedback. :)

Yes you miss that packet_release() has memset()'s so we won't hit
that path. :)

However, I missed the swap() in this messy function, actually I
believe the bug is that we modify tpacket_kbdq_core inside rx_ring
in non-closing case without actually stopping its timer. I feel
more confident with the following patch:


diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 008bb34ee324..267b181fef15 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk,
union tpacket_req_u *req_u,
case TPACKET_V3:
/* Block transmit is not supported yet */
if (!tx_ring) {
+ prb_shutdown_retire_blk_timer(po, rb_queue);
init_prb_bdqc(po, rb, pg_vec, req_u);
} else {
struct tpacket_req3 *req3 = &req_u->req3;

2017-07-23 08:25:36

by liujian (CE)

[permalink] [raw]
Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired

Hi Wang Cong,

With this patch , the system was crashed when setsockopt.

The call trace as below:

crash> bt
PID: 3069 TASK: ffff8800afcc0000 CPU: 0 COMMAND: "trinity-main"
#0 [ffff8801bec03ce0] machine_kexec at ffffffff8105354b
#1 [ffff8801bec03d40] crash_kexec at ffffffff810f7e82
#2 [ffff8801bec03e10] panic at ffffffff81650058
#3 [ffff8801bec03e90] watchdog_timer_fn at ffffffff81122533
#4 [ffff8801bec03ec8] __hrtimer_run_queues at ffffffff810abeb2
#5 [ffff8801bec03f20] hrtimer_interrupt at ffffffff810ac450
#6 [ffff8801bec03f70] local_apic_timer_interrupt at ffffffff8104a457
#7 [ffff8801bec03f88] smp_apic_timer_interrupt at ffffffff8166aed0
#8 [ffff8801bec03fb0] apic_timer_interrupt at ffffffff8166931d
--- <IRQ stack> ---
#9 [ffff8801b301fcb8] apic_timer_interrupt at ffffffff8166931d
[exception RIP: lock_timer_base+77]
RIP: ffffffff8108dced RSP: ffff8801b301fd60 RFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff8800afcc0000 RCX: 0000000000000001
RDX: ffff8800afcc0000 RSI: ffff8801b301fd90 RDI: ffff8800b0d853c8
RBP: ffff8801b301fd80 R8: ffff8800afcc0000 R9: ffffea0002680000
R10: 000000000000003c R11: ffff8801b301fb2e R12: ffff8800afcc0000
R13: ffff8800afcc0000 R14: 0000000000000000 R15: ffffffff83d1a340
ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018
#10 [ffff8801b301fd88] try_to_del_timer_sync at ffffffff8108f19f
#11 [ffff8801b301fdb8] del_timer_sync at ffffffff8108f252
#12 [ffff8801b301fdd0] packet_set_ring at ffffffff81635e60
#13 [ffff8801b301fe98] packet_setsockopt at ffffffff81636760
#14 [ffff8801b301ff38] sys_setsockopt at ffffffff81531860
#15 [ffff8801b301ff80] tracesys at ffffffff816687ed (via system_call)
RIP: 00007fcc78b03e3a RSP: 00007fff16f246b8 RFLAGS: 00000202
RAX: ffffffffffffffda RBX: ffffffff816687ed RCX: ffffffffffffffff
RDX: 0000000000000005 RSI: 0000000000000107 RDI: 0000000000000180
RBP: 0000000000000180 R8: 000000000000001c R9: 00007fcc78dc7160
R10: 0000000001fd6ba0 R11: 0000000000000202 R12: 0000000000000000
R13: 0000000000000011 R14: 0000000001fd6b60 R15: 0000000001fd6b70
ORIG_RAX: 0000000000000036 CS: 0033 SS: 002b


Best Regards,
liujian


> -----Original Message-----
> From: Cong Wang [mailto:[email protected]]
> Sent: Sunday, July 23, 2017 1:59 PM
> To: Dingtianhong
> Cc: liujian (CE); Willem de Bruijn; Dave Jones; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]
> Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired
>
> On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong <[email protected]>
> wrote:
> > Hi, Cong:
> >
> > Thanks for your quirk solution, but I still has some doubts about it,
> > it looks like fix the problem in the
> > packet_setsockopt->packet_set_ring processing, but when in
> > packet_release processing, it may could not release the real pg_vec
> > for the TPACKET_V3 ring, and then cause the mem leak, maybe I miss
> > something here, nice to hear from your feedback. :)
>
> Yes you miss that packet_release() has memset()'s so we won't hit that path. :)
>
> However, I missed the swap() in this messy function, actually I believe the bug
> is that we modify tpacket_kbdq_core inside rx_ring in non-closing case without
> actually stopping its timer. I feel more confident with the following patch:
>
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index
> 008bb34ee324..267b181fef15 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk, union
> tpacket_req_u *req_u,
> case TPACKET_V3:
> /* Block transmit is not supported yet */
> if (!tx_ring) {
> + prb_shutdown_retire_blk_timer(po,
> + rb_queue);
> init_prb_bdqc(po, rb, pg_vec, req_u);
> } else {
> struct tpacket_req3 *req3 =
> &req_u->req3;

2017-07-23 09:59:40

by liujian (CE)

[permalink] [raw]
Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired

Hi,

Do we need delete the v3 ring, when tp_version changed from TPACKET_V3 to TPACKET_V1 ?


Best Regards,
liujian


> -----Original Message-----
> From: liujian (CE)
> Sent: Sunday, July 23, 2017 4:21 PM
> To: 'Cong Wang'; Dingtianhong
> Cc: Willem de Bruijn; Dave Jones; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]
> Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired
>
> Hi Wang Cong,
>
> With this patch , the system was crashed when setsockopt.
>
> The call trace as below:
>
> crash> bt
> PID: 3069 TASK: ffff8800afcc0000 CPU: 0 COMMAND: "trinity-main"
> #0 [ffff8801bec03ce0] machine_kexec at ffffffff8105354b
> #1 [ffff8801bec03d40] crash_kexec at ffffffff810f7e82
> #2 [ffff8801bec03e10] panic at ffffffff81650058
> #3 [ffff8801bec03e90] watchdog_timer_fn at ffffffff81122533
> #4 [ffff8801bec03ec8] __hrtimer_run_queues at ffffffff810abeb2
> #5 [ffff8801bec03f20] hrtimer_interrupt at ffffffff810ac450
> #6 [ffff8801bec03f70] local_apic_timer_interrupt at ffffffff8104a457
> #7 [ffff8801bec03f88] smp_apic_timer_interrupt at ffffffff8166aed0
> #8 [ffff8801bec03fb0] apic_timer_interrupt at ffffffff8166931d
> --- <IRQ stack> ---
> #9 [ffff8801b301fcb8] apic_timer_interrupt at ffffffff8166931d
> [exception RIP: lock_timer_base+77]
> RIP: ffffffff8108dced RSP: ffff8801b301fd60 RFLAGS: 00000246
> RAX: 0000000000000000 RBX: ffff8800afcc0000 RCX:
> 0000000000000001
> RDX: ffff8800afcc0000 RSI: ffff8801b301fd90 RDI: ffff8800b0d853c8
> RBP: ffff8801b301fd80 R8: ffff8800afcc0000 R9: ffffea0002680000
> R10: 000000000000003c R11: ffff8801b301fb2e R12: ffff8800afcc0000
> R13: ffff8800afcc0000 R14: 0000000000000000 R15: ffffffff83d1a340
> ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018
> #10 [ffff8801b301fd88] try_to_del_timer_sync at ffffffff8108f19f
> #11 [ffff8801b301fdb8] del_timer_sync at ffffffff8108f252
> #12 [ffff8801b301fdd0] packet_set_ring at ffffffff81635e60
> #13 [ffff8801b301fe98] packet_setsockopt at ffffffff81636760
> #14 [ffff8801b301ff38] sys_setsockopt at ffffffff81531860
> #15 [ffff8801b301ff80] tracesys at ffffffff816687ed (via system_call)
> RIP: 00007fcc78b03e3a RSP: 00007fff16f246b8 RFLAGS: 00000202
> RAX: ffffffffffffffda RBX: ffffffff816687ed RCX: ffffffffffffffff
> RDX: 0000000000000005 RSI: 0000000000000107 RDI:
> 0000000000000180
> RBP: 0000000000000180 R8: 000000000000001c R9:
> 00007fcc78dc7160
> R10: 0000000001fd6ba0 R11: 0000000000000202 R12:
> 0000000000000000
> R13: 0000000000000011 R14: 0000000001fd6b60 R15:
> 0000000001fd6b70
> ORIG_RAX: 0000000000000036 CS: 0033 SS: 002b
>
>
> Best Regards,
> liujian
>
>
> > -----Original Message-----
> > From: Cong Wang [mailto:[email protected]]
> > Sent: Sunday, July 23, 2017 1:59 PM
> > To: Dingtianhong
> > Cc: liujian (CE); Willem de Bruijn; Dave Jones;
> > [email protected]; [email protected];
> [email protected];
> > [email protected]; [email protected]; [email protected];
> > [email protected]
> > Subject: Re: af_packet: use after free in
> > prb_retire_rx_blk_timer_expired
> >
> > On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong
> > <[email protected]>
> > wrote:
> > > Hi, Cong:
> > >
> > > Thanks for your quirk solution, but I still has some doubts about
> > > it, it looks like fix the problem in the
> > > packet_setsockopt->packet_set_ring processing, but when in
> > > packet_release processing, it may could not release the real pg_vec
> > > for the TPACKET_V3 ring, and then cause the mem leak, maybe I miss
> > > something here, nice to hear from your feedback. :)
> >
> > Yes you miss that packet_release() has memset()'s so we won't hit that
> > path. :)
> >
> > However, I missed the swap() in this messy function, actually I
> > believe the bug is that we modify tpacket_kbdq_core inside rx_ring in
> > non-closing case without actually stopping its timer. I feel more confident
> with the following patch:
> >
> >
> > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index
> > 008bb34ee324..267b181fef15 100644
> > --- a/net/packet/af_packet.c
> > +++ b/net/packet/af_packet.c
> > @@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk,
> > union tpacket_req_u *req_u,
> > case TPACKET_V3:
> > /* Block transmit is not supported yet */
> > if (!tx_ring) {
> > + prb_shutdown_retire_blk_timer(po,
> > + rb_queue);
> > init_prb_bdqc(po, rb, pg_vec, req_u);
> > } else {
> > struct tpacket_req3 *req3 =
> > &req_u->req3;

2017-07-23 13:04:29

by liujian (CE)

[permalink] [raw]
Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired

Hi

I find it caused by below steps:
1. set tp_version to TPACKET_V3 and req->tp_block_nr to 1
2. set tp_block_nr to 0
Then pg_vec was freed, and we did not delete the timer?

Best Regards,
liujian


> -----Original Message-----
> From: liujian (CE)
> Sent: Sunday, July 23, 2017 5:47 PM
> To: liujian (CE); 'Cong Wang'; Dingtianhong
> Cc: 'Willem de Bruijn'; 'Dave Jones'; '[email protected]';
> '[email protected]'; '[email protected]'; '[email protected]';
> '[email protected]'; '[email protected]';
> '[email protected]'
> Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired
>
> Hi,
>
> Do we need delete the v3 ring, when tp_version changed from TPACKET_V3 to
> TPACKET_V1 ?
>
>
> Best Regards,
> liujian
>
>
> > -----Original Message-----
> > From: liujian (CE)
> > Sent: Sunday, July 23, 2017 4:21 PM
> > To: 'Cong Wang'; Dingtianhong
> > Cc: Willem de Bruijn; Dave Jones; [email protected];
> > [email protected]; [email protected]; [email protected];
> > [email protected]; [email protected];
> > [email protected]
> > Subject: RE: af_packet: use after free in
> > prb_retire_rx_blk_timer_expired
> >
> > Hi Wang Cong,
> >
> > With this patch , the system was crashed when setsockopt.
> >
> > The call trace as below:
> >
> > crash> bt
> > PID: 3069 TASK: ffff8800afcc0000 CPU: 0 COMMAND: "trinity-main"
> > #0 [ffff8801bec03ce0] machine_kexec at ffffffff8105354b
> > #1 [ffff8801bec03d40] crash_kexec at ffffffff810f7e82
> > #2 [ffff8801bec03e10] panic at ffffffff81650058
> > #3 [ffff8801bec03e90] watchdog_timer_fn at ffffffff81122533
> > #4 [ffff8801bec03ec8] __hrtimer_run_queues at ffffffff810abeb2
> > #5 [ffff8801bec03f20] hrtimer_interrupt at ffffffff810ac450
> > #6 [ffff8801bec03f70] local_apic_timer_interrupt at ffffffff8104a457
> > #7 [ffff8801bec03f88] smp_apic_timer_interrupt at ffffffff8166aed0
> > #8 [ffff8801bec03fb0] apic_timer_interrupt at ffffffff8166931d
> > --- <IRQ stack> ---
> > #9 [ffff8801b301fcb8] apic_timer_interrupt at ffffffff8166931d
> > [exception RIP: lock_timer_base+77]
> > RIP: ffffffff8108dced RSP: ffff8801b301fd60 RFLAGS: 00000246
> > RAX: 0000000000000000 RBX: ffff8800afcc0000 RCX:
> > 0000000000000001
> > RDX: ffff8800afcc0000 RSI: ffff8801b301fd90 RDI: ffff8800b0d853c8
> > RBP: ffff8801b301fd80 R8: ffff8800afcc0000 R9:
> ffffea0002680000
> > R10: 000000000000003c R11: ffff8801b301fb2e R12:
> ffff8800afcc0000
> > R13: ffff8800afcc0000 R14: 0000000000000000 R15:
> ffffffff83d1a340
> > ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018
> > #10 [ffff8801b301fd88] try_to_del_timer_sync at ffffffff8108f19f
> > #11 [ffff8801b301fdb8] del_timer_sync at ffffffff8108f252
> > #12 [ffff8801b301fdd0] packet_set_ring at ffffffff81635e60
> > #13 [ffff8801b301fe98] packet_setsockopt at ffffffff81636760
> > #14 [ffff8801b301ff38] sys_setsockopt at ffffffff81531860
> > #15 [ffff8801b301ff80] tracesys at ffffffff816687ed (via system_call)
> > RIP: 00007fcc78b03e3a RSP: 00007fff16f246b8 RFLAGS: 00000202
> > RAX: ffffffffffffffda RBX: ffffffff816687ed RCX: ffffffffffffffff
> > RDX: 0000000000000005 RSI: 0000000000000107 RDI:
> > 0000000000000180
> > RBP: 0000000000000180 R8: 000000000000001c R9:
> > 00007fcc78dc7160
> > R10: 0000000001fd6ba0 R11: 0000000000000202 R12:
> > 0000000000000000
> > R13: 0000000000000011 R14: 0000000001fd6b60 R15:
> > 0000000001fd6b70
> > ORIG_RAX: 0000000000000036 CS: 0033 SS: 002b
> >
> >
> > Best Regards,
> > liujian
> >
> >
> > > -----Original Message-----
> > > From: Cong Wang [mailto:[email protected]]
> > > Sent: Sunday, July 23, 2017 1:59 PM
> > > To: Dingtianhong
> > > Cc: liujian (CE); Willem de Bruijn; Dave Jones;
> > > [email protected]; [email protected];
> > [email protected];
> > > [email protected]; [email protected]; [email protected];
> > > [email protected]
> > > Subject: Re: af_packet: use after free in
> > > prb_retire_rx_blk_timer_expired
> > >
> > > On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong
> > > <[email protected]>
> > > wrote:
> > > > Hi, Cong:
> > > >
> > > > Thanks for your quirk solution, but I still has some doubts about
> > > > it, it looks like fix the problem in the
> > > > packet_setsockopt->packet_set_ring processing, but when in
> > > > packet_release processing, it may could not release the real
> > > > pg_vec for the TPACKET_V3 ring, and then cause the mem leak, maybe
> > > > I miss something here, nice to hear from your feedback. :)
> > >
> > > Yes you miss that packet_release() has memset()'s so we won't hit
> > > that path. :)
> > >
> > > However, I missed the swap() in this messy function, actually I
> > > believe the bug is that we modify tpacket_kbdq_core inside rx_ring
> > > in non-closing case without actually stopping its timer. I feel more
> > > confident
> > with the following patch:
> > >
> > >
> > > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index
> > > 008bb34ee324..267b181fef15 100644
> > > --- a/net/packet/af_packet.c
> > > +++ b/net/packet/af_packet.c
> > > @@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk,
> > > union tpacket_req_u *req_u,
> > > case TPACKET_V3:
> > > /* Block transmit is not supported yet */
> > > if (!tx_ring) {
> > > + prb_shutdown_retire_blk_timer(po,
> > > + rb_queue);
> > > init_prb_bdqc(po, rb, pg_vec,
> req_u);
> > > } else {
> > > struct tpacket_req3 *req3 =
> > > &req_u->req3;

2017-07-23 17:03:56

by Cong Wang

[permalink] [raw]
Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired

On Sun, Jul 23, 2017 at 5:48 AM, liujian (CE) <[email protected]> wrote:
> Hi
>
> I find it caused by below steps:
> 1. set tp_version to TPACKET_V3 and req->tp_block_nr to 1
> 2. set tp_block_nr to 0
> Then pg_vec was freed, and we did not delete the timer?

Thanks for testing!

Ah, I overlook the initialization case in my previous patch.

How about the following one? Does it cover all the cases?


diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 008bb34ee324..0615c2a950fa 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4329,7 +4329,7 @@ static int packet_set_ring(struct sock *sk,
union tpacket_req_u *req_u,
register_prot_hook(sk);
}
spin_unlock(&po->bind_lock);
- if (closing && (po->tp_version > TPACKET_V2)) {
+ if (pg_vec && (po->tp_version > TPACKET_V2)) {
/* Because we don't support block-based V3 on tx-ring */
if (!tx_ring)
prb_shutdown_retire_blk_timer(po, rb_queue);

2017-07-24 01:14:29

by Ding Tianhong

[permalink] [raw]
Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired



On 2017/7/24 1:03, Cong Wang wrote:
> On Sun, Jul 23, 2017 at 5:48 AM, liujian (CE) <[email protected]> wrote:
>> Hi
>>
>> I find it caused by below steps:
>> 1. set tp_version to TPACKET_V3 and req->tp_block_nr to 1
>> 2. set tp_block_nr to 0
>> Then pg_vec was freed, and we did not delete the timer?
>
> Thanks for testing!
>
> Ah, I overlook the initialization case in my previous patch.
>
> How about the following one? Does it cover all the cases?
>
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 008bb34ee324..0615c2a950fa 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -4329,7 +4329,7 @@ static int packet_set_ring(struct sock *sk,
> union tpacket_req_u *req_u,
> register_prot_hook(sk);
> }
> spin_unlock(&po->bind_lock);
> - if (closing && (po->tp_version > TPACKET_V2)) {
> + if (pg_vec && (po->tp_version > TPACKET_V2)) {
> /* Because we don't support block-based V3 on tx-ring */
> if (!tx_ring)
> prb_shutdown_retire_blk_timer(po, rb_queue);
>
> .

Hi, Cong:

It looks like could not cover the case: req->tp_block_nr = 2 -> reg->tp_block_nr = 1 .

what about this way:
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4331,13 +4331,17 @@ static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u,
register_prot_hook(sk);
}
spin_unlock(&po->bind_lock);
- if (closing && (po->tp_version > TPACKET_V2)) {
+ if ((closing || (pg_vec && !reg->tp_block_nr))&& (po->tp_version > TPACKET_V2)) {
/* Because we don't support block-based V3 on tx-ring */
if (!tx_ring)
prb_shutdown_retire_blk_timer(po, rb_queue);


>

2017-07-24 01:30:32

by Ding Tianhong

[permalink] [raw]
Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired



On 2017/7/24 9:09, Ding Tianhong wrote:
>
>
> On 2017/7/24 1:03, Cong Wang wrote:
>> On Sun, Jul 23, 2017 at 5:48 AM, liujian (CE) <[email protected]> wrote:
>>> Hi
>>>
>>> I find it caused by below steps:
>>> 1. set tp_version to TPACKET_V3 and req->tp_block_nr to 1
>>> 2. set tp_block_nr to 0
>>> Then pg_vec was freed, and we did not delete the timer?
>>
>> Thanks for testing!
>>
>> Ah, I overlook the initialization case in my previous patch.
>>
>> How about the following one? Does it cover all the cases?
>>
>>
>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>> index 008bb34ee324..0615c2a950fa 100644
>> --- a/net/packet/af_packet.c
>> +++ b/net/packet/af_packet.c
>> @@ -4329,7 +4329,7 @@ static int packet_set_ring(struct sock *sk,
>> union tpacket_req_u *req_u,
>> register_prot_hook(sk);
>> }
>> spin_unlock(&po->bind_lock);
>> - if (closing && (po->tp_version > TPACKET_V2)) {
>> + if (pg_vec && (po->tp_version > TPACKET_V2)) {
>> /* Because we don't support block-based V3 on tx-ring */
>> if (!tx_ring)
>> prb_shutdown_retire_blk_timer(po, rb_queue);
>>
>> .
>
> Hi, Cong:
>
> It looks like could not cover the case: req->tp_block_nr = 2 -> reg->tp_block_nr = 1 .
>

Oh, looks like this case would never happen, so I think your solution is ok.

> what about this way:
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -4331,13 +4331,17 @@ static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u,
> register_prot_hook(sk);
> }
> spin_unlock(&po->bind_lock);
> - if (closing && (po->tp_version > TPACKET_V2)) {
> + if ((closing || (pg_vec && !reg->tp_block_nr))&& (po->tp_version > TPACKET_V2)) {
> /* Because we don't support block-based V3 on tx-ring */
> if (!tx_ring)
> prb_shutdown_retire_blk_timer(po, rb_queue);
>
>

>>
>
>
> .
>

2017-07-24 10:40:23

by liujian (CE)

[permalink] [raw]
Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired

Hi Wang cong,

After apply the patch, I did not hit the issue again.
Thank you~


Best Regards,
liujian

> -----Original Message-----
> From: Dingtianhong
> Sent: Monday, July 24, 2017 9:29 AM
> To: Cong Wang; liujian (CE)
> Cc: Willem de Bruijn; Dave Jones; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]
> Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired
>
>
>
> On 2017/7/24 9:09, Ding Tianhong wrote:
> >
> >
> > On 2017/7/24 1:03, Cong Wang wrote:
> >> On Sun, Jul 23, 2017 at 5:48 AM, liujian (CE) <[email protected]> wrote:
> >>> Hi
> >>>
> >>> I find it caused by below steps:
> >>> 1. set tp_version to TPACKET_V3 and req->tp_block_nr to 1 2. set
> >>> tp_block_nr to 0 Then pg_vec was freed, and we did not delete the
> >>> timer?
> >>
> >> Thanks for testing!
> >>
> >> Ah, I overlook the initialization case in my previous patch.
> >>
> >> How about the following one? Does it cover all the cases?
> >>
> >>
> >> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index
> >> 008bb34ee324..0615c2a950fa 100644
> >> --- a/net/packet/af_packet.c
> >> +++ b/net/packet/af_packet.c
> >> @@ -4329,7 +4329,7 @@ static int packet_set_ring(struct sock *sk,
> >> union tpacket_req_u *req_u,
> >> register_prot_hook(sk);
> >> }
> >> spin_unlock(&po->bind_lock);
> >> - if (closing && (po->tp_version > TPACKET_V2)) {
> >> + if (pg_vec && (po->tp_version > TPACKET_V2)) {
> >> /* Because we don't support block-based V3 on tx-ring
> */
> >> if (!tx_ring)
> >> prb_shutdown_retire_blk_timer(po,
> rb_queue);
> >>
> >> .
> >
> > Hi, Cong:
> >
> > It looks like could not cover the case: req->tp_block_nr = 2 ->
> reg->tp_block_nr = 1 .
> >
>
> Oh, looks like this case would never happen, so I think your solution is ok.
>
> > what about this way:
> > --- a/net/packet/af_packet.c
> > +++ b/net/packet/af_packet.c
> > @@ -4331,13 +4331,17 @@ static int packet_set_ring(struct sock *sk, union
> tpacket_req_u *req_u,
> > register_prot_hook(sk);
> > }
> > spin_unlock(&po->bind_lock);
> > - if (closing && (po->tp_version > TPACKET_V2)) {
> > + if ((closing || (pg_vec && !reg->tp_block_nr))&&
> > + (po->tp_version > TPACKET_V2)) {
> > /* Because we don't support block-based V3 on tx-ring */
> > if (!tx_ring)
> > prb_shutdown_retire_blk_timer(po,
> rb_queue);
> >
> >
>
> >>
> >
> >
> > .
> >