Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1245252imm; Wed, 18 Jul 2018 20:37:13 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcuYC9UprSpWraPruU57iIBoS0ybeam+L7U/zU/h4v/fgk+CPXnBrFgkxcbS+o2jo2/zI48 X-Received: by 2002:a63:f804:: with SMTP id n4-v6mr8423866pgh.106.1531971433097; Wed, 18 Jul 2018 20:37:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531971433; cv=none; d=google.com; s=arc-20160816; b=V4WEz1tgTIUs232KWjthZzLqFq9EqgHXmRmpyX/7eUTN25LtVNN6iDjmwlN1vivTYQ B1l2QqCCj7+dR/XFDgAQdeuLA3QXKvQlHjy+08MBNINPThTd0LxH3IziY1wqYhCMHjjM P6qwQzFMdyaR8SDJhEVWci9CtLr7EEkP05TcMm9qqT/whn7TvWRpm7gwUoLfFJ+MCk/d xBHuPK0rPMT4q5pK0sLXOJwzvOkhkNdMFfIyyqNrPt8QVW54FPF+cQUlZPj/bkMZmEw+ PZaNYwQ8qsff66fSxtCAHgx4KXHhtApODjWgQJkr4uNLQoIlOEhRFvDwDyrpDsWwID1r oWKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=Gy0W0cVJQBlea5zQol3cWIu+tJjgWA4WTw1zjLhWk24=; b=KV2Fuqg5mij/mIe0ViOxfrRM/5sbZYozZOMR3hx6sx4MiLoe5O/I4l1Wncw5JSGyAh ij4t2KqJW8ezz3B1SAbH3N86s0hBTZCn+q3gv5Gg8p7wZr+fKga2hcMqWT67Iq6jVYoi n3gm273LvtBANwP/5DHMxwKk0qbTZE7qij9B7qwbwuCI1WPLoSk/paxB4v9VrhYOtKkq xA3WokDBumr7zclopVyIkLhmQ3FQDX4rLZKUnGBx5/7+u+Z615Rvc7senwIlXHc6eAfO YHzVZVrMDxvE/kToxoUC9fXYWo77Lx53m/n4K3pMKPLajFzORid7ygbtnmEZt2vUO+KM /ACw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 85-v6si4871709pfm.264.2018.07.18.20.36.58; Wed, 18 Jul 2018 20:37:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726704AbeGSERX (ORCPT + 99 others); Thu, 19 Jul 2018 00:17:23 -0400 Received: from nautica.notk.org ([91.121.71.147]:51463 "EHLO nautica.notk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726143AbeGSERX (ORCPT ); Thu, 19 Jul 2018 00:17:23 -0400 Received: by nautica.notk.org (Postfix, from userid 1001) id 73FC5C009; Thu, 19 Jul 2018 05:36:23 +0200 (CEST) Date: Thu, 19 Jul 2018 05:36:08 +0200 From: Dominique Martinet To: piaojun Cc: "akpm@linux-foundation.org" , Eric Van Hensbergen , Ron Minnich , Latchesar Ionkov , Linux Kernel Mailing List , v9fs-developer@lists.sourceforge.net Subject: Re: [PATCH] net/9p/trans_virtio.c: replace mutex_lock with spin_lock to protect 'virtio_chan_list' Message-ID: <20180719033608.GA29510@nautica> References: <5B4EF511.7090104@huawei.com> <20180718095420.GA28377@nautica> <5B4FF6DB.1040101@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <5B4FF6DB.1040101@huawei.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org piaojun wrote on Thu, Jul 19, 2018: > > piaojun wrote on Wed, Jul 18, 2018: > > That's not a fast path operation, I don't mind changing things but I'd > > like to understand why - these functions are only ever called at unmount > > time or when something happens on the virtio bus (probe will happen on > > probing on the pci bus and I'm not too sure on remove but probably pci > > removal i.e. basically never?) > > > > I don't see why this wouldn't work, but I won't take this without a > > (good?) reason. > > > virtio_9p_lock is responsable for protecting virtio_chan_list which has 3 > operation: > > 1. Add a virtio chan to virtio_chan_list. This will happen when we insmod > 9pnet_virtio.ko: > p9_virtio_probe > --list_add_tail(&chan->chan_list, &virtio_chan_list); > > 2. Remove a virtio chan. This will happen when remnod 9pnet_virtio.ko: > p9_virtio_remove > --list_del(&chan->chan_list); > > 3. Find a unused virtio chan when mount 9p: > mount > --p9_virtio_create > --list_for_each_entry(chan, &virtio_chan_list, chan_list) > > Multi mount process will compete for virtio_9p_lock when finding unused > virtio chan, in which case mutex lock will cause process sleep and wake > up. I think this a waste of CPU time. So we could use spin lock to avoid > this. Well, sure, that's theory; but how is that in practice? I actually took the time to run some tests, setting up 20 virtio mount points in qemu, and running this command with and without your patch: # time sh -c 'for i in {1..20}; do sh -c "for j in {1..100}; do mount -t 9p d$i d.$i; umount d.$i; done" & done; wait' This is quick & dirty but basically, mounts and unmounts 100 times in a loop all 20 mount points in parallel to stress that lock. I get these times 5 times (one run per column), without patch: real 0m19.357s 0m19.626s 0m19.904s 0m19.926s 0m21.321s user 0m6.795s 0m6.874s 0m6.807s 0m6.768s 0m6.892s sys 0m29.936s 0m31.196s 0m31.702s 0m31.914s 0m30.791s With patch: real 0m19.439s 0m19.849s 0m19.683s 0m19.600s 0m20.689s user 0m6.948s 0m6.582s 0m6.706s 0m6.598s 0m6.876s sys 0m29.364s 0m30.898s 0m30.695s 0m31.311s 0m33.391s I honestly can't say I'm convinced with a difference either way, the variations look more like noise than anything to me. More to the point, while these tests ran my dmesg buffer was filled with errors like: FS-Cache: Duplicate cookie detected FS-Cache: O-cookie c=0000000000368cdb [p=00000000548b03c2 fl=222 nc=0 na=1] FS-Cache: O-cookie d=000000004cebd15f n=00000000029a0b83 FS-Cache: O-key=[10] '34323935303838343536' FS-Cache: N-cookie c=00000000d4089478 [p=00000000548b03c2 fl=2 nc=0 na=1] FS-Cache: N-cookie d=000000004cebd15f n=00000000959d4d37 FS-Cache: N-key=[10] '34323935303838343536' or (output mangled a bit) ================================================================== BUG: KASAN: use-after-free in p9_client_cb+0x14d/0x160 [9pnet] Read of size 8 at addr ffff88003522a088 by task systemd-udevd/492 CPU: 1 PID: 492 Comm: systemd-udevd Tainted: G O 4.18.0-rc5+ #9 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 0> Call Trace: dump_stack+0x7b/0xad print_address_description+0x6a/0x209 ? p9_client_cb+0x14d/0x160 [9pnet] kasan_report.cold.7+0x242/0x2fe __asan_report_load8_noabort+0x19/0x20 p9_client_cb+0x14d/0x160 [9pnet] req_done+0x22f/0x280 [9pnet_virtio] ? p9_mount_tag_show+0x120/0x120 [9pnet_virtio] vring_interrupt+0x108/0x1b0 [virtio_ring] ? vring_map_single.constprop.23+0x350/0x350 [virtio_ring] __handle_irq_event_percpu+0xec/0x460 handle_irq_event_percpu+0x71/0x140 ? __handle_irq_event_percpu+0x460/0x460 ? apic_ack_irq+0xa3/0xe0 handle_irq_event+0xb9/0x14a handle_edge_irq+0x1ea/0x7a0 ? kasan_check_read+0x11/0x20 handle_irq+0x48/0x60 do_IRQ+0x67/0x140 common_interrupt+0xf/0xf RIP: 0010:finish_task_switch+0x10e/0x630 Code: e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 6d 04 00 00 41 c7 45 38 00 00 00 00 4c 89 e7 ff 14 25 28 f5 66 8e fb 66 0f > RSP: 0018:ffff8800633e7a60 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd4 RAX: 0000000000000001 RBX: ffff880036632000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88006caaac00 RBP: ffff8800633e7aa0 R08: ffffed000cea15cd R09: ffffed000cea15cc R10: ffffed000cea15cc R11: ffff88006750ae63 R12: ffff88006caaac00 R13: ffff88006558b000 R14: 0000000000000000 R15: ffff880036632000 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 __schedule+0x733/0x1c10 ? __bpf_prog_run64+0xd0/0xd0 ? firmware_map_remove+0x174/0x174 schedule+0x7a/0x1a0 schedule_hrtimeout_range_clock+0x306/0x3b0 ? kasan_check_write+0x14/0x20 ? hrtimer_nanosleep_restart+0x290/0x290 ? ep_busy_loop_end+0x110/0x110 schedule_hrtimeout_range+0x13/0x20 ep_poll+0x7a7/0xb50 ? __ia32_sys_epoll_ctl+0x1170/0x1170 ? __fget_light+0x59/0x1f0 ? __audit_syscall_entry+0x347/0x980 ? __audit_free+0x8a0/0x8a0 34 ? wake_up_q+0x100/0x100 39 ? kasan_check_read+0x11/0x20 3230373130' FS-Cache: O-key=[10] '34323934393230373131' FS-Cache: N-cookie c=00000000fa69c1f9 [p=00000000887326c4 fl=2 nc=0 na=1] FS-Cache: N-cookie d=00000000a8f143d1 n=00000000446f741a FS-Cache: N-key=[10] '34323934393230373131' ? __fget_light+0x59/0x1f0 do_epoll_wait+0x129/0x160 __x64_sys_epoll_wait+0x97/0xf0 do_syscall_64+0xa5/0x260 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f9099a22317 Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8d 05 d1 46 2c 00 41 89 ca 8b 00 85 c0 75 10 b8 e8 00 > RSP: 002b:00007ffff67e1f28 EFLAGS: 00000246 ORIG_RAX: 00000000000000e8 RAX: ffffffffffffffda RBX: 0000558182d9e390 RCX: 00007f9099a22317 RDX: 000000000000000b RSI: 00007ffff67e1f30 RDI: 000000000000000b RBP: 00007ffff67e20b0 R08: 0000000006c65ded R09: 00007ffff67e1f30 R10: 00000000ffffffff R11: 0000000000000246 R12: 0000000000000001 R13: 00007ffff67e1f30 R14: ffffffffffffffff R15: 0000558182d7a4c0 Allocated by task 6390: save_stack+0x43/0xd0 kasan_kmalloc+0xc4/0xe0 kasan_slab_alloc+0x12/0x20 kmem_cache_alloc+0xe2/0x5e0 p9_client_prepare_req+0xa4/0x670 [9pnet] p9_client_rpc+0x133/0xd20 [9pnet] p9_client_getattr_dotl+0x102/0x910 [9pnet] v9fs_mount+0x5a6/0x7c0 [9p] mount_fs+0x89/0x2ad vfs_kern_mount.part.32+0x5d/0x390 do_mount+0x379/0x2bb0 ksys_mount+0xbf/0xe0 __x64_sys_mount+0xbe/0x150 do_syscall_64+0xa5/0x260 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 6390: save_stack+0x43/0xd0 __kasan_slab_free+0x118/0x170 kasan_slab_free+0xe/0x10 kmem_cache_free+0x49/0x160 p9_free_req+0x106/0x140 [9pnet] p9_client_getattr_dotl+0x590/0x910 [9pnet] v9fs_mount+0x5a6/0x7c0 [9p] mount_fs+0x89/0x2ad vfs_kern_mount.part.32+0x5d/0x390 do_mount+0x379/0x2bb0 ksys_mount+0xbf/0xe0 __x64_sys_mount+0xbe/0x150 do_syscall_64+0xa5/0x260 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff88003522a068 which belongs to the cache p9_req_t of size 72 The buggy address is located 32 bytes inside of 72-byte region [ffff88003522a068, ffff88003522a0b0) The buggy address belongs to the page: page:ffffea0000d48a80 count:1 mapcount:0 mapping:ffff880064562580 index:0x0 flags: 0xffffc000000100(slab) raw: 00ffffc000000100 ffff880035e36618 ffffea00019fa888 ffff880064562580 raw: 0000000000000000 ffff88003522a000 0000000100000027 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880035229f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88003522a000: fb fb fb fb fb fb fb fb fb fc fc fc fc fb fb fb >ffff88003522a080: fb fb fb fb fb fb fc fc fc fc fb fb fb fb fb fb ^ ffff88003522a100: fb fb fb fc fc fc fc fb fb fb fb fb fb fb fb fb ffff88003522a180: fc fc fc fc fb fb fb fb fb fb fb fb fb fc fc fc ================================================================== so if you're concerned about parallel mountings, I think there are others, more important, bugs to fix rather than replacing a hardly-used mutex by a spin-lock... You've done the work now so it's not like I can't take the patch, but it really feels pointless to me unless you can show me there is actual improvement. -- Dominique