Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp34745130rwd; Sun, 9 Jul 2023 19:40:11 -0700 (PDT) X-Google-Smtp-Source: APBJJlEy4txb62Bf/0/12cw/zfhSnILO32RU9903HSXhvvG2OzgBPjREGzFUs47FD6ZTf5j+dzh8 X-Received: by 2002:a17:906:289:b0:991:bf04:2047 with SMTP id 9-20020a170906028900b00991bf042047mr11304012ejf.14.1688956811532; Sun, 09 Jul 2023 19:40:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688956811; cv=none; d=google.com; s=arc-20160816; b=s63EtP4DDJTA+yqr9Kzf0VGE4bUYHuuB4QdZYT7Md4asqjK/3GerVsRgFnyOjEkXQX Gq2cypPnUVOQA8eU9j+H+Evg/obdZ/o13JdoTkB1dUJm0r+TwghBoVbcCX+3j2hHikr+ Uws+YpVaAOynyQV88wlE49omD3hR6ldC55Y3pB+BnV9VdxTdVQlZrHibe8Q5eXe1Q3wm bOc9eh0fXT6Sec8TNDSJrRm7HOnrXpatBQP6lmjy5wkKyWiT41ECk1lS27u8o0vDYeUH WX/974ShYK74WBjshMKXdnq1bB2UokxXkyLmBQkQ2SiBvHaqp991da0HvkvxL0fDIh9H f51A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=qvl1tMAnXLqUKbs+EdyGTcQbv9v+6LQDlk+gdFH6Jkw=; fh=JGoboGl/mBD1zidUn1K8dLnjNKdMFH1K5Gz8zJXQZ68=; b=X++0nXT+PrkOyiWammJ9kD2yGeLff6Q0HPBJL44YOM40gnz9Th7esAt4K/Z0vwDTM2 u1QJH3F48B2OHGdwg8SFHKcSXir0K2Aqjr+1ArZv2Y/tI38+ZG81ce7NwvYS0ZRL7sP3 m9QTOzM4VHlTUrzS4u6/ew78P17Ljmxty/wDbJbIAod4l3yGjiSeLT2trPnC0lc6o7S3 U5Jz6jYaEjqWBDm6UdeUjk27sfrre5pfwPwL+I3hH3kh94fbvJtRSFdxmE4QQ1189T1q 7xt6i5QGYF0AjZiNnLEKjIZ4KcAVhcU9wH11zmN6YCivqtJww7s3TNfGjpagwboOleSn oG1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pv20-20020a170907209400b00977ca821ed8si7596135ejb.584.2023.07.09.19.39.47; Sun, 09 Jul 2023 19:40:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230075AbjGJBiS (ORCPT + 99 others); Sun, 9 Jul 2023 21:38:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229862AbjGJBiR (ORCPT ); Sun, 9 Jul 2023 21:38:17 -0400 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C9C4138; Sun, 9 Jul 2023 18:37:55 -0700 (PDT) Received: from dggpeml500012.china.huawei.com (unknown [172.30.72.54]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4Qzmp95V50z1FDfP; Mon, 10 Jul 2023 09:37:21 +0800 (CST) Received: from [10.67.110.218] (10.67.110.218) by dggpeml500012.china.huawei.com (7.185.36.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 10 Jul 2023 09:37:53 +0800 Message-ID: <63ff741f-b781-37da-3f47-15d1a8ff76a4@huawei.com> Date: Mon, 10 Jul 2023 09:37:52 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: [PATCH] ring-buffer: Fix deadloop issue on reading trace_pipe Content-Language: en-US To: Steven Rostedt CC: , , References: <20230708225144.3785600-1-zhengyejian1@huawei.com> <20230709154523.32dc89c2@rorschach.local.home> From: Zheng Yejian In-Reply-To: <20230709154523.32dc89c2@rorschach.local.home> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.110.218] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500012.china.huawei.com (7.185.36.15) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2023/7/10 03:45, Steven Rostedt wrote: > On Sun, 9 Jul 2023 06:51:44 +0800 > Zheng Yejian wrote: > >> Soft lockup occurs when reading file 'trace_pipe': >> >> watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [cat:4488] >> [...] >> RIP: 0010:ring_buffer_empty_cpu+0xed/0x170 >> RSP: 0018:ffff88810dd6fc48 EFLAGS: 00000246 >> RAX: 0000000000000000 RBX: 0000000000000246 RCX: ffffffff93d1aaeb >> RDX: ffff88810a280040 RSI: 0000000000000008 RDI: ffff88811164b218 >> RBP: ffff88811164b218 R08: 0000000000000000 R09: ffff88815156600f >> R10: ffffed102a2acc01 R11: 0000000000000001 R12: 0000000051651901 >> R13: 0000000000000000 R14: ffff888115e49500 R15: 0000000000000000 >> [...] >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00007f8d853c2000 CR3: 000000010dcd8000 CR4: 00000000000006e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Call Trace: >> __find_next_entry+0x1a8/0x4b0 >> ? peek_next_entry+0x250/0x250 >> ? down_write+0xa5/0x120 >> ? down_write_killable+0x130/0x130 >> trace_find_next_entry_inc+0x3b/0x1d0 >> tracing_read_pipe+0x423/0xae0 >> ? tracing_splice_read_pipe+0xcb0/0xcb0 >> vfs_read+0x16b/0x490 >> ksys_read+0x105/0x210 >> ? __ia32_sys_pwrite64+0x200/0x200 >> ? switch_fpu_return+0x108/0x220 >> do_syscall_64+0x33/0x40 >> entry_SYSCALL_64_after_hwframe+0x61/0xc6 >> >> Through the vmcore, I found it's because in tracing_read_pipe(), >> ring_buffer_empty_cpu() found some buffer is not empty but then it >> cannot read anything due to "rb_num_of_entries() == 0" always true, >> Then it infinitely loop the procedure due to user buffer not been >> filled, see following code path: >> >> tracing_read_pipe() { >> ... ... >> waitagain: >> tracing_wait_pipe() // 1. find non-empty buffer here >> trace_find_next_entry_inc() // 2. loop here try to find an entry >> __find_next_entry() >> ring_buffer_empty_cpu(); // 3. find non-empty buffer >> peek_next_entry() // 4. but peek always return NULL >> ring_buffer_peek() >> rb_buffer_peek() >> rb_get_reader_page() >> // 5. because rb_num_of_entries() == 0 always true here >> // then return NULL >> // 6. user buffer not been filled so goto 'waitgain' >> // and eventually leads to an deadloop in kernel!!! >> } >> >> By some analyzing, I found that when resetting ringbuffer, the 'entries' >> of its pages are not all cleared (see rb_reset_cpu()). Then when reducing >> the ringbuffer, and if some reduced pages exist dirty 'entries' data, they >> will be added into 'cpu_buffer->overrun' (see rb_remove_pages()), which >> cause wrong 'overrun' count and eventually cause the deadloop issue. >> >> To fix it, we need to clear every pages in rb_reset_cpu(). > > Nice catch and analysis! > Thanks! Steve, IIUC, following tags should be added? Cc: stable@vger.kernel.org Fixes: 48fdc72f23ad ("ring-buffer: Fix accounting of entries when removing pages") Fixes: 83f40318dab0 ("ring-buffer: Make removal of ring buffer pages atomic") --- Thanks, Zheng Yejian >> >> Signed-off-by: Zheng Yejian >> --- >> kernel/trace/ring_buffer.c | 24 +++++++++++++++--------- >> 1 file changed, 15 insertions(+), 9 deletions(-) >> >> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c >> index 834b361a4a66..14d8001140c8 100644 >> --- a/kernel/trace/ring_buffer.c >> +++ b/kernel/trace/ring_buffer.c >> @@ -5242,28 +5242,34 @@ unsigned long ring_buffer_size(struct trace_buffer *buffer, int cpu) >> } >> EXPORT_SYMBOL_GPL(ring_buffer_size); >> >> +static void rb_clear_buffer_page(struct buffer_page *page) >> +{ >> + local_set(&page->write, 0); >> + local_set(&page->entries, 0); >> + rb_init_page(page->page); >> + page->read = 0; >> +} >> + >> static void >> rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer) >> { >> + struct buffer_page *page; >> + >> rb_head_page_deactivate(cpu_buffer); >> >> cpu_buffer->head_page >> = list_entry(cpu_buffer->pages, struct buffer_page, list); >> - local_set(&cpu_buffer->head_page->write, 0); >> - local_set(&cpu_buffer->head_page->entries, 0); >> - local_set(&cpu_buffer->head_page->page->commit, 0); >> - >> - cpu_buffer->head_page->read = 0; >> + rb_clear_buffer_page(cpu_buffer->head_page); >> + list_for_each_entry(page, cpu_buffer->pages, list) { >> + rb_clear_buffer_page(page); >> + } >> >> cpu_buffer->tail_page = cpu_buffer->head_page; >> cpu_buffer->commit_page = cpu_buffer->head_page; >> >> INIT_LIST_HEAD(&cpu_buffer->reader_page->list); >> INIT_LIST_HEAD(&cpu_buffer->new_pages); >> - local_set(&cpu_buffer->reader_page->write, 0); >> - local_set(&cpu_buffer->reader_page->entries, 0); >> - local_set(&cpu_buffer->reader_page->page->commit, 0); >> - cpu_buffer->reader_page->read = 0; >> + rb_clear_buffer_page(cpu_buffer->reader_page); > > Looks good. I'll apply it shortly and start running it through my tests. > Thanks! > Thanks! > > -- Steve > >> >> local_set(&cpu_buffer->entries_bytes, 0); >> local_set(&cpu_buffer->overrun, 0); > >