Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp1101943ybt; Wed, 1 Jul 2020 19:05:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyP2K00aWl75Dr+janw7rfCPWtaUj6vD140KbYCqa+1y0SmiGg7q+mRqqXQVJaYkk7FfliQ X-Received: by 2002:a17:906:d9c4:: with SMTP id qk4mr26919311ejb.100.1593655556312; Wed, 01 Jul 2020 19:05:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593655556; cv=none; d=google.com; s=arc-20160816; b=BdHzSW+jgPSWFm0vzM05cvfNmE4RklQivW18JEI5Yo/AWHgUP18Blym1hW2S99Q56f OIVsMrFYG5k659nJ3OH5ge391lrCFx/Wf2XMnoq/MP7et4KvSjj0bs9bSA2MON1R+elZ 9yVcDbLQfQ+5tDj+/VjWbMHHjbOB4bNZJo02j6OE9crhxQSQ35rtngPqFoY+oPMyGOsL vcJdWR++k9AaQU90PPZ8kPZoxzUjsrFXbBRJENjSdD+6uNR3h1hdS1CtMzR7k/JHvMsp pbznQvmAsScJAsJVe25Q03Ug++o4tTqMk9IowUhtjgNBU/r+t7WBYO/QXH9LmfY5mohX hY0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=tEKhDk/H3ehZqtNmDP8tuZdIMfx2HPPtLimASiHb6/U=; b=XA86bYaZdcxdfRVwPYtztFa1+mKiBeO/YmhMeJ5R7b9apX4cmQmJgrRo5zsJtes1O6 woszNvKImY9g0oGHyRqpmNiDWa7Xc4PkFeO59tFNj6nwWwxjkvqXVayXY/gS7fPz+Ai0 Ut60zutEA47WEBxAhH3t74WkgJ2Jxc/rWX8KFfgN4RrvU2kG/InILaKustwbOXJ873gE RBgdBbx8jKdNVAH8C6795cpPSjKyW+iOmu7iofNDLcT61MDxh0kLjL6IdGO9BqtYTkLx oPjr00agIOMrEmFILgRUcSzOI/jsEJ77Feno/H3BNcVCLVLBNcrgcrm26L0o85Fa/ELX swoQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l7si4621498edv.475.2020.07.01.19.05.32; Wed, 01 Jul 2020 19:05:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728132AbgGBCDT (ORCPT + 99 others); Wed, 1 Jul 2020 22:03:19 -0400 Received: from mail.kernel.org ([198.145.29.99]:45768 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728120AbgGBCDT (ORCPT ); Wed, 1 Jul 2020 22:03:19 -0400 Received: from oasis.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 355F2207E8; Thu, 2 Jul 2020 02:03:18 +0000 (UTC) Date: Wed, 1 Jul 2020 22:03:16 -0400 From: Steven Rostedt To: rananta@codeaurora.org Cc: mingo@redhat.com, psodagud@codeaurora.org, linux-kernel@vger.kernel.org Subject: Re: Tracing: rb_head_page_deactivate() caught in an infinite loop Message-ID: <20200701220316.1baf0a50@oasis.local.home> In-Reply-To: <6f84f449a6951d47e6cbab40bf898a1f@codeaurora.org> References: <6f84f449a6951d47e6cbab40bf898a1f@codeaurora.org> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 01 Jul 2020 10:07:06 -0700 rananta@codeaurora.org wrote: > Hi Steven and Mingo, > Hi Raghavendra, > While trying to adjust the buffer size (echo > > /sys/kernel/debug/tracing/buffer_size_kb), we see that the kernel gets > caught up in an infinite loop > while traversing the "cpu_buffer->pages" list in > rb_head_page_deactivate(). > > Looks like the last node of the list could be uninitialized, thus > leading to infinite traversal. From the data that we captured: > 000|rb_head_page_deactivate(inline) > | cpu_buffer = 0xFFFFFF8000671600 = > kernel_size_le_lo32+0xFFFFFF652F6EE600 -> ( > ... > | pages = 0xFFFFFF80A909D980 = > kernel_size_le_lo32+0xFFFFFF65D811A980 -> ( > | next = 0xFFFFFF80A909D200 = > kernel_size_le_lo32+0xFFFFFF65D811A200 -> ( > | next = 0xFFFFFF80A909D580 = > kernel_size_le_lo32+0xFFFFFF65D811A580 -> ( > | next = 0xFFFFFF8138D1CD00 = > kernel_size_le_lo32+0xFFFFFF6667D99D00 -> ( > | next = 0xFFFFFF80006716F0 = > kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> ( > | next = 0xFFFFFF80006716F0 = > kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> ( > | next = 0xFFFFFF80006716F0 = > kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> ( > | next = 0xFFFFFF80006716F0 = > kernel_size_le_lo32+0xFFFFFF652F6EE6F0, > > Wanted to check with you if there's any scenario that could lead us into > this state. > > Test details: > -- Arch: arm64 > -- Kernel version 5.4.30; running on Andriod > -- Test case: Running the following set of commands across reboot will > lead us to the scenario > > atrace --async_start -z -c -b 120000 sched audio irq idle freq > < Run any workload here > > atrace --async_dump -z -c -b 1200000 sched audio irq idle freq > > mytrace.trace > atrace --async_stop > /dev/null > echo 150000 > /sys/kernel/debug/tracing/buffer_size_kb > echo 200000 > /sys/kernel/debug/tracing/buffer_size_kb > reboot > > Repeating the above lines across reboots would reproduce the issue. > The "atrace" or "echo" would just get stuck while resizing the buffer > size. What do you mean repeat across reboots? If it doesn't happen it wont ever happen, but if you reboot it may have it happen again? > I'll try to reproduce the issue without atrace as well, but wondering > what could be the reason for leading us to this state. I haven't used arm lately, and I'm unfamiliar with atrace. So I don't really know what is going on. If you can reproduce this with just a shell script accessing the ftrace files, that would be much more useful. Thanks, -- Steve