Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp794554ybt; Wed, 1 Jul 2020 10:08:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxJ+S4kRo5QhNSY0eiQzgChF52O5h/JZzqkyIWofPSvHeQXBo72+8khRUUQR3vg7KPh13Fg X-Received: by 2002:a50:ee01:: with SMTP id g1mr22894148eds.264.1593623287684; Wed, 01 Jul 2020 10:08:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593623287; cv=none; d=google.com; s=arc-20160816; b=BBbrCo0ixSD9KSOlTCySNPKOVTyHf5yTI5kKxZDn56FKqdZyYuE437oKBS1ntSaG4o jPYV82lVHwCzF1HcyVLj19vb9QOpV1OdSOJ7rNw/EIgs7pETb+8tveoZMV2iivusaGqf CZt6MNIr9HC1j8sxGCQ388TqODOAS5Vo0zGg6As6tama4Tcw89JSuwYd6SJwLolKj7NL pbTYFUTXTVtqKouQXIH90wYiUNAhid+f6tvTNrqyDCqgSj2kr1e+P8X3S0aYjf/Gm76S 5pdl/54nHCPrcRFgVSZ0cfx3Xz/7MRZ5ZrMP1/mUGxRlSM6Tx3GoFshbqa95EfdqKC0K fGWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:message-id:subject:cc:to:from :date:content-transfer-encoding:mime-version:dkim-signature; bh=5EVbzSJM3gmkik/8EL/wKnXz1B2KelAXzUt6dm0FFHY=; b=hc0sVz2UqKaBsvVD5ms0q6Zn/hD/Mw80CpVP7CtiIEuFbDTIHvrxuLIXqPw5jFE0py /HYyZHJFWFPEgbsikFmYrNmtYGxKAO28lus4l1Oviye+v/Grc+SL7nXBqyeNlQNfEjzT jfe07Xdk29/9QfHxIbm7f1bJO78ru2waKwJ3gaM/aP7faoxrhFh+KHT5J4w3oELCXWBy zCbMPFGDlMrxolxrT7ZXp6KKnp0mLxwC5AQRVOgCeuDPPtmb4TTxbKEUcEliiicyGk8X UCzj4hJvnPClcp4k90dzeVEJBiTuVeDoipRinHhgvuj+Z/XKJ+0yDV2IL12jXZaAYUSt rk1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=ViclCkFw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t2si3985377eje.689.2020.07.01.10.07.43; Wed, 01 Jul 2020 10:08:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=ViclCkFw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732605AbgGARHK (ORCPT + 99 others); Wed, 1 Jul 2020 13:07:10 -0400 Received: from m43-7.mailgun.net ([69.72.43.7]:13821 "EHLO m43-7.mailgun.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728103AbgGARHJ (ORCPT ); Wed, 1 Jul 2020 13:07:09 -0400 DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1593623229; h=Message-ID: Subject: Cc: To: From: Date: Content-Transfer-Encoding: Content-Type: MIME-Version: Sender; bh=5EVbzSJM3gmkik/8EL/wKnXz1B2KelAXzUt6dm0FFHY=; b=ViclCkFwGYHyi+utnX20vjXmy0gPOVcGYYxWgsZiL4QLAOBdu7E78cuGbXb4Ef4SVCCAy/3W /69zp1yy7klFvOJ6k5crsSOxPa06Iho3ANWo6pZvbtSspxJ6c1w/7Cor43QnA0t1Lkdulaor p5lyJrXxe7pNn9pz1EpUQQJ3rxk= X-Mailgun-Sending-Ip: 69.72.43.7 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n01.prod.us-west-2.postgun.com with SMTP id 5efcc2bb6f2ee827da9e87a1 (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Wed, 01 Jul 2020 17:07:07 GMT Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 8A7E9C433CA; Wed, 1 Jul 2020 17:07:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=2.0 tests=ALL_TRUSTED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: rananta) by smtp.codeaurora.org (Postfix) with ESMTPSA id 4B792C433C6; Wed, 1 Jul 2020 17:07:06 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Wed, 01 Jul 2020 10:07:06 -0700 From: rananta@codeaurora.org To: rostedt@goodmis.org, mingo@redhat.com Cc: psodagud@codeaurora.org, linux-kernel@vger.kernel.org Subject: Tracing: rb_head_page_deactivate() caught in an infinite loop Message-ID: <6f84f449a6951d47e6cbab40bf898a1f@codeaurora.org> X-Sender: rananta@codeaurora.org User-Agent: Roundcube Webmail/1.3.9 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Steven and Mingo, While trying to adjust the buffer size (echo > /sys/kernel/debug/tracing/buffer_size_kb), we see that the kernel gets caught up in an infinite loop while traversing the "cpu_buffer->pages" list in rb_head_page_deactivate(). Looks like the last node of the list could be uninitialized, thus leading to infinite traversal. From the data that we captured: 000|rb_head_page_deactivate(inline) | cpu_buffer = 0xFFFFFF8000671600 = kernel_size_le_lo32+0xFFFFFF652F6EE600 -> ( ... | pages = 0xFFFFFF80A909D980 = kernel_size_le_lo32+0xFFFFFF65D811A980 -> ( | next = 0xFFFFFF80A909D200 = kernel_size_le_lo32+0xFFFFFF65D811A200 -> ( | next = 0xFFFFFF80A909D580 = kernel_size_le_lo32+0xFFFFFF65D811A580 -> ( | next = 0xFFFFFF8138D1CD00 = kernel_size_le_lo32+0xFFFFFF6667D99D00 -> ( | next = 0xFFFFFF80006716F0 = kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> ( | next = 0xFFFFFF80006716F0 = kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> ( | next = 0xFFFFFF80006716F0 = kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> ( | next = 0xFFFFFF80006716F0 = kernel_size_le_lo32+0xFFFFFF652F6EE6F0, Wanted to check with you if there's any scenario that could lead us into this state. Test details: -- Arch: arm64 -- Kernel version 5.4.30; running on Andriod -- Test case: Running the following set of commands across reboot will lead us to the scenario atrace --async_start -z -c -b 120000 sched audio irq idle freq < Run any workload here > atrace --async_dump -z -c -b 1200000 sched audio irq idle freq > mytrace.trace atrace --async_stop > /dev/null echo 150000 > /sys/kernel/debug/tracing/buffer_size_kb echo 200000 > /sys/kernel/debug/tracing/buffer_size_kb reboot Repeating the above lines across reboots would reproduce the issue. The "atrace" or "echo" would just get stuck while resizing the buffer size. I'll try to reproduce the issue without atrace as well, but wondering what could be the reason for leading us to this state. Thank you. Raghavendra