Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp1834890pxu; Fri, 9 Oct 2020 00:29:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxd/flDvwQOYeVYd4qj2TbCkXSOZx1651lMh69Jd1HVdQg2OivpqWvVg8mIf0gVggMMVfel X-Received: by 2002:a05:6402:a4f:: with SMTP id bt15mr13398134edb.345.1602228583478; Fri, 09 Oct 2020 00:29:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602228583; cv=none; d=google.com; s=arc-20160816; b=BQttGaOOwvyLzoo3VRH2n8BSEwmwftRUK/932EVmUINiRid4fYmW6w+LTwkTvLeYwL 4+d+DLL3Rmykj9K7fBrkQkyGZQ1JBP+KdORZGL1nrH0afSUe7ZOVQnK7SGKDO7gBZJ8v Z1Yf1k+VZNMnT0xycQQYT6Q9jbFDGizqq0XG9WtD+wNJ0TTVSg3y5yY5dJ0bgUNP4yus xObMdvrtG/fschDN19ONUOd7VzMI9LhGY9bL1lOzUdIDoi0rVYZ7Q7IA1xirVJyIOfwb +Ta/qUB9Rjj8ecrgD2M4xYzKh/c+7OV/kgEDY7lX1QtrOXqrJdX2QpmfjZ8flPuSTTHu 31Mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=nX8LaYlgFI8pSxcFor1JWVzHslHCgNEFBImUZwGXD/s=; b=KITzUBNkKeugUNgtAz5cnLncukPniBpkHzsdWkq1cUbCX5F5gXfNcaNxvIaPXiQoHF WcUVw5lLuZH5GDa2ChCg6RfNjjujfskd0+/eGKoCg7xQi1sDX7Ve1XFuFTFTS+lETmpL x/MjujBmUbIAcTU749p94lMbewreoSDN1M9WxXsyEkt3+tnayqYHsiIqtLhjXSYeqlJK 3R1qyPX448xQUQmTzFTqwZ09FpJhbKBUyZSK0RiPuI8dOUGA9OzdytFt1zqcjrjv3Vwl ba2vWFgk1PPSgoHeTeDFt3hAWXQGSgiTSqpIpksebPjNvRLDJJ98Wv8ustdi0Fd2zwZL KPLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=qjC4QvUa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bh20si5256498ejb.294.2020.10.09.00.29.20; Fri, 09 Oct 2020 00:29:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=qjC4QvUa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731939AbgJIH0d (ORCPT + 99 others); Fri, 9 Oct 2020 03:26:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731864AbgJIH0d (ORCPT ); Fri, 9 Oct 2020 03:26:33 -0400 Received: from mail-yb1-xb43.google.com (mail-yb1-xb43.google.com [IPv6:2607:f8b0:4864:20::b43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A10EC0613D2 for ; Fri, 9 Oct 2020 00:26:33 -0700 (PDT) Received: by mail-yb1-xb43.google.com with SMTP id c3so6555439ybl.0 for ; Fri, 09 Oct 2020 00:26:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=nX8LaYlgFI8pSxcFor1JWVzHslHCgNEFBImUZwGXD/s=; b=qjC4QvUagjX3m085IM+aCTiCw1n6/oUCOqJ1ODyb10/u6ToyEmSnR7d2n3CVBaQ+eJ yrIhLwzfo/Cc4FgQ2iJaw4TYvJUWdMgL6MQ1wEyTFg3NN/k1nmL1AOGmoYwlwNSgib3h 1GeQlhVSXESLJM6T0ia8+spUD+6B5L036kwFu5+KUxp+Rp4MVq3+cQPeHLlmpKMUwrah zb4ukC6A3TtIfSoVY6shi9Q63IjaiEfMOEU8dMRTDlSf0M+xEo6vbyDlAlRH9mTJnKG2 i0BdF7VCjGII/3c/uEGvHdfOOs3vQrOy2DGq9oJ4PiEMYv7CaUXDi5ju7RP8LED0lmkn 3H5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nX8LaYlgFI8pSxcFor1JWVzHslHCgNEFBImUZwGXD/s=; b=VMqvT9FzboYKZ3h2RFAGrOrXAMAV4NWvYNd5WiX+DgsccHHwFZ6DldorUhnoI03FcG onm73GozfbKLXeijHhGKl8hTD9SqQnE74db0SIEldEKV1Hu6Sk/PC4GfHRm7z3UCIb39 OpucGlbw1OzvleBZBgJhssPjRnMfG4MFVL8YOwipO8FZzi/BSayw7bJTwJYrbt/cRdWn 3wihIzk1EwHirb0/GJL1Jqdy+ryWoKROI4xamZX6XX1Wo2PWtbNUhLbNjp2Rod3HDutF 29sNKMbjBTkcoAkTn81LDIZ3YeHXYuXJLNz7ifJTy0sSi/merq1PUOHCFHvFexUWFlf9 Rzwg== X-Gm-Message-State: AOAM532xnQoMzQt4RQM2Y1HTKk9g34R8He67i8ynPU//j8A0ldySRQNb 3oKW8v20IXzZaDYWv2tgkQxamUGSdxXytkUWxuw= X-Received: by 2002:a25:d74f:: with SMTP id o76mr16988395ybg.55.1602228392258; Fri, 09 Oct 2020 00:26:32 -0700 (PDT) MIME-Version: 1.0 References: <7a7af0e893e3ceb0d8d1f055bffca1d47025d7ba.1597645588.git.zhaoqianli@xiaomi.com> <87a6x6f21t.fsf@nanos.tec.linutronix.de> In-Reply-To: <87a6x6f21t.fsf@nanos.tec.linutronix.de> From: qianli zhao Date: Fri, 9 Oct 2020 15:26:21 +0800 Message-ID: Subject: Re: [PATCH v5] kthread: Add debugobject support To: Thomas Gleixner Cc: axboe@kernel.dk, akpm@linux-foundation.org, Felix.Kuehling@amd.com, Stephen Boyd , John Stultz , ben.dooks@codethink.co.uk, bfields@redhat.com, cl@rock-chips.com, linux-kernel@vger.kernel.org, Qianli Zhao Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi,Thomas Thanks for your reply On Thu, 1 Oct 2020 at 22:34, Thomas Gleixner wrote: > > On Mon, Aug 17 2020 at 14:37, Qianli Zhao wrote: > > From: Qianli Zhao > > > > Add debugobject support to track the life time of kthread_work > > which is used to detect reinitialization/free active object problems > > Add kthread_init_work_onstack()/kthread_init_delayed_work_onstack() for > > kthread onstack support > > > > If we reinitialize a kthread_work that has been activated, > > this will cause delayed_work_list/work_list corruption. > > enable this config,there is an chance to fixup these errors > > or WARNING the wrong use of kthread_work > > > > [30858.395766] list_del corruption. next->prev should be ffffffe388ebbf88, but was ffffffe388ebb588 > > [30858.395788] WARNING: CPU: 2 PID: 387 at kernel/msm-4.19/lib/list_debug.c:56 __list_del_entry_valid+0xc8/0xd0 > > ... > > [30858.395906] Call trace: > > [30858.395909] __list_del_entry_valid+0xc8/0xd0 > > [30858.395912] __kthread_cancel_work_sync+0x98/0x138 > > [30858.395915] kthread_cancel_delayed_work_sync+0x10/0x20 > > [30858.395917] sde_encoder_resource_control+0xe8/0x12c0 > > [30858.395920] sde_encoder_prepare_for_kickoff+0x5dc/0x2008 > > [30858.395923] sde_crtc_commit_kickoff+0x280/0x890 > > [30858.395925] sde_kms_commit+0x16c/0x278 > > [30858.395928] complete_commit+0x3c4/0x760 > > [30858.395931] _msm_drm_commit_work_cb+0xec/0x1e0 > > [30858.395933] kthread_worker_fn+0xf8/0x190 > > [30858.395935] kthread+0x118/0x128 > > [30858.395938] ret_from_fork+0x10/0x18 > > > > crash> struct kthread_worker.delayed_work_list 0xffffffe3893925f0 > > [ffffffe389392620] delayed_work_list = { > > next = 0xffffffe388ebbf88, > > prev = 0xffffffe388ebb588 > > } > > crash> list 0xffffffe388ebbf88 > > ffffffe388ebbf88 > > This changelog is confusing at best. Something like this perhaps? > > kthread_work is not covered by debug objects, but the same problems as with > regular work objects apply. > > Some of the issues like reinitialization of an active kthread_work are hard > to debug because the problem manifests itself later in a completely > different context. > > Add debugobject support along with the necessary fixup functions to make > debugging of these problems less tedious. > Will follow your tips to improve. > > +static void stub_kthread_work(struct kthread_work *unuse) > > unused? > > > +{ > > + WARN_ON(1); > > +} When the kthread_work is not initialized, kwork_fixup_assert_init() will call kthread_init_work() to fixup this kthread_work,and kthread_init_work() needs a function as a parameter,so we have to quote an empty function(refer to stub_timer()). > > void kthread_flush_work(struct kthread_work *work) > > { > > struct kthread_flush_work fwork = { > > - KTHREAD_WORK_INIT(fwork.work, kthread_flush_work_fn), > > - COMPLETION_INITIALIZER_ONSTACK(fwork.done), > > + .done = COMPLETION_INITIALIZER_ONSTACK(fwork.done), > > Eew. Why is the completion initialized seperately instead of being > initialized as part of kthread_init_work_onstack() ? > I just try to keep the same as before,how about change as below? completion initialized as part of kthread_init_work_onstack() @@ -1319,10 +1318,9 @@ bool kthread_cancel_delayed_work_sync(struct kthread_delayed_work *dwork) */ void kthread_flush_worker(struct kthread_worker *worker) { - struct kthread_flush_work fwork = { - .done = COMPLETION_INITIALIZER_ONSTACK(fwork.done), - }; + struct kthread_flush_work fwork; + fwork.done = COMPLETION_INITIALIZER_ONSTACK(fwork.done); kthread_init_work_onstack(&fwork.work, kthread_flush_work_fn); > > }; > > struct kthread_worker *worker; > > bool noop = false; > > > > + debug_kwork_assert_init(work); > > worker = work->worker; > > if (!worker) > > return; > > > > + kthread_init_work_onstack(&fwork.work, kthread_flush_work_fn); > > > > @@ -1194,12 +1319,13 @@ EXPORT_SYMBOL_GPL(kthread_cancel_delayed_work_sync); > > void kthread_flush_worker(struct kthread_worker *worker) > > { > > struct kthread_flush_work fwork = { > > - KTHREAD_WORK_INIT(fwork.work, kthread_flush_work_fn), > > - COMPLETION_INITIALIZER_ONSTACK(fwork.done), > > + .done = COMPLETION_INITIALIZER_ONSTACK(fwork.done), > > }; > > Ditto. > > > + kthread_init_work_onstack(&fwork.work, kthread_flush_work_fn); > > kthread_queue_work(worker, &fwork.work); > > wait_for_completion(&fwork.done); > > + destroy_kwork_on_stack(&fwork.work); > > Thanks, > > tglx