Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp882197rdb; Wed, 6 Dec 2023 02:40:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IHhbJd8LabVwNBilP0825VvYFbe4U2jnLiWXKxf+6mKp9NTpsEhPGydxz2w7FymBjKwpyB4 X-Received: by 2002:a17:902:ab15:b0:1d0:6ffe:a19 with SMTP id ik21-20020a170902ab1500b001d06ffe0a19mr655113plb.119.1701859238401; Wed, 06 Dec 2023 02:40:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701859238; cv=none; d=google.com; s=arc-20160816; b=mMUWHIKiBr3T7ZtHAopTIFlws1gdANJVBbcUthWiQdrNgd+rHVWHb/CscwI7Y8yGIz mpyfuwcU4UpZDga4cFOmYHiRysAu1zlv32ylbeBNQINZj+CPdGGbztnOrVtMTmNGXPdb r7vB9EDR91KxdRBH+2lf8SifapY6AsT8rMJXuUvoNjE8ZrVQaLyPp0tCgP1Skjhcs6it 2yXELXEeTOgPRNDBbuu0q9T/hFY/JueFSmFt7yfU5gChe6OWz84q4HVW0/WRsUH3KL6O Wd4OAOxOgGWoCJE2Ga8IDl2Ak22VTfP9KotXzpWL3nqIMcYik6u5HdnjbvRM7rdRjD8j bELQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=yv4DBGc+tbpkWvzCMU3OJ92GavYlVxoKi15dx20+z+o=; fh=eOjQ32nOEaZnWjQa9UJ+3/FG4qbQ3hym/PDbBHfHZOk=; b=b1ZnpOoGu7tfeDZo0lvBaRIXgtGYMWuy9v0CEP+5vfZ6r8JewmtK+R1NcdM0Lzas3F M5wxZ4Dmq5+sWppyeGB3gZOvlGkNrVyMapyqwvWbI8pHyXCBWTODf+Gqwv7YTDhoG2Su E30FqXYfzyAY7BbglBPybU9GiZR7Z1exp0fhplYWKN3Wb6o0UIAzBWbK6/ZfsXXQsGZE EYL1oeKQ4zSJvIG87xRgYcV9iFmSzWoEyoQ5ChkcJio85mfMO7cP9tiQttcupVgmfK41 vyrU8op9GvXAALIBdFXA3wwBJrm23N02NaXKxrNkfcb+hiz9OvHEzDst+lyW7pO3z3KY 21Ig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=YRT7CHjO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Return-Path: Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id n3-20020a1709026a8300b001cfcc0ca762si10996647plk.108.2023.12.06.02.40.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 02:40:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=YRT7CHjO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 428E980A5B6B; Wed, 6 Dec 2023 02:40:28 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377485AbjLFKkI (ORCPT + 99 others); Wed, 6 Dec 2023 05:40:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42428 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377467AbjLFKkH (ORCPT ); Wed, 6 Dec 2023 05:40:07 -0500 Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E06A6D45; Wed, 6 Dec 2023 02:40:12 -0800 (PST) Received: from pps.filterd (m0279868.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3B67uskL023735; Wed, 6 Dec 2023 10:40:04 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=qcppdkim1; bh=yv4DBGc+tbpkWvzCMU3OJ92GavYlVxoKi15dx20+z+o=; b=YRT7CHjONfi17iGYqE0fwH3oz598x7cZANj7taQ1sYDDEXp1n8npJBxWW/+3tyWLhRN9 8cb92BpVfMFX+fic4sCALlYBnwSCKeUeiWoOwExxPDoJbJJAdO81WJ3JXDAjLupHZYlf bJLecVos3RxgvOHXBlORVhmlLq44JPZ0RWxdNPMhyUi+BwIdIEJZCB4ty7hIHR65yPbp h+F2XaBXyQc2nWO80Zdp07SiIWC74WNvIU7F+IYofC3PboP1J5nKXXuMUmmAoXRur8Dk e6c0bEl2FggxqNzFwQBQI6Uz9dp3oX/ggMMgaBjYOTdLOJv7y5Hjq82Xw73lcz21VmQ6 Rg== Received: from nasanppmta02.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3utd1wh953-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 10:40:03 +0000 Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA02.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 3B6Ae2Za010869 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 6 Dec 2023 10:40:02 GMT Received: from [10.214.66.81] (10.80.80.8) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Wed, 6 Dec 2023 02:39:59 -0800 Message-ID: <486d6d25-77e6-5fe4-4110-7256c20ba742@quicinc.com> Date: Wed, 6 Dec 2023 16:09:56 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH v4] PM / devfreq: Synchronize device_monitor_[start/stop] Content-Language: en-US To: , , CC: , , References: <1700860318-4025-1-git-send-email-quic_mojha@quicinc.com> From: Mukesh Ojha In-Reply-To: <1700860318-4025-1-git-send-email-quic_mojha@quicinc.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01a.na.qualcomm.com (10.52.223.231) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: jYwQoNc9aNaATkIEFM4QwysUXicozOFq X-Proofpoint-ORIG-GUID: jYwQoNc9aNaATkIEFM4QwysUXicozOFq X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-06_06,2023-12-06_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 lowpriorityscore=0 mlxscore=0 phishscore=0 clxscore=1011 impostorscore=0 adultscore=0 mlxlogscore=999 suspectscore=0 spamscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2312060087 X-Spam-Status: No, score=-4.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Wed, 06 Dec 2023 02:40:28 -0800 (PST) Gentle reminder.. -Mukesh On 11/25/2023 2:41 AM, Mukesh Ojha wrote: > There is a chance if a frequent switch of the governor > done in a loop result in timer list corruption where > timer cancel being done from two place one from > cancel_delayed_work_sync() and followed by expire_timers() > can be seen from the traces[1]. > > while true > do > echo "simple_ondemand" > /sys/class/devfreq/1d84000.ufshc/governor > echo "performance" > /sys/class/devfreq/1d84000.ufshc/governor > done > > It looks to be issue with devfreq driver where > device_monitor_[start/stop] need to synchronized so that > delayed work should get corrupted while it is either > being queued or running or being cancelled. > > Let's use polling flag and devfreq lock to synchronize the > queueing the timer instance twice and work data being > corrupted. > > [1] > ... > .. > -0 [003] 9436.209662: timer_cancel timer=0xffffff80444f0428 > -0 [003] 9436.209664: timer_expire_entry timer=0xffffff80444f0428 now=0x10022da1c function=__typeid__ZTSFvP10timer_listE_global_addr baseclk=0x10022da1c > -0 [003] 9436.209718: timer_expire_exit timer=0xffffff80444f0428 > kworker/u16:6-14217 [003] 9436.209863: timer_start timer=0xffffff80444f0428 function=__typeid__ZTSFvP10timer_listE_global_addr expires=0x10022da2b now=0x10022da1c flags=182452227 > vendor.xxxyyy.ha-1593 [004] 9436.209888: timer_cancel timer=0xffffff80444f0428 > vendor.xxxyyy.ha-1593 [004] 9436.216390: timer_init timer=0xffffff80444f0428 > vendor.xxxyyy.ha-1593 [004] 9436.216392: timer_start timer=0xffffff80444f0428 function=__typeid__ZTSFvP10timer_listE_global_addr expires=0x10022da2c now=0x10022da1d flags=186646532 > vendor.xxxyyy.ha-1593 [005] 9436.220992: timer_cancel timer=0xffffff80444f0428 > xxxyyyTraceManag-7795 [004] 9436.261641: timer_cancel timer=0xffffff80444f0428 > > [2] > > 9436.261653][ C4] Unable to handle kernel paging request at virtual address dead00000000012a > [ 9436.261664][ C4] Mem abort info: > [ 9436.261666][ C4] ESR = 0x96000044 > [ 9436.261669][ C4] EC = 0x25: DABT (current EL), IL = 32 bits > [ 9436.261671][ C4] SET = 0, FnV = 0 > [ 9436.261673][ C4] EA = 0, S1PTW = 0 > [ 9436.261675][ C4] Data abort info: > [ 9436.261677][ C4] ISV = 0, ISS = 0x00000044 > [ 9436.261680][ C4] CM = 0, WnR = 1 > [ 9436.261682][ C4] [dead00000000012a] address between user and kernel address ranges > [ 9436.261685][ C4] Internal error: Oops: 96000044 [#1] PREEMPT SMP > [ 9436.261701][ C4] Skip md ftrace buffer dump for: 0x3a982d0 > ... > > [ 9436.262138][ C4] CPU: 4 PID: 7795 Comm: TraceManag Tainted: G S W O 5.10.149-android12-9-o-g17f915d29d0c #1 > [ 9436.262141][ C4] Hardware name: Qualcomm Technologies, Inc. (DT) > [ 9436.262144][ C4] pstate: 22400085 (nzCv daIf +PAN -UAO +TCO BTYPE=--) > [ 9436.262161][ C4] pc : expire_timers+0x9c/0x438 > [ 9436.262164][ C4] lr : expire_timers+0x2a4/0x438 > [ 9436.262168][ C4] sp : ffffffc010023dd0 > [ 9436.262171][ C4] x29: ffffffc010023df0 x28: ffffffd0636fdc18 > [ 9436.262178][ C4] x27: ffffffd063569dd0 x26: ffffffd063536008 > [ 9436.262182][ C4] x25: 0000000000000001 x24: ffffff88f7c69280 > [ 9436.262185][ C4] x23: 00000000000000e0 x22: dead000000000122 > [ 9436.262188][ C4] x21: 000000010022da29 x20: ffffff8af72b4e80 > [ 9436.262191][ C4] x19: ffffffc010023e50 x18: ffffffc010025038 > [ 9436.262195][ C4] x17: 0000000000000240 x16: 0000000000000201 > [ 9436.262199][ C4] x15: ffffffffffffffff x14: ffffff889f3c3100 > [ 9436.262203][ C4] x13: ffffff889f3c3100 x12: 00000000049f56b8 > [ 9436.262207][ C4] x11: 00000000049f56b8 x10: 00000000ffffffff > [ 9436.262212][ C4] x9 : ffffffc010023e50 x8 : dead000000000122 > [ 9436.262216][ C4] x7 : ffffffffffffffff x6 : ffffffc0100239d8 > [ 9436.262220][ C4] x5 : 0000000000000000 x4 : 0000000000000101 > [ 9436.262223][ C4] x3 : 0000000000000080 x2 : ffffff889edc155c > [ 9436.262227][ C4] x1 : ffffff8001005200 x0 : ffffff80444f0428 > [ 9436.262232][ C4] Call trace: > [ 9436.262236][ C4] expire_timers+0x9c/0x438 > [ 9436.262240][ C4] __run_timers+0x1f0/0x330 > [ 9436.262245][ C4] run_timer_softirq+0x28/0x58 > [ 9436.262255][ C4] efi_header_end+0x168/0x5ec > [ 9436.262265][ C4] __irq_exit_rcu+0x108/0x124 > [ 9436.262274][ C4] __handle_domain_irq+0x118/0x1e4 > [ 9436.262282][ C4] gic_handle_irq.30369+0x6c/0x2bc > [ 9436.262286][ C4] el0_irq_naked+0x60/0x6c > > Reported-by: Joyyoung Huang > Signed-off-by: Mukesh Ojha > --- > Huang, > > Would be looking for your tested-by.. > > -Mukesh > > > Changes in v4: https://lore.kernel.org/lkml/1700238027-20518-1-git-send-email-quic_mojha@quicinc.com/ > - Mistakenly put cancel work under devfreq lock which could result in deadlock > reported by [Joyyoung Huang] > https://lore.kernel.org/lkml/KL1PR02MB8141D1A307457AF69EBB6AFBA3B8A@KL1PR02MB8141.apcprd02.prod.outlook.com/ > > Changes in v3: https://lore.kernel.org/lkml/1700235522-31105-1-git-send-email-quic_mojha@quicinc.com/ > - Remove the unexpected 'twice' from the subject. > > Changes in v2: https://lore.kernel.org/lkml/1699957648-31299-1-git-send-email-quic_mojha@quicinc.com/ > - Changed subject. > - Added lock to avoid work data corruption due to > parallel calls to devfreq_monitor_start while work > is queued in flight. > - Added lock to cover the same as above case while the > work is being cancelled. > - Added Reported-by for similar issue reported at > https://lore.kernel.org/lkml/SEYPR02MB565398175FA093AC3E63EE7BA3B0A@SEYPR02MB5653.apcprd02.prod.outlook.com/ > > drivers/devfreq/devfreq.c | 24 ++++++++++++++++++++++-- > 1 file changed, 22 insertions(+), 2 deletions(-) > > diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c > index b3a68d5833bd..cb1c24721a37 100644 > --- a/drivers/devfreq/devfreq.c > +++ b/drivers/devfreq/devfreq.c > @@ -461,10 +461,14 @@ static void devfreq_monitor(struct work_struct *work) > if (err) > dev_err(&devfreq->dev, "dvfs failed with (%d) error\n", err); > > + if (devfreq->stop_polling) > + goto out; > + > queue_delayed_work(devfreq_wq, &devfreq->work, > msecs_to_jiffies(devfreq->profile->polling_ms)); > - mutex_unlock(&devfreq->lock); > > +out: > + mutex_unlock(&devfreq->lock); > trace_devfreq_monitor(devfreq); > } > > @@ -483,6 +487,10 @@ void devfreq_monitor_start(struct devfreq *devfreq) > if (IS_SUPPORTED_FLAG(devfreq->governor->flags, IRQ_DRIVEN)) > return; > > + mutex_lock(&devfreq->lock); > + if (delayed_work_pending(&devfreq->work)) > + goto out; > + > switch (devfreq->profile->timer) { > case DEVFREQ_TIMER_DEFERRABLE: > INIT_DEFERRABLE_WORK(&devfreq->work, devfreq_monitor); > @@ -491,12 +499,16 @@ void devfreq_monitor_start(struct devfreq *devfreq) > INIT_DELAYED_WORK(&devfreq->work, devfreq_monitor); > break; > default: > - return; > + goto out; > } > > if (devfreq->profile->polling_ms) > queue_delayed_work(devfreq_wq, &devfreq->work, > msecs_to_jiffies(devfreq->profile->polling_ms)); > + > +out: > + devfreq->stop_polling = false; > + mutex_unlock(&devfreq->lock); > } > EXPORT_SYMBOL(devfreq_monitor_start); > > @@ -513,6 +525,14 @@ void devfreq_monitor_stop(struct devfreq *devfreq) > if (IS_SUPPORTED_FLAG(devfreq->governor->flags, IRQ_DRIVEN)) > return; > > + mutex_lock(&devfreq->lock); > + if (devfreq->stop_polling) { > + mutex_unlock(&devfreq->lock); > + return; > + } > + > + devfreq->stop_polling = true; > + mutex_unlock(&devfreq->lock); > cancel_delayed_work_sync(&devfreq->work); > } > EXPORT_SYMBOL(devfreq_monitor_stop);