Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp1969127imj; Sun, 17 Feb 2019 19:52:35 -0800 (PST) X-Google-Smtp-Source: AHgI3IaiZSTxQ9FGvN9uvj6NLSiU9BX3liq8dbfPxNoF02b6Wt8iU8ZnLclov4z8S+zvMOeK3dGV X-Received: by 2002:a17:902:34a:: with SMTP id 68mr23527945pld.268.1550461955875; Sun, 17 Feb 2019 19:52:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550461955; cv=none; d=google.com; s=arc-20160816; b=Bg63p8Zcfcxd9hWglG60rsUxjF/Pq8tDfcnhd71bcD8JK5q6h+th8JuBhn2AHPFK80 hT+tyMWCaejT9TWNwYwErzgHDn/kBsXEAl+q7jfUfuuN/WJ4DkNKI162Ez7bLPnyZjX3 PIY4+CDnrQrExG4Dop0Ej9egwjqlGaYM4z27Y6u/WYbaoP+LElngFUWz38vmGRnQzoo3 xqJGREkznAd/RfUgOaS3rqNi0wqIGiQHrEHv8Vb5DbaowdPs29ivq0vLcPDRmHU24v8V uNaky9clAydvFaM3jTwTdAs8UY/Q4wovwZFiGSd0N719VZugRqi+mNm9SGfrZvQyIGHH x7Rg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:organization:from :references:cc:to:subject:reply-to:dkim-signature; bh=oVScqPJ3Zij7bvX89910UvrBpF5sFfs79BfFYV+drFk=; b=A7+B9wTSwTDed6EyOHbmrDVRGWkguHi9RM80u2wUs9wmtmIultDHJjDElx8IMTUQW2 q6L940EbyvK+Ey/q6nOICCFo306x6SrTgiBCY+nQyvvWhdjRIUn8gg0V2jYXXLYa2ljC utS6Pq6KmKIi39/FlsjQWR0uAjT3SEo5Hp5YtiI0z15SeRQr4vjyhKhRUaSSm1Ndk+gr ff3EOVfZEIH5iNU7m4tpxFGzJ6/wQaxksUI8Mkm96VbXFuEf2Ge1xHG2hEOhH9zUglQN C9u/Qu24fDQzhZW1k3F5JsSZwvIJpVAYh3tJ+/mMlJyWo6DbpRwnEtSBmxsEQMOlHOIK Pzfg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=u3E9+YOl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w15si13248536plk.357.2019.02.17.19.52.20; Sun, 17 Feb 2019 19:52:35 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=u3E9+YOl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728473AbfBRDsp (ORCPT + 99 others); Sun, 17 Feb 2019 22:48:45 -0500 Received: from aserp2130.oracle.com ([141.146.126.79]:33568 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726892AbfBRDso (ORCPT ); Sun, 17 Feb 2019 22:48:44 -0500 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x1I3i9bj144705; Mon, 18 Feb 2019 03:48:30 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=reply-to : subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=oVScqPJ3Zij7bvX89910UvrBpF5sFfs79BfFYV+drFk=; b=u3E9+YOl220xa2OWxaIV3oAquxaxH/n6fiD3ToDhgs27BMIFqN7uF98C347VsnuL9Li5 mXdPM+qFZFlojWkR7pJdYhqojrnEzFeuhMWX8LOp/PDw3B7O5CZsSHPdxyYBmfRd1wrW R9Xqo0Qoz7RhoKx+ME3dMR6oNoMMBY0cqX1dW41ZgFoogHjvGucd4fxiZLuPtIkGUkfN 3s8ay5ApwhwBLgs5kskbyqQ13UGJ9RqueMaT858xL7Wockfm2q9pFeYMpZAQnhl3GNTY JYki8Ex2uQORq8gkj/S+o5OTTwkCjglmytoThTuoESW9V1SO0O6Zvjk4GQjXugJXYjTg zg== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp2130.oracle.com with ESMTP id 2qp81dv5ed-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 18 Feb 2019 03:48:30 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id x1I3mUGP002944 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 18 Feb 2019 03:48:30 GMT Received: from abhmp0004.oracle.com (abhmp0004.oracle.com [141.146.116.10]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x1I3mTbt013623; Mon, 18 Feb 2019 03:48:30 GMT Received: from [192.168.1.166] (/60.223.138.56) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sun, 17 Feb 2019 19:48:29 -0800 Reply-To: zhenzhong.duan@oracle.com Subject: Re: [PATCH] acpi_pm: Reduce PMTMR counter read contention To: Thomas Gleixner Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Daniel Lezcano , Waiman Long , Srinivas Eeda , kin.cho@oracle.com References: <1548141807-25825-1-git-send-email-zhenzhong.duan@oracle.com> <019e583c-7bcb-c234-200c-fcdb6c49fbb0@oracle.com> <853e8cf6-aba9-0200-8e39-e362848399ba@oracle.com> From: Zhenzhong Duan Organization: Oracle Message-ID: <8ffea578-5eb6-f479-7bd4-668df84d930f@oracle.com> Date: Mon, 18 Feb 2019 11:48:37 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=gbk; format=flowed Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9170 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902180027 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2019/2/11 5:08, Thomas Gleixner wrote: > On Sat, 2 Feb 2019, Zhenzhong Duan wrote: >> On 2019/1/31 22:26, Thomas Gleixner wrote: > >>>>> I'm not against the change per se, but I really want to understand >>>>> why we need all the complexity for something which should never be >>>>> used in a real world deployment. >>>>> >>>> Hmm, it's a strong word of "never be used". Customers may happen to >>>> use nohpet(sanity test?) and report bug to us. Sometimes they does >>>> report a bug that reproduce with their customed config. There may >>>> also be BIOS setting HPET disabled. >>> >>> And because the customer MAY do completely nonsensical things (and there >>> are a lot more than the HPET) the kernel has to handle all of them? >> >> Ok, then. I don't have more suggestion to convince you. > > You give up too fast :) Ah, because I thought of a simple fix. > > The point is, that we really want proper justifications for changes like > this. Some 'may, could and more handwaving' simply does not cut it. > > So if you can just describe a realistic scenario, which does not involve > thoughtless flipping of BIOS options, then this becomes way more palatable. I indeed don't see a realistic scenario in a product env needing to use nohpet. My only justification is now that we have nohpet as kernel parameter, we should fix the softlockup in large machines for enterprise use. > >> I just think of a simple fix as below. I think it will work for both hpet >> and pmtmr. We will test it when the env is available. > >> --- a/kernel/time/timekeeping.c >> +++ b/kernel/time/timekeeping.c >> @@ -1353,6 +1353,7 @@ static int change_clocksource(void *data) >> >> write_seqcount_end(&tk_core.seq); >> raw_spin_unlock_irqrestore(&timekeeper_lock, flags); >> + tick_clock_notify(); >> >> return 0; >> } >> @@ -1371,7 +1372,6 @@ int timekeeping_notify(struct clocksource *clock) >> if (tk->tkr_mono.clock == clock) >> return 0; >> stop_machine(change_clocksource, clock, NULL); >> - tick_clock_notify(); >> return tk->tkr_mono.clock == clock ? 0 : -1; >> } > > This won't resolve the concurrency issues of HPET or PMTIMER in any > way. Just got chance to test and Kin confirmed it fix the softlockup of PMTMR(with nohpet) and HPET(without nohpet, revert previous hpet commit) at bootup stage. My understandig is, at bootup stage tick device is firstly initialized in periodic mode and then switch to one-shot mode. In periodic mode clock event interrupt is triggered every 1ms(HZ=1000), contention in HPET or PMTIMER exceeds 1ms and delayed the clock interrupt. Then CPUs continue to process interrupt one by one without a break, tick_clock_notify() have no chance to be called and we never switch to one-shot mode. In one-shot mode, the contention is still there but next event is always set with a future value. We may missed some ticks, but the timer code is smart enough to pick up those missed ticks. By moving tick_clock_notify() in stop_machine, kernel changes to one-shot mode early before the contention accumulate and lockup system. > Instead it breaks the careful orchestrated mechanism of clocksource > change. Sorry, I didn't get a idea how it breaks, tick_clock_notify() is a simple function setting bitmask in percpu variable. Could you explain a bit? Thanks Zhenzhong