Dear Mika, dear Rafael,
Commit c2bf1fc2 (PCI: Add missing link delays required by the PCIe spec) [1]
increases the resume time from ACPI S3 on a desktop system Dell OptiPlex 5040
by one second. It looks like this is expected from the commit message, but
breaks existing systems with boot time requirements. I measured this with the
help of the pm-graph script `sleepgraph.py` [2].
0000:00:01.0 resume_noirq (1134.715 ms @ 5039.412578 to 5040.547293)
…
pcie_wait_for_link_delay (1132.853 ms @ 5039.414431)
$ lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [8086:191f] (rev 07)
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31)
00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Thermal subsystem [8086:a131] (rev 31)
00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
00:17.0 SATA controller [0106]: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] [8086:a102] (rev 31)
00:1c.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #1 [8086:a110] (rev f1)
00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a146] (rev 31)
00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-V [8086:15b8] (rev 31)
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland XT [Radeon HD 8670 / R7 250/350] [1002:6610] (rev 81)
01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] [1002:aab0]
02:00.0 PCI bridge [0604]: Texas Instruments XIO2001 PCI Express-to-PCI Bridge [104c:8240]
So, it’s about the internal Intel graphics device, which is not used on this
system, as there is an external AMD graphics device plugged in.
As far as I understand it, it’s a bug in the firmware, that a one second delay
is specified?
Anyway, there is such firmware out there, so I’d like to avoid the time
increases.
As a first step, the commit should be extended to print a warning (maybe if
`initcall_debug` is specified), when the delay is higher than let’s say 50(?)
ms. Also better documentation how to debug these delays would be appreciated.
If there is no easy solution, it’d be great if the commit could be reverted for
now, and a better solution be discussed for the next release.
Kind regards,
Paul
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c2bf1fc212f7e6f25ace1af8f0b3ac061ea48ba5
[2]: https://github.com/intel/pm-graph
+Nicholas and Matthias
On Tue, Aug 06, 2019 at 11:20:37AM +0200, Paul Menzel wrote:
> Dear Mika, dear Rafael,
>
>
> Commit c2bf1fc2 (PCI: Add missing link delays required by the PCIe spec) [1]
> increases the resume time from ACPI S3 on a desktop system Dell OptiPlex 5040
> by one second. It looks like this is expected from the commit message, but
> breaks existing systems with boot time requirements. I measured this with the
> help of the pm-graph script `sleepgraph.py` [2].
>
> 0000:00:01.0 resume_noirq (1134.715 ms @ 5039.412578 to 5040.547293)
> …
> pcie_wait_for_link_delay (1132.853 ms @ 5039.414431)
>
> $ lspci -nn
> 00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [8086:191f] (rev 07)
> 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
> 00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31)
> 00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Thermal subsystem [8086:a131] (rev 31)
> 00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
> 00:17.0 SATA controller [0106]: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] [8086:a102] (rev 31)
> 00:1c.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #1 [8086:a110] (rev f1)
> 00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a146] (rev 31)
> 00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
> 00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
> 00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
> 00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-V [8086:15b8] (rev 31)
> 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland XT [Radeon HD 8670 / R7 250/350] [1002:6610] (rev 81)
> 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] [1002:aab0]
> 02:00.0 PCI bridge [0604]: Texas Instruments XIO2001 PCI Express-to-PCI Bridge [104c:8240]
>
> So, it’s about the internal Intel graphics device, which is not used on this
> system, as there is an external AMD graphics device plugged in.
>
> As far as I understand it, it’s a bug in the firmware, that a one second delay
> is specified?
>
> Anyway, there is such firmware out there, so I’d like to avoid the time
> increases.
>
> As a first step, the commit should be extended to print a warning (maybe if
> `initcall_debug` is specified), when the delay is higher than let’s say 50(?)
> ms. Also better documentation how to debug these delays would be appreciated.
>
> If there is no easy solution, it’d be great if the commit could be reverted for
> now, and a better solution be discussed for the next release.
There is also kernel bugzilla entry about another regression this causes
here:
https://bugzilla.kernel.org/show_bug.cgi?id=204413
I agree we should revert c2bf1fc2 now. I'll try to come up alternative
solution to these missing delays that hopefully does not break existing
setups.
Rafael, Bjorn,
Can you revert the commit or do you want me to send a revert patch?
Thanks and sorry about the breakage.
Dear Mika,
Thank you for your quick reply.
On 06.08.19 11:36, Mika Westerberg wrote:
> +Nicholas and Matthias
>
> On Tue, Aug 06, 2019 at 11:20:37AM +0200, Paul Menzel wrote:
>> Commit c2bf1fc2 (PCI: Add missing link delays required by the PCIe spec) [1]
>> increases the resume time from ACPI S3 on a desktop system Dell OptiPlex 5040
>> by one second. It looks like this is expected from the commit message, but
>> breaks existing systems with boot time requirements. I measured this with the
>> help of the pm-graph script `sleepgraph.py` [2].
>>
>> 0000:00:01.0 resume_noirq (1134.715 ms @ 5039.412578 to 5040.547293)
>> …
>> pcie_wait_for_link_delay (1132.853 ms @ 5039.414431)
By the way, here is the trace excerpt with the interesting comments.
```
5040.547284 | 1) kworker-3594 | 1132852 us | } /* schedule_timeout */
5040.547284 | 1) kworker-3594 | 1132853 us | } /* msleep */
5040.547284 | 1) kworker-3594 | 1132853 us | } /* pcie_wait_for_link_delay */
5040.547284 | 1) kworker-3594 | 1132856 us | } /* wait_for_downstream_link */
5040.547285 | 1) kworker-3594 | | device_for_each_child() {
5040.547285 | 1) kworker-3594 | 0.185 us | _raw_spin_lock_irqsave();
5040.547286 | 1) kworker-3594 | 0.136 us | _raw_spin_unlock_irqrestore();
5040.547286 | 1) kworker-3594 | 0.190 us | pm_iter();
5040.547286 | 1) kworker-3594 | 0.129 us | _raw_spin_lock_irqsave();
5040.547287 | 1) kworker-3594 | 0.134 us | _raw_spin_unlock_irqrestore();
5040.547287 | 1) kworker-3594 | 0.194 us | pm_iter();
5040.547287 | 1) kworker-3594 | 0.134 us | _raw_spin_lock_irqsave();
5040.547288 | 1) kworker-3594 | 0.133 us | _raw_spin_unlock_irqrestore();
5040.547288 | 1) kworker-3594 | 0.187 us | pm_iter();
5040.547288 | 1) kworker-3594 | 0.135 us | _raw_spin_lock_irqsave();
5040.547289 | 1) kworker-3594 | 0.135 us | _raw_spin_unlock_irqrestore();
5040.547289 | 1) kworker-3594 | 0.271 us | pm_iter();
5040.547289 | 1) kworker-3594 | 0.132 us | _raw_spin_lock_irqsave();
5040.547290 | 1) kworker-3594 | 0.137 us | _raw_spin_unlock_irqrestore();
5040.547290 | 1) kworker-3594 | 5.036 us | } /* device_for_each_child */
5040.547290 | 1) kworker-3594 | 1132862 us | } /* pcie_port_device_resume_noirq */
5040.547290 | 1) kworker-3594 | 1134709 us | } /* pci_pm_resume_noirq */
```
>> $ lspci -nn
>> 00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [8086:191f] (rev 07)
>> 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
>> 00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31)
>> 00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Thermal subsystem [8086:a131] (rev 31)
>> 00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
>> 00:17.0 SATA controller [0106]: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] [8086:a102] (rev 31)
>> 00:1c.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #1 [8086:a110] (rev f1)
>> 00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a146] (rev 31)
>> 00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
>> 00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
>> 00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
>> 00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-V [8086:15b8] (rev 31)
>> 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland XT [Radeon HD 8670 / R7 250/350] [1002:6610] (rev 81)
>> 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] [1002:aab0]
>> 02:00.0 PCI bridge [0604]: Texas Instruments XIO2001 PCI Express-to-PCI Bridge [104c:8240]
>>
>> So, it’s about the internal Intel graphics device, which is not used on this
>> system, as there is an external AMD graphics device plugged in.
>>
>> As far as I understand it, it’s a bug in the firmware, that a one second delay
>> is specified?
How can I read out the delay from the system as done in?
```
static int get_downstream_delay(struct pci_bus *bus)
{
struct pci_dev *pdev;
int min_delay = 100;
int max_delay = 0;
list_for_each_entry(pdev, &bus->devices, bus_list) {
if (!pdev->imm_ready)
min_delay = 0;
else if (pdev->d3cold_delay < min_delay)
min_delay = pdev->d3cold_delay;
if (pdev->d3cold_delay > max_delay)
max_delay = pdev->d3cold_delay;
}
return max(min_delay, max_delay);
}
```
>> Anyway, there is such firmware out there, so I’d like to avoid the time
>> increases.
>>
>> As a first step, the commit should be extended to print a warning (maybe if
>> `initcall_debug` is specified), when the delay is higher than let’s say 50(?)
>> ms. Also better documentation how to debug these delays would be appreciated.
As your commit message says the standard demands a delay of at least 100 ms, 50 ms
is of course too short, and maybe 150 ms or so should be used as the threshold.
>> If there is no easy solution, it’d be great if the commit could be reverted for
>> now, and a better solution be discussed for the next release.
>
> There is also kernel bugzilla entry about another regression this causes
> here:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=204413
>
> I agree we should revert c2bf1fc2 now. I'll try to come up alternative
> solution to these missing delays that hopefully does not break existing
> setups.
>
> Rafael, Bjorn,
>
> Can you revert the commit or do you want me to send a revert patch?
>
> Thanks and sorry about the breakage.
No worries.
Kind regards,
Paul
PS:
```
$ sudo lspci -vvv -s 00:00.0
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers (rev 07)
Subsystem: Dell Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
Latency: 0
Capabilities: [e0] Vendor Specific Information: Len=10 <?>
Kernel driver in use: skl_uncore
$ sudo lspci -vvv -s 00:01
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) (rev 07) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 120
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 0000e000-0000efff [size=4K]
Memory behind bridge: f7e00000-f7efffff [size=1M]
Prefetchable memory behind bridge: 00000000e0000000-00000000efffffff [size=256M]
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA+ MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [88] Subsystem: Dell Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16)
Capabilities: [80] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee04004 Data: 4021
Capabilities: [a0] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 256 bytes, MaxReadReq 128 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #2, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <256ns, L1 <8us
ClockPM- Surprise- LLActRep- BwNot+ ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt+ ABWMgmt+
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
Slot #2, PowerLimit 75.000W; Interlock- NoCompl+
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
Changed: MRL- PresDet+ LinkState-
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR+, OBFF Via WAKE# ARIFwd-
AtomicOpsCap: Routing- 32bit+ 64bit+ 128bitCAS+
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Via WAKE# ARIFwd-
AtomicOpsCtl: ReqEn- EgressBlck-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Capabilities: [140 v1] Root Complex Link
Desc: PortNumber=02 ComponentID=01 EltType=Config
Link0: Desc: TargetPort=00 TargetComponent=01 AssocRCRB- LinkType=MemMapped LinkValid+
Addr: 00000000fed19000
Capabilities: [d94 v1] #19
Kernel driver in use: pcieport
```
On Tue, Aug 06, 2019 at 11:57:26AM +0200, Paul Menzel wrote:
> Dear Mika,
>
>
> Thank you for your quick reply.
>
>
> On 06.08.19 11:36, Mika Westerberg wrote:
> > +Nicholas and Matthias
> >
> > On Tue, Aug 06, 2019 at 11:20:37AM +0200, Paul Menzel wrote:
>
> >> Commit c2bf1fc2 (PCI: Add missing link delays required by the PCIe spec) [1]
> >> increases the resume time from ACPI S3 on a desktop system Dell OptiPlex 5040
> >> by one second. It looks like this is expected from the commit message, but
> >> breaks existing systems with boot time requirements. I measured this with the
> >> help of the pm-graph script `sleepgraph.py` [2].
> >>
> >> 0000:00:01.0 resume_noirq (1134.715 ms @ 5039.412578 to 5040.547293)
> >> …
> >> pcie_wait_for_link_delay (1132.853 ms @ 5039.414431)
>
> By the way, here is the trace excerpt with the interesting comments.
>
> ```
> 5040.547284 | 1) kworker-3594 | 1132852 us | } /* schedule_timeout */
> 5040.547284 | 1) kworker-3594 | 1132853 us | } /* msleep */
> 5040.547284 | 1) kworker-3594 | 1132853 us | } /* pcie_wait_for_link_delay */
> 5040.547284 | 1) kworker-3594 | 1132856 us | } /* wait_for_downstream_link */
> 5040.547285 | 1) kworker-3594 | | device_for_each_child() {
> 5040.547285 | 1) kworker-3594 | 0.185 us | _raw_spin_lock_irqsave();
> 5040.547286 | 1) kworker-3594 | 0.136 us | _raw_spin_unlock_irqrestore();
> 5040.547286 | 1) kworker-3594 | 0.190 us | pm_iter();
> 5040.547286 | 1) kworker-3594 | 0.129 us | _raw_spin_lock_irqsave();
> 5040.547287 | 1) kworker-3594 | 0.134 us | _raw_spin_unlock_irqrestore();
> 5040.547287 | 1) kworker-3594 | 0.194 us | pm_iter();
> 5040.547287 | 1) kworker-3594 | 0.134 us | _raw_spin_lock_irqsave();
> 5040.547288 | 1) kworker-3594 | 0.133 us | _raw_spin_unlock_irqrestore();
> 5040.547288 | 1) kworker-3594 | 0.187 us | pm_iter();
> 5040.547288 | 1) kworker-3594 | 0.135 us | _raw_spin_lock_irqsave();
> 5040.547289 | 1) kworker-3594 | 0.135 us | _raw_spin_unlock_irqrestore();
> 5040.547289 | 1) kworker-3594 | 0.271 us | pm_iter();
> 5040.547289 | 1) kworker-3594 | 0.132 us | _raw_spin_lock_irqsave();
> 5040.547290 | 1) kworker-3594 | 0.137 us | _raw_spin_unlock_irqrestore();
> 5040.547290 | 1) kworker-3594 | 5.036 us | } /* device_for_each_child */
> 5040.547290 | 1) kworker-3594 | 1132862 us | } /* pcie_port_device_resume_noirq */
> 5040.547290 | 1) kworker-3594 | 1134709 us | } /* pci_pm_resume_noirq */
> ```
>
> >> $ lspci -nn
> >> 00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [8086:191f] (rev 07)
> >> 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
> >> 00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31)
> >> 00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Thermal subsystem [8086:a131] (rev 31)
> >> 00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
> >> 00:17.0 SATA controller [0106]: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] [8086:a102] (rev 31)
> >> 00:1c.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #1 [8086:a110] (rev f1)
> >> 00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a146] (rev 31)
> >> 00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
> >> 00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
> >> 00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
> >> 00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-V [8086:15b8] (rev 31)
> >> 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland XT [Radeon HD 8670 / R7 250/350] [1002:6610] (rev 81)
> >> 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] [1002:aab0]
> >> 02:00.0 PCI bridge [0604]: Texas Instruments XIO2001 PCI Express-to-PCI Bridge [104c:8240]
> >>
> >> So, it’s about the internal Intel graphics device, which is not used on this
> >> system, as there is an external AMD graphics device plugged in.
> >>
> >> As far as I understand it, it’s a bug in the firmware, that a one second delay
> >> is specified?
>
> How can I read out the delay from the system as done in?
The delay is not system wide so it depends on the device. Typically it
is 100ms but there is a way to shorten it using ACPI _DSM.
> ```
> static int get_downstream_delay(struct pci_bus *bus)
> {
> struct pci_dev *pdev;
> int min_delay = 100;
> int max_delay = 0;
>
> list_for_each_entry(pdev, &bus->devices, bus_list) {
> if (!pdev->imm_ready)
> min_delay = 0;
> else if (pdev->d3cold_delay < min_delay)
> min_delay = pdev->d3cold_delay;
> if (pdev->d3cold_delay > max_delay)
> max_delay = pdev->d3cold_delay;
> }
>
> return max(min_delay, max_delay);
> }
> ```
>
> >> Anyway, there is such firmware out there, so I’d like to avoid the time
> >> increases.
> >>
> >> As a first step, the commit should be extended to print a warning (maybe if
> >> `initcall_debug` is specified), when the delay is higher than let’s say 50(?)
> >> ms. Also better documentation how to debug these delays would be appreciated.
>
> As your commit message says the standard demands a delay of at least 100 ms, 50 ms
> is of course too short, and maybe 150 ms or so should be used as the threshold.
>
> >> If there is no easy solution, it’d be great if the commit could be reverted for
> >> now, and a better solution be discussed for the next release.
> >
> > There is also kernel bugzilla entry about another regression this causes
> > here:
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=204413
> >
> > I agree we should revert c2bf1fc2 now. I'll try to come up alternative
> > solution to these missing delays that hopefully does not break existing
> > setups.
> >
> > Rafael, Bjorn,
> >
> > Can you revert the commit or do you want me to send a revert patch?
> >
> > Thanks and sorry about the breakage.
>
> No worries.
Thanks for the lspci output. This explains the 1 second delay:
> LnkCap: Port #2, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <256ns, L1 <8us
> ClockPM- Surprise- LLActRep- BwNot+ ASPMOptComp+
The port does not support active link reporting. Can you try the below
patch?
Nicholas, can you also try it? I think it should solve your issue as
well.
Thanks!
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 308c3e0c4a34..bb8c753013d0 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -434,7 +434,8 @@ static void wait_for_downstream_link(struct pci_dev *pdev)
* need to wait 100ms. For higher speeds (gen3) we need to wait
* first for the data link layer to become active.
*/
- if (pcie_get_speed_cap(pdev) <= PCIE_SPEED_5_0GT)
+ if (pcie_get_speed_cap(pdev) <= PCIE_SPEED_5_0GT ||
+ !pdev->link_active_reporting)
msleep(delay);
else
pcie_wait_for_link_delay(pdev, true, delay);
On Tue, Aug 06, 2019 at 04:02:42PM +0200, Paul Menzel wrote:
> >> How can I read out the delay from the system as done in?
> >
> > The delay is not system wide so it depends on the device. Typically it
> > is 100ms but there is a way to shorten it using ACPI _DSM.
>
> Yeah, I know. I was wondering if it can easily be read out for a device
> under `/sys`, other debug level or, for example, with `lspci` or
> `acpidump`.
AFAIK there is no easy way at least without patching the kernel. You can
log pdev->d3cold_delay for each device in pci_pm_init() for example.
> >> ```
> >> static int get_downstream_delay(struct pci_bus *bus)
> >> {
> >> struct pci_dev *pdev;
> >> int min_delay = 100;
> >> int max_delay = 0;
> >>
> >> list_for_each_entry(pdev, &bus->devices, bus_list) {
> >> if (!pdev->imm_ready)
> >> min_delay = 0;
> >> else if (pdev->d3cold_delay < min_delay)
> >> min_delay = pdev->d3cold_delay;
> >> if (pdev->d3cold_delay > max_delay)
> >> max_delay = pdev->d3cold_delay;
> >> }
> >>
> >> return max(min_delay, max_delay);
> >> }
> >> ```
> >>
> >>>> Anyway, there is such firmware out there, so I’d like to avoid the time
> >>>> increases.
> >>>>
> >>>> As a first step, the commit should be extended to print a warning (maybe if
> >>>> `initcall_debug` is specified), when the delay is higher than let’s say 50(?)
> >>>> ms. Also better documentation how to debug these delays would be appreciated.
> >>
> >> As your commit message says the standard demands a delay of at least 100 ms, 50 ms
> >> is of course too short, and maybe 150 ms or so should be used as the threshold.
> >>
> >>>> If there is no easy solution, it’d be great if the commit could be reverted for
> >>>> now, and a better solution be discussed for the next release.
> >>>
> >>> There is also kernel bugzilla entry about another regression this causes
> >>> here:
> >>>
> >>> https://bugzilla.kernel.org/show_bug.cgi?id=204413
> >>>
> >>> I agree we should revert c2bf1fc2 now. I'll try to come up alternative
> >>> solution to these missing delays that hopefully does not break existing
> >>> setups.
> >>>
> >>> Rafael, Bjorn,
> >>>
> >>> Can you revert the commit or do you want me to send a revert patch?
> >>>
> >>> Thanks and sorry about the breakage.
> >>
> >> No worries.
> >
> > Thanks for the lspci output. This explains the 1 second delay:
> >
> >> LnkCap: Port #2, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <256ns, L1 <8us
> >> ClockPM- Surprise- LLActRep- BwNot+ ASPMOptComp+
> >
> > The port does not support active link reporting. Can you try the below
> > patch?
> >
> > Nicholas, can you also try it? I think it should solve your issue as
> > well.
> >
> > Thanks!
> >
> > diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
> > index 308c3e0c4a34..bb8c753013d0 100644
> > --- a/drivers/pci/pcie/portdrv_core.c
> > +++ b/drivers/pci/pcie/portdrv_core.c
> > @@ -434,7 +434,8 @@ static void wait_for_downstream_link(struct pci_dev *pdev)
> > * need to wait 100ms. For higher speeds (gen3) we need to wait
> > * first for the data link layer to become active.
> > */
> > - if (pcie_get_speed_cap(pdev) <= PCIE_SPEED_5_0GT)
> > + if (pcie_get_speed_cap(pdev) <= PCIE_SPEED_5_0GT ||
> > + !pdev->link_active_reporting)
> > msleep(delay);
> > else
> > pcie_wait_for_link_delay(pdev, true, delay);
>
> I can confirm, that the delay is reduced now.
>
> pcieport @ 0000:00:01.0 {pcieport} async_device (Total Suspend: 12.118 ms Total Resume: 105.604 ms)
Awesome, thanks.
> How can I enable the verbose log messages you got in your commit
> message like below?
>
> pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0
It did not come from any log message, I just wrote it like that for the
commit log.
Dear Mika,
On 06.08.19 13:31, Mika Westerberg wrote:
> On Tue, Aug 06, 2019 at 11:57:26AM +0200, Paul Menzel wrote:
>> On 06.08.19 11:36, Mika Westerberg wrote:
>>> +Nicholas and Matthias
>>>
>>> On Tue, Aug 06, 2019 at 11:20:37AM +0200, Paul Menzel wrote:
>>
>>>> Commit c2bf1fc2 (PCI: Add missing link delays required by the PCIe spec) [1]
>>>> increases the resume time from ACPI S3 on a desktop system Dell OptiPlex 5040
>>>> by one second. It looks like this is expected from the commit message, but
>>>> breaks existing systems with boot time requirements. I measured this with the
>>>> help of the pm-graph script `sleepgraph.py` [2].
>>>>
>>>> 0000:00:01.0 resume_noirq (1134.715 ms @ 5039.412578 to 5040.547293)
>>>> …
>>>> pcie_wait_for_link_delay (1132.853 ms @ 5039.414431)
>>
>> By the way, here is the trace excerpt with the interesting comments.
>>
>> ```
>> 5040.547284 | 1) kworker-3594 | 1132852 us | } /* schedule_timeout */
>> 5040.547284 | 1) kworker-3594 | 1132853 us | } /* msleep */
>> 5040.547284 | 1) kworker-3594 | 1132853 us | } /* pcie_wait_for_link_delay */
>> 5040.547284 | 1) kworker-3594 | 1132856 us | } /* wait_for_downstream_link */
>> 5040.547285 | 1) kworker-3594 | | device_for_each_child() {
>> 5040.547285 | 1) kworker-3594 | 0.185 us | _raw_spin_lock_irqsave();
>> 5040.547286 | 1) kworker-3594 | 0.136 us | _raw_spin_unlock_irqrestore();
>> 5040.547286 | 1) kworker-3594 | 0.190 us | pm_iter();
>> 5040.547286 | 1) kworker-3594 | 0.129 us | _raw_spin_lock_irqsave();
>> 5040.547287 | 1) kworker-3594 | 0.134 us | _raw_spin_unlock_irqrestore();
>> 5040.547287 | 1) kworker-3594 | 0.194 us | pm_iter();
>> 5040.547287 | 1) kworker-3594 | 0.134 us | _raw_spin_lock_irqsave();
>> 5040.547288 | 1) kworker-3594 | 0.133 us | _raw_spin_unlock_irqrestore();
>> 5040.547288 | 1) kworker-3594 | 0.187 us | pm_iter();
>> 5040.547288 | 1) kworker-3594 | 0.135 us | _raw_spin_lock_irqsave();
>> 5040.547289 | 1) kworker-3594 | 0.135 us | _raw_spin_unlock_irqrestore();
>> 5040.547289 | 1) kworker-3594 | 0.271 us | pm_iter();
>> 5040.547289 | 1) kworker-3594 | 0.132 us | _raw_spin_lock_irqsave();
>> 5040.547290 | 1) kworker-3594 | 0.137 us | _raw_spin_unlock_irqrestore();
>> 5040.547290 | 1) kworker-3594 | 5.036 us | } /* device_for_each_child */
>> 5040.547290 | 1) kworker-3594 | 1132862 us | } /* pcie_port_device_resume_noirq */
>> 5040.547290 | 1) kworker-3594 | 1134709 us | } /* pci_pm_resume_noirq */
>> ```
>>
>>>> $ lspci -nn
>>>> 00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [8086:191f] (rev 07)
>>>> 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
>>>> 00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31)
>>>> 00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Thermal subsystem [8086:a131] (rev 31)
>>>> 00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
>>>> 00:17.0 SATA controller [0106]: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] [8086:a102] (rev 31)
>>>> 00:1c.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #1 [8086:a110] (rev f1)
>>>> 00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a146] (rev 31)
>>>> 00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
>>>> 00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
>>>> 00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
>>>> 00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-V [8086:15b8] (rev 31)
>>>> 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland XT [Radeon HD 8670 / R7 250/350] [1002:6610] (rev 81)
>>>> 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] [1002:aab0]
>>>> 02:00.0 PCI bridge [0604]: Texas Instruments XIO2001 PCI Express-to-PCI Bridge [104c:8240]
>>>>
>>>> So, it’s about the internal Intel graphics device, which is not used on this
>>>> system, as there is an external AMD graphics device plugged in.
>>>>
>>>> As far as I understand it, it’s a bug in the firmware, that a one second delay
>>>> is specified?
>>
>> How can I read out the delay from the system as done in?
>
> The delay is not system wide so it depends on the device. Typically it
> is 100ms but there is a way to shorten it using ACPI _DSM.
Yeah, I know. I was wondering if it can easily be read out for a device
under `/sys`, other debug level or, for example, with `lspci` or
`acpidump`.
>> ```
>> static int get_downstream_delay(struct pci_bus *bus)
>> {
>> struct pci_dev *pdev;
>> int min_delay = 100;
>> int max_delay = 0;
>>
>> list_for_each_entry(pdev, &bus->devices, bus_list) {
>> if (!pdev->imm_ready)
>> min_delay = 0;
>> else if (pdev->d3cold_delay < min_delay)
>> min_delay = pdev->d3cold_delay;
>> if (pdev->d3cold_delay > max_delay)
>> max_delay = pdev->d3cold_delay;
>> }
>>
>> return max(min_delay, max_delay);
>> }
>> ```
>>
>>>> Anyway, there is such firmware out there, so I’d like to avoid the time
>>>> increases.
>>>>
>>>> As a first step, the commit should be extended to print a warning (maybe if
>>>> `initcall_debug` is specified), when the delay is higher than let’s say 50(?)
>>>> ms. Also better documentation how to debug these delays would be appreciated.
>>
>> As your commit message says the standard demands a delay of at least 100 ms, 50 ms
>> is of course too short, and maybe 150 ms or so should be used as the threshold.
>>
>>>> If there is no easy solution, it’d be great if the commit could be reverted for
>>>> now, and a better solution be discussed for the next release.
>>>
>>> There is also kernel bugzilla entry about another regression this causes
>>> here:
>>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=204413
>>>
>>> I agree we should revert c2bf1fc2 now. I'll try to come up alternative
>>> solution to these missing delays that hopefully does not break existing
>>> setups.
>>>
>>> Rafael, Bjorn,
>>>
>>> Can you revert the commit or do you want me to send a revert patch?
>>>
>>> Thanks and sorry about the breakage.
>>
>> No worries.
>
> Thanks for the lspci output. This explains the 1 second delay:
>
>> LnkCap: Port #2, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <256ns, L1 <8us
>> ClockPM- Surprise- LLActRep- BwNot+ ASPMOptComp+
>
> The port does not support active link reporting. Can you try the below
> patch?
>
> Nicholas, can you also try it? I think it should solve your issue as
> well.
>
> Thanks!
>
> diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
> index 308c3e0c4a34..bb8c753013d0 100644
> --- a/drivers/pci/pcie/portdrv_core.c
> +++ b/drivers/pci/pcie/portdrv_core.c
> @@ -434,7 +434,8 @@ static void wait_for_downstream_link(struct pci_dev *pdev)
> * need to wait 100ms. For higher speeds (gen3) we need to wait
> * first for the data link layer to become active.
> */
> - if (pcie_get_speed_cap(pdev) <= PCIE_SPEED_5_0GT)
> + if (pcie_get_speed_cap(pdev) <= PCIE_SPEED_5_0GT ||
> + !pdev->link_active_reporting)
> msleep(delay);
> else
> pcie_wait_for_link_delay(pdev, true, delay);
I can confirm, that the delay is reduced now.
pcieport @ 0000:00:01.0 {pcieport} async_device (Total Suspend: 12.118 ms Total Resume: 105.604 ms)
How can I enable the verbose log messages you got in your commit
message like below?
pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0
Kind regards,
Paul