2020-08-06 17:10:31

by Raul Rangel

[permalink] [raw]
Subject: PS/2 + i8042 intermixing commands

I'm debugging a resume issue on one of our devices using the v5.4
kernel. The device has a PS/2 atkbd and a PS/2 touchpad. It looks like
PS/2 commands are getting intermingled with i8042 commands. This
results in our keyboard controller thinking it got some invalid data.
This usually happens 1 out of 1500 resumes.

Here is the log from my keyboard controller on a good resume.

I've added some comments to make it easier to follow:
> <-- i8042_controller_resume
> [KB recv cmd: 0xaa] # CMD #1
>
> [KB recv cmd: 0x60] # CMD #2
> [ACPI kblight 0]
> [KB recv data: 0x70]
> [KB eaten by STATE_WRITE_CMD_BYTE: 0x70]
> [KB set CTR_RAM(0x00)=0x70 (old:0x70)]
>
> <- i8042_enable_aux_port
> [KB recv cmd: 0x60] # CMD #3
> [KB recv data: 0x52]
> [KB eaten by STATE_WRITE_CMD_BYTE: 0x52]
> [KB set CTR_RAM(0x00)=0x52 (old:0x70)]
> [AUX IRQ enable]
> [AUX enabled]
>
> <- i8042_enable_kbd_port
> [KB recv cmd: 0x60] # CMD #4
> [KB recv data: 0x43]
> [KB eaten by STATE_WRITE_CMD_BYTE: 0x43]
> [KB set CTR_RAM(0x00)=0x43 (old:0x52)]
> [KB IRQ enable]
> [KB enable]
>
> <- atkbd_reconnect->atkbd_probe
> [KB recv data: 0xf2] <- ATKBD_CMD_GETID # CMD #5
> <- atkbd_deactivate
> [KB recv data: 0xf5] <- ATKBD_CMD_RESET_DIS # CMD #6
> [KS disable]
> [KB Clear Buffer]
>
> <- atkbd_reconnect->atkbd_set_leds
> [KB recv data: 0xed] # CMD #7
> [KB recv data: 0x00]
> [KB eaten by STATE_SETLEDS]
>
> <- atkbd_reconnect->atkbd_set_repeat_rate
> [KB recv data: 0xf3] # CMD #8
> [KB recv data: 0x00]
> [KB eaten by STATE_SETREP: 0x00]
>
> <- atkbd_reconnect->atkbd_activate
> [KB recv data: 0xf4] <- ATKBD_CMD_ENABLE # CMD #9
> [KS enable]
> [KB Clear Buffer]
>
> <- atkbd_event_work->atkbd_set_leds
> [KB recv data: 0xed] # CMD #10
> [KB recv data: 0x00]
> [KB eaten by STATE_SETLEDS]
>
> <- atkbd_event_work->atkbd_set_repeat_rate
> [KB recv data: 0xf3] # CMD #11
> [KB recv data: 0x00]
> [KB eaten by STATE_SETREP: 0x00]
>
> <-- trackpoint_reconnect->trackpoint_start_protocol
> [KB recv cmd: 0xd4] <- I8042_CMD_AUX_SEND # CMD #12
> [KB recv data: 0xe1] <- TP_READ_ID
> [STATE_SEND_TO_MOUSE: 0xe1]
>
> <- Is this i8042_port_close?
> [KB recv cmd: 0x60] # CMD #13
> [KB recv data: 0x41]
> [KB eaten by STATE_WRITE_CMD_BYTE: 0x41]
> [KB set CTR_RAM(0x00)=0x41 (old:0x43)]
> [AUX IRQ disable]
> [KB recv cmd: 0x60] # CMD #14
> [KB recv data: 0x43]
> [KB eaten by STATE_WRITE_CMD_BYTE: 0x43]
> [KB set CTR_RAM(0x00)=0x43 (old:0x41)]
> [AUX IRQ enable]
>
> [KB recv cmd: 0xd4] # CMD #15
> [KB recv data: 0xf2]
> [STATE_SEND_TO_MOUSE: 0xf2]


Here is the log on the bad resume:

> <-- i8042_controller_resume
> [KB recv cmd: 0xaa] # CMD #1
>
> [KB recv cmd: 0x60] # CMD #2
> [ACPI kblight 0]
> [KB recv data: 0x70]
> [KB eaten by STATE_WRITE_CMD_BYTE: 0x70]
> [KB set CTR_RAM(0x00)=0x70 (old:0x70)]
>
> <- i8042_enable_aux_port
> [KB recv cmd: 0x60] # CMD #3
> [KB recv data: 0x52]
> [KB eaten by STATE_WRITE_CMD_BYTE: 0x52]
> [KB set CTR_RAM(0x00)=0x52 (old:0x70)]
> [AUX IRQ enable]
> [AUX enabled]
>
> <- i8042_enable_kbd_port
> [KB recv cmd: 0x60] # CMD #4
> [KB recv data: 0x43]
> [KB eaten by STATE_WRITE_CMD_BYTE: 0x43]
> [KB set CTR_RAM(0x00)=0x43 (old:0x52)]
> [KB IRQ enable]
> [KB enable]
>
> <- atkbd_reconnect->atkbd_probe
> [KB recv data: 0xf2] <- ATKBD_CMD_GETID # CMD #5
> <- atkbd_deactivate
> [KB recv data: 0xf5] <- ATKBD_CMD_RESET_DIS # CMD #6
> [KS disable]
> [KB Clear Buffer]
>
> <- atkbd_reconnect->atkbd_set_leds
> [KB recv data: 0xed] # CMD #7
> [KB recv data: 0x00]
> [KB eaten by STATE_SETLEDS]
>
> <- atkbd_reconnect->atkbd_set_repeat_rate
> [KB recv data: 0xf3] # CMD #8
> [KB recv data: 0x00]
> [KB eaten by STATE_SETREP: 0x00]
>
> <- atkbd_reconnect->atkbd_activate
> [KB recv data: 0xf4] <- ATKBD_CMD_ENABLE # CMD #9
> [KS enable]
> [KB Clear Buffer]
>
> <-- trackpoint_reconnect->trackpoint_start_protocol
> [KB recv cmd: 0xd4] <- I8042_CMD_AUX_SEND # CMD #12
> [KB recv data: 0xe1] <- TP_READ_ID
> [STATE_SEND_TO_MOUSE: 0xe1]
>
> <- Is this i8042_port_close?
> [KB recv cmd: 0x60] # CMD #13
> [KB recv data: 0x41]
> [KB eaten by STATE_WRITE_CMD_BYTE: 0x41]
> [KB set CTR_RAM(0x00)=0x41 (old:0x43)]
> [AUX IRQ disable]
>
> <- atkbd_event_work->atkbd_set_leds
> [KB recv data: 0xed] # CMD 10
> <- Wait, where is the data?
>
> <- Continuation of i8042_port_close?
> [KB recv cmd: 0x60] # CMD #14
> [KB recv data: 0x43]
> [KB eaten by STATE_WRITE_CMD_BYTE: 0x43]
> [KB set CTR_RAM(0x00)=0x43 (old:0x41)]
> [AUX IRQ enable]
>
> <- Here is the data!
> [KB recv data: 0x00] # CMD 10 (data)
> [KB Unsupported i8042 data 0x00]
> [KB recv data: 0x00] <- Did the host retry?
> [KB Unsupported i8042 data 0x00]
>
> <- atkbd_event_work->atkbd_set_repeat_rate
> [KB recv data: 0xf3] # CMD #11
> [KB recv data: 0x00]
> [KB eaten by STATE_SETREP: 0x00]
>
> [KB recv cmd: 0xd4] # CMD #15
> [KB recv data: 0xf2]
> [STATE_SEND_TO_MOUSE: 0xf2]

As you can see CMD #10 starts between #13 and #14, and then completes
after #14. Is this expected behavior?

I'm not quite sure if #13 and #14 are coming from i8042_port_close. I
don't have a function trace available, but it seems to fit.

I found this comment:
/*
* Writers to AUX and KBD ports as well as users issuing i8042_command
* directly should acquire i8042_mutex (by means of calling
* i8042_lock_chip() and i8042_unlock_ship() helpers) to ensure that
* they do not disturb each other (unfortunately in many i8042
* implementations write to one of the ports will immediately abort
* command that is being processed by another port).
*/
static DEFINE_MUTEX(i8042_mutex);

Does that not mean that i8042_port_close, i8042_enable_kbd_port,
i8042_enable_aux_port, + any other function that calls i8042_command
should be taking the lock before calling i8042_command?

Imagine this scenario:
1) atkbd_set_leds->ps2_command: locks i8042_mutex
2) __ps2_command->ps2_do_sendbyte->i8042_kbd_write
2a) locks i8042_lock
2b) Sends the kbd command
2c) unlocks i8042_lock
3) i8042_port_close/i8042_enable_aux_port/etc gets called and then
calls i8042_command
3a) locks i8042_lock
3b) sends 8042 CMD
3c) sends 8042 DATA
3d) unlocks i8042_lock
3e) i8042_command returns
4) ps2_do_sendbyte continues sending the kb param via i8042_kbd_write
4a) locks i8042_lock
4b) Sends the kbd data <- Original data got stomped on because of the
8042 CMD in between
4c) unlocks i8042_lock
4c) ps2_command unlocks i8042_mutex

Is this an invalid sequence or is our keyboard controller at fault
because it doesn't support intermingled commands?

Thanks,
Raul

p.s, sorry for sending twice. I forgot to set the original as plain text :)


2020-08-13 01:23:45

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: PS/2 + i8042 intermixing commands

On Thu, Aug 06, 2020 at 09:28:41AM -0600, Raul Rangel wrote:
> >
> > <- atkbd_event_work->atkbd_set_leds
> > [KB recv data: 0xed] # CMD 10
> > <- Wait, where is the data?
> >
> > <- Continuation of i8042_port_close?
> > [KB recv cmd: 0x60] # CMD #14
> > [KB recv data: 0x43]
> > [KB eaten by STATE_WRITE_CMD_BYTE: 0x43]
> > [KB set CTR_RAM(0x00)=0x43 (old:0x41)]
> > [AUX IRQ enable]
> >
> > <- Here is the data!
> > [KB recv data: 0x00] # CMD 10 (data)
> > [KB Unsupported i8042 data 0x00]
> > [KB recv data: 0x00] <- Did the host retry?
> > [KB Unsupported i8042 data 0x00]
> >
> > <- atkbd_event_work->atkbd_set_repeat_rate
> > [KB recv data: 0xf3] # CMD #11
> > [KB recv data: 0x00]
> > [KB eaten by STATE_SETREP: 0x00]
> >
> > [KB recv cmd: 0xd4] # CMD #15
> > [KB recv data: 0xf2]
> > [STATE_SEND_TO_MOUSE: 0xf2]
>
> As you can see CMD #10 starts between #13 and #14, and then completes
> after #14. Is this expected behavior?
>
> I'm not quite sure if #13 and #14 are coming from i8042_port_close. I
> don't have a function trace available, but it seems to fit.
>
> I found this comment:
> /*
> * Writers to AUX and KBD ports as well as users issuing i8042_command
> * directly should acquire i8042_mutex (by means of calling
> * i8042_lock_chip() and i8042_unlock_ship() helpers) to ensure that
> * they do not disturb each other (unfortunately in many i8042
> * implementations write to one of the ports will immediately abort
> * command that is being processed by another port).
> */
> static DEFINE_MUTEX(i8042_mutex);
>
> Does that not mean that i8042_port_close, i8042_enable_kbd_port,
> i8042_enable_aux_port, + any other function that calls i8042_command
> should be taking the lock before calling i8042_command?

Yeah, I think this is right. When I added the mutex it was because 2
deices were clashing with each other and I did not consider that closing
port in the i8042 driver itself may also disturb in-flight command.

Thanks.

--
Dmitry

2020-08-14 20:12:27

by Raul Rangel

[permalink] [raw]
Subject: Re: PS/2 + i8042 intermixing commands

Thanks for clarifying Dmitry. I'll get a patch pushed up next week.

On Wed, Aug 12, 2020 at 7:21 PM Dmitry Torokhov
<[email protected]> wrote:
>
> On Thu, Aug 06, 2020 at 09:28:41AM -0600, Raul Rangel wrote:
> > >
> > > <- atkbd_event_work->atkbd_set_leds
> > > [KB recv data: 0xed] # CMD 10
> > > <- Wait, where is the data?
> > >
> > > <- Continuation of i8042_port_close?
> > > [KB recv cmd: 0x60] # CMD #14
> > > [KB recv data: 0x43]
> > > [KB eaten by STATE_WRITE_CMD_BYTE: 0x43]
> > > [KB set CTR_RAM(0x00)=0x43 (old:0x41)]
> > > [AUX IRQ enable]
> > >
> > > <- Here is the data!
> > > [KB recv data: 0x00] # CMD 10 (data)
> > > [KB Unsupported i8042 data 0x00]
> > > [KB recv data: 0x00] <- Did the host retry?
> > > [KB Unsupported i8042 data 0x00]
> > >
> > > <- atkbd_event_work->atkbd_set_repeat_rate
> > > [KB recv data: 0xf3] # CMD #11
> > > [KB recv data: 0x00]
> > > [KB eaten by STATE_SETREP: 0x00]
> > >
> > > [KB recv cmd: 0xd4] # CMD #15
> > > [KB recv data: 0xf2]
> > > [STATE_SEND_TO_MOUSE: 0xf2]
> >
> > As you can see CMD #10 starts between #13 and #14, and then completes
> > after #14. Is this expected behavior?
> >
> > I'm not quite sure if #13 and #14 are coming from i8042_port_close. I
> > don't have a function trace available, but it seems to fit.
> >
> > I found this comment:
> > /*
> > * Writers to AUX and KBD ports as well as users issuing i8042_command
> > * directly should acquire i8042_mutex (by means of calling
> > * i8042_lock_chip() and i8042_unlock_ship() helpers) to ensure that
> > * they do not disturb each other (unfortunately in many i8042
> > * implementations write to one of the ports will immediately abort
> > * command that is being processed by another port).
> > */
> > static DEFINE_MUTEX(i8042_mutex);
> >
> > Does that not mean that i8042_port_close, i8042_enable_kbd_port,
> > i8042_enable_aux_port, + any other function that calls i8042_command
> > should be taking the lock before calling i8042_command?
>
> Yeah, I think this is right. When I added the mutex it was because 2
> deices were clashing with each other and I did not consider that closing
> port in the i8042 driver itself may also disturb in-flight command.
>
> Thanks.
>
> --
> Dmitry