2009-05-01 09:42:36

by Alan Jenkins

[permalink] [raw]
Subject: Re: rfkill rewrite

On 4/18/09, Johannes Berg <[email protected]> wrote:
> On Sat, 2009-04-18 at 10:43 +0100, Alan Jenkins wrote:
>
>> >> When I looked at the code earlier, I saw no obvious replacement for
>> >> rfkill_set_default(). So I tried disabling the wireless and rebooting
>> >> to see what happened. It didn't like that :-).
>> >>
>> >
>> > Ok that wasn't too hard -- try this on top if you get a chance:
>> >
>>
>> Great, that fixes the crash.
>>
>>
>> 1) I think we need to add a resume method to eeepc-laptop.
>>
>> Without this, funny things happen when I hibernate, disable wireless in
>> the BIOS, and resume:
>>
>> ath5k phy0: failed to wake up the MAC chip
>>
>> It's an really stupid thing to do, but it can happen. It's bad from a
>> UI point of view. E.g. in network-manager, you can see a "wlan0"
>> device, but it doesn't work.
>>
>> The EEE rfkill is unusual in that it hotplugs the PCI device, making
>> eeepc-laptop something like a custom pci hotplug driver. With your
>> rewrite, eeepc-laptop doesn't notice the state change on resume.
>> Previously, the rfkill-core would restore the pre-hibernation state,
>> which would sort everything out. I don't think anything else does this,
>> so we can just add a resume method to eeepc-laptop. The resume method
>> would re-check the state and do the PCI hotplug dance if necessary.
>>
>> If you agree, I can do the patch for this and send it to you.
>
> Sounds good to me, yeah.
>
> I could make the rfkill core do that at resume, but I'm not really sure
> it's what we want -- there are too many cases imho:
> * hard rfkill might have changed
> * soft rfkill might still be ok in hw
> * soft rfkill might need reconfiguring
> etc. I think generally it's saner to let the driver sort it out -- it
> can always ask for the current state by using set_hw_state() or so.

I just realized or remembered something non-obvious. This means _all_
rfkill drivers need a resume handler. They don't at the moment, so
this would need to be fixed and documented. In which case, it's
probably simpler for the core to do it.

You need this to handle hibernation. If the rfkill state persists
across hibernation (which mine does, even in S5), you can always cause
the state to change, by pressing the wireless toggle key _while the
hibernation image is being written to disk_. At that stage, all
devices are active, so that s2disk can interact with the user and
write the image wherever it chooses. The kernel will not "remember"
the state change on resume, since it happened after the kernel image
was snapshotted.

If the rfkill state does _not_ persist over hibernation, then clearly
the state can change on resume and you will again need a resume
handler,

On my EeePC, this is just more dramatic because it can de-power a PCI
device without notifying the driver. But the new rfkill design will
require all devices to have a resume method, because there is no
get_state() callback. Otherwise, reading of
/sys/class/rfkill/rfkill*/state after resume from hibernation may
return an incorrect result. I don't think we should allow that to
happen.

Regards
Alan


2009-05-05 14:12:17

by Johannes Berg

[permalink] [raw]
Subject: Re: rfkill rewrite

> >> I just realized or remembered something non-obvious. This means _all_
> >> rfkill drivers need a resume handler. They don't at the moment, so
> >> this would need to be fixed and documented. In which case, it's
> >> probably simpler for the core to do it.
> >
> > Only those that have a hard-rfkill line, which I think isn't all.
>
> My scenario quoted below affects soft-rfkill only.

Ah, you're thinking of that.

> > But yes. However, how would the core handle it?
>
> I was thinking of calling set_block() on resume to restore the
> pre-suspend state. That's what the old rfkill does, and why
> eeepc-laptop got away without a resume handler.

That makes sense.

> > So we would need to add a query_resume() method like query that is
> > called at resume time, and needs to call rfkill_set{,_hw,_sw}_state as
> > appopriate. Thoughts? We can enforce that much easier by making it a
> > required method.
>
> That would certainly follow your existing model. (BTW the kerneldoc
> for "query" is out of sync in v8; it says "return true for blocked"
> even though the prototype returns void :-).

Yeah, I noticed already and fixed it yesterday but didn't post v9 yet
(will also amend v9 with the outcome of this discussion)

> It's debatable as to what behaviour is more user-friendly for my nasty
> corner-case. query_resume() would allow the radio to always come back
> in the same state as it was in the moment the laptop went to sleep, if
> the hardware supports it. But the _global_ rfkill state will then be
> out of sync, and that means the next time you press the wireless
> toggle key it does nothing, which is disconcerting. Plus, it means
> the corner-case behaviour varies depending on whether the soft-rfkill
> line state survives over hibernation - which may even vary on the same
> machine (S5 versus S4? whether you remove the laptop's battery? weird
> firmware bugs?).
>
> All I'm trying to do is selfishly preserve eeepc-laptop across this
> re-write, for which purpose it seems easier to have set_block() called
> on resume :-). I don't know if that makes rfkill more awkward in
> wireless drivers (as opposed to platform drivers), or drivers which
> have a hardware rfkill line.

No, should be fine. I can add that and remove the resume code from
eeepc.

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2009-05-05 14:04:55

by Alan Jenkins

[permalink] [raw]
Subject: Re: rfkill rewrite

On 5/5/09, Johannes Berg <[email protected]> wrote:
> Hi,
>
> Sorry for the long wait.
>
>> > I could make the rfkill core do that at resume, but I'm not really sure
>> > it's what we want -- there are too many cases imho:
>> > * hard rfkill might have changed
>> > * soft rfkill might still be ok in hw
>> > * soft rfkill might need reconfiguring
>> > etc. I think generally it's saner to let the driver sort it out -- it
>> > can always ask for the current state by using set_hw_state() or so.
>>
>> I just realized or remembered something non-obvious. This means _all_
>> rfkill drivers need a resume handler. They don't at the moment, so
>> this would need to be fixed and documented. In which case, it's
>> probably simpler for the core to do it.
>
> Only those that have a hard-rfkill line, which I think isn't all.

My scenario quoted below affects soft-rfkill only.

> But yes. However, how would the core handle it?

I was thinking of calling set_block() on resume to restore the
pre-suspend state. That's what the old rfkill does, and why
eeepc-laptop got away without a resume handler.

>> You need this to handle hibernation. If the rfkill state persists
>> across hibernation (which mine does, even in S5), you can always cause
>> the state to change, by pressing the wireless toggle key _while the
>> hibernation image is being written to disk_. At that stage, all
>> devices are active, so that s2disk can interact with the user and
>> write the image wherever it chooses. The kernel will not "remember"
>> the state change on resume, since it happened after the kernel image
>> was snapshotted.
>
> Good scenario, yes, it's definitely possible.
>
>> If the rfkill state does _not_ persist over hibernation, then clearly
>> the state can change on resume and you will again need a resume
>> handler,
>>
>> On my EeePC, this is just more dramatic because it can de-power a PCI
>> device without notifying the driver. But the new rfkill design will
>> require all devices to have a resume method, because there is no
>> get_state() callback. Otherwise, reading of
>> /sys/class/rfkill/rfkill*/state after resume from hibernation may
>> return an incorrect result. I don't think we should allow that to
>> happen.
>
> So we would need to add a query_resume() method like query that is
> called at resume time, and needs to call rfkill_set{,_hw,_sw}_state as
> appopriate. Thoughts? We can enforce that much easier by making it a
> required method.

That would certainly follow your existing model. (BTW the kerneldoc
for "query" is out of sync in v8; it says "return true for blocked"
even though the prototype returns void :-).

It's debatable as to what behaviour is more user-friendly for my nasty
corner-case. query_resume() would allow the radio to always come back
in the same state as it was in the moment the laptop went to sleep, if
the hardware supports it. But the _global_ rfkill state will then be
out of sync, and that means the next time you press the wireless
toggle key it does nothing, which is disconcerting. Plus, it means
the corner-case behaviour varies depending on whether the soft-rfkill
line state survives over hibernation - which may even vary on the same
machine (S5 versus S4? whether you remove the laptop's battery? weird
firmware bugs?).

All I'm trying to do is selfishly preserve eeepc-laptop across this
re-write, for which purpose it seems easier to have set_block() called
on resume :-). I don't know if that makes rfkill more awkward in
wireless drivers (as opposed to platform drivers), or drivers which
have a hardware rfkill line.

Alan

2009-05-05 08:15:40

by Johannes Berg

[permalink] [raw]
Subject: Re: rfkill rewrite

Hi,

Sorry for the long wait.

> > I could make the rfkill core do that at resume, but I'm not really sure
> > it's what we want -- there are too many cases imho:
> > * hard rfkill might have changed
> > * soft rfkill might still be ok in hw
> > * soft rfkill might need reconfiguring
> > etc. I think generally it's saner to let the driver sort it out -- it
> > can always ask for the current state by using set_hw_state() or so.
>
> I just realized or remembered something non-obvious. This means _all_
> rfkill drivers need a resume handler. They don't at the moment, so
> this would need to be fixed and documented. In which case, it's
> probably simpler for the core to do it.

Only those that have a hard-rfkill line, which I think isn't all. But
yes. However, how would the core handle it?

> You need this to handle hibernation. If the rfkill state persists
> across hibernation (which mine does, even in S5), you can always cause
> the state to change, by pressing the wireless toggle key _while the
> hibernation image is being written to disk_. At that stage, all
> devices are active, so that s2disk can interact with the user and
> write the image wherever it chooses. The kernel will not "remember"
> the state change on resume, since it happened after the kernel image
> was snapshotted.

Good scenario, yes, it's definitely possible.

> If the rfkill state does _not_ persist over hibernation, then clearly
> the state can change on resume and you will again need a resume
> handler,
>
> On my EeePC, this is just more dramatic because it can de-power a PCI
> device without notifying the driver. But the new rfkill design will
> require all devices to have a resume method, because there is no
> get_state() callback. Otherwise, reading of
> /sys/class/rfkill/rfkill*/state after resume from hibernation may
> return an incorrect result. I don't think we should allow that to
> happen.

So we would need to add a query_resume() method like query that is
called at resume time, and needs to call rfkill_set{,_hw,_sw}_state as
appopriate. Thoughts? We can enforce that much easier by making it a
required method.

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part