Felix Fietkau <[email protected]> writes:
> Depending on the source, a hardware calculated hash may not provide the
> same level of collision resistance.
This seems like it would have performance implications?
Also, this can potentially discard information from tunnels that
preserve the hash before encapsulation (we added support for this to
Wireguard which had some nice effects on queueing of encapsulated
traffic).
Could you elaborate a bit on the kind of bad hardware hashes you were
seeing?
-Toke
On 2020-12-17 12:54, Toke Høiland-Jørgensen wrote:
> Felix Fietkau <[email protected]> writes:
>
>> Depending on the source, a hardware calculated hash may not provide the
>> same level of collision resistance.
>
> This seems like it would have performance implications?
>
> Also, this can potentially discard information from tunnels that
> preserve the hash before encapsulation (we added support for this to
> Wireguard which had some nice effects on queueing of encapsulated
> traffic).
If the hash was calculated in software using the flow dissector, it will
be preserved, even if it went through a few virtual interfaces.
The only hashes discarded are hardware generated ones.
- Felix
Felix Fietkau <[email protected]> writes:
> On 2020-12-17 12:54, Toke Høiland-Jørgensen wrote:
>> Felix Fietkau <[email protected]> writes:
>>
>>> Depending on the source, a hardware calculated hash may not provide the
>>> same level of collision resistance.
>>
>> This seems like it would have performance implications?
>>
>> Also, this can potentially discard information from tunnels that
>> preserve the hash before encapsulation (we added support for this to
>> Wireguard which had some nice effects on queueing of encapsulated
>> traffic).
> If the hash was calculated in software using the flow dissector, it will
> be preserved, even if it went through a few virtual interfaces.
> The only hashes discarded are hardware generated ones.
Yeah, but I was thinking something like:
Packet comes in with HW hash -> gets encapsulated (preserving the hash)
-> gets to mac80211 which discards the HW hash. So now you're replacing
a (possibly bad-quality) HW hash with a software hash of the *outer*
encapsulation header...
-Toke
On 2020-12-17 14:01, Toke Høiland-Jørgensen wrote:
> Felix Fietkau <[email protected]> writes:
>
>> On 2020-12-17 12:54, Toke Høiland-Jørgensen wrote:
>>> Felix Fietkau <[email protected]> writes:
>>>
>>>> Depending on the source, a hardware calculated hash may not provide the
>>>> same level of collision resistance.
>>>
>>> This seems like it would have performance implications?
>>>
>>> Also, this can potentially discard information from tunnels that
>>> preserve the hash before encapsulation (we added support for this to
>>> Wireguard which had some nice effects on queueing of encapsulated
>>> traffic).
>> If the hash was calculated in software using the flow dissector, it will
>> be preserved, even if it went through a few virtual interfaces.
>> The only hashes discarded are hardware generated ones.
>
> Yeah, but I was thinking something like:
>
> Packet comes in with HW hash -> gets encapsulated (preserving the hash)
> -> gets to mac80211 which discards the HW hash. So now you're replacing
> a (possibly bad-quality) HW hash with a software hash of the *outer*
> encapsulation header...
If this becomes a problem, I think we should add a similar patch to
wireguard, which already calls skb_get_hash before encapsulating.
Other regular tunnels should already get a proper hash, since the flow
dissector will take care of it.
The reason I did this patch is because I have a patch to set the hw flow
hash in the skb on mtk_eth_soc, which does help GRO, but leads to
collisions on mac80211 fq.
- Felix
Felix Fietkau <[email protected]> writes:
> On 2020-12-17 14:01, Toke Høiland-Jørgensen wrote:
>> Felix Fietkau <[email protected]> writes:
>>
>>> On 2020-12-17 12:54, Toke Høiland-Jørgensen wrote:
>>>> Felix Fietkau <[email protected]> writes:
>>>>
>>>>> Depending on the source, a hardware calculated hash may not provide the
>>>>> same level of collision resistance.
>>>>
>>>> This seems like it would have performance implications?
>>>>
>>>> Also, this can potentially discard information from tunnels that
>>>> preserve the hash before encapsulation (we added support for this to
>>>> Wireguard which had some nice effects on queueing of encapsulated
>>>> traffic).
>>> If the hash was calculated in software using the flow dissector, it will
>>> be preserved, even if it went through a few virtual interfaces.
>>> The only hashes discarded are hardware generated ones.
>>
>> Yeah, but I was thinking something like:
>>
>> Packet comes in with HW hash -> gets encapsulated (preserving the hash)
>> -> gets to mac80211 which discards the HW hash. So now you're replacing
>> a (possibly bad-quality) HW hash with a software hash of the *outer*
>> encapsulation header...
> If this becomes a problem, I think we should add a similar patch to
> wireguard, which already calls skb_get_hash before encapsulating.
> Other regular tunnels should already get a proper hash, since the flow
> dissector will take care of it.
But then we'd need to go around adding this to all the places that uses
the hash just to work around a particular piece of broken(ish) hardware.
And we're hard-coding a behaviour in mac80211 that means we'll *always*
recompute the hash, even for hardware that's not similarly broken.
> The reason I did this patch is because I have a patch to set the hw flow
> hash in the skb on mtk_eth_soc, which does help GRO, but leads to
> collisions on mac80211 fq.
So wouldn't the right thing to do here be to put a flag into the RX
device that makes the stack clear the hash after using it for GRO?
-Toke
On 2020-12-17 18:26, Toke Høiland-Jørgensen wrote:
> Felix Fietkau <[email protected]> writes:
>> If this becomes a problem, I think we should add a similar patch to
>> wireguard, which already calls skb_get_hash before encapsulating.
>> Other regular tunnels should already get a proper hash, since the flow
>> dissector will take care of it.
>
> But then we'd need to go around adding this to all the places that uses
> the hash just to work around a particular piece of broken(ish) hardware.
> And we're hard-coding a behaviour in mac80211 that means we'll *always*
> recompute the hash, even for hardware that's not similarly broken.
>
>> The reason I did this patch is because I have a patch to set the hw flow
>> hash in the skb on mtk_eth_soc, which does help GRO, but leads to
>> collisions on mac80211 fq.
>
> So wouldn't the right thing to do here be to put a flag into the RX
> device that makes the stack clear the hash after using it for GRO?
I don't think the hardware is broken, I think fq is simply making
assumptions about the hash that aren't met by the hw.
The documentation in include/linux/skbuff.h mentions these requirements
for the skb hash:
* 1) Two packets in different flows have different hash values
* 2) Two packets in the same flow should have the same hash value
FWIW, I think the 'should' from 2) probably belongs to 1), otherwise it
makes no sense. Two packets of the flow must return the same hash,
otherwise the hash is broken. I'm assuming this is a typo.
In addition to those properties, fq needs the hash to be
cryptographically secure, so that it can use reciprocal_scale to sort
flows into buckets without allowing an attacker to craft collisions.
That's also the reason why it used to use skb_get_hash_perturb with a
random perturbation until we got software hashes based on siphash.
I think it's safe to assume that most hardware out there will not
provide collision resistant hashes, so in my opinion fq cannot rely on a
hardware hash. We don't need to go around and change all places that use
the hash, just those that assume a collision resistant one.
- Felix
Felix Fietkau <[email protected]> writes:
> On 2020-12-17 18:26, Toke Høiland-Jørgensen wrote:
>> Felix Fietkau <[email protected]> writes:
>>> If this becomes a problem, I think we should add a similar patch to
>>> wireguard, which already calls skb_get_hash before encapsulating.
>>> Other regular tunnels should already get a proper hash, since the flow
>>> dissector will take care of it.
>>
>> But then we'd need to go around adding this to all the places that uses
>> the hash just to work around a particular piece of broken(ish) hardware.
>> And we're hard-coding a behaviour in mac80211 that means we'll *always*
>> recompute the hash, even for hardware that's not similarly broken.
>>
>>> The reason I did this patch is because I have a patch to set the hw flow
>>> hash in the skb on mtk_eth_soc, which does help GRO, but leads to
>>> collisions on mac80211 fq.
>>
>> So wouldn't the right thing to do here be to put a flag into the RX
>> device that makes the stack clear the hash after using it for GRO?
> I don't think the hardware is broken, I think fq is simply making
> assumptions about the hash that aren't met by the hw.
>
> The documentation in include/linux/skbuff.h mentions these requirements
> for the skb hash:
> * 1) Two packets in different flows have different hash values
> * 2) Two packets in the same flow should have the same hash value
>
> FWIW, I think the 'should' from 2) probably belongs to 1), otherwise it
> makes no sense. Two packets of the flow must return the same hash,
> otherwise the hash is broken. I'm assuming this is a typo.
There's some text further down indicating this is deliberate:
* A driver may indicate a hash level which is less specific than the
* actual layer the hash was computed on. For instance, a hash computed
* at L4 may be considered an L3 hash. This should only be done if the
* driver can't unambiguously determine that the HW computed the hash at
* the higher layer. Note that the "should" in the second property above
* permits this.
So the way I'm reading that whole section, either the intent is that
both properties should be fulfilled, or that the first one (being
collision-free) is more important...
> In addition to those properties, fq needs the hash to be
> cryptographically secure, so that it can use reciprocal_scale to sort
> flows into buckets without allowing an attacker to craft collisions.
> That's also the reason why it used to use skb_get_hash_perturb with a
> random perturbation until we got software hashes based on siphash.
>
> I think it's safe to assume that most hardware out there will not
> provide collision resistant hashes, so in my opinion fq cannot rely on a
> hardware hash. We don't need to go around and change all places that use
> the hash, just those that assume a collision resistant one.
I did a quick grep-based survey of uses of skb_get_hash() outside
drivers - this is what I found (with my interpretations of what they're
used for):
net/core/dev.c : skb_tx_hash() - selecting TX queue w/reciprocal scale
net/core/dev.c : RX flow steering, flow limiting
net/core/dev.c : GRO
net/core/filter.c : BPF helper
include/net/ip_tunnels.h : flowi4_multipath_hash - so multipath selection?
net/ipv{4,6}/route.c : multipath hashing (if l4)
net/ipv6/seg6_iptunnel : building flow labels
net/mac80211/tx.c : FQ
net/mptcp/syncookies : storing cookies (XOR w/net_hash_mix())
net/netfilter/nft_hash.c : symhash input (seems to be load balancing)
net/openvswitch : flow hashing and actions
net/packet/af_packet.c : PACKET_FANOUT_HASH
net/sched/sch_*.c : flow hashing for queueing
Apart from GRO it's not obvious to me that a trivially
attacker-controlled hash is safe in any of those uses?
-Toke
On 2020-12-18 13:41, Toke Høiland-Jørgensen wrote:
> Felix Fietkau <[email protected]> writes:
>
>> On 2020-12-17 18:26, Toke Høiland-Jørgensen wrote:
>>> Felix Fietkau <[email protected]> writes:
>>>> If this becomes a problem, I think we should add a similar patch to
>>>> wireguard, which already calls skb_get_hash before encapsulating.
>>>> Other regular tunnels should already get a proper hash, since the flow
>>>> dissector will take care of it.
>>>
>>> But then we'd need to go around adding this to all the places that uses
>>> the hash just to work around a particular piece of broken(ish) hardware.
>>> And we're hard-coding a behaviour in mac80211 that means we'll *always*
>>> recompute the hash, even for hardware that's not similarly broken.
>>>
>>>> The reason I did this patch is because I have a patch to set the hw flow
>>>> hash in the skb on mtk_eth_soc, which does help GRO, but leads to
>>>> collisions on mac80211 fq.
>>>
>>> So wouldn't the right thing to do here be to put a flag into the RX
>>> device that makes the stack clear the hash after using it for GRO?
>> I don't think the hardware is broken, I think fq is simply making
>> assumptions about the hash that aren't met by the hw.
>>
>> The documentation in include/linux/skbuff.h mentions these requirements
>> for the skb hash:
>> * 1) Two packets in different flows have different hash values
>> * 2) Two packets in the same flow should have the same hash value
>>
>> FWIW, I think the 'should' from 2) probably belongs to 1), otherwise it
>> makes no sense. Two packets of the flow must return the same hash,
>> otherwise the hash is broken. I'm assuming this is a typo.
>
> There's some text further down indicating this is deliberate:
>
> * A driver may indicate a hash level which is less specific than the
> * actual layer the hash was computed on. For instance, a hash computed
> * at L4 may be considered an L3 hash. This should only be done if the
> * driver can't unambiguously determine that the HW computed the hash at
> * the higher layer. Note that the "should" in the second property above
> * permits this.
>
> So the way I'm reading that whole section, either the intent is that
> both properties should be fulfilled, or that the first one (being
> collision-free) is more important...
A hash - by definition - cannot be collision free.
But that's beside the point. On my hw, the hash itself seems collision
free for the flows that I'm pushing, but the result of the
reciprocal_scale isn't.
I took another look and figured out the reason for that:
The hw delivers a 14 bit hash. reciprocal_scale assumes that the values
are distributed across the full 32 bit range. So in this case, the lower
bits are pretty much ignored and the result of the reciprocal_scale is 0
or close to 0, which is what's causing the collisions in fq.
Maybe the assumption that the hash should be distributed across the full
32 bit range should be documented somewhere :)
>> In addition to those properties, fq needs the hash to be
>> cryptographically secure, so that it can use reciprocal_scale to sort
>> flows into buckets without allowing an attacker to craft collisions.
>> That's also the reason why it used to use skb_get_hash_perturb with a
>> random perturbation until we got software hashes based on siphash.
>>
>> I think it's safe to assume that most hardware out there will not
>> provide collision resistant hashes, so in my opinion fq cannot rely on a
>> hardware hash. We don't need to go around and change all places that use
>> the hash, just those that assume a collision resistant one.
>
> I did a quick grep-based survey of uses of skb_get_hash() outside
> drivers - this is what I found (with my interpretations of what they're
> used for):
>
> net/core/dev.c : skb_tx_hash() - selecting TX queue w/reciprocal scale
> net/core/dev.c : RX flow steering, flow limiting
> net/core/dev.c : GRO
> net/core/filter.c : BPF helper
> include/net/ip_tunnels.h : flowi4_multipath_hash - so multipath selection?
> net/ipv{4,6}/route.c : multipath hashing (if l4)
> net/ipv6/seg6_iptunnel : building flow labels
> net/mac80211/tx.c : FQ
> net/mptcp/syncookies : storing cookies (XOR w/net_hash_mix())
> net/netfilter/nft_hash.c : symhash input (seems to be load balancing)
> net/openvswitch : flow hashing and actions
> net/packet/af_packet.c : PACKET_FANOUT_HASH
> net/sched/sch_*.c : flow hashing for queueing
>
> Apart from GRO it's not obvious to me that a trivially
> attacker-controlled hash is safe in any of those uses?
I looked at some of those uses you mentioned here.
Most of them fit into 2 categories:
1. Sort into power-of-2 buckets and use hash & (size-1), effectively
using the lower bits only.
2. Use reciprocal_scale - effectively using the higher bits only.
For the hash that my hw is reporting, type 1 is working and type 2 is
broken.
So it seems to me that the solution would involve running a simple hash
on the 14 bit values to get the bits distributed to the full 32 bit
range without adding too much bias.
I will do this in the driver and drop this patch.
Thanks for looking into this,
- Felix
Felix Fietkau <[email protected]> writes:
> On 2020-12-18 13:41, Toke Høiland-Jørgensen wrote:
>> Felix Fietkau <[email protected]> writes:
>>
>>> On 2020-12-17 18:26, Toke Høiland-Jørgensen wrote:
>>>> Felix Fietkau <[email protected]> writes:
>>>>> If this becomes a problem, I think we should add a similar patch to
>>>>> wireguard, which already calls skb_get_hash before encapsulating.
>>>>> Other regular tunnels should already get a proper hash, since the flow
>>>>> dissector will take care of it.
>>>>
>>>> But then we'd need to go around adding this to all the places that uses
>>>> the hash just to work around a particular piece of broken(ish) hardware.
>>>> And we're hard-coding a behaviour in mac80211 that means we'll *always*
>>>> recompute the hash, even for hardware that's not similarly broken.
>>>>
>>>>> The reason I did this patch is because I have a patch to set the hw flow
>>>>> hash in the skb on mtk_eth_soc, which does help GRO, but leads to
>>>>> collisions on mac80211 fq.
>>>>
>>>> So wouldn't the right thing to do here be to put a flag into the RX
>>>> device that makes the stack clear the hash after using it for GRO?
>>> I don't think the hardware is broken, I think fq is simply making
>>> assumptions about the hash that aren't met by the hw.
>>>
>>> The documentation in include/linux/skbuff.h mentions these requirements
>>> for the skb hash:
>>> * 1) Two packets in different flows have different hash values
>>> * 2) Two packets in the same flow should have the same hash value
>>>
>>> FWIW, I think the 'should' from 2) probably belongs to 1), otherwise it
>>> makes no sense. Two packets of the flow must return the same hash,
>>> otherwise the hash is broken. I'm assuming this is a typo.
>>
>> There's some text further down indicating this is deliberate:
>>
>> * A driver may indicate a hash level which is less specific than the
>> * actual layer the hash was computed on. For instance, a hash computed
>> * at L4 may be considered an L3 hash. This should only be done if the
>> * driver can't unambiguously determine that the HW computed the hash at
>> * the higher layer. Note that the "should" in the second property above
>> * permits this.
>>
>> So the way I'm reading that whole section, either the intent is that
>> both properties should be fulfilled, or that the first one (being
>> collision-free) is more important...
> A hash - by definition - cannot be collision free.
> But that's beside the point. On my hw, the hash itself seems collision
> free for the flows that I'm pushing, but the result of the
> reciprocal_scale isn't.
> I took another look and figured out the reason for that:
> The hw delivers a 14 bit hash. reciprocal_scale assumes that the values
> are distributed across the full 32 bit range. So in this case, the lower
> bits are pretty much ignored and the result of the reciprocal_scale is 0
> or close to 0, which is what's causing the collisions in fq.
Ah, right, that makes sense!
> Maybe the assumption that the hash should be distributed across the full
> 32 bit range should be documented somewhere :)
Yeah, I agree. Maybe just updating that comment in skbuff.h? Do you want
to fold such an update into your series? Otherwise I can send a patch
once net-next opens...
>>> In addition to those properties, fq needs the hash to be
>>> cryptographically secure, so that it can use reciprocal_scale to sort
>>> flows into buckets without allowing an attacker to craft collisions.
>>> That's also the reason why it used to use skb_get_hash_perturb with a
>>> random perturbation until we got software hashes based on siphash.
>>>
>>> I think it's safe to assume that most hardware out there will not
>>> provide collision resistant hashes, so in my opinion fq cannot rely on a
>>> hardware hash. We don't need to go around and change all places that use
>>> the hash, just those that assume a collision resistant one.
>>
>> I did a quick grep-based survey of uses of skb_get_hash() outside
>> drivers - this is what I found (with my interpretations of what they're
>> used for):
>>
>> net/core/dev.c : skb_tx_hash() - selecting TX queue w/reciprocal scale
>> net/core/dev.c : RX flow steering, flow limiting
>> net/core/dev.c : GRO
>> net/core/filter.c : BPF helper
>> include/net/ip_tunnels.h : flowi4_multipath_hash - so multipath selection?
>> net/ipv{4,6}/route.c : multipath hashing (if l4)
>> net/ipv6/seg6_iptunnel : building flow labels
>> net/mac80211/tx.c : FQ
>> net/mptcp/syncookies : storing cookies (XOR w/net_hash_mix())
>> net/netfilter/nft_hash.c : symhash input (seems to be load balancing)
>> net/openvswitch : flow hashing and actions
>> net/packet/af_packet.c : PACKET_FANOUT_HASH
>> net/sched/sch_*.c : flow hashing for queueing
>>
>> Apart from GRO it's not obvious to me that a trivially
>> attacker-controlled hash is safe in any of those uses?
> I looked at some of those uses you mentioned here.
> Most of them fit into 2 categories:
> 1. Sort into power-of-2 buckets and use hash & (size-1), effectively
> using the lower bits only.
> 2. Use reciprocal_scale - effectively using the higher bits only.
> For the hash that my hw is reporting, type 1 is working and type 2 is
> broken.
>
> So it seems to me that the solution would involve running a simple hash
> on the 14 bit values to get the bits distributed to the full 32 bit
> range without adding too much bias.
> I will do this in the driver and drop this patch.
Yes, this seems like a reasonable solution; great!
> Thanks for looking into this,
You're welcome :)
-Toke