> And if you go that naive route, just fix everything in the driver, then
> you simply end up with something terribly inefficient because all those
> corner case checks end up in the fast path and eating up code space.
>
One thing I forgot to mention here that should especially interest you:
you add a lot of complexity to the *driver* that needs to verified and
maintained (by the kernel community?!). Some of these workarounds I had to
implement are really quite a convoluted mess and it's starting to take up
a significant portion of the driver's code base as well ... just to support
some fringe cases that are not even relevant to the hardware's main use
cases (as "we" as the "hardware vendor" see it) at all.
Note that I actually *have* implemented all these workarounds and my
driver is fully passing all fuzzing tests etc. I'm just not happy with it.
Regards,
Pascal van Leeuwen
Silicon IP Architect, Multi-Protocol Engines @ Inside Secure
http://www.insidesecure.com
On Mon, 27 May 2019 at 14:41, Pascal Van Leeuwen
<[email protected]> wrote:
>
> > And if you go that naive route, just fix everything in the driver, then
> > you simply end up with something terribly inefficient because all those
> > corner case checks end up in the fast path and eating up code space.
> >
> One thing I forgot to mention here that should especially interest you:
> you add a lot of complexity to the *driver* that needs to verified and
> maintained (by the kernel community?!). Some of these workarounds I had to
> implement are really quite a convoluted mess and it's starting to take up
> a significant portion of the driver's code base as well ... just to support
> some fringe cases that are not even relevant to the hardware's main use
> cases (as "we" as the "hardware vendor" see it) at all.
>
> Note that I actually *have* implemented all these workarounds and my
> driver is fully passing all fuzzing tests etc. I'm just not happy with it.
>
Good, glad to hear that. I would test it myself if my MacchiatoBin
hadn't self combusted recently (for the second time!) but there are
some others that volunteered IIUC?
I think nobody is happy that we are adding code like that to the
kernel, but it seems we will have to agree to disagree on the
alternatives, since teaching the upper layers about zero length inputs
being special cases is simply not acceptable (and it would result in
those workarounds to be added to generic code where it would be
affecting everyone instead of only the users of your IP)
But I fully understand that dealing with this case in hardware is not
feasible either, and so this approach is what we will have to live
with.
> > One thing I forgot to mention here that should especially interest you:
> > you add a lot of complexity to the *driver* that needs to verified and
> > maintained (by the kernel community?!). Some of these workarounds I had to
> > implement are really quite a convoluted mess and it's starting to take up
> > a significant portion of the driver's code base as well ... just to support
> > some fringe cases that are not even relevant to the hardware's main use
> > cases (as "we" as the "hardware vendor" see it) at all.
> >
> > Note that I actually *have* implemented all these workarounds and my
> > driver is fully passing all fuzzing tests etc. I'm just not happy with it.
> >
>
> Good, glad to hear that. I would test it myself if my MacchiatoBin
> hadn't self combusted recently (for the second time!) but there are
> some others that volunteered IIUC?
>
Some people did volunteer about a month ago but I haven't heard from
any of them since ... which means my fixes won't make it into any kernel
trees any day soon.
> I think nobody is happy that we are adding code like that to the
> kernel, but it seems we will have to agree to disagree on the
> alternatives, since teaching the upper layers about zero length inputs
> being special cases is simply not acceptable (and it would result in
> those workarounds to be added to generic code where it would be
> affecting everyone instead of only the users of your IP)
>
You keep missing my point though ... I was not suggesting teaching
upper layers about zero lengths or any other hardware limitations.
My point is that the overal majory of the "upper layers" are known not
to need zero lengths or any of these other corner cases and our driver/
hardware doesn't really care about supporting the ones that do, because
those "upper layers" would not benefit from our driver/hardware anyway.
You know, all I *really* care about at this point is *just* supporting
the kernel IPsec stack. The rest is just baggage for me anyway.
It's all about preventing the "upper layers" from selecting our driver.
Which can be arranged very easily. Just set cra_priority to 0 which
requires it to be selected explicitly, moving the responsibility to
whomever configures the machine.
> But I fully understand that dealing with this case in hardware is not
> feasible either, and so this approach is what we will have to live
> with.
>
We don't *have* to do anything ... this thing is not set in stone as far
as I know. We could actually come up with a real compromise, which this
is not. It just requires some flexibility of mind and some good will ...
Regards,
Pascal van Leeuwen
Silicon IP Architect, Multi-Protocol Engines
http://www.insidesecure.com
On Mon, 27 May 2019 at 17:16, Pascal Van Leeuwen
<[email protected]> wrote:
>
> > > One thing I forgot to mention here that should especially interest you:
> > > you add a lot of complexity to the *driver* that needs to verified and
> > > maintained (by the kernel community?!). Some of these workarounds I had to
> > > implement are really quite a convoluted mess and it's starting to take up
> > > a significant portion of the driver's code base as well ... just to support
> > > some fringe cases that are not even relevant to the hardware's main use
> > > cases (as "we" as the "hardware vendor" see it) at all.
> > >
> > > Note that I actually *have* implemented all these workarounds and my
> > > driver is fully passing all fuzzing tests etc. I'm just not happy with it.
> > >
> >
> > Good, glad to hear that. I would test it myself if my MacchiatoBin
> > hadn't self combusted recently (for the second time!) but there are
> > some others that volunteered IIUC?
> >
> Some people did volunteer about a month ago but I haven't heard from
> any of them since ... which means my fixes won't make it into any kernel
> trees any day soon.
>
OK. Can you share your git tree again? I will try to ping some people.
> > I think nobody is happy that we are adding code like that to the
> > kernel, but it seems we will have to agree to disagree on the
> > alternatives, since teaching the upper layers about zero length inputs
> > being special cases is simply not acceptable (and it would result in
> > those workarounds to be added to generic code where it would be
> > affecting everyone instead of only the users of your IP)
> >
> You keep missing my point though ... I was not suggesting teaching
> upper layers about zero lengths or any other hardware limitations.
> My point is that the overal majory of the "upper layers" are known not
> to need zero lengths or any of these other corner cases and our driver/
> hardware doesn't really care about supporting the ones that do, because
> those "upper layers" would not benefit from our driver/hardware anyway.
>
Yes, but 'are not known' today is not enough. Once we open that door,
it becomes our responsibility as kernel maintainers to ensure that
this remains the case.
So i understand perfectly well that current in-kernel users may never
issue crypto requests that exercise this code path. But the nice thing
today is that we don't have to care about this at all, since zero
length vectors are simply supported as well. Once we start
distinguishing the two, we have to start policing this at *some* level
rather than just pretend the issue isn't there and get bitten by it
down the road.
So no, i am not missing your point. But I think we disagree on your
conclusion that this permits is to optimize this case away and simply
don't reason about it at all.
> You know, all I *really* care about at this point is *just* supporting
> the kernel IPsec stack. The rest is just baggage for me anyway.
>
I see.
> It's all about preventing the "upper layers" from selecting our driver.
> Which can be arranged very easily. Just set cra_priority to 0 which
> requires it to be selected explicitly, moving the responsibility to
> whomever configures the machine.
>
So which algorithms that IPsec actually uses on your hardware does
this issue apply to? Does the hardware implement the complete AEAD
transform? Or does it rely on software to wire the MAC and skcipher
pieces together?
> > But I fully understand that dealing with this case in hardware is not
> > feasible either, and so this approach is what we will have to live
> > with.
> >
> We don't *have* to do anything ... this thing is not set in stone as far
> as I know. We could actually come up with a real compromise, which this
> is not. It just requires some flexibility of mind and some good will ...
>
The Latin term escapes me, but surely, complaining about the other's
unwillingness to compromise rather than give actual convincing
arguments is one of the well known rhetorical fallacies? :-)
> > Some people did volunteer about a month ago but I haven't heard from
> > any of them since ... which means my fixes won't make it into any kernel
> > trees any day soon.
> >
>
> OK. Can you share your git tree again? I will try to ping some people.
>
Well, I just got a response, but in case anyone else is interested:
https://github.com/pvanleeuwen/linux.git, branch "is_driver_armada_fix"
> > You keep missing my point though ... I was not suggesting teaching
> > upper layers about zero lengths or any other hardware limitations.
> > My point is that the overal majory of the "upper layers" are known not
> > to need zero lengths or any of these other corner cases and our driver/
> > hardware doesn't really care about supporting the ones that do, because
> > those "upper layers" would not benefit from our driver/hardware anyway.
> >
>
> Yes, but 'are not known' today is not enough. Once we open that door,
> it becomes our responsibility as kernel maintainers to ensure that
> this remains the case.
>
No, the only responsibility it might add is the responsibility to
validate that driver X indeed still works with "upper layer" Y, and if
not, to update the drivers' README to remove that assertion.
But honestly how DO the kernel maintainers validate that driver X is
not broken, considering they may not have access to the actual hardware?
I don't really get the impression anyone is actively verifying the
Inside Secure driver is still working for every kernel release ...
So I think it's a pretty moot point - you're not doing that anyway.
> So i understand perfectly well that current in-kernel users may never
> issue crypto requests that exercise this code path. But the nice thing
> today is that we don't have to care about this at all, since zero
> length vectors are simply supported as well. Once we start
> distinguishing the two, we have to start policing this at *some* level
> rather than just pretend the issue isn't there and get bitten by it
> down the road.
>
I'm not pretending the issue isn't there. I just *know* I don't care
about the issue for my particular driver/hw at all because it's
irrelevant for the "upper layer"s I'm actually specifically targetting.
> So no, i am not missing your point. But I think we disagree on your
> conclusion that this permits is to optimize this case away and simply
> don't reason about it at all.
>
I never concluded to optimize the case away across the whole API.
Or not reasoning about it at all for that matter. Just for cutting
drivers some slack, under the explicit condition that they're not
ever selected as default implementation.
> > You know, all I *really* care about at this point is *just* supporting
> > the kernel IPsec stack. The rest is just baggage for me anyway.
> >
>
> I see.
>
> > It's all about preventing the "upper layers" from selecting our driver.
> > Which can be arranged very easily. Just set cra_priority to 0 which
> > requires it to be selected explicitly, moving the responsibility to
> > whomever configures the machine.
> >
>
> So which algorithms that IPsec actually uses on your hardware does
> this issue apply to? Does the hardware implement the complete AEAD
> transform? Or does it rely on software to wire the MAC and skcipher
> pieces together?
>
Actually, it CAN do the full ESP transform. Plus the IP header processing.
But since that cannot be accelerated yet, I'm happy just to get some AEAD
acceleration going. Perhaps with IV generation pulled in as well.
> > > But I fully understand that dealing with this case in hardware is not
> > > feasible either, and so this approach is what we will have to live
> > > with.
> > >
> > We don't *have* to do anything ... this thing is not set in stone as far
> > as I know. We could actually come up with a real compromise, which this
> > is not. It just requires some flexibility of mind and some good will ...
> >
>
> The Latin term escapes me, but surely, complaining about the other's
> unwillingness to compromise rather than give actual convincing
> arguments is one of the well known rhetorical fallacies? :-)
>
I don't know, I'm just an engineer, not a rhetorical mastermind :-)
But you were the one claiming to compromise and I'm just making the
observation that I don't see a whole lot of compromise: it's either
implement every little corner of the algorithms or bust, it seems.
Regards,
Pascal van Leeuwen
Silicon IP Architect, Multi-Protocol Engines @ Inside Secure
http://www.insidesecure.com