2021-04-06 13:24:40

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH rdma-next 01/10] RDMA: Add access flags to ib_alloc_mr() and ib_mr_pool_init()

On Tue, Apr 06, 2021 at 08:23:28AM +0300, Leon Romanovsky wrote:
> The same proposal (enable unconditionally) was raised during
> submission preparations and we decided to follow same pattern
> as other verbs objects which receive flag parameter.

A flags argument can be added when it actually is needed. Using it
to pass an argument enabled by all ULPs just gets us back to the bad
old days of complete crap APIs someone drew up on a whiteboard.

I think we need to:

a) document the semantics
b) sort out any technical concerns
c) just enable the damn thing

instead of requiring some form of cargo culting.


2021-04-06 13:27:31

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH rdma-next 01/10] RDMA: Add access flags to ib_alloc_mr() and ib_mr_pool_init()

On Tue, Apr 06, 2021 at 07:27:17AM +0200, Christoph Hellwig wrote:
> On Tue, Apr 06, 2021 at 08:23:28AM +0300, Leon Romanovsky wrote:
> > The same proposal (enable unconditionally) was raised during
> > submission preparations and we decided to follow same pattern
> > as other verbs objects which receive flag parameter.
>
> A flags argument can be added when it actually is needed. Using it
> to pass an argument enabled by all ULPs just gets us back to the bad
> old days of complete crap APIs someone drew up on a whiteboard.

Let's wait till Jason wakes up, before jumping to conclusions.
It was his request to update all ULPs.

>
> I think we need to:
>
> a) document the semantics
> b) sort out any technical concerns
> c) just enable the damn thing

Sure

>
> instead of requiring some form of cargo culting.

2021-04-06 21:11:55

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH rdma-next 01/10] RDMA: Add access flags to ib_alloc_mr() and ib_mr_pool_init()

On Tue, Apr 06, 2021 at 08:58:54AM +0300, Leon Romanovsky wrote:
> On Tue, Apr 06, 2021 at 07:27:17AM +0200, Christoph Hellwig wrote:
> > On Tue, Apr 06, 2021 at 08:23:28AM +0300, Leon Romanovsky wrote:
> > > The same proposal (enable unconditionally) was raised during
> > > submission preparations and we decided to follow same pattern
> > > as other verbs objects which receive flag parameter.
> >
> > A flags argument can be added when it actually is needed. Using it
> > to pass an argument enabled by all ULPs just gets us back to the bad
> > old days of complete crap APIs someone drew up on a whiteboard.
>
> Let's wait till Jason wakes up, before jumping to conclusions.
> It was his request to update all ULPs.

We are stuck in a bad spot here

Only the kernel can universally support ACCESS_RELAXED_ORDERING
because all the ULPs are required to work sanely already, but
userspace does not have this limitation and we know of enough popular
verbs users that will break if we unconditionally switch on
ACCESS_RELAXED_ORDERING.

Further, we have the problem with the FMR interface that technically
lets the caller control the entire access_flags, including
ACCESS_RELAXED_ORDERING.

So we broadly have two choice
1) Diverge the kernel and user interfaces and make the RDMA drivers
special case all the kernel stuff to force
ACCESS_RELAXED_ORDERING when they are building MRs and processing
FMR WQE's
2) Keep the two interfaces the same and push the
ACCESS_RELAXED_ORDERING to a couple of places in the ULPs so the
drivers see a consistent API

They are both poor choices, but I think #2 has a greater chance of
everyone doing their parts correctly.

Jason

2021-04-06 21:38:10

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH rdma-next 01/10] RDMA: Add access flags to ib_alloc_mr() and ib_mr_pool_init()

On Tue, Apr 06, 2021 at 09:13:12AM -0300, Jason Gunthorpe wrote:
> So we broadly have two choice
> 1) Diverge the kernel and user interfaces and make the RDMA drivers
> special case all the kernel stuff to force
> ACCESS_RELAXED_ORDERING when they are building MRs and processing
> FMR WQE's
> 2) Keep the two interfaces the same and push the
> ACCESS_RELAXED_ORDERING to a couple of places in the ULPs so the
> drivers see a consistent API
>
> They are both poor choices, but I think #2 has a greater chance of
> everyone doing their parts correctly.

No, 1 is the only sensible choice. The userspace verbs interface is
a mess and should not inflict pain on the kernel in any way. We've moved
away from a lot of the idiotic "Verbs API" concepts with things like
how we handle the global lkey, the new CQ API and the RDMA R/W
abstraction and that massively helped the kernel ecosystem.

2021-04-06 22:57:33

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH rdma-next 01/10] RDMA: Add access flags to ib_alloc_mr() and ib_mr_pool_init()

On Tue, Apr 06, 2021 at 02:30:34PM +0200, Christoph Hellwig wrote:
> On Tue, Apr 06, 2021 at 09:13:12AM -0300, Jason Gunthorpe wrote:
> > So we broadly have two choice
> > 1) Diverge the kernel and user interfaces and make the RDMA drivers
> > special case all the kernel stuff to force
> > ACCESS_RELAXED_ORDERING when they are building MRs and processing
> > FMR WQE's
> > 2) Keep the two interfaces the same and push the
> > ACCESS_RELAXED_ORDERING to a couple of places in the ULPs so the
> > drivers see a consistent API
> >
> > They are both poor choices, but I think #2 has a greater chance of
> > everyone doing their parts correctly.
>
> No, 1 is the only sensible choice. The userspace verbs interface is
> a mess and should not inflict pain on the kernel in any way. We've moved
> away from a lot of the idiotic "Verbs API" concepts with things like
> how we handle the global lkey, the new CQ API and the RDMA R/W
> abstraction and that massively helped the kernel ecosystem.

It might be idiodic, but I have to keep the uverbs thing working
too.

There is a lot of assumption baked in to all the drivers that
user/kernel is the same thing, we'd have to go in and break this.

Essentially #2 ends up as deleting IB_ACCESS_RELAXED_ORDERING kernel
side and instead doing some IB_ACCESS_DISABLE_RO in kernel,
translating uverbs IBV_ACCESS_* to this then finding and inverting all
the driver logic and also finding and unblocking all the places that
enforce valid access flags in the drivers. It is complicated enough

Maybe Avihai will give it a try

Jason

2021-04-07 04:56:08

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH rdma-next 01/10] RDMA: Add access flags to ib_alloc_mr() and ib_mr_pool_init()

On Tue, Apr 06, 2021 at 11:04:37AM -0300, Jason Gunthorpe wrote:
> It might be idiodic, but I have to keep the uverbs thing working
> too.
>
> There is a lot of assumption baked in to all the drivers that
> user/kernel is the same thing, we'd have to go in and break this.
>
> Essentially #2 ends up as deleting IB_ACCESS_RELAXED_ORDERING kernel
> side and instead doing some IB_ACCESS_DISABLE_RO in kernel,
> translating uverbs IBV_ACCESS_* to this then finding and inverting all
> the driver logic and also finding and unblocking all the places that
> enforce valid access flags in the drivers. It is complicated enough

Inverting the polarity of a flag at the uapi boundary is pretty
trivial and we already do it all over the kernel.

Do we actually ever need the strict ordering semantics in the kernel?

2021-04-07 05:35:17

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH rdma-next 01/10] RDMA: Add access flags to ib_alloc_mr() and ib_mr_pool_init()

On Tue, Apr 06, 2021 at 04:15:52PM +0200, Christoph Hellwig wrote:
> On Tue, Apr 06, 2021 at 11:04:37AM -0300, Jason Gunthorpe wrote:
> > It might be idiodic, but I have to keep the uverbs thing working
> > too.
> >
> > There is a lot of assumption baked in to all the drivers that
> > user/kernel is the same thing, we'd have to go in and break this.
> >
> > Essentially #2 ends up as deleting IB_ACCESS_RELAXED_ORDERING kernel
> > side and instead doing some IB_ACCESS_DISABLE_RO in kernel,
> > translating uverbs IBV_ACCESS_* to this then finding and inverting all
> > the driver logic and also finding and unblocking all the places that
> > enforce valid access flags in the drivers. It is complicated enough
>
> Inverting the polarity of a flag at the uapi boundary is pretty
> trivial and we already do it all over the kernel.

Yes, but the complexity is how the drivers are constructed they are
designed to reject flags they don't know about..

Hum, it looks like someone has already been in here and we now have a
IB_ACCESS_OPTIONAL concept.

Something like this would be the starting point:

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index bed4cfe50554f7..fcb107df0eefc6 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1440,9 +1440,11 @@ enum ib_access_flags {
IB_ZERO_BASED = IB_UVERBS_ACCESS_ZERO_BASED,
IB_ACCESS_ON_DEMAND = IB_UVERBS_ACCESS_ON_DEMAND,
IB_ACCESS_HUGETLB = IB_UVERBS_ACCESS_HUGETLB,
- IB_ACCESS_RELAXED_ORDERING = IB_UVERBS_ACCESS_RELAXED_ORDERING,

IB_ACCESS_OPTIONAL = IB_UVERBS_ACCESS_OPTIONAL_RANGE,
+ _IB_ACCESS_RESERVED1 = IB_UVERBS_ACCESS_RELAXED_ORDERING,
+ IB_ACCESS_DISABLE_RELAXED_ORDERING,
+
IB_ACCESS_SUPPORTED =
((IB_ACCESS_HUGETLB << 1) - 1) | IB_ACCESS_OPTIONAL,
};

However I see only EFA actually uses IB_ACCESS_OPTIONAL, so the lead
up would be to audit all the drivers to process optional access_flags
properly. Maybe this was done, but I don't see much evidence of it..

Sigh. It is a big mess cleaning adventure in drivers really.

> Do we actually ever need the strict ordering semantics in the kernel?

No, only for uverbs.

Jason

2021-04-07 05:35:35

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH rdma-next 01/10] RDMA: Add access flags to ib_alloc_mr() and ib_mr_pool_init()

On Tue, Apr 06, 2021 at 11:40:39AM -0300, Jason Gunthorpe wrote:
> Yes, but the complexity is how the drivers are constructed they are
> designed to reject flags they don't know about..
>
> Hum, it looks like someone has already been in here and we now have a
> IB_ACCESS_OPTIONAL concept.
>
> Something like this would be the starting point:
>
> [...]
>
> However I see only EFA actually uses IB_ACCESS_OPTIONAL, so the lead
> up would be to audit all the drivers to process optional access_flags
> properly. Maybe this was done, but I don't see much evidence of it..
>
> Sigh. It is a big mess cleaning adventure in drivers really.

Yes. When passing flags to drivers we need to have a good pattern
in place for distinguishing between mandatory and optional flags.
That is something everyone gets wrong at first (yourself included)
and which then later needs painful untangling. So I think we need
to do that anyway.

> > Do we actually ever need the strict ordering semantics in the kernel?
>
> No, only for uverbs.

Which is a pretty clear indicator that we should avoid all this ULP
churn. Note that the polarity of the the flag passed to the HCA
driver doesn't really matter either - we should just avoid having to
deal with it in the kernel ULP API.

2021-04-07 05:35:56

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH rdma-next 01/10] RDMA: Add access flags to ib_alloc_mr() and ib_mr_pool_init()

On Tue, Apr 06, 2021 at 04:54:57PM +0200, Christoph Hellwig wrote:
> That is something everyone gets wrong at first (yourself included)

The yourself here should have been a yours truly or "myself", sorry.

2021-04-07 21:52:27

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH rdma-next 01/10] RDMA: Add access flags to ib_alloc_mr() and ib_mr_pool_init()

On Tue, Apr 06, 2021 at 04:54:57PM +0200, Christoph Hellwig wrote:
> On Tue, Apr 06, 2021 at 11:40:39AM -0300, Jason Gunthorpe wrote:
> > Yes, but the complexity is how the drivers are constructed they are
> > designed to reject flags they don't know about..
> >
> > Hum, it looks like someone has already been in here and we now have a
> > IB_ACCESS_OPTIONAL concept.
> >
> > Something like this would be the starting point:
> >
> > [...]
> >
> > However I see only EFA actually uses IB_ACCESS_OPTIONAL, so the lead
> > up would be to audit all the drivers to process optional access_flags
> > properly. Maybe this was done, but I don't see much evidence of it..
> >
> > Sigh. It is a big mess cleaning adventure in drivers really.
>
> Yes. When passing flags to drivers we need to have a good pattern
> in place for distinguishing between mandatory and optional flags.
> That is something everyone gets wrong at first (yourself included)
> and which then later needs painful untangling. So I think we need
> to do that anyway.

It looks like we did actually do this right here, when the
RELAXED_ORDERING was originally added it was done as an optional flag
and a range of bits were set aside in the uAPI for future optional
flags.

This should be quite easy to do then

Jason