2018-12-04 00:48:33

by Stephen Rothwell

[permalink] [raw]
Subject: linux-next: build failure after merge of the rdma tree

Hi all,

After merging the rdma tree, today's linux-next build (x86_64
allmodconfig) failed like this:

ERROR: "mlx5_get_send_wqe" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!

Caused by commit

34f4c9554d8b ("IB/mlx5: Use fragmented QP's buffer for in-kernel users")

mlx5_get_send_wqe() is still used in drivers/infiniband/hw/mlx5/cq.c
and declared in drivers/infiniband/hw/mlx5/mlx5_ib.h ...

I have used the version of the rdma tree from next-20181203 for today.



--
Cheers,
Stephen Rothwell


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2018-12-04 01:54:15

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the rdma tree

On Tue, Dec 04, 2018 at 11:47:31AM +1100, Stephen Rothwell wrote:
> Hi all,
>
> After merging the rdma tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
>
> ERROR: "mlx5_get_send_wqe" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!
>
> Caused by commit
>
> 34f4c9554d8b ("IB/mlx5: Use fragmented QP's buffer for in-kernel users")
>
> mlx5_get_send_wqe() is still used in drivers/infiniband/hw/mlx5/cq.c
> and declared in drivers/infiniband/hw/mlx5/mlx5_ib.h ...
>
> I have used the version of the rdma tree from next-20181203 for today.

Huh. So apparently every compiler that tested this patch (0-day, mine,
the submitters) optimized this call away because is_atomic_response()
always returns 0: meaning mlx5_get_atomic_laddr is never callable and
can be deleted entirely, including the call to mlx5_get_send_wqe.

Not sure what compiler setup will hit this, but it is clearly wrong
code..

Guy/Leon, please send a fixup.. Maybe just delete all this
handle_atomics stuff?

Thanks,
Jason

2018-12-04 09:44:52

by Leon Romanovsky

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the rdma tree

On Tue, Dec 04, 2018 at 01:52:54AM +0000, Jason Gunthorpe wrote:
> On Tue, Dec 04, 2018 at 11:47:31AM +1100, Stephen Rothwell wrote:
> > Hi all,
> >
> > After merging the rdma tree, today's linux-next build (x86_64
> > allmodconfig) failed like this:
> >
> > ERROR: "mlx5_get_send_wqe" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!
> >
> > Caused by commit
> >
> > 34f4c9554d8b ("IB/mlx5: Use fragmented QP's buffer for in-kernel users")
> >
> > mlx5_get_send_wqe() is still used in drivers/infiniband/hw/mlx5/cq.c
> > and declared in drivers/infiniband/hw/mlx5/mlx5_ib.h ...
> >
> > I have used the version of the rdma tree from next-20181203 for today.
>
> Huh. So apparently every compiler that tested this patch (0-day, mine,
> the submitters) optimized this call away because is_atomic_response()
> always returns 0: meaning mlx5_get_atomic_laddr is never callable and
> can be deleted entirely, including the call to mlx5_get_send_wqe.
>
> Not sure what compiler setup will hit this, but it is clearly wrong
> code..
>
> Guy/Leon, please send a fixup.. Maybe just delete all this
> handle_atomics stuff?

I deleted it and will wait for the results of regression prior to send the patch.

Thanks

>
> Thanks,
> Jason


Attachments:
(No filename) (1.24 kB)
signature.asc (817.00 B)
Download all attachments

2018-12-05 12:27:40

by Guy Levi(SW)

[permalink] [raw]
Subject: RE: linux-next: build failure after merge of the rdma tree

>
> Huh. So apparently every compiler that tested this patch (0-day, mine,
> the submitters) optimized this call away because is_atomic_response()
> always returns 0: meaning mlx5_get_atomic_laddr is never callable and
> can be deleted entirely, including the call to mlx5_get_send_wqe.
>
> Not sure what compiler setup will hit this, but it is clearly wrong
> code..

Flag -o0 ?


>
> Guy/Leon, please send a fixup.. Maybe just delete all this
> handle_atomics stuff?
>
> Thanks,
> Jason

2018-12-05 22:59:24

by Stephen Rothwell

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the rdma tree

Hi Guy,

On Wed, 5 Dec 2018 12:25:57 +0000 "Guy Levi(SW)" <[email protected]> wrote:
>
> >
> > Huh. So apparently every compiler that tested this patch (0-day, mine,
> > the submitters) optimized this call away because is_atomic_response()
> > always returns 0: meaning mlx5_get_atomic_laddr is never callable and
> > can be deleted entirely, including the call to mlx5_get_send_wqe.
> >
> > Not sure what compiler setup will hit this, but it is clearly wrong
> > code..
>
> Flag -o0 ?

No, but the kbuild tree contains a change that allows turning off of
gcc's autoinlining and the CONFIG option guarding that gets turned on
for allmodconfig builds among others.

Masahiro, should CONFIG_NO_AUTO_INLINE maybe need to be off unless
explicitly enabled (like CONFIG_DEBUG_INFO and others)?

--
Cheers,
Stephen Rothwell


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2018-12-07 02:42:12

by Stephen Rothwell

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the rdma tree

Hi all,

On Tue, 4 Dec 2018 09:42:59 +0000 Leon Romanovsky <[email protected]> wrote:
>
> On Tue, Dec 04, 2018 at 01:52:54AM +0000, Jason Gunthorpe wrote:
> > On Tue, Dec 04, 2018 at 11:47:31AM +1100, Stephen Rothwell wrote:
> > >
> > > After merging the rdma tree, today's linux-next build (x86_64
> > > allmodconfig) failed like this:
> > >
> > > ERROR: "mlx5_get_send_wqe" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!
> > >
> > > Caused by commit
> > >
> > > 34f4c9554d8b ("IB/mlx5: Use fragmented QP's buffer for in-kernel users")
> > >
> > > mlx5_get_send_wqe() is still used in drivers/infiniband/hw/mlx5/cq.c
> > > and declared in drivers/infiniband/hw/mlx5/mlx5_ib.h ...
> > >
> > > I have used the version of the rdma tree from next-20181203 for today.
> >
> > Huh. So apparently every compiler that tested this patch (0-day, mine,
> > the submitters) optimized this call away because is_atomic_response()
> > always returns 0: meaning mlx5_get_atomic_laddr is never callable and
> > can be deleted entirely, including the call to mlx5_get_send_wqe.
> >
> > Not sure what compiler setup will hit this, but it is clearly wrong
> > code..
> >
> > Guy/Leon, please send a fixup.. Maybe just delete all this
> > handle_atomics stuff?
>
> I deleted it and will wait for the results of regression prior to send the patch.

I am still getting this error.

--
Cheers,
Stephen Rothwell


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2018-12-10 00:47:38

by Changbin Du

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the rdma tree

On Thu, Dec 06, 2018 at 09:58:15AM +1100, Stephen Rothwell wrote:
> Hi Guy,
>
> On Wed, 5 Dec 2018 12:25:57 +0000 "Guy Levi(SW)" <[email protected]> wrote:
> >
> > >
> > > Huh. So apparently every compiler that tested this patch (0-day, mine,
> > > the submitters) optimized this call away because is_atomic_response()
> > > always returns 0: meaning mlx5_get_atomic_laddr is never callable and
> > > can be deleted entirely, including the call to mlx5_get_send_wqe.
> > >
> > > Not sure what compiler setup will hit this, but it is clearly wrong
> > > code..
> >
> > Flag -o0 ?
>
> No, but the kbuild tree contains a change that allows turning off of
> gcc's autoinlining and the CONFIG option guarding that gets turned on
> for allmodconfig builds among others.
>
Let me turn it off to pass allmodconfig build. it is a pure debug
option.

> Masahiro, should CONFIG_NO_AUTO_INLINE maybe need to be off unless
> explicitly enabled (like CONFIG_DEBUG_INFO and others)?
>
> --
> Cheers,
> Stephen Rothwell



--
Thanks,
Changbin Du

2018-12-10 16:51:28

by Masahiro Yamada

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the rdma tree

On Thu, Dec 6, 2018 at 7:59 AM Stephen Rothwell <[email protected]> wrote:
>
> Hi Guy,
>
> On Wed, 5 Dec 2018 12:25:57 +0000 "Guy Levi(SW)" <[email protected]> wrote:
> >
> > >
> > > Huh. So apparently every compiler that tested this patch (0-day, mine,
> > > the submitters) optimized this call away because is_atomic_response()
> > > always returns 0: meaning mlx5_get_atomic_laddr is never callable and
> > > can be deleted entirely, including the call to mlx5_get_send_wqe.
> > >
> > > Not sure what compiler setup will hit this, but it is clearly wrong
> > > code..
> >
> > Flag -o0 ?
>
> No, but the kbuild tree contains a change that allows turning off of
> gcc's autoinlining and the CONFIG option guarding that gets turned on
> for allmodconfig builds among others.
>
> Masahiro, should CONFIG_NO_AUTO_INLINE maybe need to be off unless
> explicitly enabled (like CONFIG_DEBUG_INFO and others)?


No.

If CONFIG_NO_AUTO_INLINE is turned off for compile-testing,
people will not even notice a breakage,
then the code will get broken here and there.

You will not be able to enable it
when you really want to use it.





In this case, the reason is obvious.

If you expect the compiler to optimize the code out,
you must use 'static inline' instead of 'static'.


static int is_atomic_response(struct mlx5_ib_qp *qp, uint16_t idx)
{
/* TBD: waiting decision
*/
return 0;
}





--
Best Regards
Masahiro Yamada