From: Zhaoyang Huang <[email protected]>
Since few caller check IOPRIO_PRIO_VALUE's return value and bio_set_prio
is a open macro to set bio->ioprio directly.It is confused for the developer
who run across kernel panic[1] but can find nothing in previous kernel log.
Add a pr_err here to dump the information.
[1]
Here is the kernel panic I run across which caused by a out of bounds check
introduced by CONFIG_FOTIFY_SOURCE.
[exception_serialno]:
[exception_kernel_version]:
[exception_reboot_reason]: kernel_crash
[exception_panic_reason]: UBSAN: array index out of bounds: Fatal exception
[exception_time]: 1970-01-01_08-00-23
[exception_file_info]: not-bugon
[exception_task_id]: 409
[exception_task_family]: [f2fs_ckpt-254:4, 409][kthreadd, 2]
[exception_pc_symbol]: [<ffffffc080736974>] dd_request_merge+0x100/0x110
[exception_stack_info]: [<ffffffc07a27e274>] get_exception_stack_info+0x124/0x2d8 [sysdump]gc
[<ffffffc07a27e670>] prepare_exception_info+0x158/0x1d4 [sysdump]gc
[<ffffffc07a280128>] sysdump_panic_event+0x5d8/0x748 [sysdump]gc
[<ffffffc08016a508>] notifier_call_chain+0x98/0x17cgc
[<ffffffc08016a9b4>] atomic_notifier_call_chain+0x44/0x68gc
[<ffffffc0810f0eb4>] panic+0x194/0x37cgc
[<ffffffc0800a638c>] die+0x300/0x310gc
[<ffffffc0800a77e8>] ubsan_handler+0x34/0x4cgc
[<ffffffc0800960a8>] brk_handler+0x9c/0x11cgc
[<ffffffc0800bf998>] do_debug_exception+0xb0/0x140gc
[<ffffffc0810f8bf0>] el1_dbg+0x58/0x74gc
[<ffffffc0810f89f4>] el1h_64_sync_handler+0x3c/0x90gc
[<ffffffc080091298>] el1h_64_sync+0x68/0x6cgc
[<ffffffc080736974>] dd_request_merge+0x100/0x110gc //out of bound
here caused by the value of class transferred from ioprio
[<ffffffc080707f28>] elv_merge+0x248/0x270gc
[<ffffffc0807146e8>] blk_mq_sched_try_merge+0x4c/0x20cgc
[<ffffffc080736824>] dd_bio_merge+0x64/0xb4gc
[<ffffffc080723e3c>] blk_mq_sched_bio_merge+0x68/0x144gc
[<ffffffc08071b944>] blk_mq_submit_bio+0x2e8/0x6c0gc
[<ffffffc08070dd3c>] __submit_bio+0xbc/0x1b0gc
[<ffffffc08070c440>] submit_bio_noacct_nocheck+0xe4/0x2f0gc
[<ffffffc08070c8e4>] submit_bio_noacct+0x298/0x3d8gc
[<ffffffc08070caf8>] submit_bio+0xd4/0xf0gc
[<ffffffc080642644>] f2fs_submit_write_bio+0xcc/0x49cgc
[<ffffffc0806442d4>] __submit_merged_bio+0x48/0x13cgc
[<ffffffc080641de4>] __submit_merged_write_cond+0x18c/0x1f8gc
[<ffffffc080641c4c>] f2fs_submit_merged_write+0x2c/0x38
[<ffffffc080655724>] f2fs_sync_node_pages+0x6e0/0x740gc
[<ffffffc08063946c>] f2fs_write_checkpoint+0x4c0/0x97cgc
[<ffffffc08063b37c>] __checkpoint_and_complete_reqs+0x88/0x248gc
[<ffffffc08063ad70>] issue_checkpoint_thread+0x94/0xf4gc
[<ffffffc080167c20>] kthread+0x110/0x1b8gc
Signed-off-by: Zhaoyang Huang <[email protected]>
---
include/uapi/linux/ioprio.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h
index bee2bdb0eedb..73c420a0df72 100644
--- a/include/uapi/linux/ioprio.h
+++ b/include/uapi/linux/ioprio.h
@@ -112,8 +112,11 @@ static __always_inline __u16 ioprio_value(int prioclass, int priolevel,
{
if (IOPRIO_BAD_VALUE(prioclass, IOPRIO_NR_CLASSES) ||
IOPRIO_BAD_VALUE(priolevel, IOPRIO_NR_LEVELS) ||
- IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS))
+ IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) {
+ pr_err("%s: get a invalid domain in class %d, level %d, hint %d\n",
+ __func__, prioclass, priolevel, priohint);
return IOPRIO_CLASS_INVALID << IOPRIO_CLASS_SHIFT;
+ }
return (prioclass << IOPRIO_CLASS_SHIFT) |
(priohint << IOPRIO_HINT_SHIFT) | priolevel;
--
2.25.1
On Wed, Jan 31, 2024 at 08:14:01PM +0800, zhaoyang.huang wrote:
> From: Zhaoyang Huang <[email protected]>
>
> Since few caller check IOPRIO_PRIO_VALUE's return value and bio_set_prio
> is a open macro to set bio->ioprio directly.It is confused for the developer
> who run across kernel panic[1] but can find nothing in previous kernel log.
> Add a pr_err here to dump the information.
>
> [1]
> Here is the kernel panic I run across which caused by a out of bounds check
> introduced by CONFIG_FOTIFY_SOURCE.
>
> [exception_serialno]:
> [exception_kernel_version]:
> [exception_reboot_reason]: kernel_crash
> [exception_panic_reason]: UBSAN: array index out of bounds: Fatal exception
> [exception_time]: 1970-01-01_08-00-23
> [exception_file_info]: not-bugon
> [exception_task_id]: 409
> [exception_task_family]: [f2fs_ckpt-254:4, 409][kthreadd, 2]
> [exception_pc_symbol]: [<ffffffc080736974>] dd_request_merge+0x100/0x110
> [exception_stack_info]: [<ffffffc07a27e274>] get_exception_stack_info+0x124/0x2d8 [sysdump]gc
> [<ffffffc07a27e670>] prepare_exception_info+0x158/0x1d4 [sysdump]gc
> [<ffffffc07a280128>] sysdump_panic_event+0x5d8/0x748 [sysdump]gc
> [<ffffffc08016a508>] notifier_call_chain+0x98/0x17cgc
> [<ffffffc08016a9b4>] atomic_notifier_call_chain+0x44/0x68gc
> [<ffffffc0810f0eb4>] panic+0x194/0x37cgc
> [<ffffffc0800a638c>] die+0x300/0x310gc
> [<ffffffc0800a77e8>] ubsan_handler+0x34/0x4cgc
> [<ffffffc0800960a8>] brk_handler+0x9c/0x11cgc
> [<ffffffc0800bf998>] do_debug_exception+0xb0/0x140gc
> [<ffffffc0810f8bf0>] el1_dbg+0x58/0x74gc
> [<ffffffc0810f89f4>] el1h_64_sync_handler+0x3c/0x90gc
> [<ffffffc080091298>] el1h_64_sync+0x68/0x6cgc
> [<ffffffc080736974>] dd_request_merge+0x100/0x110gc //out of bound
> here caused by the value of class transferred from ioprio
> [<ffffffc080707f28>] elv_merge+0x248/0x270gc
> [<ffffffc0807146e8>] blk_mq_sched_try_merge+0x4c/0x20cgc
> [<ffffffc080736824>] dd_bio_merge+0x64/0xb4gc
> [<ffffffc080723e3c>] blk_mq_sched_bio_merge+0x68/0x144gc
> [<ffffffc08071b944>] blk_mq_submit_bio+0x2e8/0x6c0gc
> [<ffffffc08070dd3c>] __submit_bio+0xbc/0x1b0gc
> [<ffffffc08070c440>] submit_bio_noacct_nocheck+0xe4/0x2f0gc
> [<ffffffc08070c8e4>] submit_bio_noacct+0x298/0x3d8gc
> [<ffffffc08070caf8>] submit_bio+0xd4/0xf0gc
> [<ffffffc080642644>] f2fs_submit_write_bio+0xcc/0x49cgc
> [<ffffffc0806442d4>] __submit_merged_bio+0x48/0x13cgc
> [<ffffffc080641de4>] __submit_merged_write_cond+0x18c/0x1f8gc
> [<ffffffc080641c4c>] f2fs_submit_merged_write+0x2c/0x38
> [<ffffffc080655724>] f2fs_sync_node_pages+0x6e0/0x740gc
> [<ffffffc08063946c>] f2fs_write_checkpoint+0x4c0/0x97cgc
> [<ffffffc08063b37c>] __checkpoint_and_complete_reqs+0x88/0x248gc
> [<ffffffc08063ad70>] issue_checkpoint_thread+0x94/0xf4gc
> [<ffffffc080167c20>] kthread+0x110/0x1b8gc
>
> Signed-off-by: Zhaoyang Huang <[email protected]>
> ---
> include/uapi/linux/ioprio.h | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h
> index bee2bdb0eedb..73c420a0df72 100644
> --- a/include/uapi/linux/ioprio.h
> +++ b/include/uapi/linux/ioprio.h
> @@ -112,8 +112,11 @@ static __always_inline __u16 ioprio_value(int prioclass, int priolevel,
> {
> if (IOPRIO_BAD_VALUE(prioclass, IOPRIO_NR_CLASSES) ||
> IOPRIO_BAD_VALUE(priolevel, IOPRIO_NR_LEVELS) ||
> - IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS))
> + IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) {
> + pr_err("%s: get a invalid domain in class %d, level %d, hint %d\n",
> + __func__, prioclass, priolevel, priohint);
> return IOPRIO_CLASS_INVALID << IOPRIO_CLASS_SHIFT;
> + }
>
> return (prioclass << IOPRIO_CLASS_SHIFT) |
> (priohint << IOPRIO_HINT_SHIFT) | priolevel;
> --
> 2.25.1
>
Adding linux-block to CC.
pr_err() is a kernel function for printing.
ioprio_value() is a function in a uapi header, so this function will be
used by user space programs.
There is a reason:
$ git grep pr_err include/uapi/
Gives no results.
I think you should fix mq-deadline instead.
It looks like the problem comes from:
ioprio_value() will set class to IOPRIO_CLASS_INVALID (value 7),
if the user specified an class/level/hint that was invalid.
ioprio_class_to_prio[] array in mq-deadline.c does currently not have an
entry in to translate IOPRIO_CLASS_INVALID (7) to a valid DD_*_PRIO value.
Although, why does this I/O even reach the scheduler, shouldn't this I/O
get rejected even earlier?
Both io_uring and libaio will call ioprio_check_cap(), which should fail
the I/O before it reaches the I/O scheduler, but in your case, you are
submitting the I/O from the filesystem.
Should we perhaps add a call to ioprio_check_cap() or similar in some
path used to submit I/O by filesystems?
Kind regards,
Niklas
On Wed, Jan 31, 2024 at 01:52:51PM +0100, Niklas Cassel wrote:
> On Wed, Jan 31, 2024 at 08:14:01PM +0800, zhaoyang.huang wrote:
> > From: Zhaoyang Huang <[email protected]>
> >
> > Since few caller check IOPRIO_PRIO_VALUE's return value and bio_set_prio
> > is a open macro to set bio->ioprio directly.It is confused for the developer
> > who run across kernel panic[1] but can find nothing in previous kernel log.
> > Add a pr_err here to dump the information.
> >
> > [1]
> > Here is the kernel panic I run across which caused by a out of bounds check
> > introduced by CONFIG_FOTIFY_SOURCE.
> >
> > [exception_serialno]:
> > [exception_kernel_version]:
> > [exception_reboot_reason]: kernel_crash
> > [exception_panic_reason]: UBSAN: array index out of bounds: Fatal exception
> > [exception_time]: 1970-01-01_08-00-23
> > [exception_file_info]: not-bugon
> > [exception_task_id]: 409
> > [exception_task_family]: [f2fs_ckpt-254:4, 409][kthreadd, 2]
> > [exception_pc_symbol]: [<ffffffc080736974>] dd_request_merge+0x100/0x110
> > [exception_stack_info]: [<ffffffc07a27e274>] get_exception_stack_info+0x124/0x2d8 [sysdump]gc
> > [<ffffffc07a27e670>] prepare_exception_info+0x158/0x1d4 [sysdump]gc
> > [<ffffffc07a280128>] sysdump_panic_event+0x5d8/0x748 [sysdump]gc
> > [<ffffffc08016a508>] notifier_call_chain+0x98/0x17cgc
> > [<ffffffc08016a9b4>] atomic_notifier_call_chain+0x44/0x68gc
> > [<ffffffc0810f0eb4>] panic+0x194/0x37cgc
> > [<ffffffc0800a638c>] die+0x300/0x310gc
> > [<ffffffc0800a77e8>] ubsan_handler+0x34/0x4cgc
> > [<ffffffc0800960a8>] brk_handler+0x9c/0x11cgc
> > [<ffffffc0800bf998>] do_debug_exception+0xb0/0x140gc
> > [<ffffffc0810f8bf0>] el1_dbg+0x58/0x74gc
> > [<ffffffc0810f89f4>] el1h_64_sync_handler+0x3c/0x90gc
> > [<ffffffc080091298>] el1h_64_sync+0x68/0x6cgc
> > [<ffffffc080736974>] dd_request_merge+0x100/0x110gc //out of bound
> > here caused by the value of class transferred from ioprio
> > [<ffffffc080707f28>] elv_merge+0x248/0x270gc
> > [<ffffffc0807146e8>] blk_mq_sched_try_merge+0x4c/0x20cgc
> > [<ffffffc080736824>] dd_bio_merge+0x64/0xb4gc
> > [<ffffffc080723e3c>] blk_mq_sched_bio_merge+0x68/0x144gc
> > [<ffffffc08071b944>] blk_mq_submit_bio+0x2e8/0x6c0gc
> > [<ffffffc08070dd3c>] __submit_bio+0xbc/0x1b0gc
> > [<ffffffc08070c440>] submit_bio_noacct_nocheck+0xe4/0x2f0gc
> > [<ffffffc08070c8e4>] submit_bio_noacct+0x298/0x3d8gc
> > [<ffffffc08070caf8>] submit_bio+0xd4/0xf0gc
> > [<ffffffc080642644>] f2fs_submit_write_bio+0xcc/0x49cgc
> > [<ffffffc0806442d4>] __submit_merged_bio+0x48/0x13cgc
> > [<ffffffc080641de4>] __submit_merged_write_cond+0x18c/0x1f8gc
> > [<ffffffc080641c4c>] f2fs_submit_merged_write+0x2c/0x38
> > [<ffffffc080655724>] f2fs_sync_node_pages+0x6e0/0x740gc
> > [<ffffffc08063946c>] f2fs_write_checkpoint+0x4c0/0x97cgc
> > [<ffffffc08063b37c>] __checkpoint_and_complete_reqs+0x88/0x248gc
> > [<ffffffc08063ad70>] issue_checkpoint_thread+0x94/0xf4gc
> > [<ffffffc080167c20>] kthread+0x110/0x1b8gc
> >
> > Signed-off-by: Zhaoyang Huang <[email protected]>
> > ---
> > include/uapi/linux/ioprio.h | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h
> > index bee2bdb0eedb..73c420a0df72 100644
> > --- a/include/uapi/linux/ioprio.h
> > +++ b/include/uapi/linux/ioprio.h
> > @@ -112,8 +112,11 @@ static __always_inline __u16 ioprio_value(int prioclass, int priolevel,
> > {
> > if (IOPRIO_BAD_VALUE(prioclass, IOPRIO_NR_CLASSES) ||
> > IOPRIO_BAD_VALUE(priolevel, IOPRIO_NR_LEVELS) ||
> > - IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS))
> > + IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) {
> > + pr_err("%s: get a invalid domain in class %d, level %d, hint %d\n",
> > + __func__, prioclass, priolevel, priohint);
> > return IOPRIO_CLASS_INVALID << IOPRIO_CLASS_SHIFT;
> > + }
> >
> > return (prioclass << IOPRIO_CLASS_SHIFT) |
> > (priohint << IOPRIO_HINT_SHIFT) | priolevel;
> > --
> > 2.25.1
> >
>
> Adding linux-block to CC.
>
> pr_err() is a kernel function for printing.
> ioprio_value() is a function in a uapi header, so this function will be
> used by user space programs.
>
> There is a reason:
> $ git grep pr_err include/uapi/
>
> Gives no results.
>
>
>
> I think you should fix mq-deadline instead.
> It looks like the problem comes from:
> ioprio_value() will set class to IOPRIO_CLASS_INVALID (value 7),
> if the user specified an class/level/hint that was invalid.
>
> ioprio_class_to_prio[] array in mq-deadline.c does currently not have an
> entry in to translate IOPRIO_CLASS_INVALID (7) to a valid DD_*_PRIO value.
>
> Although, why does this I/O even reach the scheduler, shouldn't this I/O
> get rejected even earlier?
>
> Both io_uring and libaio will call ioprio_check_cap(), which should fail
> the I/O before it reaches the I/O scheduler, but in your case, you are
> submitting the I/O from the filesystem.
>
> Should we perhaps add a call to ioprio_check_cap() or similar in some
> path used to submit I/O by filesystems?
On second thought, how can the FS have a ioprio class stored that would
have been rejected at I/O submission time by the user?
This sound like either a bug in the FS or by some of your local changes
that you did for your other patch (ioprio based on activity).
Kind regards,
Niklas
On Wed, Jan 31, 2024 at 9:07 PM Niklas Cassel <[email protected]> wrote:
>
> On Wed, Jan 31, 2024 at 01:52:51PM +0100, Niklas Cassel wrote:
> > On Wed, Jan 31, 2024 at 08:14:01PM +0800, zhaoyang.huang wrote:
> > > From: Zhaoyang Huang <[email protected]>
> > >
> > > Since few caller check IOPRIO_PRIO_VALUE's return value and bio_set_prio
> > > is a open macro to set bio->ioprio directly.It is confused for the developer
> > > who run across kernel panic[1] but can find nothing in previous kernel log.
> > > Add a pr_err here to dump the information.
> > >
> > > [1]
> > > Here is the kernel panic I run across which caused by a out of bounds check
> > > introduced by CONFIG_FOTIFY_SOURCE.
> > >
> > > [exception_serialno]:
> > > [exception_kernel_version]:
> > > [exception_reboot_reason]: kernel_crash
> > > [exception_panic_reason]: UBSAN: array index out of bounds: Fatal exception
> > > [exception_time]: 1970-01-01_08-00-23
> > > [exception_file_info]: not-bugon
> > > [exception_task_id]: 409
> > > [exception_task_family]: [f2fs_ckpt-254:4, 409][kthreadd, 2]
> > > [exception_pc_symbol]: [<ffffffc080736974>] dd_request_merge+0x100/0x110
> > > [exception_stack_info]: [<ffffffc07a27e274>] get_exception_stack_info+0x124/0x2d8 [sysdump]gc
> > > [<ffffffc07a27e670>] prepare_exception_info+0x158/0x1d4 [sysdump]gc
> > > [<ffffffc07a280128>] sysdump_panic_event+0x5d8/0x748 [sysdump]gc
> > > [<ffffffc08016a508>] notifier_call_chain+0x98/0x17cgc
> > > [<ffffffc08016a9b4>] atomic_notifier_call_chain+0x44/0x68gc
> > > [<ffffffc0810f0eb4>] panic+0x194/0x37cgc
> > > [<ffffffc0800a638c>] die+0x300/0x310gc
> > > [<ffffffc0800a77e8>] ubsan_handler+0x34/0x4cgc
> > > [<ffffffc0800960a8>] brk_handler+0x9c/0x11cgc
> > > [<ffffffc0800bf998>] do_debug_exception+0xb0/0x140gc
> > > [<ffffffc0810f8bf0>] el1_dbg+0x58/0x74gc
> > > [<ffffffc0810f89f4>] el1h_64_sync_handler+0x3c/0x90gc
> > > [<ffffffc080091298>] el1h_64_sync+0x68/0x6cgc
> > > [<ffffffc080736974>] dd_request_merge+0x100/0x110gc //out of bound
> > > here caused by the value of class transferred from ioprio
> > > [<ffffffc080707f28>] elv_merge+0x248/0x270gc
> > > [<ffffffc0807146e8>] blk_mq_sched_try_merge+0x4c/0x20cgc
> > > [<ffffffc080736824>] dd_bio_merge+0x64/0xb4gc
> > > [<ffffffc080723e3c>] blk_mq_sched_bio_merge+0x68/0x144gc
> > > [<ffffffc08071b944>] blk_mq_submit_bio+0x2e8/0x6c0gc
> > > [<ffffffc08070dd3c>] __submit_bio+0xbc/0x1b0gc
> > > [<ffffffc08070c440>] submit_bio_noacct_nocheck+0xe4/0x2f0gc
> > > [<ffffffc08070c8e4>] submit_bio_noacct+0x298/0x3d8gc
> > > [<ffffffc08070caf8>] submit_bio+0xd4/0xf0gc
> > > [<ffffffc080642644>] f2fs_submit_write_bio+0xcc/0x49cgc
> > > [<ffffffc0806442d4>] __submit_merged_bio+0x48/0x13cgc
> > > [<ffffffc080641de4>] __submit_merged_write_cond+0x18c/0x1f8gc
> > > [<ffffffc080641c4c>] f2fs_submit_merged_write+0x2c/0x38
> > > [<ffffffc080655724>] f2fs_sync_node_pages+0x6e0/0x740gc
> > > [<ffffffc08063946c>] f2fs_write_checkpoint+0x4c0/0x97cgc
> > > [<ffffffc08063b37c>] __checkpoint_and_complete_reqs+0x88/0x248gc
> > > [<ffffffc08063ad70>] issue_checkpoint_thread+0x94/0xf4gc
> > > [<ffffffc080167c20>] kthread+0x110/0x1b8gc
> > >
> > > Signed-off-by: Zhaoyang Huang <[email protected]>
> > > ---
> > > include/uapi/linux/ioprio.h | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h
> > > index bee2bdb0eedb..73c420a0df72 100644
> > > --- a/include/uapi/linux/ioprio.h
> > > +++ b/include/uapi/linux/ioprio.h
> > > @@ -112,8 +112,11 @@ static __always_inline __u16 ioprio_value(int prioclass, int priolevel,
> > > {
> > > if (IOPRIO_BAD_VALUE(prioclass, IOPRIO_NR_CLASSES) ||
> > > IOPRIO_BAD_VALUE(priolevel, IOPRIO_NR_LEVELS) ||
> > > - IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS))
> > > + IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) {
> > > + pr_err("%s: get a invalid domain in class %d, level %d, hint %d\n",
> > > + __func__, prioclass, priolevel, priohint);
> > > return IOPRIO_CLASS_INVALID << IOPRIO_CLASS_SHIFT;
> > > + }
> > >
> > > return (prioclass << IOPRIO_CLASS_SHIFT) |
> > > (priohint << IOPRIO_HINT_SHIFT) | priolevel;
> > > --
> > > 2.25.1
> > >
> >
> > Adding linux-block to CC.
> >
> > pr_err() is a kernel function for printing.
> > ioprio_value() is a function in a uapi header, so this function will be
> > used by user space programs.
> >
> > There is a reason:
> > $ git grep pr_err include/uapi/
> >
> > Gives no results.
> >
> >
> >
> > I think you should fix mq-deadline instead.
> > It looks like the problem comes from:
> > ioprio_value() will set class to IOPRIO_CLASS_INVALID (value 7),
> > if the user specified an class/level/hint that was invalid.
> >
> > ioprio_class_to_prio[] array in mq-deadline.c does currently not have an
> > entry in to translate IOPRIO_CLASS_INVALID (7) to a valid DD_*_PRIO value.
> >
> > Although, why does this I/O even reach the scheduler, shouldn't this I/O
> > get rejected even earlier?
> >
> > Both io_uring and libaio will call ioprio_check_cap(), which should fail
> > the I/O before it reaches the I/O scheduler, but in your case, you are
> > submitting the I/O from the filesystem.
> >
> > Should we perhaps add a call to ioprio_check_cap() or similar in some
> > path used to submit I/O by filesystems?
>
> On second thought, how can the FS have a ioprio class stored that would
> have been rejected at I/O submission time by the user?
>
> This sound like either a bug in the FS or by some of your local changes
> that you did for your other patch (ioprio based on activity).
Yes. That's why I would like to suggest adding some information here
to help developers find the clue quickly.
>
>
> Kind regards,
> Niklas
On 2/1/24 08:34, Zhaoyang Huang wrote:
> On Wed, Jan 31, 2024 at 9:07 PM Niklas Cassel <[email protected]> wrote:
>> This sound like either a bug in the FS or by some of your local changes
>> that you did for your other patch (ioprio based on activity).
> Yes. That's why I would like to suggest adding some information here
> to help developers find the clue quickly.
The backtrace was not clear enough ?
When a request reaches mq-deadline, the request priority is supposed to be
correct already. Your changes had a bug and broke that assumption. Please fix
that instead of adding error messages for errors that should never happen in the
first place.
--
Damien Le Moal
Western Digital Research