On 22/02/21 04:09PM, Steven Rostedt wrote:
> While working on libtraceevent, it stumbled over a "->" in the print
> formats that wasn't parsing properly. Well, the reason it does not
> handle this particular case is because it is A MAJOR BUG IN THE KERNEL!
>
> DO NOT DEREFERENCE ANYTHING FROM THE RING BUFFER.
>
> Sorry, for yelling, but I really want to stress this is not something
> to do. The ring buffer is referenced seconds, minutes, hours, days,
> months after the data has been loaded. This is the same as using
> something after it is freed. Just don't do it.
>
> The culprit is:
>
> ext4_fc_stats
>
> Introduced by: aa75f4d3daaeb ("ext4: main fast-commit commit path")
>
> That has in its print fmt in include/trace/events/ext4.h:
>
>
> TP_printk("dev %d:%d fc ineligible reasons:\n"
> "%s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d; "
> "num_commits:%ld, ineligible: %ld, numblks: %ld",
> MAJOR(__entry->dev), MINOR(__entry->dev),
> FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
> FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
> FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
> FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
> FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
> FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
> FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
> FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
> FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
> __entry->sbi->s_fc_stats.fc_num_commits,
> __entry->sbi->s_fc_stats.fc_ineligible_commits,
> __entry->sbi->s_fc_stats.fc_numblks)
>
> That __entry->sbi can be LONG GONE by the time it is read. You have no
> idea what it is pointing to. And even if it is still around, there's no
> way that any of those numbers are accurate from the time the event
> triggered. Might as well just expose them by some other interface.
Thanks Steven for spotting this and providing details of the problem.
I have submitted a patch to fix this problem at [1].
But I am facing a small challenge w.r.t __array field type w.r.t perf record
which I have explained in the cover letter.
Would it be possible for you to take a look at it [2] and also at the patch [1].
Thanks a lot for your help!!
[1]: https://lore.kernel.org/all/9a8c359270a6330ed384ea0a75441e367ecde924.1645558375.git.riteshh@linux.ibm.com/
[2]: https://lore.kernel.org/all/[email protected]/
-ritesh
>
> This event must be modified or removed from the current release and all
> stable kernels that have it.
>
> In the mean time, I need to update my event verifier to detect this and
> fail to register the event if it has such cases.
>
> -- Steve