v4 --> v5:
1. In scripts/kallsyms.c, we use an extra field to hold type and eventually
put it together with name in write_src().
2. Generate a new table kallsyms_best_token_table[], so that we compress a
symbol in the kernel using a process similar to compress_symbol().
3. Remove helper sym_name(), and rename field 'sym[]' to 'name[]' in
scripts/kallsyms.c
4. Add helper __kallsyms_lookup_compressed_name() to avoid duplicate code in
functions kallsyms_lookup_name() and kallsyms_on_each_match_symbol().
5. Add a new parameter "const char *modname" to module_kallsyms_on_each_symbol(),
this makes the code logic clearer.
6. Delete the parameter 'struct module *' in the hook function associated with
kallsyms_on_each_symbol(), it's unused now.
v3 --> v4:
1. Move the declaration of function kallsyms_sym_address() to linux/kallsyms.h,
fix a build warning.
v2 --> v3:
1. Improve test cases, perform complete functional tests on functions
kallsyms_lookup_name(), kallsyms_on_each_symbol() and
kallsyms_on_each_match_symbol().
2. Add patch [PATCH v3 2/8] scripts/kallsyms: ensure that all possible
combinations are compressed.
3. The symbol type is not compressed regardless of whether
CONFIG_KALLSYMS_ALL is set or not. The memory overhead is increased
by less than 20KiB if CONFIG_KALLSYMS_ALL=n.
4. Discard [PATCH v2 3/8] kallsyms: Adjust the types of some local variables
v1 --> v2:
Add self-test facility
v1:
Currently, to search for a symbol, we need to expand the symbols in
'kallsyms_names' one by one, and then use the expanded string for
comparison. This is very slow.
In fact, we can first compress the name being looked up and then use
it for comparison when traversing 'kallsyms_names'.
This patch series optimizes the performance of function kallsyms_lookup_name(),
and function klp_find_object_symbol() in the livepatch module. Based on the
test results, the performance overhead is reduced to 5%. That is, the
performance of these functions is improved by 20 times.
To avoid increasing the kernel size in non-debug mode, the optimization is only
for the case CONFIG_KALLSYMS_ALL=y.
Zhen Lei (10):
scripts/kallsyms: rename build_initial_tok_table()
scripts/kallsyms: don't compress symbol types
scripts/kallsyms: remove helper sym_name() and cleanup
scripts/kallsyms: generate kallsyms_best_token_table[]
kallsyms: Improve the performance of kallsyms_lookup_name()
kallsyms: Add helper kallsyms_on_each_match_symbol()
livepatch: Use kallsyms_on_each_match_symbol() to improve performance
livepatch: Improve the search performance of
module_kallsyms_on_each_symbol()
kallsyms: Delete an unused parameter related to
kallsyms_on_each_symbol()
kallsyms: Add self-test facility
include/linux/kallsyms.h | 12 +-
include/linux/module.h | 4 +-
init/Kconfig | 13 ++
kernel/Makefile | 1 +
kernel/kallsyms.c | 167 ++++++++++++++-
kernel/kallsyms_internal.h | 1 +
kernel/kallsyms_selftest.c | 421 +++++++++++++++++++++++++++++++++++++
kernel/livepatch/core.c | 31 ++-
kernel/module/kallsyms.c | 15 +-
kernel/trace/ftrace.c | 3 +-
scripts/kallsyms.c | 88 +++++---
11 files changed, 694 insertions(+), 62 deletions(-)
create mode 100644 kernel/kallsyms_selftest.c
--
2.25.1
Currently, to search for a symbol, we need to expand the symbols in
'kallsyms_names' one by one, and then use the expanded string for
comparison. Because we do not know the symbol type, and the symbol type
may be combined with the following characters to form a token.
So if we don't compress the symbol type, we can first compress the
searched symbol and then make a quick comparison based on the compressed
length and content. In this way, for entries with mismatched lengths,
there is no need to expand and compare strings. And for those matching
lengths, there's no need to expand the symbol. This saves a lot of time.
According to my test results, the average performance of
kallsyms_lookup_name() can be improved by 20 to 30 times.
Of course, because the symbol type is forcibly not compressed, the
compression rate also decreases. Here are the test results with
defconfig:
arm64: <<<<<<
---------------------------------------------------------------
| ALL | nr_symbols | compressed size | original size | ratio(%) |
-----|---------------------------------------------------------|
Before | Y | 174094 | 1884938 | 3750653 | 50.25 |
After | Y | 174099 | 1960154 | 3750756 | 52.26 |
Before | N | 61744 | 725507 | 1222737 | 59.33 |
After | N | 61747 | 745733 | 1222801 | 60.98 |
---------------------------------------------------------------
The memory overhead is increased by:
73.5KiB and 4.0% if CONFIG_KALLSYMS_ALL=y.
19.8KiB and 2.8% if CONFIG_KALLSYMS_ALL=n.
x86: <<<<<<<<
---------------------------------------------------------------
| ALL | nr_symbols | compressed size | original size | ratio(%) |
-----|---------------------------------------------------------|
Before | Y | 131415 | 1697542 | 3161216 | 53.69 |
After | Y | 131540 | 1747769 | 3163933 | 55.24 |
Before | N | 60695 | 737627 | 1283046 | 57.49 |
After | N | 60699 | 754797 | 1283149 | 58.82 |
---------------------------------------------------------------
The memory overhead is increased by:
49.0KiB and 3.0% if CONFIG_KALLSYMS_ALL=y.
16.8KiB and 2.3% if CONFIG_KALLSYMS_ALL=n.
This additional memory overhead is worth it compared to the performance
improvement, I think.
Let's use an extra field to hold type and eventually put it together with
name in write_src().
Signed-off-by: Zhen Lei <[email protected]>
---
scripts/kallsyms.c | 39 +++++++++++++++++++++++----------------
1 file changed, 23 insertions(+), 16 deletions(-)
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 8caccc8f4a23703..296277128d450ff 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -34,6 +34,7 @@ struct sym_entry {
unsigned int len;
unsigned int start_pos;
unsigned int percpu_absolute;
+ unsigned char type;
unsigned char sym[];
};
@@ -77,7 +78,7 @@ static void usage(void)
static char *sym_name(const struct sym_entry *s)
{
- return (char *)s->sym + 1;
+ return (char *)s->sym;
}
static bool is_ignored_symbol(const char *name, char type)
@@ -227,11 +228,7 @@ static struct sym_entry *read_symbol(FILE *in)
check_symbol_range(name, addr, text_ranges, ARRAY_SIZE(text_ranges));
check_symbol_range(name, addr, &percpu_range, 1);
- /* include the type field in the symbol name, so that it gets
- * compressed together */
-
- len = strlen(name) + 1;
-
+ len = strlen(name);
sym = malloc(sizeof(*sym) + len + 1);
if (!sym) {
fprintf(stderr, "kallsyms failure: "
@@ -240,7 +237,7 @@ static struct sym_entry *read_symbol(FILE *in)
}
sym->addr = addr;
sym->len = len;
- sym->sym[0] = type;
+ sym->type = type;
strcpy(sym_name(sym), name);
sym->percpu_absolute = 0;
@@ -471,12 +468,18 @@ static void write_src(void)
if ((i & 0xFF) == 0)
markers[i >> 8] = off;
- printf("\t.byte 0x%02x", table[i]->len);
+ /*
+ * Store the symbol type togerher with symbol name.
+ * It helps to reduce the size.
+ */
+ printf("\t.byte 0x%02x", table[i]->len + 1);
+ printf(", 0x%02x", table[i]->type);
for (k = 0; k < table[i]->len; k++)
printf(", 0x%02x", table[i]->sym[k]);
printf("\n");
- off += table[i]->len + 1;
+ /* fields 'len' and 'type' occupy one byte each */
+ off += table[i]->len + 1 + 1;
}
printf("\n");
@@ -637,14 +640,18 @@ static void optimize_result(void)
/* start by placing the symbols that are actually used on the table */
static void insert_real_symbols_in_table(void)
{
- unsigned int i, j, c;
+ unsigned int i, j;
+ unsigned char c;
for (i = 0; i < table_cnt; i++) {
for (j = 0; j < table[i]->len; j++) {
c = table[i]->sym[j];
- best_table[c][0]=c;
- best_table_len[c]=1;
+ best_table[c][0] = c;
+ best_table_len[c] = 1;
}
+ c = table[i]->type;
+ best_table[c][0] = c;
+ best_table_len[c] = 1;
}
}
@@ -661,7 +668,7 @@ static void optimize_token_table(void)
static int may_be_linker_script_provide_symbol(const struct sym_entry *se)
{
const char *symbol = sym_name(se);
- int len = se->len - 1;
+ int len = se->len;
if (len < 8)
return 0;
@@ -705,8 +712,8 @@ static int compare_symbols(const void *a, const void *b)
return -1;
/* sort by "weakness" type */
- wa = (sa->sym[0] == 'w') || (sa->sym[0] == 'W');
- wb = (sb->sym[0] == 'w') || (sb->sym[0] == 'W');
+ wa = (sa->type == 'w') || (sa->type == 'W');
+ wb = (sb->type == 'w') || (sb->type == 'W');
if (wa != wb)
return wa - wb;
@@ -742,7 +749,7 @@ static void make_percpus_absolute(void)
* ensure consistent behavior compared to older
* versions of this tool.
*/
- table[i]->sym[0] = 'A';
+ table[i]->type = 'A';
table[i]->percpu_absolute = 1;
}
}
--
2.25.1
Except for the function build_initial_tok_table(), no token abbreviation
is used elsewhere.
$ cat scripts/kallsyms.c | grep tok | wc -l
33
$ cat scripts/kallsyms.c | grep token | wc -l
31
Here, it would be clearer to use the full name.
Signed-off-by: Zhen Lei <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
---
scripts/kallsyms.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index f18e6dfc68c5839..8caccc8f4a23703 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -525,7 +525,7 @@ static void forget_symbol(const unsigned char *symbol, int len)
}
/* do the initial token count */
-static void build_initial_tok_table(void)
+static void build_initial_token_table(void)
{
unsigned int i;
@@ -650,7 +650,7 @@ static void insert_real_symbols_in_table(void)
static void optimize_token_table(void)
{
- build_initial_tok_table();
+ build_initial_token_table();
insert_real_symbols_in_table();
--
2.25.1
Function kallsyms_on_each_symbol() traverses all symbols and submits each
symbol to the hook 'fn' for judgment and processing. For some cases, the
hook actually only handles the matched symbol, such as livepatch.
So that, we can first compress the name being looked up and then use
it for comparison when traversing 'kallsyms_names', this greatly reduces
the time consumed by traversing.
The pseudo code of the test case is as follows:
static int tst_find(void *data, const char *name,
struct module *mod, unsigned long addr)
{
if (strcmp(name, "vmap") == 0)
*(unsigned long *)data = addr;
return 0;
}
static int tst_match(void *data, unsigned long addr)
{
*(unsigned long *)data = addr;
return 0;
}
start = sched_clock();
kallsyms_on_each_match_symbol(tst_match, "vmap", &addr);
end = sched_clock();
start = sched_clock();
kallsyms_on_each_symbol(tst_find, &addr);
end = sched_clock();
The test results are as follows (twice):
kallsyms_on_each_match_symbol: 557400, 583900
kallsyms_on_each_symbol : 16659500, 16113950
kallsyms_on_each_match_symbol() consumes only 3.48% of
kallsyms_on_each_symbol()'s time.
Signed-off-by: Zhen Lei <[email protected]>
---
include/linux/kallsyms.h | 8 ++++++++
kernel/kallsyms.c | 44 ++++++++++++++++++++++++++++++++++++----
2 files changed, 48 insertions(+), 4 deletions(-)
diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index ad39636e0c3f122..2138219ae0296e9 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -69,6 +69,8 @@ static inline void *dereference_symbol_descriptor(void *ptr)
int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
unsigned long),
void *data);
+int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
+ const char *name, void *data);
/* Lookup the address for a symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name);
@@ -168,6 +170,12 @@ static inline int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct
{
return -EOPNOTSUPP;
}
+
+static inline int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
+ const char *name, void *data)
+{
+ return -EOPNOTSUPP;
+}
#endif /*CONFIG_KALLSYMS*/
static inline void print_ip_sym(const char *loglvl, unsigned long ip)
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index dcf5bdc7309a6cc..398865f01360589 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -266,14 +266,17 @@ static bool cleanup_symbol_name(char *s)
return false;
}
-static int kallsyms_lookup_compressed_name(unsigned char *namebuf, int len,
- unsigned long *addr)
+static int __kallsyms_lookup_compressed_name(unsigned char *namebuf, int len,
+ unsigned int *index,
+ unsigned int *offset,
+ unsigned long *addr)
{
- unsigned int i, off;
+ unsigned int i = *index;
+ unsigned int off = *offset;
unsigned int name_len;
const unsigned char *name;
- for (i = 0, off = 0; len && i < kallsyms_num_syms; i++) {
+ for (; len && i < kallsyms_num_syms; i++) {
/*
* For each symbol entry, the storage format is:
* ----------------------------
@@ -290,6 +293,10 @@ static int kallsyms_lookup_compressed_name(unsigned char *namebuf, int len,
continue;
if (!memcmp(name, namebuf, len)) {
+ /* Prepare for the next iteration */
+ *index = i + 1;
+ *offset = off;
+
*addr = kallsyms_sym_address(i);
return 0;
}
@@ -298,6 +305,14 @@ static int kallsyms_lookup_compressed_name(unsigned char *namebuf, int len,
return -ENOENT;
}
+static int kallsyms_lookup_compressed_name(unsigned char *namebuf, int len,
+ unsigned long *addr)
+{
+ unsigned int i = 0, off = 0;
+
+ return __kallsyms_lookup_compressed_name(namebuf, len, &i, &off, addr);
+}
+
/* Lookup the address for this symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name)
{
@@ -348,6 +363,27 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
return 0;
}
+int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
+ const char *name, void *data)
+{
+ int ret, len;
+ unsigned long addr;
+ unsigned int i = 0, off = 0;
+ char namebuf[KSYM_NAME_LEN];
+
+ len = kallsyms_compress_symbol_name(name, namebuf, ARRAY_SIZE(namebuf));
+ do {
+ ret = __kallsyms_lookup_compressed_name(namebuf, len, &i, &off, &addr);
+ if (ret)
+ return 0; /* end of lookup */
+
+ ret = fn(data, addr);
+ cond_resched();
+ } while (!ret);
+
+ return ret;
+}
+
static unsigned long get_symbol_pos(unsigned long addr,
unsigned long *symbolsize,
unsigned long *offset)
--
2.25.1
Now, the type and name of a symbol are no longer stored together. So the
helper sym_name() is no longer needed. Correspondingly, replacing the
field name 'sym[]' with 'name[]' is more accurate.
Suggested-by: Petr Mladek <[email protected]>
Signed-off-by: Zhen Lei <[email protected]>
---
scripts/kallsyms.c | 29 ++++++++++++-----------------
1 file changed, 12 insertions(+), 17 deletions(-)
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 296277128d450ff..ca378a7e9425c00 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -35,7 +35,7 @@ struct sym_entry {
unsigned int start_pos;
unsigned int percpu_absolute;
unsigned char type;
- unsigned char sym[];
+ unsigned char name[];
};
struct addr_range {
@@ -76,11 +76,6 @@ static void usage(void)
exit(1);
}
-static char *sym_name(const struct sym_entry *s)
-{
- return (char *)s->sym;
-}
-
static bool is_ignored_symbol(const char *name, char type)
{
/* Symbol names that exactly match to the following are ignored.*/
@@ -238,7 +233,7 @@ static struct sym_entry *read_symbol(FILE *in)
sym->addr = addr;
sym->len = len;
sym->type = type;
- strcpy(sym_name(sym), name);
+ strcpy((char *)sym->name, name);
sym->percpu_absolute = 0;
return sym;
@@ -262,7 +257,7 @@ static int symbol_in_range(const struct sym_entry *s,
static int symbol_valid(const struct sym_entry *s)
{
- const char *name = sym_name(s);
+ const char *name = (char *)s->name;
/* if --all-symbols is not specified, then symbols outside the text
* and inittext sections are discarded */
@@ -475,7 +470,7 @@ static void write_src(void)
printf("\t.byte 0x%02x", table[i]->len + 1);
printf(", 0x%02x", table[i]->type);
for (k = 0; k < table[i]->len; k++)
- printf(", 0x%02x", table[i]->sym[k]);
+ printf(", 0x%02x", table[i]->name[k]);
printf("\n");
/* fields 'len' and 'type' occupy one byte each */
@@ -533,7 +528,7 @@ static void build_initial_token_table(void)
unsigned int i;
for (i = 0; i < table_cnt; i++)
- learn_symbol(table[i]->sym, table[i]->len);
+ learn_symbol(table[i]->name, table[i]->len);
}
static unsigned char *find_token(unsigned char *str, int len,
@@ -558,14 +553,14 @@ static void compress_symbols(const unsigned char *str, int idx)
for (i = 0; i < table_cnt; i++) {
len = table[i]->len;
- p1 = table[i]->sym;
+ p1 = table[i]->name;
/* find the token on the symbol */
p2 = find_token(p1, len, str);
if (!p2) continue;
/* decrease the counts for this symbol's tokens */
- forget_symbol(table[i]->sym, len);
+ forget_symbol(table[i]->name, len);
size = len;
@@ -587,7 +582,7 @@ static void compress_symbols(const unsigned char *str, int idx)
table[i]->len = len;
/* increase the counts for this symbol's new tokens */
- learn_symbol(table[i]->sym, len);
+ learn_symbol(table[i]->name, len);
}
}
@@ -645,7 +640,7 @@ static void insert_real_symbols_in_table(void)
for (i = 0; i < table_cnt; i++) {
for (j = 0; j < table[i]->len; j++) {
- c = table[i]->sym[j];
+ c = table[i]->name[j];
best_table[c][0] = c;
best_table_len[c] = 1;
}
@@ -667,7 +662,7 @@ static void optimize_token_table(void)
/* guess for "linker script provide" symbol */
static int may_be_linker_script_provide_symbol(const struct sym_entry *se)
{
- const char *symbol = sym_name(se);
+ const char *symbol = (char *)se->name;
int len = se->len;
if (len < 8)
@@ -724,8 +719,8 @@ static int compare_symbols(const void *a, const void *b)
return wa - wb;
/* sort by the number of prefix underscores */
- wa = strspn(sym_name(sa), "_");
- wb = strspn(sym_name(sb), "_");
+ wa = strspn((char *)sa->name, "_");
+ wb = strspn((char *)sb->name, "_");
if (wa != wb)
return wa - wb;
--
2.25.1
The parameter 'struct module *' in the hook function associated with
kallsyms_on_each_symbol() is no longer used. Delete it.
Suggested-by: Petr Mladek <[email protected]>
Signed-off-by: Zhen Lei <[email protected]>
---
include/linux/kallsyms.h | 3 +--
kernel/kallsyms.c | 5 ++---
kernel/trace/ftrace.c | 3 +--
3 files changed, 4 insertions(+), 7 deletions(-)
diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index 2138219ae0296e9..015c7685765978e 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -66,8 +66,7 @@ static inline void *dereference_symbol_descriptor(void *ptr)
}
#ifdef CONFIG_KALLSYMS
-int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
- unsigned long),
+int kallsyms_on_each_symbol(int (*fn)(void *, const char *, unsigned long),
void *data);
int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
const char *name, void *data);
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 398865f01360589..69e040204ed4ebd 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -344,8 +344,7 @@ unsigned long kallsyms_lookup_name(const char *name)
* Iterate over all symbols in vmlinux. For symbols from modules use
* module_kallsyms_on_each_symbol instead.
*/
-int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
- unsigned long),
+int kallsyms_on_each_symbol(int (*fn)(void *, const char *, unsigned long),
void *data)
{
char namebuf[KSYM_NAME_LEN];
@@ -355,7 +354,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
- ret = fn(data, namebuf, NULL, kallsyms_sym_address(i));
+ ret = fn(data, namebuf, kallsyms_sym_address(i));
if (ret != 0)
return ret;
cond_resched();
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 439e2ab6905ee1e..f135a0a334a3fcb 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -8250,8 +8250,7 @@ struct kallsyms_data {
size_t found;
};
-static int kallsyms_callback(void *data, const char *name,
- struct module *mod, unsigned long addr)
+static int kallsyms_callback(void *data, const char *name, unsigned long addr)
{
struct kallsyms_data *args = data;
const char **sym;
--
2.25.1
To speed up the lookup of a symbol in the kernel, we'd better compress
the searched symbol first and then make a quick comparison based on the
compressed length and content. But the tokens in kallsyms_token_table[]
have been expanded, a more complex process is required to complete the
compression of a symbol. So generate kallsyms_best_token_table[] helps
us to compress a symbol in the kernel using a process similar to
compress_symbol().
Some minor changes have been made to reduce memory usage and improve
compression performance.
1. Some entries in best_table[] are single characters, and most of them
are clustered together. such as a-z, A-Z, 0-9. These individual
characters are not used in the process of compressing a symbol. Let
kallsyms_best_token_table[i][0] = 0x00, [i][0] = number of consecutive
single characters (for exampe, a-z is 26). When [i][0] = 0x00 is
encountered, we can skip to the next token with two elements.
2. Now ARRAY_SIZE(kallsyms_best_token_table) is not fixed, we store
the content of best_table[] to kallsyms_best_token_table[] in reverse
order. That is, the higher the frequency, the lower the index.
Signed-off-by: Zhen Lei <[email protected]>
---
kernel/kallsyms_internal.h | 1 +
scripts/kallsyms.c | 18 ++++++++++++++++++
2 files changed, 19 insertions(+)
diff --git a/kernel/kallsyms_internal.h b/kernel/kallsyms_internal.h
index 2d0c6f2f0243a28..d9672ede8cfc215 100644
--- a/kernel/kallsyms_internal.h
+++ b/kernel/kallsyms_internal.h
@@ -26,5 +26,6 @@ extern const char kallsyms_token_table[] __weak;
extern const u16 kallsyms_token_index[] __weak;
extern const unsigned int kallsyms_markers[] __weak;
+extern const unsigned char kallsyms_best_token_table[] __weak;
#endif // LINUX_KALLSYMS_INTERNAL_H_
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index ca378a7e9425c00..40a6fe6d14ef03f 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -499,6 +499,24 @@ static void write_src(void)
for (i = 0; i < 256; i++)
printf("\t.short\t%d\n", best_idx[i]);
printf("\n");
+
+ output_label("kallsyms_best_token_table");
+ for (i = 255, k = 0; (int)i >= 0; i--) {
+ if (best_table_len[i] <= 1) {
+ k++;
+ continue;
+ }
+
+ if (k) {
+ printf("\t.byte 0x00, 0x%02x\n", k);
+ k = 0;
+ }
+
+ printf("\t.byte 0x%02x, 0x%02x\n", best_table[i][0], best_table[i][1]);
+ }
+ if (k)
+ printf("\t.byte 0x00, 0x%02x\n", k);
+ printf("\n");
}
--
2.25.1
Currently we traverse all symbols of all modules to find the specified
function for the specified module. But in reality, we just need to find
the given module and then traverse all the symbols in it.
In order to achieve this purpose, split the call to hook 'fn' into two
phases:
1. Finds the given module. Pass pointer 'mod'. Hook 'fn' directly returns
the comparison result of the module name without comparing the function
name.
2. Finds the given function in that module. Pass pointer 'mod = NULL'.
Hook 'fn' skip the comparison of module name and directly compare
function names.
Phase1: mod1-->mod2..(subsequent modules do not need to be compared)
|
Phase2: -->f1-->f2-->f3
Signed-off-by: Zhen Lei <[email protected]>
---
include/linux/module.h | 4 ++--
kernel/livepatch/core.c | 13 ++-----------
kernel/module/kallsyms.c | 15 ++++++++++++---
3 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/include/linux/module.h b/include/linux/module.h
index 518296ea7f73af6..6e1a531d78e7e8b 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -879,8 +879,8 @@ static inline bool module_sig_ok(struct module *module)
}
#endif /* CONFIG_MODULE_SIG */
-int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
- struct module *, unsigned long),
+int module_kallsyms_on_each_symbol(const char *modname,
+ int (*fn)(void *, const char *, unsigned long),
void *data);
#endif /* _LINUX_MODULE_H */
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 31b57ccf908017e..b02de4cb311c703 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -118,27 +118,19 @@ static struct klp_object *klp_find_object(struct klp_patch *patch,
}
struct klp_find_arg {
- const char *objname;
const char *name;
unsigned long addr;
unsigned long count;
unsigned long pos;
};
-static int klp_find_callback(void *data, const char *name,
- struct module *mod, unsigned long addr)
+static int klp_find_callback(void *data, const char *name, unsigned long addr)
{
struct klp_find_arg *args = data;
- if ((mod && !args->objname) || (!mod && args->objname))
- return 0;
-
if (strcmp(args->name, name))
return 0;
- if (args->objname && strcmp(args->objname, mod->name))
- return 0;
-
args->addr = addr;
args->count++;
@@ -175,7 +167,6 @@ static int klp_find_object_symbol(const char *objname, const char *name,
unsigned long sympos, unsigned long *addr)
{
struct klp_find_arg args = {
- .objname = objname,
.name = name,
.addr = 0,
.count = 0,
@@ -183,7 +174,7 @@ static int klp_find_object_symbol(const char *objname, const char *name,
};
if (objname)
- module_kallsyms_on_each_symbol(klp_find_callback, &args);
+ module_kallsyms_on_each_symbol(objname, klp_find_callback, &args);
else
kallsyms_on_each_match_symbol(klp_match_callback, name, &args);
diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c
index f5c5c9175333df7..329cef573675d49 100644
--- a/kernel/module/kallsyms.c
+++ b/kernel/module/kallsyms.c
@@ -495,8 +495,8 @@ unsigned long module_kallsyms_lookup_name(const char *name)
}
#ifdef CONFIG_LIVEPATCH
-int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
- struct module *, unsigned long),
+int module_kallsyms_on_each_symbol(const char *modname,
+ int (*fn)(void *, const char *, unsigned long),
void *data)
{
struct module *mod;
@@ -510,6 +510,9 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
if (mod->state == MODULE_STATE_UNFORMED)
continue;
+ if (strcmp(modname, mod->name))
+ continue;
+
/* Use rcu_dereference_sched() to remain compliant with the sparse tool */
preempt_disable();
kallsyms = rcu_dereference_sched(mod->kallsyms);
@@ -522,10 +525,16 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
continue;
ret = fn(data, kallsyms_symbol_name(kallsyms, i),
- mod, kallsyms_symbol_value(sym));
+ kallsyms_symbol_value(sym));
if (ret != 0)
goto out;
}
+
+ /*
+ * The given module is found, the subsequent modules do not
+ * need to be compared.
+ */
+ break;
}
out:
mutex_unlock(&module_mutex);
--
2.25.1
Added test cases for basic functions and performance of functions
kallsyms_lookup_name(), kallsyms_on_each_symbol() and
kallsyms_on_each_match_symbol(). It also calculates the compression rate
of the kallsyms compression algorithm for the current symbol set.
The basic functions test begins by testing a set of symbols whose address
values are known. Then, traverse all symbol addresses and find the
corresponding symbol name based on the address. It's impossible to
determine whether these addresses are correct, but we can use the above
three functions along with the addresses to test each other. Due to the
traversal operation of kallsyms_on_each_symbol() is too slow, only 60
symbols can be tested in one second, so let it test on average once
every 128 symbols. The other two functions validate all symbols.
If the basic functions test is passed, print only performance test
results. If the test fails, print error information, but do not perform
subsequent performance tests.
Start self-test automatically after system startup if
CONFIG_KALLSYMS_SELFTEST=y.
Example of output content: (prefix 'kallsyms_selftest:' is omitted)
start
---------------------------------------------------------
| nr_symbols | compressed size | original size | ratio(%) |
|---------------------------------------------------------|
| 174099 | 1960154 | 3750756 | 52.26 |
---------------------------------------------------------
kallsyms_lookup_name() looked up 174099 symbols
The time spent on each symbol is (ns): min=5250, max=726560, avg=302132
kallsyms_on_each_symbol() traverse all: 16659500 ns
kallsyms_on_each_match_symbol() traverse all: 557400 ns
finish
Signed-off-by: Zhen Lei <[email protected]>
---
include/linux/kallsyms.h | 1 +
init/Kconfig | 13 ++
kernel/Makefile | 1 +
kernel/kallsyms.c | 2 +-
kernel/kallsyms_selftest.c | 421 +++++++++++++++++++++++++++++++++++++
5 files changed, 437 insertions(+), 1 deletion(-)
create mode 100644 kernel/kallsyms_selftest.c
diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index 015c7685765978e..c7219d74e29000c 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -66,6 +66,7 @@ static inline void *dereference_symbol_descriptor(void *ptr)
}
#ifdef CONFIG_KALLSYMS
+unsigned long kallsyms_sym_address(int idx);
int kallsyms_on_each_symbol(int (*fn)(void *, const char *, unsigned long),
void *data);
int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
diff --git a/init/Kconfig b/init/Kconfig
index 532362fcfe31fd3..60193fd185fb6e6 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1716,6 +1716,19 @@ config KALLSYMS
symbolic stack backtraces. This increases the size of the kernel
somewhat, as all symbols have to be loaded into the kernel image.
+config KALLSYMS_SELFTEST
+ bool "Test the basic functions and performance of kallsyms"
+ depends on KALLSYMS
+ default n
+ help
+ Test the basic functions and performance of some interfaces, such as
+ kallsyms_lookup_name. It also calculates the compression rate of the
+ kallsyms compression algorithm for the current symbol set.
+
+ Start self-test automatically after system startup. Suggest executing
+ "dmesg | grep kallsyms_selftest" to collect test results. "finish" is
+ displayed in the last line, indicating that the test is complete.
+
config KALLSYMS_ALL
bool "Include all symbols in kallsyms"
depends on DEBUG_KERNEL && KALLSYMS
diff --git a/kernel/Makefile b/kernel/Makefile
index 318789c728d3290..122a5fed457bd98 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -68,6 +68,7 @@ endif
obj-$(CONFIG_UID16) += uid16.o
obj-$(CONFIG_MODULE_SIG_FORMAT) += module_signature.o
obj-$(CONFIG_KALLSYMS) += kallsyms.o
+obj-$(CONFIG_KALLSYMS_SELFTEST) += kallsyms_selftest.o
obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o
obj-$(CONFIG_CRASH_CORE) += crash_core.o
obj-$(CONFIG_KEXEC_CORE) += kexec_core.o
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 69e040204ed4ebd..19cd9c56df6aecc 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -208,7 +208,7 @@ static unsigned int get_symbol_offset(unsigned long pos)
return name - kallsyms_names;
}
-static unsigned long kallsyms_sym_address(int idx)
+unsigned long kallsyms_sym_address(int idx)
{
if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
return kallsyms_addresses[idx];
diff --git a/kernel/kallsyms_selftest.c b/kernel/kallsyms_selftest.c
new file mode 100644
index 000000000000000..f7538a70d36c531
--- /dev/null
+++ b/kernel/kallsyms_selftest.c
@@ -0,0 +1,421 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Test the function and performance of kallsyms
+ *
+ * Copyright (C) Huawei Technologies Co., Ltd., 2022
+ *
+ * Authors: Zhen Lei <[email protected]> Huawei
+ */
+
+#define pr_fmt(fmt) "kallsyms_selftest: " fmt
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/kallsyms.h>
+#include <linux/random.h>
+#include <linux/sched/clock.h>
+#include <linux/kthread.h>
+#include <linux/vmalloc.h>
+
+#include "kallsyms_internal.h"
+
+
+#define MAX_NUM_OF_RECORDS 64
+
+struct test_stat {
+ int min;
+ int max;
+ int save_cnt;
+ int real_cnt;
+ u64 sum;
+ char *name;
+ unsigned long addr;
+ unsigned long addrs[MAX_NUM_OF_RECORDS];
+};
+
+struct test_item {
+ char *name;
+ unsigned long addr;
+};
+
+#define ITEM_FUNC(s) \
+ { \
+ .name = #s, \
+ .addr = (unsigned long)s, \
+ }
+
+#define ITEM_DATA(s) \
+ { \
+ .name = #s, \
+ .addr = (unsigned long)&s, \
+ }
+
+static int test_var_bss_static;
+static int test_var_data_static = 1;
+int test_var_bss;
+int test_var_data = 1;
+
+static int test_func_static(void)
+{
+ test_var_bss_static++;
+ test_var_data_static++;
+
+ return 0;
+}
+
+int test_func(void)
+{
+ return test_func_static();
+}
+
+__weak int test_func_weak(void)
+{
+ test_var_bss++;
+ test_var_data++;
+ return 0;
+}
+
+static struct test_item test_items[] = {
+ ITEM_FUNC(test_func_static),
+ ITEM_FUNC(test_func),
+ ITEM_FUNC(test_func_weak),
+ ITEM_FUNC(vmalloc),
+ ITEM_FUNC(vfree),
+#ifdef CONFIG_KALLSYMS_ALL
+ ITEM_DATA(test_var_bss_static),
+ ITEM_DATA(test_var_data_static),
+ ITEM_DATA(test_var_bss),
+ ITEM_DATA(test_var_data),
+ ITEM_DATA(vmap_area_list),
+#endif
+};
+
+static char stub_name[KSYM_NAME_LEN];
+
+static int stat_symbol_len(void *data, const char *name, unsigned long addr)
+{
+ *(u32 *)data += strlen(name);
+
+ return 0;
+}
+
+static void test_kallsyms_compression_ratio(void)
+{
+ int i;
+ const u8 *name;
+ u32 pos;
+ u32 ratio, total_size, total_len = 0;
+
+ kallsyms_on_each_symbol(stat_symbol_len, &total_len);
+
+ /*
+ * A symbol name cannot start with a number. This stub name helps us
+ * traverse the entire symbol table without finding a match. It's used
+ * for subsequent performance tests, and its length is the average
+ * length of all symbol names.
+ */
+ memset(stub_name, '4', sizeof(stub_name));
+ pos = total_len / kallsyms_num_syms;
+ stub_name[pos] = 0;
+
+ pos = kallsyms_num_syms - 1;
+ name = &kallsyms_names[kallsyms_markers[pos >> 8]];
+ for (i = 0; i <= (pos & 0xff); i++)
+ name = name + (*name) + 1;
+
+ /*
+ * 1. The length fields is not counted
+ * 2. The memory occupied by array kallsyms_token_table[] and
+ * kallsyms_token_index[] needs to be counted.
+ */
+ total_size = (name - kallsyms_names) - kallsyms_num_syms;
+ pos = kallsyms_token_index[0xff];
+ total_size += pos + strlen(&kallsyms_token_table[pos]) + 1;
+ total_size += 0x100 * sizeof(u16);
+
+ pr_info(" ---------------------------------------------------------\n");
+ pr_info("| nr_symbols | compressed size | original size | ratio(%%) |\n");
+ pr_info("|---------------------------------------------------------|\n");
+ ratio = 10000ULL * total_size / total_len;
+ pr_info("| %10d | %10d | %10d | %2d.%-2d |\n",
+ kallsyms_num_syms, total_size, total_len, ratio / 100, ratio % 100);
+ pr_info(" ---------------------------------------------------------\n");
+}
+
+static int lookup_name(void *data, const char *name, unsigned long addr)
+{
+ u64 t0, t1, t;
+ unsigned long flags;
+ struct test_stat *stat = (struct test_stat *)data;
+
+ local_irq_save(flags);
+ t0 = sched_clock();
+ (void)kallsyms_lookup_name(name);
+ t1 = sched_clock();
+ local_irq_restore(flags);
+
+ t = t1 - t0;
+ if (t < stat->min)
+ stat->min = t;
+
+ if (t > stat->max)
+ stat->max = t;
+
+ stat->real_cnt++;
+ stat->sum += t;
+
+ return 0;
+}
+
+static void test_perf_kallsyms_lookup_name(void)
+{
+ struct test_stat stat;
+
+ memset(&stat, 0, sizeof(stat));
+ stat.min = INT_MAX;
+ kallsyms_on_each_symbol(lookup_name, &stat);
+ pr_info("kallsyms_lookup_name() looked up %d symbols\n", stat.real_cnt);
+ pr_info("The time spent on each symbol is (ns): min=%d, max=%d, avg=%lld\n",
+ stat.min, stat.max, stat.sum / stat.real_cnt);
+}
+
+static int find_symbol(void *data, const char *name, unsigned long addr)
+{
+ struct test_stat *stat = (struct test_stat *)data;
+
+ if (strcmp(name, stat->name) == 0) {
+ stat->real_cnt++;
+ stat->addr = addr;
+
+ if (stat->save_cnt < MAX_NUM_OF_RECORDS) {
+ stat->addrs[stat->save_cnt] = addr;
+ stat->save_cnt++;
+ }
+
+ if (stat->real_cnt == stat->max)
+ return 1;
+ }
+
+ return 0;
+}
+
+static void test_perf_kallsyms_on_each_symbol(void)
+{
+ u64 t0, t1;
+ unsigned long flags;
+ struct test_stat stat;
+
+ memset(&stat, 0, sizeof(stat));
+ stat.max = INT_MAX;
+ stat.name = stub_name;
+ local_irq_save(flags);
+ t0 = sched_clock();
+ kallsyms_on_each_symbol(find_symbol, &stat);
+ t1 = sched_clock();
+ local_irq_restore(flags);
+ pr_info("kallsyms_on_each_symbol() traverse all: %lld ns\n", t1 - t0);
+}
+
+static int match_symbol(void *data, unsigned long addr)
+{
+ struct test_stat *stat = (struct test_stat *)data;
+
+ stat->real_cnt++;
+ stat->addr = addr;
+
+ if (stat->save_cnt < MAX_NUM_OF_RECORDS) {
+ stat->addrs[stat->save_cnt] = addr;
+ stat->save_cnt++;
+ }
+
+ if (stat->real_cnt == stat->max)
+ return 1;
+
+ return 0;
+}
+
+static void test_perf_kallsyms_on_each_match_symbol(void)
+{
+ u64 t0, t1;
+ unsigned long flags;
+ struct test_stat stat;
+
+ memset(&stat, 0, sizeof(stat));
+ stat.max = INT_MAX;
+ stat.name = stub_name;
+ local_irq_save(flags);
+ t0 = sched_clock();
+ kallsyms_on_each_match_symbol(match_symbol, stat.name, &stat);
+ t1 = sched_clock();
+ local_irq_restore(flags);
+ pr_info("kallsyms_on_each_match_symbol() traverse all: %lld ns\n", t1 - t0);
+}
+
+static int test_kallsyms_basic_function(void)
+{
+ int i, j, ret;
+ int next = 0, nr_failed = 0;
+ char *prefix;
+ unsigned short rand;
+ unsigned long addr;
+ char namebuf[KSYM_NAME_LEN];
+ struct test_stat stat, stat1, stat2;
+
+ prefix = "kallsyms_lookup_name() for";
+ for (i = 0; i < ARRAY_SIZE(test_items); i++) {
+ addr = kallsyms_lookup_name(test_items[i].name);
+ if (addr != test_items[i].addr) {
+ nr_failed++;
+ pr_info("%s %s failed: addr=%lx, expect %lx\n",
+ prefix, test_items[i].name, addr, test_items[i].addr);
+ }
+ }
+
+ prefix = "kallsyms_on_each_symbol() for";
+ for (i = 0; i < ARRAY_SIZE(test_items); i++) {
+ memset(&stat, 0, sizeof(stat));
+ stat.max = INT_MAX;
+ stat.name = test_items[i].name;
+ kallsyms_on_each_symbol(find_symbol, &stat);
+ if (stat.addr != test_items[i].addr || stat.real_cnt != 1) {
+ nr_failed++;
+ pr_info("%s %s failed: count=%d, addr=%lx, expect %lx\n",
+ prefix, test_items[i].name,
+ stat.real_cnt, stat.addr, test_items[i].addr);
+ }
+ }
+
+ prefix = "kallsyms_on_each_match_symbol() for";
+ for (i = 0; i < ARRAY_SIZE(test_items); i++) {
+ memset(&stat, 0, sizeof(stat));
+ stat.max = INT_MAX;
+ stat.name = test_items[i].name;
+ kallsyms_on_each_match_symbol(match_symbol, test_items[i].name, &stat);
+ if (stat.addr != test_items[i].addr || stat.real_cnt != 1) {
+ nr_failed++;
+ pr_info("%s %s failed: count=%d, addr=%lx, expect %lx\n",
+ prefix, test_items[i].name,
+ stat.real_cnt, stat.addr, test_items[i].addr);
+ }
+ }
+
+ if (nr_failed)
+ return -ESRCH;
+
+ for (i = 0; i < kallsyms_num_syms; i++) {
+ addr = kallsyms_sym_address(i);
+ if (!is_ksym_addr(addr))
+ continue;
+
+ ret = lookup_symbol_name(addr, namebuf);
+ if (unlikely(ret)) {
+ namebuf[0] = 0;
+ goto failed;
+ }
+
+ stat.addr = kallsyms_lookup_name(namebuf);
+
+ memset(&stat1, 0, sizeof(stat1));
+ stat1.max = INT_MAX;
+ kallsyms_on_each_match_symbol(match_symbol, namebuf, &stat1);
+
+ /*
+ * kallsyms_on_each_symbol() is too slow, randomly select some
+ * symbols for test.
+ */
+ if (i >= next) {
+ memset(&stat2, 0, sizeof(stat2));
+ stat2.max = INT_MAX;
+ stat2.name = namebuf;
+ kallsyms_on_each_symbol(find_symbol, &stat2);
+
+ /*
+ * kallsyms_on_each_symbol() and kallsyms_on_each_match_symbol()
+ * need to get the same traversal result.
+ */
+ if (stat1.addr != stat2.addr ||
+ stat1.real_cnt != stat2.real_cnt ||
+ memcmp(stat1.addrs, stat2.addrs,
+ stat1.save_cnt * sizeof(stat1.addrs[0])))
+ goto failed;
+
+ /*
+ * The average of random increments is 128, that is, one of
+ * them is tested every 128 symbols.
+ */
+ get_random_bytes(&rand, sizeof(rand));
+ next = i + (rand & 0xff) + 1;
+ }
+
+ /* Need to be found at least once */
+ if (!stat1.real_cnt)
+ goto failed;
+
+ /*
+ * kallsyms_lookup_name() returns the address of the first
+ * symbol found and cannot be NULL.
+ */
+ if (!stat.addr || stat.addr != stat1.addrs[0])
+ goto failed;
+
+ /*
+ * If the addresses of all matching symbols are recorded, the
+ * target address needs to be exist.
+ */
+ if (stat1.real_cnt <= MAX_NUM_OF_RECORDS) {
+ for (j = 0; j < stat1.save_cnt; j++) {
+ if (stat1.addrs[j] == addr)
+ break;
+ }
+
+ if (j == stat1.save_cnt)
+ goto failed;
+ }
+ }
+
+ return 0;
+
+failed:
+ pr_info("Test for %dth symbol failed: (%s) addr=%lx", i, namebuf, addr);
+ return -ESRCH;
+}
+
+static int test_entry(void *p)
+{
+ int ret;
+
+ do {
+ schedule_timeout(5 * HZ);
+ } while (system_state != SYSTEM_RUNNING);
+
+ pr_info("start\n");
+ ret = test_kallsyms_basic_function();
+ if (ret) {
+ pr_info("abort\n");
+ return 0;
+ }
+
+ test_kallsyms_compression_ratio();
+ test_perf_kallsyms_lookup_name();
+ test_perf_kallsyms_on_each_symbol();
+ test_perf_kallsyms_on_each_match_symbol();
+ pr_info("finish\n");
+
+ return 0;
+}
+
+static int __init kallsyms_test_init(void)
+{
+ struct task_struct *t;
+
+ t = kthread_create(test_entry, NULL, "kallsyms_test");
+ if (IS_ERR(t)) {
+ pr_info("Create kallsyms selftest task failed\n");
+ return PTR_ERR(t);
+ }
+ kthread_bind(t, 0);
+ wake_up_process(t);
+
+ return 0;
+}
+late_initcall(kallsyms_test_init);
--
2.25.1
Based on the test results of kallsyms_on_each_match_symbol() and
kallsyms_on_each_symbol(), the average performance can be improved by 20
to 30 times.
Signed-off-by: Zhen Lei <[email protected]>
---
kernel/livepatch/core.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 42f7e716d56bf72..31b57ccf908017e 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -153,6 +153,24 @@ static int klp_find_callback(void *data, const char *name,
return 0;
}
+static int klp_match_callback(void *data, unsigned long addr)
+{
+ struct klp_find_arg *args = data;
+
+ args->addr = addr;
+ args->count++;
+
+ /*
+ * Finish the search when the symbol is found for the desired position
+ * or the position is not defined for a non-unique symbol.
+ */
+ if ((args->pos && (args->count == args->pos)) ||
+ (!args->pos && (args->count > 1)))
+ return 1;
+
+ return 0;
+}
+
static int klp_find_object_symbol(const char *objname, const char *name,
unsigned long sympos, unsigned long *addr)
{
@@ -167,7 +185,7 @@ static int klp_find_object_symbol(const char *objname, const char *name,
if (objname)
module_kallsyms_on_each_symbol(klp_find_callback, &args);
else
- kallsyms_on_each_symbol(klp_find_callback, &args);
+ kallsyms_on_each_match_symbol(klp_match_callback, name, &args);
/*
* Ensure an address was found. If sympos is 0, ensure symbol is unique;
--
2.25.1
Currently, to search for a symbol, we need to expand the symbols in
'kallsyms_names' one by one, and then use the expanded string for
comparison. This process can be optimized.
And now scripts/kallsyms no longer compresses the symbol types, each
symbol type always occupies one byte. So we can first compress the
searched symbol and then make a quick comparison based on the compressed
length and content. In this way, for entries with mismatched lengths,
there is no need to expand and compare strings. And for those matching
lengths, there's no need to expand the symbol. This saves a lot of time.
According to my test results, the average performance of
kallsyms_lookup_name() can be improved by 20 to 30 times.
The pseudo code of the test case is as follows:
static int stat_find_name(...)
{
start = sched_clock();
(void)kallsyms_lookup_name(name);
end = sched_clock();
//Update min, max, cnt, sum
}
/*
* Traverse all symbols in sequence and collect statistics on the time
* taken by kallsyms_lookup_name() to lookup each symbol.
*/
kallsyms_on_each_symbol(stat_find_name, NULL);
The test results are as follows (twice):
After : min=5250, max= 726560, avg= 302132
After : min=5320, max= 726850, avg= 301978
Before: min=170, max=15949190, avg=7553906
Before: min=160, max=15877280, avg=7517784
The average time consumed is only 4.01% and the maximum time consumed is
only 4.57% of the time consumed before optimization.
Signed-off-by: Zhen Lei <[email protected]>
---
kernel/kallsyms.c | 124 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 120 insertions(+), 4 deletions(-)
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 3e7e2c2ad2f75ef..dcf5bdc7309a6cc 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -87,6 +87,86 @@ static unsigned int kallsyms_expand_symbol(unsigned int off,
return off;
}
+static unsigned char *find_token(unsigned char *str, int len,
+ const unsigned char *token)
+{
+ int i;
+
+ for (i = 0; i < len - 1; i++) {
+ if (str[i] == token[0] && str[i+1] == token[1])
+ return &str[i];
+ }
+ return NULL;
+}
+
+static int kallsyms_compress_symbol_name(const char *name, char *buf, size_t size)
+{
+ int i, j, n, len;
+ unsigned char *p1, *p2;
+ const unsigned char *token;
+
+ len = strscpy(buf, name, size);
+ if (WARN_ON_ONCE(len <= 0))
+ return 0;
+
+ /*
+ * For each entry in kallsyms_best_token_table[], the storage
+ * format is:
+ * 1. For tokens that cannot be used to compress characters, the value
+ * at [j] is 0, and the value at [j+1] is the number of consecutive
+ * tokens with this feature.
+ * 2. For each token: the larger the token value, the higher the
+ * frequency, and the lower the index.
+ *
+ * -------------------------------
+ * | j | [j] [j+1] | token |
+ * -----|---------------|---------|
+ * | 0 | ?? ?? | 255 |
+ * | 2 | ?? ?? | 254 |
+ * | ... | ?? ?? | ... |
+ * | n-2 | ?? ?? | x |
+ * | n | 00 len | x-1 |
+ * | n+2 | ?? ?? | x-1-len |
+ * above '??' is non-zero
+ */
+ for (i = 255, j = 0; i >= 0; i--, j += 2) {
+ if (!kallsyms_best_token_table[j]) {
+ i -= kallsyms_best_token_table[j + 1];
+ if (i < 0)
+ break;
+ j += 2;
+ }
+ token = &kallsyms_best_token_table[j];
+
+ p1 = buf;
+
+ /* find the token on the symbol */
+ p2 = find_token(p1, len, token);
+ if (!p2)
+ continue;
+
+ n = len;
+
+ do {
+ *p2 = i;
+ p2++;
+ n -= (p2 - p1);
+ memmove(p2, p2 + 1, n);
+ p1 = p2;
+ len--;
+
+ if (n < 2)
+ break;
+
+ /* find the token on the symbol */
+ p2 = find_token(p1, n, token);
+
+ } while (p2);
+ }
+
+ return len;
+}
+
/*
* Get symbol type information. This is encoded as a single char at the
* beginning of the symbol name.
@@ -186,26 +266,62 @@ static bool cleanup_symbol_name(char *s)
return false;
}
+static int kallsyms_lookup_compressed_name(unsigned char *namebuf, int len,
+ unsigned long *addr)
+{
+ unsigned int i, off;
+ unsigned int name_len;
+ const unsigned char *name;
+
+ for (i = 0, off = 0; len && i < kallsyms_num_syms; i++) {
+ /*
+ * For each symbol entry, the storage format is:
+ * ----------------------------
+ * | len(1) | type(1) | name(x) |
+ * ----------------------------
+ *
+ * Number of bytes in parentheses, and: len = 1 + x
+ */
+ name_len = kallsyms_names[off] - 1;
+ name = &kallsyms_names[off + 2];
+ off += name_len + 2;
+
+ if (name_len != len)
+ continue;
+
+ if (!memcmp(name, namebuf, len)) {
+ *addr = kallsyms_sym_address(i);
+ return 0;
+ }
+ }
+
+ return -ENOENT;
+}
+
/* Lookup the address for this symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name)
{
char namebuf[KSYM_NAME_LEN];
- unsigned long i;
+ unsigned long i, addr;
unsigned int off;
+ int ret, len;
/* Skip the search for empty string. */
if (!*name)
return 0;
+ len = kallsyms_compress_symbol_name(name, namebuf, ARRAY_SIZE(namebuf));
+ ret = kallsyms_lookup_compressed_name(namebuf, len, &addr);
+ if (!ret)
+ return addr;
+
for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
- if (strcmp(namebuf, name) == 0)
- return kallsyms_sym_address(i);
-
if (cleanup_symbol_name(namebuf) && strcmp(namebuf, name) == 0)
return kallsyms_sym_address(i);
}
+
return module_kallsyms_lookup_name(name);
}
--
2.25.1
On 2022/9/23 19:20, Zhen Lei wrote:
> Currently we traverse all symbols of all modules to find the specified
> function for the specified module. But in reality, we just need to find
> the given module and then traverse all the symbols in it.
>
> In order to achieve this purpose, split the call to hook 'fn' into two
> phases:
> 1. Finds the given module. Pass pointer 'mod'. Hook 'fn' directly returns
> the comparison result of the module name without comparing the function
> name.
> 2. Finds the given function in that module. Pass pointer 'mod = NULL'.
> Hook 'fn' skip the comparison of module name and directly compare
> function names.
Sorry, I forgot to change the description. I will fix it in v6, after I've
collected review comments.
>
> Phase1: mod1-->mod2..(subsequent modules do not need to be compared)
> |
> Phase2: -->f1-->f2-->f3
>
> Signed-off-by: Zhen Lei <[email protected]>
> ---
> include/linux/module.h | 4 ++--
> kernel/livepatch/core.c | 13 ++-----------
> kernel/module/kallsyms.c | 15 ++++++++++++---
> 3 files changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/module.h b/include/linux/module.h
> index 518296ea7f73af6..6e1a531d78e7e8b 100644
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -879,8 +879,8 @@ static inline bool module_sig_ok(struct module *module)
> }
> #endif /* CONFIG_MODULE_SIG */
>
> -int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
> - struct module *, unsigned long),
> +int module_kallsyms_on_each_symbol(const char *modname,
> + int (*fn)(void *, const char *, unsigned long),
> void *data);
>
> #endif /* _LINUX_MODULE_H */
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 31b57ccf908017e..b02de4cb311c703 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -118,27 +118,19 @@ static struct klp_object *klp_find_object(struct klp_patch *patch,
> }
>
> struct klp_find_arg {
> - const char *objname;
> const char *name;
> unsigned long addr;
> unsigned long count;
> unsigned long pos;
> };
>
> -static int klp_find_callback(void *data, const char *name,
> - struct module *mod, unsigned long addr)
> +static int klp_find_callback(void *data, const char *name, unsigned long addr)
> {
> struct klp_find_arg *args = data;
>
> - if ((mod && !args->objname) || (!mod && args->objname))
> - return 0;
> -
> if (strcmp(args->name, name))
> return 0;
>
> - if (args->objname && strcmp(args->objname, mod->name))
> - return 0;
> -
> args->addr = addr;
> args->count++;
>
> @@ -175,7 +167,6 @@ static int klp_find_object_symbol(const char *objname, const char *name,
> unsigned long sympos, unsigned long *addr)
> {
> struct klp_find_arg args = {
> - .objname = objname,
> .name = name,
> .addr = 0,
> .count = 0,
> @@ -183,7 +174,7 @@ static int klp_find_object_symbol(const char *objname, const char *name,
> };
>
> if (objname)
> - module_kallsyms_on_each_symbol(klp_find_callback, &args);
> + module_kallsyms_on_each_symbol(objname, klp_find_callback, &args);
> else
> kallsyms_on_each_match_symbol(klp_match_callback, name, &args);
>
> diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c
> index f5c5c9175333df7..329cef573675d49 100644
> --- a/kernel/module/kallsyms.c
> +++ b/kernel/module/kallsyms.c
> @@ -495,8 +495,8 @@ unsigned long module_kallsyms_lookup_name(const char *name)
> }
>
> #ifdef CONFIG_LIVEPATCH
> -int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
> - struct module *, unsigned long),
> +int module_kallsyms_on_each_symbol(const char *modname,
> + int (*fn)(void *, const char *, unsigned long),
> void *data)
> {
> struct module *mod;
> @@ -510,6 +510,9 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
> if (mod->state == MODULE_STATE_UNFORMED)
> continue;
>
> + if (strcmp(modname, mod->name))
> + continue;
> +
> /* Use rcu_dereference_sched() to remain compliant with the sparse tool */
> preempt_disable();
> kallsyms = rcu_dereference_sched(mod->kallsyms);
> @@ -522,10 +525,16 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
> continue;
>
> ret = fn(data, kallsyms_symbol_name(kallsyms, i),
> - mod, kallsyms_symbol_value(sym));
> + kallsyms_symbol_value(sym));
> if (ret != 0)
> goto out;
> }
> +
> + /*
> + * The given module is found, the subsequent modules do not
> + * need to be compared.
> + */
> + break;
> }
> out:
> mutex_unlock(&module_mutex);
>
--
Regards,
Zhen Lei
On 2022/9/24 9:11, Leizhen (ThunderTown) wrote:
>
> On 2022/9/23 19:20, Zhen Lei wrote:
>> Currently we traverse all symbols of all modules to find the specified
>> function for the specified module. But in reality, we just need to find
>> the given module and then traverse all the symbols in it.
>>
>> In order to achieve this purpose, split the call to hook 'fn' into two
>> phases:
>> 1. Finds the given module. Pass pointer 'mod'. Hook 'fn' directly returns
>> the comparison result of the module name without comparing the function
>> name.
>> 2. Finds the given function in that module. Pass pointer 'mod = NULL'.
>> Hook 'fn' skip the comparison of module name and directly compare
>> function names.
> Sorry, I forgot to change the description. I will fix it in v6, after I've
> collected review comments.
Oh, It's Saturday, and I don't think anyone's seen v5 yet. So I'll post the v6.
Please skip v5.
>
--
Regards,
Zhen Lei