LinuxLists.cc - [PATCH 0/3] fs/ntfs3: Make entry binary search faster

2021-09-02 15:44:34

Subject: [PATCH 0/3] fs/ntfs3: Make entry binary search faster

This series will make binary search faster with removing the need of
allocations. We will only use stack memory. This will also make possible
to remove linear search completely.

It is good also point out that full binary search not quite fit with
entry search because entrys are not always same size. This why we first
need linear table fill algorithm. My implementation try to use the fact
that we should not linear fill full table before not doing any checking
of the entrys. It is example 50/50 change if we are in middle that entry
is in first half. So it is very inefficient to fill table after we are
middle point.

We could also predict how many entrys there is and use this information,
but I did not do that in this point. I'm more than happy to improve this
more if someone has ideas.

I have tested this with xfstests and did not see regressions. Checkpatch
and build tests for every patch have been done. I haven't done major
bench marking with this one. Idea that this is better is just my two
cent. Paragon has hopefully done bencmarking with old binary search
compared to linear search? I'm quite certain that this will win old
binary search algorithm because no need for allocations.

Thanks Joe for let me notice this improvement.

Kari Argillander (3):
fs/ntfs3: Limit binary search table size
fs/ntfs3: Make binary search to search smaller chunks in beginning
fs/ntfs3: Always use binary search with entry search

fs/ntfs3/index.c | 153 ++++++++++++++---------------------------------
fs/ntfs3/ntfs.h | 3 -
2 files changed, 45 insertions(+), 111 deletions(-)

base-commit: d3624466b56dd5b1886c1dff500525b544c19c83
--
2.25.1

2021-09-02 23:27:13

by Kari Argillander

[permalink] [raw]

Subject: [PATCH 3/3] fs/ntfs3: Always use binary search with entry search

We do not have any reason to keep old linear search in. Before this was
used for error path or if table was so big that it cannot be allocated.
Current binary search implementation won't need error path. Remove old
references to linear entry search.

Signed-off-by: Kari Argillander <[email protected]>
---
fs/ntfs3/index.c | 50 ++++++------------------------------------------
fs/ntfs3/ntfs.h | 3 ---
2 files changed, 6 insertions(+), 47 deletions(-)

diff --git a/fs/ntfs3/index.c b/fs/ntfs3/index.c
index e336d5645628..9f79cff7d09e 100644
--- a/fs/ntfs3/index.c
+++ b/fs/ntfs3/index.c
@@ -672,22 +672,16 @@ static struct NTFS_DE *hdr_find_e(const struct ntfs_index *indx,
const struct INDEX_HDR *hdr, const void *key,
size_t key_len, const void *ctx, int *diff)
{
- struct NTFS_DE *e;
+ struct NTFS_DE *e, *found = NULL;
NTFS_CMP_FUNC cmp = indx->cmp;
+ int min_idx = 0, mid_idx, max_idx = 0;
+ int diff2;
+ int table_size = 8;
u32 e_size, e_key_len;
u32 end = le32_to_cpu(hdr->used);
u32 off = le32_to_cpu(hdr->de_off);
-
-#ifdef NTFS3_INDEX_BINARY_SEARCH
- struct NTFS_DE *found = NULL;
- int min_idx = 0, mid_idx, max_idx = 0;
- int table_size = 8;
- int diff2;
u16 offs[128];

- if (end > 0x10000)
- goto next;
-
fill_table:
if (off + sizeof(struct NTFS_DE) > end)
return NULL;
@@ -721,7 +715,8 @@ static struct NTFS_DE *hdr_find_e(const struct ntfs_index *indx,
return NULL;

max_idx = 0;
- table_size = min(table_size * 2, 128);
+ table_size = min(table_size * 2,
+ (int)ARRAY_SIZE(offs));
goto fill_table;
}
} else if (diff2 < 0) {
@@ -745,39 +740,6 @@ static struct NTFS_DE *hdr_find_e(const struct ntfs_index *indx,
e = Add2Ptr(hdr, offs[mid_idx]);

goto binary_search;
-#endif
-
-next:
- /*
- * Entries index are sorted.
- * Enumerate all entries until we find entry
- * that is <= to the search value.
- */
- if (off + sizeof(struct NTFS_DE) > end)
- return NULL;
-
- e = Add2Ptr(hdr, off);
- e_size = le16_to_cpu(e->size);
-
- if (e_size < sizeof(struct NTFS_DE) || off + e_size > end)
- return NULL;
-
- off += e_size;
-
- e_key_len = le16_to_cpu(e->key_size);
-
- *diff = (*cmp)(key, key_len, e + 1, e_key_len, ctx);
- if (!*diff)
- return e;
-
- if (*diff <= 0)
- return e;
-
- if (de_is_last(e)) {
- *diff = 1;
- return e;
- }
- goto next;
}

/*
diff --git a/fs/ntfs3/ntfs.h b/fs/ntfs3/ntfs.h
index 6bb3e595263b..a7e1b7cc7a14 100644
--- a/fs/ntfs3/ntfs.h
+++ b/fs/ntfs3/ntfs.h
@@ -12,9 +12,6 @@

/* TODO: Check 4K MFT record and 512 bytes cluster. */

-/* Activate this define to use binary search in indexes. */
-#define NTFS3_INDEX_BINARY_SEARCH
-
/* Check each run for marked clusters. */
#define NTFS3_CHECK_FREE_CLST

--
2.25.1

2021-09-13 16:58:08

by Konstantin Komarov

[permalink] [raw]

Subject: Re: [PATCH 0/3] fs/ntfs3: Make entry binary search faster

On 02.09.2021 18:40, Kari Argillander wrote:
> This series will make binary search faster with removing the need of
> allocations. We will only use stack memory. This will also make possible
> to remove linear search completely.
>
> It is good also point out that full binary search not quite fit with
> entry search because entrys are not always same size. This why we first
> need linear table fill algorithm. My implementation try to use the fact
> that we should not linear fill full table before not doing any checking
> of the entrys. It is example 50/50 change if we are in middle that entry
> is in first half. So it is very inefficient to fill table after we are
> middle point.
>
> We could also predict how many entrys there is and use this information,
> but I did not do that in this point. I'm more than happy to improve this
> more if someone has ideas.
>
> I have tested this with xfstests and did not see regressions. Checkpatch
> and build tests for every patch have been done. I haven't done major
> bench marking with this one. Idea that this is better is just my two
> cent. Paragon has hopefully done bencmarking with old binary search
> compared to linear search? I'm quite certain that this will win old
> binary search algorithm because no need for allocations.
>
> Thanks Joe for let me notice this improvement.
>
> Kari Argillander (3):
> fs/ntfs3: Limit binary search table size
> fs/ntfs3: Make binary search to search smaller chunks in beginning
> fs/ntfs3: Always use binary search with entry search
>
> fs/ntfs3/index.c | 153 ++++++++++++++---------------------------------
> fs/ntfs3/ntfs.h | 3 -
> 2 files changed, 45 insertions(+), 111 deletions(-)
>
>
> base-commit: d3624466b56dd5b1886c1dff500525b544c19c83
>

Hi Joe, Kari!

Applied, thanks!