2009-11-10 14:55:24

by André Goddard Rosa

[permalink] [raw]
Subject: [PATCH v2 0/2] bsearch: fix overflow and avoid unnecessary calculation

Fix a rare overflow situation which can occur when working with large arrays.
Also, remove an unneeded calculation.

changelog:
v2: fix a bug introduced into 00/01 of the v1 patch
v1: initial submission

André Goddard Rosa (2):
bsearch: avoid unneeded decrement arithmetic
bsearch: prevent overflow when computing middle comparison element

lib/bsearch.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)


2009-11-10 14:55:32

by André Goddard Rosa

[permalink] [raw]
Subject: [PATCH v2 1/2] bsearch: avoid unneeded decrement arithmetic

Signed-off-by: André Goddard Rosa <[email protected]>
---
lib/bsearch.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/bsearch.c b/lib/bsearch.c
index 2e70664..33cbba6 100644
--- a/lib/bsearch.c
+++ b/lib/bsearch.c
@@ -33,13 +33,13 @@
void *bsearch(const void *key, const void *base, size_t num, size_t size,
int (*cmp)(const void *key, const void *elt))
{
- int start = 0, end = num - 1, mid, result;
+ int start = 0, end = num, mid, result;

- while (start <= end) {
+ while (start < end) {
mid = (start + end) / 2;
result = cmp(key, base + mid * size);
if (result < 0)
- end = mid - 1;
+ end = mid;
else if (result > 0)
start = mid + 1;
else
--
1.6.5.2.153.g6e31f.dirty

2009-11-10 14:55:35

by André Goddard Rosa

[permalink] [raw]
Subject: [PATCH v2 2/2] bsearch: prevent overflow when computing middle comparison element

It's really difficult to occur in practice because the sum of the lower
and higher limits must overflow an int variable, but it can occur when
working with large arrays. We'd better safe than sorry by avoiding this
overflow situation when computing the middle element for comparison.

See:
http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5045582

The program below demonstrates the issue:

$ ./a.out
(glibc) Element is at the last position
(patched) Element is at the last position
Segmentation fault

===
#include <search.h>
#include <stdlib.h>
#include <stdio.h>

/* A number that when doubled will be bigger than 2^31 - 1 */
#define BIG_NUMBER_TO_OVERFLOW_INT (1100000000)

static int cmp_char(const void *p1, const void *p2)
{
char v1 = (*(char *)p1);
char v2 = (*(char *)p2);

if (v1 < v2)
return -1;
else if (v1 > v2)
return 1;
else
return 0;
}

void *lib_bsearch(const void *key, const void *base, size_t num, size_t size
, int (*cmp)(const void *key, const void *elt))
{
int start = 0, end = num - 1, mid, result;

while (start <= end) {
mid = (start + end) / 2;
result = cmp(key, base + mid * size);
if (result < 0)
end = mid - 1;
else if (result > 0)
start = mid + 1;
else
return (void *)base + mid * size;
}

return NULL;
}

void *patch_lib_bsearch(const void *key, const void *base, size_t num, size_t size
, int (*cmp)(const void *key, const void *elt))
{
size_t start = 0, end = num, mid;
int result;

while (start < end) {
mid = (start + end) / 2;
result = cmp(key, base + mid * size);
if (result < 0)
end = mid - 1;
else if (result > 0)
start = mid + 1;
else
return (void *)base + mid * size;
}

return NULL;
}

int main(void)
{
char key = 1;
char *array = calloc(BIG_NUMBER_TO_OVERFLOW_INT, sizeof(char));
void *ptr;

if (!array)
{
printf("%s\n", "no memory");
return EXIT_FAILURE;
}
array[BIG_NUMBER_TO_OVERFLOW_INT - 1] = 1;

ptr = bsearch(&key, array, BIG_NUMBER_TO_OVERFLOW_INT, sizeof(char), cmp_char);
printf("(glibc) Element is%sat the last position\n"
, (ptr == &array[BIG_NUMBER_TO_OVERFLOW_INT - 1]) ? " " : " NOT ");

ptr = patch_lib_bsearch(&key, array, BIG_NUMBER_TO_OVERFLOW_INT, sizeof(char), cmp_char);
printf("(patched) Element is%sat the last position\n"
, (ptr == &array[BIG_NUMBER_TO_OVERFLOW_INT - 1]) ? " " : " NOT ");

ptr = lib_bsearch(&key, array, BIG_NUMBER_TO_OVERFLOW_INT, sizeof(char), cmp_char);
printf("(unpatched) Element is%sat the last position\n"
, (ptr == &array[BIG_NUMBER_TO_OVERFLOW_INT - 1]) ? " " : " NOT ");

free(array);

return EXIT_SUCCESS;
}

Signed-off-by: André Goddard Rosa <[email protected]>
---
lib/bsearch.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/lib/bsearch.c b/lib/bsearch.c
index 33cbba6..af40c03 100644
--- a/lib/bsearch.c
+++ b/lib/bsearch.c
@@ -33,7 +33,8 @@
void *bsearch(const void *key, const void *base, size_t num, size_t size,
int (*cmp)(const void *key, const void *elt))
{
- int start = 0, end = num, mid, result;
+ size_t start = 0, end = num, mid;
+ int result;

while (start < end) {
mid = (start + end) / 2;
--
1.6.5.2.153.g6e31f.dirty

2009-11-11 00:18:45

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] bsearch: prevent overflow when computing middle comparison element

On Tue, 10 Nov 2009 02:12:31 am André Goddard Rosa wrote:
> It's really difficult to occur in practice because the sum of the lower
> and higher limits must overflow an int variable, but it can occur when
> working with large arrays. We'd better safe than sorry by avoiding this
> overflow situation when computing the middle element for comparison.

I always thought the obvious answer was:

mid = start + (end - start)/2;

Your version does nothing for 32 bit platforms...
Rusty.

2009-11-11 15:00:55

by André Goddard Rosa

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] bsearch: prevent overflow when computing middle comparison element

On Tue, Nov 10, 2009 at 10:18 PM, Rusty Russell <[email protected]> wrote:
> On Tue, 10 Nov 2009 02:12:31 am Andr? Goddard Rosa wrote:
>> It's really difficult to occur in practice because the sum of the lower
>> and higher limits must overflow an int variable, but it can occur when
>> working with large arrays. We'd better safe than sorry by avoiding this
>> overflow situation when computing the middle element for comparison.
>
> I always thought the obvious answer was:
>
> ? ? ? ?mid = start + (end - start)/2;

Hi, Rusty!

Yes, you're right! The previous patch fixes the case where the
number of elements approach the
maximum int value (2^31 - 1 on my computer). If the number of elements
(parameter num) were an
integer amount (Java's array length case), just making those unsigned
would be enough, because
in the worst case we would have:

(max int) * 2 < (max unsigned int)
(2^31 - 1) * 2 < (2^32 - 1)

But it does not fix the case where the number of elements
approaches the maximum unsigned int
value (parameter size_t num).

So, the worst case happens when the number we search for is stored
at the highest extreme of the array.
In that case, 'start' tends toward 'end', and if 'end' is near the
maximum allowed value for a specific data type,
the overflow could still happen.

I'm sending a fixed patch in a moment as per your suggestion.

Thank you,
Andr?