2016-03-02 23:50:02

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH v4] sscanf: implement basic character sets

On Fri, Feb 26 2016, Jessica Yu <[email protected]> wrote:

> @@ -2714,6 +2718,57 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
> num++;
> }
> continue;
> + /*
> + * Warning: This implementation of the '[' conversion specifier
> + * deviates from its glibc counterpart in the following ways:
> + * (1) It does NOT support ranges i.e. '-' is NOT a special character
> + * (2) It cannot match the closing bracket ']' itself
> + * (3) A field width is required
> + * (4) '%*[' (discard matching input) is currently not supported
> + *
> + * Example usage:
> + * ret = sscanf("00:0a:95","%2[^:]:%2[^:]:%2[^:]", buf1, buf2, buf3);
> + * if (ret < 3)
> + * // etc..
> + */
> + case '[':
> + {
> + char *s = (char *)va_arg(args, char *);
> + DECLARE_BITMAP(set, 256) = {0};
> + unsigned int len = 0;
> + bool negate = (*fmt == '^');
> +
> + /* field width is required */
> + if (field_width == -1)
> + return num;
> +
> + if (negate)
> + ++fmt;
> +
> + for ( ; *fmt && *fmt != ']'; ++fmt, ++len)
> + set_bit((u8)*fmt, set);
> +
> + /* no ']' or no character set found */
> + if (!*fmt || !len)
> + return num;
> + ++fmt;
> +

I think it might be useful to be able to do [^] to match any sequence of
characters. If the user passed [] the code below won't match anything,
so we'll return num anyway. In other words, I'd just omit the test for
empty character set. Other than that, LGTM.

Rasmus


2016-03-07 23:09:48

by Jessica Yu

[permalink] [raw]
Subject: Re: sscanf: implement basic character sets

+++ Rasmus Villemoes [03/03/16 00:49 +0100]:
>On Fri, Feb 26 2016, Jessica Yu <[email protected]> wrote:
>
>> @@ -2714,6 +2718,57 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
>> num++;
>> }
>> continue;
>> + /*
>> + * Warning: This implementation of the '[' conversion specifier
>> + * deviates from its glibc counterpart in the following ways:
>> + * (1) It does NOT support ranges i.e. '-' is NOT a special character
>> + * (2) It cannot match the closing bracket ']' itself
>> + * (3) A field width is required
>> + * (4) '%*[' (discard matching input) is currently not supported
>> + *
>> + * Example usage:
>> + * ret = sscanf("00:0a:95","%2[^:]:%2[^:]:%2[^:]", buf1, buf2, buf3);
>> + * if (ret < 3)
>> + * // etc..
>> + */
>> + case '[':
>> + {
>> + char *s = (char *)va_arg(args, char *);
>> + DECLARE_BITMAP(set, 256) = {0};
>> + unsigned int len = 0;
>> + bool negate = (*fmt == '^');
>> +
>> + /* field width is required */
>> + if (field_width == -1)
>> + return num;
>> +
>> + if (negate)
>> + ++fmt;
>> +
>> + for ( ; *fmt && *fmt != ']'; ++fmt, ++len)
>> + set_bit((u8)*fmt, set);
>> +
>> + /* no ']' or no character set found */
>> + if (!*fmt || !len)
>> + return num;
>> + ++fmt;
>> +
>
>I think it might be useful to be able to do [^] to match any sequence of
>characters. If the user passed [] the code below won't match anything,
>so we'll return num anyway. In other words, I'd just omit the test for
>empty character set. Other than that, LGTM.

Thanks for the review. My only concern would be that that behavior
(i.e., have [^] match any sequence of characters) would also deviate
from glibc sccanf behavior (which matches nothing), and would need to
be documented as well. Perhaps we should best keep these differences
to a minimum, so as to prevent unexpected surprises.

Jessica