2002-02-06 08:22:19

by [email protected]

[permalink] [raw]
Subject: äÌÑÇÌÁ×ÎÏÇÏÂÕÈÇÁÌÔÅÒÁ

??? ???????? ??????????.

?????? ? ????????? ? ??????? ???????? ?? 2002 ??? (? ???????? ?? 1 ??????? 2002 ????),
??????????? ??????????? ? 25 ????? ?????????? ??????? ?? ????? ?? ??????? ??????????? (? ???????? ?? 28 ?????? 2002 ????),
????? ??????? ????? ???????????? ????????.

?? ?????? ??????????? ???????? ?????? ????????? ????????? ?? ??????????? ?????????, ? ???? ?? 15 ??????? 2002 ????. ??????????? ?????????? ??????? ??????????? ??????? ??????????? ?????.

?????? ?????????? E-mail: [email protected]


????? ??????: ???????????? ???????????, ???????? ?????, ??????? ??? ???????? ? ??????? ???????? ?????????? ?????????, E-mail, ???????? ??????????? ??? ??????????.

?????? ? ?????????? (??????????????) ??????? ?? ???????????????!!!

??????????? ?????????? ?????????? ? ???? ? 10 ?? 15 ??????? 2002 ????.

(????????: ???? ??? ?? ?????????????? ???? ??????????? ???????? ???? ????????? ?? ??????????????? ??????????.)



2002-02-06 21:24:03

by Brian

[permalink] [raw]
Subject: Re: ?????????????????????

Can we get something like
/[\200-\377]{6}/ (6 upper ACSII characters in a row)
added to the taboo list?

-- Brian

On Tuesday 05 February 2002 06:48 pm, [email protected] wrote:
> ??? ???????? ??????????.
>
> ?????? ? ????????? ? ??????? ???????? ?? 2002 ??? (? ???????? ?? 1
>

Subject: Re: ?????????????????????

like

Subject: [ANNOUNCE] blah blah?



--On Wednesday, 06 February, 2002 4:21 PM -0500 Brian
<[email protected]> wrote:

> Can we get something like
> /[\200-\377]{6}/ (6 upper ACSII characters in a row)
> added to the taboo list?
>
> -- Brian
>
> On Tuesday 05 February 2002 06:48 pm, [email protected] wrote:
>> ??? ???????? ??????????.
>>
>> ?????? ? ????????? ? ??????? ???????? ?? 2002 ??? (? ???????? ?? 1
>>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>



--
Alex Bligh

2002-02-06 22:46:43

by Roland Dreier

[permalink] [raw]
Subject: Re: ?????????????????????

>>>>> "Alex" == Alex Bligh <- linux-kernel <[email protected]>> writes:

Alex> like Subject: [ANNOUNCE] blah blah?

Brian> Can we get something like /[\200-\377]{6}/ (6 upper ACSII
Brian> characters in a row) added to the taboo list?

Brian's pattern doesn't match upper case letters. It matches
characters with the most significant bit set. 'A' is 0101 octal,
'N' is 0116 octal, etc. so your example would not trigger the rule.
The idea of the rule is to filter out messages posted in non-Roman
character sets.

Roland

2002-02-06 23:42:38

by Brian

[permalink] [raw]
Subject: Re: ?????????????????????

On Wednesday 06 February 2002 05:31 pm, Alex Bligh - linux-kernel wrote:
> like
>
> Subject: [ANNOUNCE] blah blah?
>

That would be upperCASE ACSII.
I mean 6 bytes, each higher than 127, in a row.

To my knowledge, there is no English word that would match that regex (or,
for that matter, any Romantic or Germanic language word). It's the most
effective tool I've seen against Asian spam (like the one I replied to).

> --On Wednesday, 06 February, 2002 4:21 PM -0500 Brian
>
> <[email protected]> wrote:
> > Can we get something like
> > /[\200-\377]{6}/ (6 upper ACSII characters in a row)
> > added to the taboo list?
> >
> > -- Brian
>

2002-02-07 11:13:17

by Bruce Harada

[permalink] [raw]
Subject: Re: ?????????????????????


On Wed, 06 Feb 2002 18:42:17 -0500
Brian <[email protected]> wrote:

> To my knowledge, there is no English word that would match that regex (or,
> for that matter, any Romantic or Germanic language word). It's the most
> effective tool I've seen against Asian spam (like the one I replied to).


Just to set the record straight, that was RUSSIAN spam, not Asian spam...
(The regex should still be effective, of course.)

2002-02-07 11:46:51

by David Miller

[permalink] [raw]
Subject: Re: ?????????????????????

From: Bruce Harada <[email protected]>
Date: Thu, 7 Feb 2002 20:12:43 +0900

On Wed, 06 Feb 2002 18:42:17 -0500
Brian <[email protected]> wrote:

> To my knowledge, there is no English word that would match that regex (or,
> for that matter, any Romantic or Germanic language word). It's the most
> effective tool I've seen against Asian spam (like the one I replied to).

Just to set the record straight, that was RUSSIAN spam, not Asian spam...
(The regex should still be effective, of course.)

Except that it would block out uuencoded patches in postings perhaps?
Or is it just supposed to be matched in the Subject field or other
parts of the headers?

2002-02-07 12:12:33

by Pete Cervasio

[permalink] [raw]
Subject: Re: ?????????????????????

At 03:44 AM 2/7/2002 -0800, David S. Miller <[email protected]> wrote:
> From: Bruce Harada <[email protected]>
> Date: Thu, 7 Feb 2002 20:12:43 +0900
>
> On Wed, 06 Feb 2002 18:42:17 -0500
> Brian <[email protected]> wrote:
>
> > To my knowledge, there is no English word that would match that regex
(or,
> > for that matter, any Romantic or Germanic language word). It's the
most
> > effective tool I've seen against Asian spam (like the one I replied to).
>
> Just to set the record straight, that was RUSSIAN spam, not Asian spam...
> (The regex should still be effective, of course.)
>
>Except that it would block out uuencoded patches in postings perhaps?
>Or is it just supposed to be matched in the Subject field or other
>parts of the headers?

Um... no, it's just supposed to look at several characters in a row that
have the high bit set. Have another cup of coffee and think about what
happens to attachments after they're uuencoded. :)

Best regards,
Pete C.


=====================================================================
$5 $75
"this is your brain... this is your brain on ebay."
(Pat McNeil on comp.sys.tandy)

2002-02-07 12:48:27

by Oliver M. Bolzer

[permalink] [raw]
Subject: Re: ?????????????????????

On Wed, Feb 06, 2002 at 04:21:50PM -0500, Brian <[email protected]> wrote...
> Can we get something like
> /[\200-\377]{6}/ (6 upper ACSII characters in a row)
> added to the taboo list?

If you mean to match a header like Subject: , don't forget to
decode them first. Usually, headers containing these are MIME-encoded.

--
Oliver M. Bolzer
[email protected]

GPG (PGP) Fingerprint = 621B 52F6 2AC1 36DB 8761 018F 8786 87AD EF50 D1FF

2002-02-07 20:03:58

by Jesse Pollard

[permalink] [raw]
Subject: Re: ?????????????????????

--------- Received message begins Here ---------

>
> From: Bruce Harada <[email protected]>
> Date: Thu, 7 Feb 2002 20:12:43 +0900
>
> On Wed, 06 Feb 2002 18:42:17 -0500
> Brian <[email protected]> wrote:
>
> > To my knowledge, there is no English word that would match that regex (or,
> > for that matter, any Romantic or Germanic language word). It's the most
> > effective tool I've seen against Asian spam (like the one I replied to).
>
> Just to set the record straight, that was RUSSIAN spam, not Asian spam...
> (The regex should still be effective, of course.)
>
> Except that it would block out uuencoded patches in postings perhaps?
> Or is it just supposed to be matched in the Subject field or other
> parts of the headers?

I thought uuencode format required 7bit print ascii output, which would
never match that pattern.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: [email protected]

Any opinions expressed are solely my own.

2002-02-07 22:54:50

by Pavel Machek

[permalink] [raw]
Subject: Re: ?????????????????????

Hi!

> like
>
> Subject: [ANNOUNCE] blah blah?

He talked about characters >128.
Pavel

> --On Wednesday, 06 February, 2002 4:21 PM -0500 Brian
> <[email protected]> wrote:
>
> >Can we get something like
> > /[\200-\377]{6}/ (6 upper ACSII characters in a row)
> >added to the taboo list?
> >
> > -- Brian
> >
> >On Tuesday 05 February 2002 06:48 pm, [email protected] wrote:
> >>??? ???????? ??????????.
> >>
> >>?????? ? ????????? ? ??????? ???????? ?? 2002 ??? (? ???????? ?? 1

--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-02-08 12:41:22

by Martin Dalecki

[permalink] [raw]
Subject: Re: ?????????????????????

Brian wrote:

>On Wednesday 06 February 2002 05:31 pm, Alex Bligh - linux-kernel wrote:
>
>>like
>>
>>Subject: [ANNOUNCE] blah blah?
>>
>
>That would be upperCASE ACSII.
>I mean 6 bytes, each higher than 127, in a row.
>
>To my knowledge, there is no English word that would match that regex (or,
>for that matter, any Romantic or Germanic language word). It's the most
>effective tool I've seen against Asian spam (like the one I replied to).
>
Just for the record... the spam you replayed to was in russian.