[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Scheme-reports] Proposing amending char-numeric? definition

Ok, I rehash the argument and make it more a proposal.

The draft's wording of char-numeric? is confusing, for Unicode doesn't
define "Numeric" property explicitly like "Alphabetic" or "Uppercase"
properties.  So I propose to change it.

There can be a few possible resolutions.

(1) Define char-numeric? returns #t if the character's Numeric_Type
property value is other than 'None'.   This seems a natural
interpretation of the current wording.  However, I think it is
practically useless, since it *can't* be used to separate numbers from
a string.  Characters whose Numeric_Type isn't 'None' includes
ordinary alphabetic characters (category Lo) that happens to have
meanings related to numbers.  For example, '幺' (U+5e7a) has
Numeric_Type = 'Numeric', since the character means small or young, so
it can sometimes mean 1 in some specific context (for Japanese,
probably the only place it means '1' is in some Mah-jong terms.)   So,
when I'm scanning a string and found that char-numeric? returns #t for
a character, and that character happens to '幺' (U+5e7a), and then what
I do?   It is probably a part of other word so I should treat it as an
alphabetic character.  And even if I want to make use of it, I need a
separate database to look up to know what number '幺' is representing.

(2) Drop char-numeric?, and add char-numeric-type and
char-numeric-value.  The former returns the value of Numeric_Type
property, and the latter returns the value of Numeric_Value property.
 This should be the way to provide access to a character's Unicode
"Numeric" property.

(3) Define char-numeric? to return #t only for 0,1,2,3,4,5,6,7,8 and
9.   This retains the compatibility to R5RS, and we can still use
char-numeric? to parse numbers, and safely use (- (char->integer c)
(char->integer #\0)) to obtain the digit value the character
represents.  (Note: R5RS programs that use char-numeric? to parse
numbers will break if we adopt the current draft's definition of

Scheme-reports mailing list