[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Scheme-reports] Opinion about R7RS



    Please find hereafter my comments about the draft for R7RS.

    My main comments are about characters and encodings.

    The description of the "read-char" function does not mention any
encoding.
Is there a default encoding?  ASCII since implementations of Scheme are
not
required to deal with the full range of Unicode, but only with the part
U+0000..U+007F?  Of course, the same question holds for the "write-char"

function.
    I think that Scheme should provide input procedures that deal with
encodings such that Latin-1, UTF-8, UTF-16, and others. Maybe some
exotic
encodings will not be provided by some Scheme interpeters, that is why a

function should return a list of available encodings for an
interpreter/compiler would be use ful. So users can implement themselves
other
encodings---if need be---by using bytes or bytevectors.
    By the way, an useful function about text files would return the
encoding
used.  On Unix and Mac OS X, this function could be implemented by
interfacing
the "file" command. So, if we can known the encoding of a file and the
encodings processed by a Scheme program, we will know if a text file
will be
directly processable.

    On another point, there is some ambiguity about functions like
"char-alphabetic?".

    Let us consider the "char-alphabetic?" function implemented by a
Scheme
interpreter that only implements the Latin 1 encoding.  In particular,
it
implements ASCII, so that is permitted, but it does not implement the
full
range of Unicode.  What would be the answer if this "char-alphabetic?"
function is applied to the letter "e with acute accent"?  Should this
interpreter deal with Unicode properties as far as possible, or can it
answer
#f since it not Unicode-compliant.  But the second choice would mean
that the
answer may be implementation-dependent.
    A second problem: let us assume that we are implementing a small
interpreter for a classical programming language, "classical" in the
sense
that only letters belonging to the ASCII encoding can be used to build
identifiers.  So this "char-alphabetic?" function will be unusable when
we
write the lexical analyzer of this language in Scheme.  From my point of
view,
a better choice would be:
    - a "char-alphabetic?" function retaining only the ASCII letters,
compatibly with the namesake function of previous standards;
    - a new "u-char-alphabetic?" function retaining letters, as far as
possible, depending on the range of characters provided; in other words,
it
will retain the letters of Latin-1 (resp. Latin-2) if it implements the
Latin-1 (resp. Latin-2) encoding; in particular, if the full range of
Unicode
is implemented, all the letters of Unicode will be retained.
    Of course, the same remarks hold about the functions "char-numeric?",

"char-whitespace?", "char-upper-case?", "char-lower-case?".

    The last two points already existed in previous standards, but I
mention
them.

    The description of the "call/cc" function begins with:
It is an error if "proc" does not accept one argument.
    That is the standard case, but as it is recalled later in the text:
Except for the continuations created by the "call-with-values" procedure

[...], all continuations take exactly one value.
    I think that the first sentence should be reformulated.

    The "values" function should be defined better about the equivalence:

       (values X) == X
The report reads:
    The "values" procedure might be defined as follows:
(define (values . things)
   (call/cc (lambda (cont) (apply cont things))))
In such a case, we can easily prove that (values X) ==> X.  But "might
be
defined" can be interpreted "might be defined or might not be defined".
A
rough implementation of this function is:
(define (values . things)
   (lambda (f) (apply f things)))
However, this implementation seems to me to be correct w.r.t. the
description
of the "values" function, but obviously the equivalence is not true.

    If the equivalence is true, that means that we can use the "values"
function as the identity function when it is applied to one argument.
Besides, this feature is used within the proposed implementation of some

SRFIs. If this equivalence is false, we cannot. So, I think that the
description of "values" should be more precise: mention that the
equivalence
is true or implementation-dependent.

    Cheers,

J.-M.

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


_______________________________________________
Scheme-reports mailing list
Scheme-reports@x
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports