[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Scheme-reports] Scheme r7rs syntax described by ABNF
Hello again.
I have an updated draft for the formal syntax of Scheme r7rs (based on r7rs-draft-8.pdf) written in ABNF. This draft includes all sections except for quasiquotations. The datum section has undergone some quickcheck-style unit testing. I'd appreciate any review and feedback.
I also have one *minor* suggestion. The definition for <library name part> seems inconsistent with the style used for <label>.
library-name-part = identifier / 1*digit10
The above definition would follow same style used for <label>.
regards,
Joe N.
;;;
;;; Formal syntax for Scheme r7rs described by the following ABNF
;;; [RFC5234]. Although [RFC5234] refers to octets, the syntax
;;; described in this document are sequences of character numbers
;;; (code points) taken from Unicode. The terminals in the ABNF
;;; productions are in terms of characters rather than bytes.
;;;
;;; A minimal number of delimiters (i.e. "DELIMITER") have been
;;; inserted to ensure the rules herein are parseable AS-IS by the
;;; "read" procedure.
;; r7rs Helper tokens
tab = %x09 ; \t
newline = %x0A ; \n
return = %x0D ; \r
space = %x20 ; \s
double-quote = %x22 ; "
number-sign = %x23 ; #
backslash = %x5C ; \
vertical-line = %x7C ; |
alarm-name = %x61.6C.61.72.6D ; alarm
backspace-name = %x62.61.63.6B.73.70.61.63.65 ; backspace
delete-name = %x64.65.6C.65.74.65 ; delete
escape-name = %x65.73.63.61.70.65 ; escape
newline-name = %x6E.65.77.6C.69.6E.65 ; newline
null-name = %x6E.75.6C.6C ; null
return-name = %x72.65.74.75.72.6E ; return
space-name = %x73.70.61.63.65 ; space
tab-name = %x74.61.62 ; tab
unichar-low = %x0000-D7FF
unichar-high = %xE000-10FFFF
unichar = unichar-low / unichar-high
non-backslash-or-double-quote = %x00-21 / %x23-5B / %x5D-D7FF / unichar-high
non-line-ending = %x00-09 / %x0B-0C / %x0E-D7FF / unichar-high
non-vertical-line = %x00-7B / %x7D-D7FF / unichar-high
non-vertical-line-or-backslash = %x00-5B / %x5D-7B / %x7D-D7FF / unichar-high
non-vertical-line-or-backslash-or-double-quote = %x00-21 / %x23-5B / %x5D-7B / %x7D-D7FF / unichar-high
non-vertical-line-or-number-sign = %x00-22 / %x24-7B / %x7D-D7FF / unichar-high
;; r7rs Lexical structure
dot = "." DOT-DELIMITER
token = identifier / boolean / number
/ character / string
/ "(" / ")" / "#(" / "#u8(" / "'" / "`" / "," / ",@" / dot
DELIMITER = whitespace intertoken-space
DOT-DELIMITER = whitespace dot-intertoken-space
; delimiter = whitespace / vertical-line
; / "(" / ")" / double-quote / ";"
; CAUTION: approximation for <delimiter> and <intertoken-space> with
; a special case for "."
intraline-whitespace = space / tab
whitespace = intraline-whitespace / line-ending
line-ending = newline / return newline / return
comment = ";" *non-line-ending line-ending
/ nested-comment
/ "#;" datum
dot-comment = ";" *non-line-ending line-ending
/ nested-comment
nested-comment = "#|" comment-text *comment-cont "|#"
comment-text = *non-vertical-line-or-number-sign
; CAUTION: approximation for <character sequence not containing #| or |#>
comment-cont = nested-comment comment-text
directive = ("#!fold-case" / "#!no-fold-case")
atmosphere = whitespace / comment / directive
dot-atmosphere = whitespace / dot-comment / directive
intertoken-space = *atmosphere
dot-intertoken-space = *dot-atmosphere
identifier = initial *subsequent DELIMITER
/ vertical-line *symbol-element vertical-line
/ peculiar-identifier DELIMITER
initial = letter / special-initial / inline-hex-escape
letter = %x61-7A / %x41-5A ; a-z / A-Z
special-initial = "!" / "$" / "%" / "&" / "*" / "/" / ":" / "<" / "="
/ ">" / "?" / "^" / "_" / "~"
subsequent = initial / digit / special-subsequent
digit = digit10 ; 0-9
hex-digit = digit16 ; 0-9 / a-f / A-F
explicit-sign = "+" / "-"
special-subsequent = explicit-sign / "." / "@"
inline-hex-escape = "\x" hex-scalar-value ";"
digit16-lt-d = digit10 / %x61-63 / %x41-43 ; 0-9 / a-c / A-c
hex-scalar-value = *"0" ( (digit16-lt-d 3hex-digit) ; %x000000-00CFFF
/ ( "D" digit8 2hex-digit) ; %x00D000-00D7FF
/ ( ("E" / "F") 3hex-digit) ; %x00E000-00FFFF
/ ("10" 4hex-digit) ) ; %x100000-10FFFF
peculiar-identifier = explicit-sign
/ explicit-sign sign-subsequent *subsequent
/ explicit-sign "." dot-subsequent *subsequent
/ "." dot-subsequent *subsequent
; CAUTION: Note that "+i", "-i" and infnan are exceptions to the
; peculiar-identifier rule; they are parsed as numbers, not
; identifiers.
dot-subsequent = sign-subsequent / "."
sign-subsequent = initial / explicit-sign / "@"
symbol-element = non-vertical-line-or-backslash
/ symbolstring-element / double-quote / "\|"
boolean = ("#t" / "#f" / "#true" / "#false") DELIMITER
character = ("#\" (character-any / character-name / "x" hex-scalar-value)) DELIMITER
character-any = unichar
character-name = alarm-name / backspace-name / delete-name
/ escape-name / newline-name / null-name
/ return-name / space-name / tab-name
string = double-quote *string-element double-quote
string-element = non-backslash-or-double-quote
/ "\a" / "\b" / "\t" / "\n" / "\r" / ("\" double-quote) / "\\"
/ "\" *intraline-whitespace line-ending *intraline-whitespace
/ inline-hex-escape
symbolstring-element = non-vertical-line-or-backslash-or-double-quote
/ "\a" / "\b" / "\t" / "\n" / "\r" / ("\" double-quote) / "\\"
/ "\" *intraline-whitespace line-ending *intraline-whitespace
/ inline-hex-escape
bytevector = "#u8(" *byte ")"
byte = (%x30-39 ; 0-9
/ %x31-39 %x30-39 ; 10-99
/ %x31 %x30-39 %x30-39 ; 100-199
/ %x32 %x30-35 %x30-35) DELIMITER ; 200-255
number = (num2 / num8 / num10 / num16) DELIMITER
num2 = prefix2 complex2
num8 = prefix8 complex8
num10 = prefix10 complex10
num16 = prefix16 complex16
complex2 = real2 / real2 "@" real2
/ real2 "+" ureal2 "i" / real2 "-" ureal2 "i"
/ real2 "+i" / real2 "-i" / real2 infnan "i"
/ "+" ureal2 "i" / "-" ureal2 "i"
/ infnan "i" / "+i" / "-i"
complex8 = real8 / real8 "@" real8
/ real8 "+" ureal8 "i" / real8 "-" ureal8 "i"
/ real8 "+i" / real8 "-i" / real8 infnan "i"
/ "+" ureal8 "i" / "-" ureal8 "i"
/ infnan "i" / "+i" / "-i"
complex10 = real10 / real10 "@" real10
/ real10 "+" ureal10 "i" / real10 "-" ureal10 "i"
/ real10 "+i" / real10 "-i" / real10 infnan "i"
/ "+" ureal10 "i" / "-" ureal10 "i"
/ infnan "i" / "+i" / "-i"
complex16 = real16 / real16 "@" real16
/ real16 "+" ureal16 "i" / real16 "-" ureal16 "i"
/ real16 "+i" / real16 "-i" / real16 infnan "i"
/ "+" ureal16 "i" / "-" ureal16 "i"
/ infnan "i" / "+i" / "-i"
real2 = sign ureal2
/ infnan
real8 = sign ureal8
/ infnan
real10 = sign ureal10
/ infnan
real16 = sign ureal16
/ infnan
ureal2 = uinteger2
/ uinteger2 "/" uinteger2
ureal8 = uinteger8
/ uinteger8 "/" uinteger8
ureal10 = uinteger10
/ uinteger10 "/" uinteger10
/ decimal10
ureal16 = uinteger16
/ uinteger16 "/" uinteger16
decimal10 = uinteger10 suffix
/ "." 1*digit10 suffix
/ 1*digit10 "." *digit10 suffix
uinteger2 = 1*digit2
uinteger8 = 1*digit8
uinteger10 = 1*digit10
uinteger16 = 1*digit16
prefix2 = radix2 exactness
/ exactness radix2
prefix8 = radix8 exactness
/ exactness radix8
prefix10 = radix10 exactness
/ exactness radix10
prefix16 = radix16 exactness
/ exactness radix16
infnan = "+inf.0" / "-inf.0" / "+nan.0" / "-nan.0"
suffix = [exponent-marker sign 1*digit10]
exponent-marker = "e" / "s" / "f" / "d" / "l"
sign = ["+" / "-"]
exactness = ["#i" / "#e"]
radix2 = "#b"
radix8 = "#o"
radix10 = ["#d"]
radix16 = "#x"
digit2 = %x30-31 ; 0-1
digit8 = %x30-37 ; 0-7
digit10 = %x30-39 ; 0-9
digit16 = digit10 / %x61-66 / %x41-46 ; 0-9 / a-f / A-F
;; r7rs External representations
datum = simple-datum / compound-datum
/ label "=" datum / label "#"
simple-datum = boolean / number
/ character / string
/ symbol / bytevector
symbol = identifier
compound-datum = list / vector / abbreviation
list = "(" *datum ")"
/ "(" 1*datum dot datum ")"
abbreviation = abbrev-prefix datum
abbrev-prefix = "'" / "`" / "," / ",@"
vector = "#(" *datum ")"
label = "#" 1*digit10
;; r7rs Expressions
expression = identifier
/ literal
/ procedure-call
/ lambda-expression
/ conditional
/ assignment
/ derived-expression
/ macro-use
/ macro-block
/ includer
literal = quotation / self-evaluating
self-evaluating = boolean / number / vector
/ character / string / bytevector
quotation = "'" datum
/ "(" "quote" DELIMITER datum ")"
procedure-call = "(" operator *operand ")"
operator = expression
operand = expression
lambda-expression = "(" "lambda" DELIMITER formals body ")"
formals = "(" *identifier ")"
/ identifier
/ "(" 1*identifier dot identifier ")"
body = *definition sequence
sequence = *command expression
command = expression
conditional = "(" "if" DELIMITER test consequent alternate ")"
test = expression
consequent = expression
alternate = [expression]
assignment = "(" "set!" DELIMITER identifier expression ")"
derived-expression = "(" "cond" DELIMITER 1*cond-clause ")"
/ "(" "cond" DELIMITER *cond-clause "(" "else" DELIMITER sequence ")" ")"
/ "(" "case" DELIMITER expression 1*case-clause ")"
/ "(" "case" DELIMITER expression *case-clause "(" "else" DELIMITER sequence ")" ")"
/ "(" "case" DELIMITER expression *case-clause "(" "else" DELIMITER "=>" DELIMITER recipient ")" ")"
/ "(" "and" DELIMITER *test ")"
/ "(" "or" DELIMITER *test ")"
/ "(" "when" DELIMITER test sequence ")"
/ "(" "unless" DELIMITER test sequence ")"
/ "(" "let" DELIMITER "(" *binding-spec ")" body ")"
/ "(" "let" DELIMITER identifier "(" *binding-spec ")" body ")"
/ "(" "let*" DELIMITER "(" *binding-spec ")" body ")"
/ "(" "letrec" DELIMITER "(" *binding-spec ")" body ")"
/ "(" "letrec*" DELIMITER "(" *binding-spec ")" body ")"
/ "(" "let-values" DELIMITER "(" *mv-binding-spec ")" body ")"
/ "(" "let*-values" DELIMITER "(" *mv-binding-spec ")" body ")"
/ "(" "begin" DELIMITER sequence ")"
/ "(" "do" DELIMITER "(" *iteration-spec ")" "(" test do-result ")"
*command ")"
/ "(" "delay" DELIMITER expression ")"
/ "(" "delay-force" DELIMITER expression ")"
/ "(" "parameterize" DELIMITER "(" *("(" expression expression ")") ")" ")"
/ "(" "guard" DELIMITER "(" identifier *cond-clause ")" body ")"
/ quasiquotation
/ "(" "case-lambda" DELIMITER *case-lambda-clause ")"
cond-clause = "(" test sequence ")"
/ "(" test ")"
/ "(" test "=>" DELIMITER recipient ")"
recipient = expression
case-clause = "(" "(" *datum ")" sequence ")"
/ "(" "(" *datum ")" "=>" DELIMITER recipient ")"
binding-spec = "(" identifier expression ")"
mv-binding-spec = "(" formals expression ")"
iteration-spec = "(" identifier init step ")"
/ "(" identifier init ")"
case-lambda-clause = "(" formals body ")"
init = expression
step = expression
do-result = expression
macro-use = "(" keyword *datum ")"
keyword = identifier
macro-block = "(" "let-syntax" DELIMITER "(" *syntax-spec ")" body ")"
/ "(" "letrec-syntax" DELIMITER "(" *syntax-spec ")" body ")"
syntax-spec = "(" keyword transformer-spec ")"
includer = "(" "include" DELIMITER 1*string ")"
/ "(" "include-ci" DELIMITER 1*string ")"
;; r7rs Quasiquotations (TBD)
quasiquotation = "`" "|TBD|"
/ "(" "quasiquote" "|TBD|" ")"
;; r7rs Transformers
transformer-spec = "(" "syntax-rules" DELIMITER "(" *identifier ")" *syntax-rule ")"
/ "(" "syntax-rules" DELIMITER identifier "(" *identifier ")"
*syntax-rule ")"
syntax-rule = "(" pattern template ")"
pattern = pattern-identifier
/ underscore
/ "(" *pattern ")"
/ "(" 1*pattern dot pattern ")"
/ "(" *pattern pattern ellipsis *pattern ")"
/ "(" *pattern pattern ellipsis *pattern
dot pattern ")"
/ "#(" *pattern ")"
/ "#(" *pattern pattern ellipsis *pattern ")"
/ pattern-datum
pattern-datum = string
/ character
/ boolean
/ number
template = pattern-identifier
/ "(" *template-element ")"
/ "(" 1*template-element dot template ")"
/ "#(" *template-element ")"
/ template-datum
template-element = template
/ template ellipsis
template-datum = pattern-datum
pattern-identifier = initial *subsequent DELIMITER
/ vertical-line *symbol-element vertical-line
/ pattern-peculiar-identifier DELIMITER
ellipsis = "..." DELIMITER
underscore = "_" DELIMITER
pattern-peculiar-identifier = explicit-sign
/ explicit-sign sign-subsequent *subsequent
/ explicit-sign "." dot-subsequent *subsequent
/ "." dot-subsequent *pattern-subsequent
; CAUTION: Note that "+i", "-i" and infnan are exceptions to the
; peculiar-pattern rule; they are parsed as numbers, not
; identifiers.
pattern-subsequent = initial / digit / pattern-special-subsequent
pattern-special-subsequent = explicit-sign / "@"
;; r7rs Programs and definitions
program = 1*import-declaration 1*command-or-definition
command-or-definition = command
/ definition
/ "(" "begin" DELIMITER 1*command-or-definition ")"
definition = "(" "define" DELIMITER identifier expression ")"
/ "(" "define" DELIMITER "(" identifier def-formals ")" body ")"
/ syntax-definition
/ "(" "define-values" DELIMITER def-formals body ")"
/ "(" "define-record-type" DELIMITER identifier
constructor identifier *field-spec ")"
/ "(" "begin" DELIMITER *definition ")"
def-formals = *identifier
/ *identifier dot identifier
constructor = "(" identifier *field-name ")"
field-spec = "(" field-name accessor ")"
/ "(" field-name accessor mutator ")"
field-name = identifier
accessor = identifier
mutator = identifier
syntax-definition = "(" "define-syntax" DELIMITER keyword transformer-spec ")"
;; r7rs Libraries
library = "(" "define-library" DELIMITER library-name
*library-declaration ")"
library-name = "(" 1*library-name-part ")"
library-name-part = identifier
/ 1*digit10 DELIMITER
; CAUTION: need to confirm correction to r7rs spec for above <uinteger 10>
library-declaration = "(" "export" DELIMITER *export-spec ")"
/ import-declaration
/ "(" "begin" DELIMITER *library-declaration ")"
/ includer
/ "(" "cond-expand" DELIMITER 1*cond-expand-clause ")"
/ "(" "cond-expand" DELIMITER 1*cond-expand-clause
"(" "else" DELIMITER *library-declaration ")" ")"
import-declaration = "(" "import" DELIMITER 1*import-set ")"
export-spec = identifier
/ "(" "rename" DELIMITER identifier identifier ")"
import-set = library-name
/ "(" "only" DELIMITER import-set 1*identifier ")"
/ "(" "except" DELIMITER import-set 1*identifier ")"
/ "(" "prefix" DELIMITER import-set identifier ")"
/ "(" "rename" DELIMITER import-set
"(" identifier 1*identifier ")" ")"
cond-expand-clause = "(" feature-requirement *library-declaration ")"
feature-requirement = identifier
/ library-name
/ "(" "and" DELIMITER *feature-requirement ")"
/ "(" "or" DELIMITER *feature-requirement ")"
/ "(" "not" DELIMITER feature-requirement ")"
On Dec 29, 2012, at 24:47 , Joseph Wayne Norton <norton@x> wrote:
>
> Hello.
>
> In the process of reviewing the r7rs draft, I decided to draft the formal syntax of Scheme r7rs written in ABNF. This draft only covers tokens (including datum).
>
> This kind of specification would be helpful to me and possibly to others. I'd appreciate any review and feedback.
>
> I intend to draft the other sections (i.e. expressions, quasiquotations, transformers, programs and definitions, and libraries) as well.
>
> thanks,
>
> Joe N.
>
> <scheme_r7rs_tokens.abnf>
_______________________________________________
Scheme-reports mailing list
Scheme-reports@x
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports