[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Scheme-reports] Scheme r7rs syntax described by ABNF
Hello again.
I have an updated draft for the formal syntax of Scheme r7rs written in ABNF. This draft is based on updates since r7rs-draft-8.pdf and includes all sections except for quasiquotations. The datum and expression sections have undergone some quickcheck-style unit testing. I'd appreciate any review and feedback.
regards,
Joe N
;;;
;;; Formal syntax for Scheme r7rs described in ABNF format per
;;; [RFC5234] except for one extension - single quoted strings are
;;; case-sensitive.
;;;
;;; Although [RFC5234] refers to octets, the syntax described in this
;;; document are sequences of character numbers (code points) taken
;;; from Unicode. The terminals in the ABNF productions are in terms
;;; of characters rather than bytes.
;;;
;;; A minimal number of delimiters (i.e. "DELIMITER-ANY" and
;;; "DELIMITER-DOT") have been inserted to ensure the rules herein are
;;; parseable AS-IS by the "read" procedure.
;;;
;; r7rs Helper tokens
tab = %x09 ; \t
newline = %x0A ; \n
return = %x0D ; \r
space = %x20 ; \s
double-quote = %x22 ; "
number-sign = %x23 ; #
single-quote = %x27 ; '
backslash = %x5C ; \
vertical-line = %x7C ; |
unichar-low = %x0000-D7FF
unichar-high = %xE000-10FFFF
unichar = unichar-low / unichar-high
non-double-quote-or-backslash = %x00-21 / %x23-5B / %x5D-D7FF / unichar-high
non-line-ending = %x00-09 / %x0B-0C / %x0E-D7FF / unichar-high
non-vertical-line = %x00-7B / %x7D-D7FF / unichar-high
non-vertical-line-or-backslash = %x00-5B / %x5D-7B / %x7D-D7FF / unichar-high
non-vertical-line-or-number-sign = %x00-22 / %x24-7B / %x7D-D7FF / unichar-high
and = 'and' DELIMITER-ANY
begin = 'begin' DELIMITER-ANY
case = 'case' DELIMITER-ANY
case-lambda = 'case-lambda' DELIMITER-ANY
cond = 'cond' DELIMITER-ANY
cond-expand = 'cond-expand' DELIMITER-ANY
define = 'define' DELIMITER-ANY
define-library = 'define-library' DELIMITER-ANY
define-record-type = 'define-record-type' DELIMITER-ANY
define-syntax = 'define-syntax' DELIMITER-ANY
define-values = 'define-values' DELIMITER-ANY
delay = 'delay' DELIMITER-ANY
delay-force = 'delay-force' DELIMITER-ANY
do = 'do' DELIMITER-ANY
dot = '.' DELIMITER-DOT
ellipsis = '...' DELIMITER-ANY
else = 'else' DELIMITER-ANY
except = 'except' DELIMITER-ANY
export = 'export' DELIMITER-ANY
guard = 'guard' DELIMITER-ANY
if = 'if' DELIMITER-ANY
implies = '=>' DELIMITER-ANY
import = 'import' DELIMITER-ANY
include = 'include' DELIMITER-ANY
include-ci = 'include-ci' DELIMITER-ANY
include-library-declarations = 'include-library-declarations' DELIMITER-ANY
lambda = 'lambda' DELIMITER-ANY
let = 'let' DELIMITER-ANY
let-syntax = 'let-syntax' DELIMITER-ANY
let-values = 'let-values' DELIMITER-ANY
letrec = 'letrec' DELIMITER-ANY
letrec-syntax = 'letrec-syntax' DELIMITER-ANY
letrecstar = 'letrec*' DELIMITER-ANY
letstar = 'let*' DELIMITER-ANY
letstar-values = 'let*-values' DELIMITER-ANY
not = 'not' DELIMITER-ANY
only = 'only' DELIMITER-ANY
or = 'or' DELIMITER-ANY
parameterize = 'parameterize' DELIMITER-ANY
prefix = 'prefix' DELIMITER-ANY
quasiquote = 'quasiquote' DELIMITER-ANY
quote = 'quote' DELIMITER-ANY
rename = 'rename' DELIMITER-ANY
setbang = 'set!' DELIMITER-ANY
syntax-rules = 'syntax-rules' DELIMITER-ANY
underscore = '_' DELIMITER-ANY
unless = 'unless' DELIMITER-ANY
when = 'when' DELIMITER-ANY
DELIMITER-ANY = whitespace intertoken-space
DELIMITER-DOT = whitespace intertoken-space-dot
comment-dot = ";" *non-line-ending line-ending
/ nested-comment
atmosphere-dot = whitespace / comment-dot / directive
intertoken-space-dot = *atmosphere-dot
; CAUTION: approximation for <delimiter> and <intertoken-space> with
; special handling for "."
;; r7rs Lexical structure
token = identifier / boolean / number
/ character / string
/ "(" / ")" / "#(" / "#u8(" / "'" / "`" / "," / ",@" / dot
delimiter = whitespace / vertical-line
/ "(" / ")" / '"' / ";"
intraline-whitespace = space / tab
whitespace = intraline-whitespace / line-ending
line-ending = newline / return newline / return
comment = ";" *non-line-ending line-ending
/ nested-comment
/ "#;" datum
nested-comment = "#|" comment-text *comment-cont "|#"
comment-text = *non-vertical-line-or-number-sign
; CAUTION: approximation for <character sequence not containing #| or |#>
comment-cont = nested-comment comment-text
directive = ("#!fold-case" / "#!no-fold-case")
atmosphere = whitespace / comment / directive
intertoken-space = *atmosphere
identifier = initial *subsequent
DELIMITER-ANY
/ vertical-line *symbol-element vertical-line
/ peculiar-identifier
DELIMITER-ANY
initial = letter / special-initial
letter = %x61-7A / %x41-5A ; a-z / A-Z
special-initial = "!" / "$" / "%" / "&" / "*" / "/" / ":" / "<" / "="
/ ">" / "?" / "^" / "_" / "~"
subsequent = initial / digit / special-subsequent
digit = digit10 ; 0-9
hex-digit = digit16 ; 0-9 / a-f / A-F
explicit-sign = "+" / "-"
special-subsequent = explicit-sign / "." / "@"
inline-hex-escape = "\x" hex-scalar-value ";"
digit16-lt-d = digit10 / %x61-63 / %x41-43 ; 0-9 / a-c / A-c
hex-scalar-value = *"0" ( (digit16-lt-d 3hex-digit) ; %x000000-00CFFF
/ ( "D" digit8 2hex-digit) ; %x00D000-00D7FF
/ ( ("E" / "F") 3hex-digit) ; %x00E000-00FFFF
/ ("10" 4hex-digit) ) ; %x100000-10FFFF
mnemonic-escape = '\a' / '\b' / '\t' / '\n' / '\r'
peculiar-identifier = explicit-sign
/ explicit-sign sign-subsequent *subsequent
/ explicit-sign "." dot-subsequent *subsequent
/ "." dot-subsequent *subsequent
; CAUTION: Note that "+i", "-i" and <infnan> are exceptions to the
; peculiar-identifier rule; they are parsed as numbers, not
; identifiers.
dot-subsequent = sign-subsequent / "."
sign-subsequent = initial / explicit-sign / "@"
symbol-element = non-vertical-line-or-backslash
/ mnemonic-escape / "\|"
/ inline-hex-escape
boolean = ("#t" / "#f" / "#true" / "#false")
DELIMITER-ANY
character = ("#\" (character-any / character-name / "x" hex-scalar-value))
DELIMITER-ANY
character-any = unichar
character-name = 'alarm' / 'backspace' / 'delete'
/ 'escape' / 'newline' / 'null'
/ 'return' / 'space' / 'tab'
string = '"' *string-element '"'
string-element = non-double-quote-or-backslash
/ mnemonic-escape / '\"' / "\\"
/ "\" *intraline-whitespace line-ending *intraline-whitespace
/ inline-hex-escape
bytevector = "#u8(" *byte ")"
byte = (%x30-39 ; 0-9
/ %x31-39 %x30-39 ; 10-99
/ %x31 %x30-39 %x30-39 ; 100-199
/ %x32 %x30-35 %x30-35) ; 200-255
DELIMITER-ANY
number = (num2 / num8 / num10 / num16)
DELIMITER-ANY
num2 = prefix2 complex2
num8 = prefix8 complex8
num10 = prefix10 complex10
num16 = prefix16 complex16
complex2 = real2 / real2 "@" real2
/ real2 "+" ureal2 "i" / real2 "-" ureal2 "i"
/ real2 "+i" / real2 "-i" / real2 infnan "i"
/ "+" ureal2 "i" / "-" ureal2 "i"
/ infnan "i" / "+i" / "-i"
complex8 = real8 / real8 "@" real8
/ real8 "+" ureal8 "i" / real8 "-" ureal8 "i"
/ real8 "+i" / real8 "-i" / real8 infnan "i"
/ "+" ureal8 "i" / "-" ureal8 "i"
/ infnan "i" / "+i" / "-i"
complex10 = real10 / real10 "@" real10
/ real10 "+" ureal10 "i" / real10 "-" ureal10 "i"
/ real10 "+i" / real10 "-i" / real10 infnan "i"
/ "+" ureal10 "i" / "-" ureal10 "i"
/ infnan "i" / "+i" / "-i"
complex16 = real16 / real16 "@" real16
/ real16 "+" ureal16 "i" / real16 "-" ureal16 "i"
/ real16 "+i" / real16 "-i" / real16 infnan "i"
/ "+" ureal16 "i" / "-" ureal16 "i"
/ infnan "i" / "+i" / "-i"
real2 = sign ureal2
/ infnan
real8 = sign ureal8
/ infnan
real10 = sign ureal10
/ infnan
real16 = sign ureal16
/ infnan
ureal2 = uinteger2
/ uinteger2 "/" uinteger2
ureal8 = uinteger8
/ uinteger8 "/" uinteger8
ureal10 = uinteger10
/ uinteger10 "/" uinteger10
/ decimal10
ureal16 = uinteger16
/ uinteger16 "/" uinteger16
decimal10 = uinteger10 suffix
/ "." 1*digit10 suffix
/ 1*digit10 "." *digit10 suffix
uinteger2 = 1*digit2
uinteger8 = 1*digit8
uinteger10 = 1*digit10
uinteger16 = 1*digit16
prefix2 = radix2 exactness
/ exactness radix2
prefix8 = radix8 exactness
/ exactness radix8
prefix10 = radix10 exactness
/ exactness radix10
prefix16 = radix16 exactness
/ exactness radix16
infnan = "+inf.0" / "-inf.0" / "+nan.0" / "-nan.0"
suffix = [exponent-marker sign 1*digit10]
exponent-marker = "e" / "s" / "f" / "d" / "l"
sign = ["+" / "-"]
exactness = ["#i" / "#e"]
radix2 = "#b"
radix8 = "#o"
radix10 = ["#d"]
radix16 = "#x"
digit2 = %x30-31 ; 0-1
digit8 = %x30-37 ; 0-7
digit10 = %x30-39 ; 0-9
digit16 = digit10 / %x61-66 / %x41-46 ; 0-9 / a-f / A-F
;; r7rs External representations
datum = simple-datum / compound-datum
/ label "=" datum / label "#"
simple-datum = boolean / number
/ character / string
/ symbol / bytevector
symbol = identifier
compound-datum = list / vector / abbreviation
list = "(" *datum ")"
/ "(" 1*datum dot datum ")"
abbreviation = abbrev-prefix datum
abbrev-prefix = "'" / "`" / "," / ",@"
vector = "#(" *datum ")"
label = "#" uinteger10
;; r7rs Expressions
expression = identifier
/ literal
/ procedure-call
/ lambda-expression
/ conditional
/ assignment
/ derived-expression
/ macro-use
/ macro-block
/ includer
literal = quotation / self-evaluating
self-evaluating = boolean / number / vector
/ character / string / bytevector
quotation = "'" datum
/ "(" quote datum ")"
procedure-call = "(" operator *operand ")"
operator = expression
operand = expression
lambda-expression = "(" lambda formals body ")"
formals = "(" *identifier ")"
/ identifier
/ "(" 1*identifier dot identifier ")"
body = *definition sequence
sequence = *command expression
command = expression
conditional = "(" if test consequent alternate ")"
test = expression
consequent = expression
alternate = [expression]
assignment = "(" setbang identifier expression ")"
derived-expression = "(" cond 1*cond-clause ")"
/ "(" cond *cond-clause "(" else sequence ")" ")"
/ "(" case expression 1*case-clause ")"
/ "(" case expression *case-clause "(" else sequence ")" ")"
/ "(" case expression *case-clause "(" else implies recipient ")" ")"
/ "(" and *test ")"
/ "(" or *test ")"
/ "(" when test sequence ")"
/ "(" unless test sequence ")"
/ "(" let "(" *binding-spec ")" body ")"
/ "(" let identifier "(" *binding-spec ")" body ")"
/ "(" letstar "(" *binding-spec ")" body ")"
/ "(" letrec "(" *binding-spec ")" body ")"
/ "(" letrecstar "(" *binding-spec ")" body ")"
/ "(" let-values "(" *mv-binding-spec ")" body ")"
/ "(" letstar-values "(" *mv-binding-spec ")" body ")"
/ "(" begin sequence ")"
/ "(" do "(" *iteration-spec ")" "(" test do-result ")"
*command ")"
/ "(" delay expression ")"
/ "(" delay-force expression ")"
/ "(" parameterize "(" *("(" expression expression ")") ")" ")"
/ "(" guard "(" identifier *cond-clause ")" body ")"
/ quasiquotation
/ "(" case-lambda *case-lambda-clause ")"
cond-clause = "(" test sequence ")"
/ "(" test ")"
/ "(" test implies recipient ")"
recipient = expression
case-clause = "(" "(" *datum ")" sequence ")"
/ "(" "(" *datum ")" implies recipient ")"
binding-spec = "(" identifier expression ")"
mv-binding-spec = "(" formals expression ")"
iteration-spec = "(" identifier init step ")"
/ "(" identifier init ")"
case-lambda-clause = "(" formals body ")"
init = expression
step = expression
do-result = expression
macro-use = "(" keyword *datum ")"
keyword = identifier
macro-block = "(" let-syntax "(" *syntax-spec ")" body ")"
/ "(" letrec-syntax "(" *syntax-spec ")" body ")"
syntax-spec = "(" keyword transformer-spec ")"
includer = "(" include 1*string ")"
/ "(" include-ci 1*string ")"
;; r7rs Quasiquotations (TBD)
quasiquotation = "`" "|TBD|"
/ "(" quasiquote "|TBD|" ")"
;; r7rs Transformers
transformer-spec = "(" syntax-rules "(" *identifier ")" *syntax-rule ")"
/ "(" syntax-rules identifier "(" *identifier ")"
*syntax-rule ")"
syntax-rule = "(" pattern template ")"
pattern = pattern-identifier
/ underscore
/ "(" *pattern ")"
/ "(" 1*pattern dot pattern ")"
/ "(" *pattern pattern ellipsis *pattern ")"
/ "(" *pattern pattern ellipsis *pattern
dot pattern ")"
/ "#(" *pattern ")"
/ "#(" *pattern pattern ellipsis *pattern ")"
/ pattern-datum
pattern-datum = string
/ character
/ boolean
/ number
template = pattern-identifier
/ "(" *template-element ")"
/ "(" 1*template-element dot template ")"
/ "#(" *template-element ")"
/ template-datum
template-element = template
/ template ellipsis
template-datum = pattern-datum
pattern-identifier = initial *subsequent
DELIMITER-ANY
/ vertical-line *symbol-element vertical-line
/ pattern-peculiar-identifier
DELIMITER-ANY
pattern-peculiar-identifier = explicit-sign
/ explicit-sign sign-subsequent *subsequent
/ explicit-sign "." dot-subsequent *subsequent
/ "." dot-subsequent *pattern-subsequent
; CAUTION: Note that "+i", "-i" and <infnan> are exceptions to the
; peculiar-pattern rule; they are parsed as numbers, not identifiers.
pattern-subsequent = initial / digit / pattern-special-subsequent
pattern-special-subsequent = explicit-sign / "@"
;; r7rs Programs and definitions
program = 1*import-declaration 1*command-or-definition
command-or-definition = command
/ definition
/ "(" begin 1*command-or-definition ")"
definition = "(" define identifier expression ")"
/ "(" define "(" identifier def-formals ")" body ")"
/ syntax-definition
/ "(" define-values formals body ")"
/ "(" define-record-type identifier
constructor identifier *field-spec ")"
/ "(" begin *definition ")"
def-formals = *identifier
/ *identifier dot identifier
constructor = "(" identifier *field-name ")"
field-spec = "(" field-name accessor ")"
/ "(" field-name accessor mutator ")"
field-name = identifier
accessor = identifier
mutator = identifier
syntax-definition = "(" define-syntax keyword transformer-spec ")"
;; r7rs Libraries
library = "(" define-library library-name
*library-declaration ")"
library-name = "(" 1*library-name-part ")"
library-name-part = identifier
/ uinteger10
DELIMITER-ANY
library-declaration = "(" export *export-spec ")"
/ import-declaration
/ "(" begin *command-or-definition ")"
/ includer
/ "(" include-library-declarations 1*string ")"
/ "(" cond-expand 1*cond-expand-clause ")"
/ "(" cond-expand 1*cond-expand-clause
"(" else *library-declaration ")" ")"
import-declaration = "(" import 1*import-set ")"
export-spec = identifier
/ "(" rename identifier identifier ")"
import-set = library-name
/ "(" only import-set 1*identifier ")"
/ "(" except import-set 1*identifier ")"
/ "(" prefix import-set identifier ")"
/ "(" rename import-set
"(" identifier 1*identifier ")" ")"
cond-expand-clause = "(" feature-requirement *library-declaration ")"
feature-requirement = identifier
/ library-name
/ "(" and *feature-requirement ")"
/ "(" or *feature-requirement ")"
/ "(" not feature-requirement ")"
On Jan 13, 2013, at 01:32 , ノートン ジョーセフ ウェイ ン <norton@x> wrote:
>
> Hello again.
>
> I have an updated draft for the formal syntax of Scheme r7rs (based on r7rs-draft-8.pdf) written in ABNF. This draft includes all sections except for quasiquotations. The datum section has undergone some quickcheck-style unit testing. I'd appreciate any review and feedback.
>
> I also have one *minor* suggestion. The definition for <library name part> seems inconsistent with the style used for <label>.
>
> library-name-part = identifier / 1*digit10
>
> The above definition would follow same style used for <label>.
>
> regards,
>
> Joe N.
>
> <r7rs_tokens.abnf.txt>
>
>
> On Dec 29, 2012, at 24:47 , Joseph Wayne Norton <norton@x> wrote:
>
>>
>> Hello.
>>
>> In the process of reviewing the r7rs draft, I decided to draft the formal syntax of Scheme r7rs written in ABNF. This draft only covers tokens (including datum).
>>
>> This kind of specification would be helpful to me and possibly to others. I'd appreciate any review and feedback.
>>
>> I intend to draft the other sections (i.e. expressions, quasiquotations, transformers, programs and definitions, and libraries) as well.
>>
>> thanks,
>>
>> Joe N.
>>
>> <scheme_r7rs_tokens.abnf>
>
> _______________________________________________
> Scheme-reports mailing list
> Scheme-reports@x
> http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports
_______________________________________________
Scheme-reports mailing list
Scheme-reports@x
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports