[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Scheme-reports] Scheme r7rs syntax described by ABNF



Hello again.

I have an updated draft for the formal syntax of Scheme r7rs written in ABNF.  This draft is based on updates since r7rs-draft-8.pdf and includes all sections except for quasiquotations.  The datum and expression sections have undergone some quickcheck-style unit testing.  I'd appreciate any review and feedback.

regards,

Joe N

;;;
;;; Formal syntax for Scheme r7rs described in ABNF format per
;;; [RFC5234] except for one extension - single quoted strings are
;;; case-sensitive.
;;;
;;; Although [RFC5234] refers to octets, the syntax described in this
;;; document are sequences of character numbers (code points) taken
;;; from Unicode.  The terminals in the ABNF productions are in terms
;;; of characters rather than bytes.
;;;
;;; A minimal number of delimiters (i.e. "DELIMITER-ANY" and
;;; "DELIMITER-DOT") have been inserted to ensure the rules herein are
;;; parseable AS-IS by the "read" procedure.
;;;

;; r7rs Helper tokens

tab            = %x09            ; \t
newline        = %x0A            ; \n
return         = %x0D            ; \r
space          = %x20            ; \s
double-quote   = %x22            ; "
number-sign    = %x23            ; #
single-quote   = %x27            ; '
backslash      = %x5C            ; \
vertical-line  = %x7C            ; |

unichar-low    = %x0000-D7FF
unichar-high   = %xE000-10FFFF
unichar        = unichar-low / unichar-high

non-double-quote-or-backslash    = %x00-21 / %x23-5B / %x5D-D7FF / unichar-high
non-line-ending                  = %x00-09 / %x0B-0C / %x0E-D7FF / unichar-high
non-vertical-line                = %x00-7B / %x7D-D7FF / unichar-high
non-vertical-line-or-backslash   = %x00-5B / %x5D-7B / %x7D-D7FF / unichar-high
non-vertical-line-or-number-sign = %x00-22 / %x24-7B / %x7D-D7FF / unichar-high

and                          = 'and' DELIMITER-ANY
begin                        = 'begin' DELIMITER-ANY
case                         = 'case' DELIMITER-ANY
case-lambda                  = 'case-lambda' DELIMITER-ANY
cond                         = 'cond' DELIMITER-ANY
cond-expand                  = 'cond-expand' DELIMITER-ANY
define                       = 'define' DELIMITER-ANY
define-library               = 'define-library' DELIMITER-ANY
define-record-type           = 'define-record-type' DELIMITER-ANY
define-syntax                = 'define-syntax' DELIMITER-ANY
define-values                = 'define-values' DELIMITER-ANY
delay                        = 'delay' DELIMITER-ANY
delay-force                  = 'delay-force' DELIMITER-ANY
do                           = 'do' DELIMITER-ANY
dot                          = '.' DELIMITER-DOT
ellipsis                     = '...' DELIMITER-ANY
else                         = 'else' DELIMITER-ANY
except                       = 'except' DELIMITER-ANY
export                       = 'export' DELIMITER-ANY
guard                        = 'guard' DELIMITER-ANY
if                           = 'if' DELIMITER-ANY
implies                      = '=>' DELIMITER-ANY
import                       = 'import' DELIMITER-ANY
include                      = 'include' DELIMITER-ANY
include-ci                   = 'include-ci' DELIMITER-ANY
include-library-declarations = 'include-library-declarations' DELIMITER-ANY
lambda                       = 'lambda' DELIMITER-ANY
let                          = 'let' DELIMITER-ANY
let-syntax                   = 'let-syntax' DELIMITER-ANY
let-values                   = 'let-values' DELIMITER-ANY
letrec                       = 'letrec' DELIMITER-ANY
letrec-syntax                = 'letrec-syntax' DELIMITER-ANY
letrecstar                   = 'letrec*' DELIMITER-ANY
letstar                      = 'let*' DELIMITER-ANY
letstar-values               = 'let*-values' DELIMITER-ANY
not                          = 'not' DELIMITER-ANY
only                         = 'only' DELIMITER-ANY
or                           = 'or' DELIMITER-ANY
parameterize                 = 'parameterize' DELIMITER-ANY
prefix                       = 'prefix' DELIMITER-ANY
quasiquote                   = 'quasiquote' DELIMITER-ANY
quote                        = 'quote' DELIMITER-ANY
rename                       = 'rename' DELIMITER-ANY
setbang                      = 'set!' DELIMITER-ANY
syntax-rules                 = 'syntax-rules' DELIMITER-ANY
underscore                   = '_' DELIMITER-ANY
unless                       = 'unless' DELIMITER-ANY
when                         = 'when' DELIMITER-ANY

DELIMITER-ANY                = whitespace intertoken-space
DELIMITER-DOT                = whitespace intertoken-space-dot
comment-dot                  = ";" *non-line-ending line-ending
                             / nested-comment
atmosphere-dot               = whitespace / comment-dot / directive
intertoken-space-dot         = *atmosphere-dot
; CAUTION: approximation for <delimiter> and <intertoken-space> with
; special handling for "."


;; r7rs Lexical structure

token                 = identifier / boolean / number
                      / character / string
                      / "(" / ")" / "#(" / "#u8(" / "'" / "`" / "," / ",@" / dot

delimiter             = whitespace / vertical-line
                      / "(" / ")" / '"' / ";"

intraline-whitespace  = space / tab

whitespace            = intraline-whitespace / line-ending

line-ending           = newline / return newline / return

comment               = ";" *non-line-ending line-ending
                      / nested-comment
                      / "#;" datum

nested-comment        = "#|" comment-text *comment-cont "|#"

comment-text          = *non-vertical-line-or-number-sign
; CAUTION: approximation for <character sequence not containing #| or |#>

comment-cont          = nested-comment comment-text

directive             = ("#!fold-case" / "#!no-fold-case")

atmosphere            = whitespace / comment / directive

intertoken-space      = *atmosphere

identifier            = initial *subsequent
                        DELIMITER-ANY
                      / vertical-line *symbol-element vertical-line
                      / peculiar-identifier
                        DELIMITER-ANY

initial               = letter / special-initial

letter                = %x61-7A / %x41-5A   ; a-z / A-Z

special-initial       = "!" / "$" / "%" / "&" / "*" / "/" / ":" / "<" / "="
                      / ">" / "?" / "^" / "_" / "~"

subsequent            = initial / digit / special-subsequent

digit                 = digit10   ; 0-9

hex-digit             = digit16   ; 0-9 / a-f / A-F

explicit-sign         = "+" / "-"

special-subsequent    = explicit-sign / "." / "@"

inline-hex-escape     = "\x" hex-scalar-value ";"

digit16-lt-d          = digit10 / %x61-63 / %x41-43   ; 0-9 / a-c / A-c

hex-scalar-value      = *"0" ( (digit16-lt-d 3hex-digit)   ; %x000000-00CFFF
                             / (  "D" digit8 2hex-digit)   ; %x00D000-00D7FF
                             / ( ("E" / "F") 3hex-digit)   ; %x00E000-00FFFF
                             /         ("10" 4hex-digit) ) ; %x100000-10FFFF

mnemonic-escape       = '\a' / '\b' / '\t' / '\n' / '\r'

peculiar-identifier   = explicit-sign
                      / explicit-sign sign-subsequent *subsequent
                      / explicit-sign "." dot-subsequent *subsequent
                      / "." dot-subsequent *subsequent
; CAUTION: Note that "+i", "-i" and <infnan> are exceptions to the
; peculiar-identifier rule; they are parsed as numbers, not
; identifiers.

dot-subsequent        = sign-subsequent / "."

sign-subsequent       = initial / explicit-sign / "@"

symbol-element        = non-vertical-line-or-backslash
                      / mnemonic-escape / "\|"
                      / inline-hex-escape

boolean               = ("#t" / "#f" / "#true" / "#false")
                        DELIMITER-ANY

character             = ("#\" (character-any / character-name / "x" hex-scalar-value))
                        DELIMITER-ANY

character-any         = unichar

character-name        = 'alarm' / 'backspace' / 'delete'
                      / 'escape' / 'newline' / 'null'
                      / 'return' / 'space' / 'tab'

string                = '"' *string-element '"'

string-element        = non-double-quote-or-backslash
                      / mnemonic-escape / '\"' / "\\"
                      / "\" *intraline-whitespace line-ending *intraline-whitespace
                      / inline-hex-escape

bytevector            = "#u8(" *byte ")"

byte                  = (%x30-39                  ; 0-9
                        / %x31-39 %x30-39         ; 10-99
                        / %x31 %x30-39 %x30-39    ; 100-199
                        / %x32 %x30-35 %x30-35)   ; 200-255
                        DELIMITER-ANY

number                = (num2 / num8 / num10 / num16)
                        DELIMITER-ANY

num2                  = prefix2 complex2
num8                  = prefix8 complex8
num10                 = prefix10 complex10
num16                 = prefix16 complex16

complex2              = real2 / real2 "@" real2
                      / real2 "+" ureal2 "i" / real2 "-" ureal2 "i"
                      / real2 "+i" / real2 "-i" / real2 infnan "i"
                      / "+" ureal2 "i" / "-" ureal2 "i"
                      / infnan "i" / "+i" / "-i"
complex8              = real8 / real8 "@" real8
                      / real8 "+" ureal8 "i" / real8 "-" ureal8 "i"
                      / real8 "+i" / real8 "-i" / real8 infnan "i"
                      / "+" ureal8 "i" / "-" ureal8 "i"
                      / infnan "i" / "+i" / "-i"
complex10             = real10 / real10 "@" real10
                      / real10 "+" ureal10 "i" / real10 "-" ureal10 "i"
                      / real10 "+i" / real10 "-i" / real10 infnan "i"
                      / "+" ureal10 "i" / "-" ureal10 "i"
                      / infnan "i" / "+i" / "-i"
complex16             = real16 / real16 "@" real16
                      / real16 "+" ureal16 "i" / real16 "-" ureal16 "i"
                      / real16 "+i" / real16 "-i" / real16 infnan "i"
                      / "+" ureal16 "i" / "-" ureal16 "i"
                      / infnan "i" / "+i" / "-i"

real2                 = sign ureal2
                      / infnan
real8                 = sign ureal8
                      / infnan
real10                = sign ureal10
                      / infnan
real16                = sign ureal16
                      / infnan

ureal2                = uinteger2
                      / uinteger2 "/" uinteger2
ureal8                = uinteger8
                      / uinteger8 "/" uinteger8
ureal10               = uinteger10
                      / uinteger10 "/" uinteger10
                      / decimal10
ureal16               = uinteger16
                      / uinteger16 "/" uinteger16

decimal10             = uinteger10 suffix
                      / "." 1*digit10 suffix
                      / 1*digit10 "." *digit10 suffix

uinteger2             = 1*digit2
uinteger8             = 1*digit8
uinteger10            = 1*digit10
uinteger16            = 1*digit16

prefix2               = radix2 exactness
                      / exactness radix2
prefix8               = radix8 exactness
                      / exactness radix8
prefix10              = radix10 exactness
                      / exactness radix10
prefix16              = radix16 exactness
                      / exactness radix16

infnan                = "+inf.0" / "-inf.0" / "+nan.0" / "-nan.0"

suffix                = [exponent-marker sign 1*digit10]

exponent-marker       = "e" / "s" / "f" / "d" / "l"

sign                  = ["+" / "-"]

exactness             = ["#i" / "#e"]

radix2                = "#b"
radix8                = "#o"
radix10               = ["#d"]
radix16               = "#x"

digit2                = %x30-31   ; 0-1
digit8                = %x30-37   ; 0-7
digit10               = %x30-39   ; 0-9
digit16               = digit10 / %x61-66 / %x41-46   ; 0-9 / a-f / A-F


;; r7rs External representations

datum                 = simple-datum / compound-datum
                      / label "=" datum / label "#"

simple-datum          = boolean / number
                      / character / string
                      / symbol / bytevector

symbol                = identifier

compound-datum        = list / vector / abbreviation

list                  = "(" *datum ")"
                      / "(" 1*datum dot datum ")"

abbreviation          = abbrev-prefix datum

abbrev-prefix         = "'" / "`" / "," / ",@"

vector                = "#(" *datum ")"

label                 = "#" uinteger10


;; r7rs Expressions

expression            = identifier
                      / literal
                      / procedure-call
                      / lambda-expression
                      / conditional
                      / assignment
                      / derived-expression
                      / macro-use
                      / macro-block
                      / includer

literal               = quotation / self-evaluating

self-evaluating       = boolean / number / vector
                      / character / string / bytevector

quotation             = "'" datum
                      / "(" quote datum ")"

procedure-call        = "(" operator *operand ")"

operator              = expression

operand               = expression

lambda-expression     = "(" lambda formals body ")"

formals               = "(" *identifier ")"
                      / identifier
                      / "(" 1*identifier dot identifier ")"

body                  = *definition sequence

sequence              = *command expression

command               = expression

conditional           = "(" if test consequent alternate ")"

test                  = expression

consequent            = expression

alternate             = [expression]

assignment            = "(" setbang identifier expression ")"

derived-expression    = "(" cond 1*cond-clause ")"
                      / "(" cond *cond-clause "(" else sequence ")" ")"
                      / "(" case expression 1*case-clause ")"
                      / "(" case expression *case-clause "(" else sequence ")" ")"
                      / "(" case expression *case-clause "(" else implies recipient ")" ")"
                      / "(" and *test ")"
                      / "(" or *test ")"
                      / "(" when test sequence ")"
                      / "(" unless test sequence ")"
                      / "(" let "(" *binding-spec ")" body ")"
                      / "(" let identifier "(" *binding-spec ")" body ")"
                      / "(" letstar "(" *binding-spec ")" body ")"
                      / "(" letrec "(" *binding-spec ")" body ")"
                      / "(" letrecstar "(" *binding-spec ")" body ")"
                      / "(" let-values "(" *mv-binding-spec ")" body ")"
                      / "(" letstar-values "(" *mv-binding-spec ")" body ")"
                      / "(" begin sequence ")"
                      / "(" do "(" *iteration-spec ")" "(" test do-result ")"
                            *command ")"
                      / "(" delay expression ")"
                      / "(" delay-force expression ")"
                      / "(" parameterize "(" *("(" expression expression ")") ")" ")"
                      / "(" guard "(" identifier *cond-clause ")" body ")"
                      / quasiquotation
                      / "(" case-lambda *case-lambda-clause ")"

cond-clause           = "(" test sequence ")"
                      / "(" test ")"
                      / "(" test implies recipient ")"

recipient             = expression

case-clause           = "(" "(" *datum ")" sequence ")"
                      / "(" "(" *datum ")" implies recipient ")"

binding-spec          = "(" identifier expression ")"

mv-binding-spec       = "(" formals expression ")"

iteration-spec        = "(" identifier init step ")"
                      / "(" identifier init ")"

case-lambda-clause    = "(" formals body ")"

init                  = expression

step                  = expression

do-result             = expression

macro-use             = "(" keyword *datum ")"

keyword               = identifier

macro-block           = "(" let-syntax "(" *syntax-spec ")" body ")"
                      / "(" letrec-syntax "(" *syntax-spec ")" body ")"

syntax-spec           = "(" keyword transformer-spec ")"

includer              = "(" include 1*string ")"
                      / "(" include-ci 1*string ")"


;; r7rs Quasiquotations (TBD)

quasiquotation        = "`" "|TBD|"
                      / "(" quasiquote "|TBD|" ")"


;; r7rs Transformers

transformer-spec      = "(" syntax-rules "(" *identifier ")" *syntax-rule ")"
                      / "(" syntax-rules identifier "(" *identifier ")"
                            *syntax-rule ")"

syntax-rule           = "(" pattern template ")"

pattern               = pattern-identifier
                      / underscore
                      / "(" *pattern ")"
                      / "(" 1*pattern dot pattern ")"
                      / "(" *pattern pattern ellipsis *pattern ")"
                      / "(" *pattern pattern ellipsis *pattern
                            dot pattern ")"
                      / "#(" *pattern ")"
                      / "#(" *pattern pattern ellipsis *pattern ")"
                      / pattern-datum

pattern-datum         = string
                      / character
                      / boolean
                      / number

template              = pattern-identifier
                      / "(" *template-element ")"
                      / "(" 1*template-element dot template ")"
                      / "#(" *template-element ")"
                      / template-datum

template-element      = template
                      / template ellipsis

template-datum        = pattern-datum

pattern-identifier    = initial *subsequent
                        DELIMITER-ANY
                      / vertical-line *symbol-element vertical-line
                      / pattern-peculiar-identifier
                        DELIMITER-ANY

pattern-peculiar-identifier = explicit-sign
                      / explicit-sign sign-subsequent *subsequent
                      / explicit-sign "." dot-subsequent *subsequent
                      / "." dot-subsequent *pattern-subsequent
; CAUTION: Note that "+i", "-i" and <infnan> are exceptions to the
; peculiar-pattern rule; they are parsed as numbers, not identifiers.

pattern-subsequent    = initial / digit / pattern-special-subsequent

pattern-special-subsequent = explicit-sign / "@"


;; r7rs Programs and definitions

program               = 1*import-declaration 1*command-or-definition

command-or-definition = command
                      / definition
                      / "(" begin 1*command-or-definition ")"

definition            = "(" define identifier expression ")"
                      / "(" define "(" identifier def-formals ")" body ")"
                      / syntax-definition
                      / "(" define-values formals body ")"
                      / "(" define-record-type identifier
                            constructor identifier *field-spec ")"
                      / "(" begin *definition ")"

def-formals           = *identifier
                      / *identifier dot identifier

constructor           = "(" identifier *field-name ")"

field-spec            = "(" field-name accessor ")"
                      / "(" field-name accessor mutator ")"

field-name            = identifier

accessor              = identifier

mutator               = identifier

syntax-definition     = "(" define-syntax keyword transformer-spec ")"


;; r7rs Libraries

library               = "(" define-library library-name
                            *library-declaration ")"

library-name          = "(" 1*library-name-part ")"

library-name-part     = identifier
                      / uinteger10
                        DELIMITER-ANY

library-declaration   = "(" export *export-spec ")"
                      / import-declaration
                      / "(" begin *command-or-definition ")"
                      / includer
                      / "(" include-library-declarations 1*string ")"
                      / "(" cond-expand 1*cond-expand-clause ")"
                      / "(" cond-expand 1*cond-expand-clause
                            "(" else *library-declaration ")" ")"

import-declaration    = "(" import 1*import-set ")"

export-spec           = identifier
                      / "(" rename identifier identifier ")"

import-set            = library-name
                      / "(" only import-set 1*identifier ")"
                      / "(" except import-set 1*identifier ")"
                      / "(" prefix import-set identifier ")"
                      / "(" rename import-set
                            "(" identifier 1*identifier ")" ")"

cond-expand-clause    = "(" feature-requirement *library-declaration ")"

feature-requirement   = identifier
                      / library-name
                      / "(" and *feature-requirement ")"
                      / "(" or *feature-requirement ")"
                      / "(" not feature-requirement ")"


On Jan 13, 2013, at 01:32 , ノートン ジョーセフ ウェイ ン <norton@x> wrote:

> 
> Hello again.
> 
> I have an updated draft for the formal syntax of Scheme r7rs (based on r7rs-draft-8.pdf) written in ABNF.  This draft includes all sections except for quasiquotations.  The datum section has undergone some quickcheck-style unit testing.  I'd appreciate any review and feedback.
> 
> I also have one *minor* suggestion.  The definition for <library name part> seems inconsistent with the style used for <label>.
> 
> library-name-part     = identifier / 1*digit10
> 
> The above definition would follow same style used for <label>.
> 
> regards,
> 
> Joe N.
> 
> <r7rs_tokens.abnf.txt>
> 
> 
> On Dec 29, 2012, at 24:47 , Joseph Wayne Norton <norton@x> wrote:
> 
>> 
>> Hello.
>> 
>> In the process of reviewing the r7rs draft, I decided to draft the formal syntax of Scheme r7rs written in ABNF.  This draft only covers tokens (including datum).  
>> 
>> This kind of specification would be helpful to me and possibly to others.  I'd appreciate any review and feedback.
>> 
>> I intend to draft the other sections (i.e. expressions, quasiquotations, transformers, programs and definitions, and libraries) as well.
>> 
>> thanks,
>> 
>> Joe N.
>> 
>> <scheme_r7rs_tokens.abnf>
> 
> _______________________________________________
> Scheme-reports mailing list
> Scheme-reports@x
> http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports

_______________________________________________
Scheme-reports mailing list
Scheme-reports@x
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports