Re: [Scheme-reports] auxiliary syntax

On Tue, Jan 8, 2013 at 7:39 AM, Noah Lavine <noah.b.lavine@x> wrote:

Hello,

I'm still trying to understand the problem here, so please bear with me if this is incorrect.

In a nutshell, this seems to be the issue: Alex wants to distribute an SRE library. This library uses certain symbols as syntax for regular expressions. This library may analyze regular expressions either at expansion time or at run time.

Yes, specifically an SRE is just a list. Lists are great things for storing

data, a huge improvement on the strings that other languages typically

use for things like regular expressions.

For example, if you wanted to write an sregrep program, then the SRE

could be obtained as

(read (open-input-string (cadr (command-line))))

Likewise you can store SREs in data files, input them at runtime in

an editor's search function, and so on.

This works fine and I have no complaints with it. There are no

macros involved.

At expansion time, we want (for modularity reasons) to be able to rename the symbols if they are going to be confusing. This means that the names that appear in the code may be different than the objects that the macros actually work with, and there's some layer in between that knows how to rename things so that no one gets confused.

However, at runtime, the *procedures* that compile an _expression_ only understand a certain fixed set of symbols, and they can't be renamed. Furthermore, the macros that run at expansion time may want to call the procedures (also at expansion time).

Because of this renaming issue, it is not correct for a macro to simply quote a list it was given and pass the quoted list an argument for the procedure. The only way to make this work is for the magic renamer (that knew about the renamings earlier) to also participate in the renaming that the macro does before it passes its list to a function. So you want some special form meaning "quote, but do renamings first".

Is that a correct summary of the issue?

Not quite. The whole problem is that I'd really rather not rename anything.

A secondary problem is that _if_ I'm forced to rename, then things will break.

A simplification of the example in question is regex-case, where we want

to write a macro which can understand (parts of) the SRE syntax, and

gracefully punt on parts that it doesn't understand:

(define (parse-date str)

(regex-case str

((: (=> year (= 4 digit)) "-" (=> mon (= 2 digit)) "-" (=> day (= 2 digit)))

(make-date (string->number year)

(string->number mon)

(string->number day)))

;; other cases ...

))

This takes advantage of the => named-submatch extension to SREs

(equivalent to the PCRE (?P<name>...) syntax), which lets us refer to

submatches by symbolic name rather than index. Here, the macro is

making this even simpler by automatically binding the variables in

question in the body. Note it's important that `year', `mon' and `day'

preserve hygiene, but the rest of the SRE syntax should be analyzed

unhygienically.

In practice you probably want to either require quoting of the SRE, or

treat it as an implicit quasiquote. Parts that are generated dynamically

(unquoted) would be skipped over, and the named submatches they

produce, if any, would not be bound. This is fine and intuitive because

those names are not visible in the scope anyway.

(If so, it sounds like syntax parameters may be the solution. But at this point I'm just trying to make sure I understand the problem.)

Yes, I believe syntax-parameters could be used. It would be clumsy, since the

macros involved would have to syntax-parameterize the entire SRE language on

each expansion. But if we're entertaining non-standard extensions then it's much

simpler, easier to port and more robust to just use an ER macro with unhygienic

matching.

Alex