Title: Indentation-sensitive syntax
Author: Egil Mller <redhog@redhog.org>


Abstract

This SRFI descibes a new syntax for Scheme, called I-expressions,
whith equal descriptive power as S-expressions. The syntax uses
indentation to group expressions, and has no special cases for
semantic constructs of the language. It can be used both for program
and data input.

It also allows mixing S-expressions and I-expressions freely, giving
the programmer the ability to layout the code as to maximize
readability.


Rationale

In the past, several non-S-expression syntaxes for various LISP
dialects has been tested and thrown away.  However, people seem not to
be too happy about the S-expressions, especially not beginners, who
regularly complain about "all those parentheses", so new syntaxes are
invented, and dismissed from time to time. All of those have had one
property in common which S-expressions did not share - they had
special constructs for the various _semantic_ constructs of the LISP
dialect they where constructed for.

Many languages uses parentheses, braces or backets to group statements
or expressions of the language, and allows exta spaces and newlines to
be used to arrange the expressions or stattements in a visually
pleasing way. This, however, often comes back to bite the progammer,
as an indentation leads him or her to think that the code is grouped
in a specific way, when in fact, its parentheses, braces or brackets
have been changed since the last time someone cared to reindent the
code.

Recently, using indentation as the sole grouping constuct of a
language has become popular with the advent of the Python programming
language. This solves the problem of indentation going out of sync
with the native grouping constuct of the language. Unfourtunately, the
Python syntax uses special constructs for the various semantic
constructs of the language, and the syntaxes of file input and
interactive input differs slightly. In addition, the indentation
syntax of Python only covers statements, not expessions and data.


Specification

Each line in a file is either empty (contains only whitepace and/or a
comment), or contains ome code, preceeded by some number of space
characters. For this proposal, a TAB should be counted as exactly four
spaces.

In the following syntax definition, this initial space, as well as
linebreaks, is not included in the rules. Instead, preceding any
matching, the leading space of each line is compared to the leading
space of the last non-empty line preceeding it, and then removed. If
the line is preceeded by more space than the last one was, the special
symbol INDENT is added to the beginning of the line, and if it is
preceeded by less space than the lastt one was, the special symbol
DEDENT is added to the beginning of the line.

The special non-terminal symbol sexpr expands to any valid
S-expression, and the special terminal symbol GROUP expands to the
word "group" in the input stream. The GROUP symbol is used to allow
lists whose first element is also a list. It is needed as the
indentation of an empty line is not accounted for.

Following each production is a rule to compute the value of an
expression matching the production. In those rules, the symbols $1 ...
$n are to be replaced by the 1:st ... n:th matched subexpression.

expr -> QUOTE expr
 (list 'quote $2)
expr -> QUASIQUOTE expr
 (list 'quasiquote $2)
expr -> UNQUOTE expr
 (list 'unquote $2)

expr -> head INDENT body DEDENT
 (append $1 $3)
expr -> GROUP head INDENT body DEDENT
 (append $2 $4)
expr -> GROUP INDENT body DEDENT
 $3
expr -> head
 (if (= (length $1) 1)
     (car $1)
   $1)
expr -> GROUP head
 (if (= (length $2) 1)
     (car $2)
   $2)

head-> expr head
 (append $1 $2)
head-> expr
 (list expr)

body -> expr body
  (cons $1 $2)
body ->
 '()


Examples

define
 fac x
 if
  = x 0
  1
  * x
    fac
     - x 1

let
 group
  foo
   + 1 2
  bar
   + 3 4
 + foo bar

Denser equivalents using more traditional S-expressions:

define (fac x)
 if (= x 0) 1
  * x
   fac (- x 1)

let
 group
  foo (+ 1 2)
  bar (+ 3 4)
 + foo bar



Implementation

The following code implements I-expressions as a GNU Guile module that
can be loaded with (use-modules (sugar)).

(define-module (sugar))

(define-public group 'group)

(define-public sugar-read-save read)
(define-public sugar-load-save primitive-load)

(define (readquote level port qt)
  (read-char port)
  (let ((char (peek-char port)))
    (if (or (eq? char #\space)
	    (eq? char #\newline)
	    (eq? char #\ht))
	(list qt)
	(list qt (sugar-read-save port)))))

(define (readitem level port)
  (let ((char (peek-char port)))
    (cond
     ((eq? char #\`)
      (readquote level port 'quasiquote))
     ((eq? char #\')
      (readquote level port 'quote))
     ((eq? char #\,)
      (readquote level port 'unquote))
     (t
      (sugar-read-save port)))))

(define (indentationlevel port)
  (define (indentationlevel level)
    (if (eq? (peek-char port) #\space)
	(begin
	  (read-char port)
	  (indentationlevel (+ level 1)))
	(if (eof-object? (peek-char port))
	    -1
	    level)))
  (indentationlevel 0))

(define (clean line)
  (cond
   ((not (pair? line))
    line)
   ((null? line)
    line)
   ((eq? (car line) 'group)
    (cdr line))
   ((null? (car line))
    (cdr line))
   ((list? (car line))
    (if (or (equal? (car line) '(quote))
	    (equal? (car line) '(quasiquote))
	    (equal? (car line) '(unquote)))
	(if (and (list? (cdr line))
		 (= (length (cdr line)) 1))
	    (cons
	     (car (car line))
	     (cdr line))
	    (list
	     (car (car line))
	     (cdr line)))
	(cons
	 (clean (car line))
	 (cdr line))))
   (#t
    line)))

;; Reads all subblocks of a block
(define (readblocks level port)
  (let* ((read (readblock-clean level port))
	 (next-level (car read))
	 (block (cdr read)))
    (if (= next-level level)
	(let* ((reads (readblocks level port))
	       (next-next-level (car reads))
	       (next-blocks (cdr reads)))
	  (if (eq? block '.)
	      (if (pair? next-blocks)
		  (cons next-next-level (car next-blocks))
		  (cons next-next-level next-blocks))
	      (cons next-next-level (cons block next-blocks))))
	(cons next-level (list block)))))

;; Read one block of input
(define (readblock level port)
  (let ((char (peek-char port)))
    (cond
     ((eof-object? char)
      (cons -1 char))
     ((eq? char #\newline)
      (read-char port)
      (let ((next-level (indentationlevel port)))
	(if (> next-level level)
	    (readblocks next-level port)
	    (cons next-level '()))))
     ((or (eq? char #\space)
	  (eq? char #\ht))
      (read-char port)
      (readblock level port))
     (t
      (let* ((first (readitem level port))
	     (rest (readblock level port))
	     (level (car rest))
	     (block (cdr rest)))
	(if (eq? first '.)
	    (if (pair? block)
		(cons level (car block))
		rest)
	    (cons level (cons first block))))))))

;; reads a block and handles group, (quote), (unquote) and
;; (quasiquote).
(define (readblock-clean level port)
  (let* ((read (readblock level port))
	 (next-level (car read))
	 (block (cdr read)))
    (if (or (not (list? block)) (> (length block) 1))
	(cons next-level (clean block))
	(if (= (length block) 1)
	    (cons next-level (car block))
	    (cons next-level '.)))))

(define-public (sugar-read . port)
  (let* ((read (readblock-clean 0 (if (null? port)
				      (current-input-port)
				      (car port))))
	 (level (car read))
	 (block (cdr read)))
    (cond
     ((eq? block '.)
      '())
     (t
      block))))

(define-public (sugar-load filename)
  (define (load port)
    (let ((inp (read port)))
      (if (eof-object? inp)
	  #t
	  (begin
	    (eval inp)
	    (load port)))))
  (load (open-input-file filename)))

(define-public (sugar-enable)
  (set! read sugar-read)
  (set! primitive-load sugar-load))

(define-public (sugar-disable)
  (set! read sugar-read-save)
  (set! primitive-load sugar-load-save))

(sugar-enable)


Copyright (C) Egil Mller (2003). All Rights Reserved.

This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published and
distributed, in whole or in part, without restriction of any kind,
provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Scheme Request For
Implementation process or editors, except as needed for the purpose of
developing SRFIs in which case the procedures for copyrights defined
in the SRFI process must be followed, or as required to translate it
into languages other than English. The limited permissions granted
above are perpetual and will not be revoked by the authors or their
successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE AUTHORS AND THE SRFI EDITORS DISCLAIM ALL
WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY
WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY
RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE.
