This page is part of the web mail archives of SRFI 52 from before July 7th, 2015. The new archives for SRFI 52 contain all messages, not just those from before July 7th, 2015.
Thanks, I think I understand your point; and agree that scheme's present specification does presently equate the character-set which may compose programs, to those characters which may symbolically compose strings, and provides a facility to specify, query, and display the numerical 8-bit value value equivalent of any arbitrary characters, without bias or assumption of any particular encoding, or it's membership in scheme's character-set; although only a subset of characters within the specified character-set may compose identifier names. However, I arrive at a very different conclusion with respect to maintaining the spirit of scheme's present specified character-set with respect to enabling scheme to process text in extended character sets; as I interpret that scheme has already implied it's intent, and solution, by specifying only ~96/256 encoding neutral element portable character-set; where although a character may have any of 256 (8-bit) values, only 96 of them may be utilized to symbolically compose program and string text, where when it is desired to specify a character value which does not correspond to a member of it's specified character-set, it may be specified and displayed numerically (where it's representation is constrained to standard scheme character-set members). Which has enabled the development and distribution of scheme programs unambiguously portably encode-able in a wide variety of different character-set specifications as may be required by various platforms. Therefore by analogy, I see no reason to fundamentally change anything with respect to scheme's portable character-set specification, as implementations are already free to encode the scheme character set as it sees fit within scheme's (presently implied 8-bit character values), and explicitly specify, query, and display any character value, or sequence values as an encoded numerical equivalent utilizing any encoding format desired, while still restricting their expression within scheme code to be composed of portable scheme characters. (as otherwise you haven't got portable code). If there is a fundamental desire to extend the abstraction of scheme's characters, string, and port types beyond their presently strongly implied binary byte oriented basis; then I see no alternative but to co-specify an alternative base-line binary port interface and data types, as scheme requires encoding agnostic data and I/O facilities from which more abstract and encoding specific data-types may be supported. WRT: learning English to read/write scheme code; although an interesting implied topic, I suspect the solution's challenge lies less with the adoption of a "universal character set", and more with the specification of the linguistic equivalence and automatic translation of arbitrarily specified symbolic names and prose as may exist in comments, from/to arbitrary languages, and likely correspondingly arbitrary native character-sets as local platforms may still require. Thanks again for your time and thoughts, -paul- > From: Robby Findler <robby@xxxxxxxxxxxxxxx> > > All I'm saying is that whatever restrictions you make on the > "composition of Scheme code" you are also making on the values in the > language, since Scheme program text and literals (aka string and symbol > values (and some others)) are all the same things. By definition. > > It's true, you could come up with some language that allowed values > representing unicode strings but didn't allow program text in unicode. > It just wouldn't be faithful to Scheme. In other words, if the > technical problems you suggest make it impossible to achieve this, to > me that means we've failed. > > I have no comment on whether or not everyone should learn English to be > able to read Scheme code -- that seems outside the scope of this forum :). > > Robby >