%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Ginger XML: Transport Syntax for Ginger %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% :Author: Stephen Leach :Email: stephen.leach@steelypip.com Created, June 2010 Revised, August 2010, May 2011, Feb 2013, Apr 2013, Jun 2017 .. note:: Reader, please note that this is very much a work-in-progress. It is being written up as we design the relevant sections in appginger. ================================================================================ Ginger XML Overview ================================================================================ The idea behind GingerXML (\*.gnx) is to provide a representation of Ginger code that is neutral. By neutral we mean that it should be both human-readable and machine-friendly. Hence it should be easy for people with some computing background to understand the logic behind it, and easy for people to review but not necessarily write. It must also be easy for them to create programs, in their programming language of choice, that can read, transform and generate GingerXML. And it should be reasonably inexpensive to process, without that being an overriding consideration. To balance these opposed concerns we borrowed from Lisp, especially Scheme. We selected a simplified subset of XML, called Minimal-XML or MinXML, for short. MinXML is loosely based on s-expressions, which made programming very straightforward. We then designed the elements around Lisp-style primitives. This gave us a very clean design because we did not have to make concessions to programming convenience - because this is a representation that is essentially only read and written by machines, not people. For example the equivalent of Pop-11's ``if f( 0 ) then p else q endif( false )`` would be: .. code-block:: xml                                                                         There are disavantages to this approach, of course, such as it being relatively verbose and therefore slowing down compilation. Yet on the whole we believe GingerXML balances the relevant factors and is both effective and pleasing to use. ================================================================================ Element and Meta-attributes ================================================================================ Comment Attribute ----------------- Any element may have the optional 'comment' attribute. It is a free-text string intended to provide human-readable text. Its context is the node and all nested nodes. Source File Attribute --------------------- Any element may have the optional 'source' attribute that describes the source file (or other source) of the element. It is a free-text string. It is considered to apply to all nested nodes. e.g. .. code-block:: text .... Context Attributes ------------------ Any element _may_ have an optional context.* attribute which will be printed out to provide additional context. It will typically be a truncated version of the spanned text. Context - Grammar Role ...................... Any element _may_ be decorated with the grammatical role assigned by the parser. .. code-block:: text if test then x else y endif ### may be decorated as follows. Context - Parameter Position ............................ Any expression that is passed as a parameter to a function application _may_ be decorated with its positional value and the name of the function (or description of the function expression) that it is statically called by. Arguments on the left hand side of the function are numbered negatively, arguments on the right hand side are numbered positively. .. code-block:: text x.f( y, z ) ### Original expression. ================================================================================ Statements ================================================================================ Syntax ------ .. code-block:: text STMNT ::= DECLARATION EXPR ================================================================================ Expressions ================================================================================ Syntax ------ .. code-block:: text EXPR ::= CONSTANT ### any literal constant VARIABLE ### reference to a variable ASSIGNMENT ### assignment to a variable SEQ ### sequence of expressions (comma/semi separated) BLOCK ### introduces a new scope FUNCTION ### a function APP ### function application CONDITIONAL ### if/unless LOOP ### for loops LIST ### list expressions VECTOR ### vector expressions ================================================================================ Constants ================================================================================ Description ----------- Constants are characterised by having element name 'constant' and 'type' and 'value' attributes. Constants always represent a single IMMUTABLE value. N.B. The compiler is free to share instances of these constants which are equal to each other. Note that the "type" attribute doesn't correspond to the class name you may have expected. This is a hangover from early development before the class names were stablised. Syntax ------ .. code-block:: text CONSTANT ::= ### The absent singleton ### Booleans ### The indeterminate singleton ### +/- arbitrary precision ### We might unify numbers? ### Immutable strings ### Symbols ### A single character ### Named procedure ### Named class ### The undefined singleton Examples -------- .. code-block:: xml N.B. Character sequences are multi-valued constants. They are represented as a sequence of characters. .. code-block:: xml Available Named Procedures -------------------------- Note that these constants are not necessarily bound to identifiers in Ginger. These constants are intended as direct support for built-in operators (e.g. arithmetic) and syntactic forms such as list construction, string interpolation, and so on. Here are some examples:: ### } ### } ### }- standard arithmetic ### } ### Boolean negation It is intended that all the members of this list are guaranteed to be available from the "std" package. Hence they are functionally equivalent to .. code-block:: xml Furthermore, it is important to note that these constants do not have to be implemented efficiently. Compiler writers are permitted to implement these as lambda forms. For example a system function 'foo' of 1 argument might be implemented like this: .. code-block:: text ### permitted possible implementation of unary sysfn called 'foo' In particular it is explicitly permitted that each use of a sysfn _may_ return a different object. Available Named Classes ----------------------- There is a built-in class for every type of built-in value, although they are not necessarily bound to identifiers in Ginger. Examples:: ### class for absent ### class for true & false ### class for 'small' integers ### class for doubles ### class for strings ### class for characters ### class for nil ### class for list pairs ### class for vectors ### class for classes Note that classes are not exactly he same as types. All function objects share the same class but may have entirely different types. ================================================================================ Variable Reference ================================================================================ Notes: We have to add in name qualification e.g. nicknames. We also should consider a way of allocating local variables guaranteed never to clash with local variables created by the programmer. Maybe have an extra hidden dimension on names?? Note: there are three ways by which a global variable might be referred by. 1. A qualified reference, using the alias attribute 2. An unqualified reference, using the enc.pkg (enclosing package) attribute 3. An absolute reference, using the def.pkg (defining package) attribute Syntax ------ .. code-block:: text VARIABLE ::= ================================================================================ Assignments ================================================================================ Description ----------- N.B. Assignment runs from left-to-right, not following the usual convention. The destination expression may be a complex assignable expression. Syntax ------ .. code-block:: xml SRC_EXPR DST_EXPR ================================================================================ Sequences & Blocks ================================================================================ Overview -------- Sequences are used to create a sequence of expressions. Blocks are sequences with the additional property that they introduce a new scope. Syntax ------ .. code-block:: xml SEQ ::= EXPR* BLOCK ::= EXPR* ================================================================================ Function Applications ================================================================================ Syntax ------ .. code-block:: xml APP ::= EXPR EXPR EXPR* SysApps ------- SysApp's are invocations of the built-in functions. Each built-in function is named and can be referred to via * EXPR* , which compiles into a function call * , which will compile into a function object * , which will compile into a variable that references a function object. Of these three methods, only the direct function call is guaranteed to be efficient. The other two forms are permitted to be relatively inefficient. In support of this, the compiler writer is allowed to make reasonable assumptions to help performance e.g. the call may be inlined, computed at compile-time, overflow checking may be deferred until the end of the parent block, no debug information may be available, the garbage collector may be blocked, and so on. Note that it is also guaranteed that direct calls of sysfns will be as efficient as sysapps. .. code-block:: text ### This form will be treated as a sysapp. ... Effectively it turns into .. code-block:: text ... See `sysapps in detail`_ for more information. .. _`sysapps in detail`: ../help/sysapp.html ================================================================================ Conditionals ================================================================================ Notes: In progress - I am designing these as multi-part ``if/then/elseif/../else/endif`` forms. This means they are an easy target for compiling switches. Short circuits need to be fleshed out. Syntax ------ .. code-block:: text CONDITIONAL ::= ( IF_PART THEN_PART )* [ELSE_PART] EXPR* EXPR* EXPR* ### && EXPR* ### || IF_PART ::= EXPR THEN_PART ::= EXPR ELSE_PART ::= EXPR .. code-block:: text SWITCH ::= VALUE_PART ( CASE_VALUE CASE_BODY )* [ ELSE_PART ] VALUE_PART ::= EXPR CASE_VALUE ::= EXPR CASE_BODY ::= EXPR ELSE_PART ::= EXPR ================================================================================ List Expressions ================================================================================ Description ----------- Lists are implemented as singly linked chains. The list syntax is a shorthand for calling the 'newList' function. The lists that are constructed are guaranteed to be immutable and may or may not share. The empty list 'nil' is guaranteed to be a unique object. Syntax ------ .. code-block:: text LIST ::= EXPR* ================================================================================ Vector Expressions ================================================================================ Description ----------- Vectors are implemented as contiguous arrays. The vector syntax is a shorthand for calling the 'newVector' function. The vectors that are constructed are guaranteed to be immutable and may or may not share. Syntax ------ .. code-block:: text VECTOR ::= EXPR* ================================================================================ For Loops ================================================================================ Overview -------- In Ginger, looping is unified with the process of finding the solutions of a query, so all of the expressive work is carried out by the query. For each solution of the query a new set of bindings is made to the variables of the query. Queries are designed so that the familiar loops can all be easily composed and all work in the expected way. As an example, iterating a variable n from A to B becomes: .. code-block:: XML THE BODY OF THE LOOP Syntax ------ .. code-block:: text LOOP ::= QUERY ================================================================================ Queries ================================================================================ Overview -------- Queries are a signature feature of Ginger. They represent multiple alternative bindings to a set of variables and, as such, are a match-based generalisation of assignment. Syntax ------ .. code-block:: text QUERY ::= PATTERN EXPR PATTERN FROM_EXPR [ BY_EXPR [ TO_EXPR ] ] PATTERN EXPR QUERY EXPR QUERY QUERY QUERY QUERY QUERY EXPR bind ---- The bind element is described in the section on Declarations and Patterns. It corresponds to a simple match between a pattern and an expression. from ---- The from element binds the pattern successively to the numbers in the inclusive range FROM_EXPR to TO_EXPR (default +infinity) in increments of BY_EXPR (default 1). in -- The in element binds the pattern successively to the members of the iterable value computed by EXPR. do -- The do element runs EXPR each time a solution is found for QUERY. cross ----- The cross element combines two queries together finding their cross-product. The combined set of solutions is the set of solutions of the right hand query in the context of the left hand query. The right hand query may assume that the variables of the left hand query are bound (but not vice versa). zip --- The zip elements combines two queries together in parallel. Unlike the cross element, the subqueries share no bindings. ok, fail, once -------------- The ``ok`` element is an atomic query that always succeeds without binding any variables, ``fail`` never succeeds and ``once`` succeeds only once. All three can be simulated by the ``from`` element. ================================================================================ Declarations and Patterns ================================================================================ Overview -------- Declarations match a pattern with an expression - patterns being limited expressions that contain pattern variables. N.B. The intention is to fit this to the pattern/query proposal. .. code-block:: text PATTERN EXPR Syntax ------ .. code-block:: text PATTERN EXPR A PATTERN is any of the following .. code-block:: text PATTERN ::= PATTERN_VAR | PATTERN_ANON | PATTERN_SEQ | PATTERN_APP | PATTERN_CONST PATTERN_VAR ::= PATTERN_ANON ::= .. note:: Qualifier or alias? We have some terminological confusion from different rounds of discussion being exposed. .. code-block:: text PATTERN_SEQ ::= PATTERN* PATTERN_CONST ::= EXPR PATTERN_APP ::= EXPR PATTERN .. note:: At the time of writing we have not implemented PATTERN_CONST or PATTERN_APP. Pattern Variables ----------------- These are the most basic and familiar types of pattern. They introduce an optionally typed variable. The protected attribute plays the same role as in Pop-11, protecting the variable from assignment (n.b. this is shallow rather than deep protection.) ``name=NAME`` The "name" attribute is optional. If it is omitted then it is an anonymous variable. ``type=TYPE_EXPR`` The type-check will be made BEFORE assignment and a failed type-check will generate an error. ``match=TYPE_EXPR`` The type-check is made BEFORE the assignment and failure will cause the matcher to backtrack. ``protected=BOOL`` If “true” variable is protected against subsequent assignments. Generated by val and define declarations. If “false” the variable may be assigned to. If omitted the default is “true”. Top level variables may also be given tags and package qualifiers. ``tagN=TAG_VALUE`` Tags the variable. ``qualifier=ALIAS`` The name is qualified by an import alias. ``pkg=PACKAGE_NAME`` The package name is an absolute reference to a package. Comment! Qualifier or alias! Note: we also need to cope with forward declarations. As a Query ---------- A bind declaration is a type of query that either fails or succeeds once. In particular this loop would execute precisely once: .. code-block:: text STATEMENTS Examples -------- .. code-block:: text ### Note that var/val introduces a query in Ginger. The '=' operator ### is a query operator whose LHS is a pattern. Identifiers are ### parsed as pattern-variables within a pattern, taking on the ### default protection of the var/val. var x = 99; ### The identifiers can given overrides for protection or type. val [ x, var y, z : bool ] = f(); ### Ensure that p returns a single value which is an integer. val _ : int = p(); ### The 'define' form also introduces an implicit PATTERN = EXPR ### bindings where EXPR will be the arguments to the function. define K( x )( y ) => x enddefine; ================================================================================ Packages and Imports ================================================================================ .. note:: This section did not reflect the current implementation and needs further discussion. In practice the fetchgnx tool discharges the packages and imports before the Ginger Virtual Machine gets to see them. As a consequence it has been moved aside to `Packages and Imports`_. .. _`Packages and Imports`: ../help/packages_and_imports.html