Monday, January 6, 2014

Grogix for programming language error-reporting


[This is an actual case study, heavily disguised.]
the social structure


Let's imagine a programming language called Yazyka. 

Let's say the Yazyka compiler is not built to provide detailed assistance and instruction, regarding use of the Yazyka language, to the user. Essentially, its error messages aren't useful to a human being. This is a common problem among production language compilers. 

Say that, because of incredible external pressures on the compiler team, the compiler will never be able to generate this human-friendly instruction itself.

So it might be best to build a separate 'check-module', which can be used by the system before or during compilation, that would play a role something like ‘lint’, but with far more sophisticated analytic functionality and pervasive user-friendliness. Like ‘lint’, it would also help to flesh-out documentation and instructional materials for the Yazyka language.

One state-of-the-art solution is to build this check-module with grogix

Grogix is an unusual computer language, but it is quite ideal for doing this particular kind of work. 

Explanation

Say there is a type of statement in Yazyka whose form is:

(A)       notify [x] with [y] on [z] ;
           
This can look like, say:

            notify alertPort with “$D” on ALERT_STRINGS;

... and an infinite set of other similar cases with different specifics.

But there is also an infinite set of incorrect cases (a larger infinite set, actually) which do not fit this statement's form (I would hesitate to even call it syntax), any of which could be generated by the user while attempting to write a correct statement of this form.

Let’s look at a tiny variant from the correct form.

*          notify alertPort with $D on ALERT_STRINGS;

In this example, the user has used a notation ($D) which turns ‘D’ into a string. But even if ‘D’ is otherwise correct, it turns out that ‘$D’ must be used inside of quotation marks.

The existing Yazyka compiler is ‘aware’ that this is incorrect. But, in its analysis, the "reasons" it has for rejection of this case are not intelligible to a human, and cannot help a human to uncover the mistake that has been made.

For this simple mistake, the current compiler should provide helpful feedback like:

The $ operator is only for use within quotation marks, e.g.: "$D"

or:

In the notify statement, the stream ALERT_STRINGS requires a string 
in the with clause. Found $D instead.

or:

“With” should be followed by a string

… et cetera ...

But, instead, the Yazyka compiler responds with these kinds of errors:

Multiple markers at this line
            - unexpected keyword: with
            - unexpected keyword: (with) in expression
            - unexpected keyword: notify
            - unexpected keyword: (notify) in action
            - unexpected keyword: on
            - unexpected keyword: (on) in action
            - constructor $ should have 1 arguments

Of course there are valid internal reasons for these messages, related to other work the Yazyka compiler needs to do. The compiler is optimized to build fast, reliable code, not to teach people the Yazyka language. This is also true for most other compilers.

This is why we advocate a language check-module. It’s intended to serve two purposes:

1. maintain an independent, human-verified, approachable formal language definition
2. make use of this definition as the basis of a grogix program that provides user-friendly error messages

What is grogix, how is it helpful, and why is it special?

Grogix is the prototype for a new class of formal languages. A grogix program explicitly represents computational operation, in a concise way, through a simple, coherent, tree-like gradient of operational importance. We call it an “operational grammar”.

A grogix program’s uniform structural description, provided by a cascading deductive block of statements, consisting of only a single statement type (described below) means that a programmer is compelled to “push out” uninteresting implementation details from this operational description of a program. This leaves a “structural essence”, something which looks merely like an outline of operation, but is actually a tight hierarchical structure that handles all cases.

Because it is ‘syntax-centric’, a grogix program can more easily provide consistency checking and targeted human error reporting. Providing this, for all incorrect statements of the type (A) above, is accomplished by this small snippet of code, which demonstrates grogix’s brevity:

. statement(*starts_with(input,‘notify’)) -> notify_statement
. notify_statement -> notify recipient_identifier with expression
.. on stream_identifier($recipient_identifier,$expression) semicolon
. notify(*starts_with(input,‘notify’)) -> ‘notify’
. notify -> *error(‘not a “notify” statement’)
. recipient_identifier -> identifier
. with(*next(input, ‘with’)) -> ‘with’
. with -> *error(‘missing “with” ’)
. expression -> *valid_type(expr,$1,$2)
. expression -> *error(‘missing expression after “with”’)
. on(*next(input,‘on’)) -> ‘on’
. on -> *error(‘missing “on” indicator for stream name’)
. stream_identifier -> *validate($expression,$recipient_identifier,identifier)
. semicolon(*next(input,‘;’)) -> ‘;’
. semicolon -> *error(‘missing semicolon’)

Now to explain.

There is only one kind of statement in a grogix program: the conditional production. Its general form is:

. (condition) ->
.. [combination of parameterized terminals, non-terminals and actions]

Note that every statement is preceded by a ‘.’ or continued by a ‘..’’ Any other line is a comment: this is a nod to Knuth’s Literate Programming initiative.

Now let’s look at the same program, with explanatory comments added:

. statement(*starts_with(input,‘notify’)) -> notify_statement
We assume that there are more kind of statements besides notify,
which will be inserted here.

. notify_statement -> notify recipient_identifier with expression
.. on stream_identifier($recipient_identifier,$expression) semicolon
This is the structure of a notify statement. In grogix, all the expressions on the right hand side of the ‘->’ will be evaluated, in order, before this statement can return a value upwards.

Note that we’ve passed the return value of two of the non-terminals (recipient_identifier and expression) as parameters (using $) to a third non-terminal (stream_identifier) in order to check type agreement.

. notify(*starts_with(input,‘notify’)) -> ‘notify’
. notify -> *error(‘not a “notify” statement’)
This is redundant! But I wanted to give you an early flavor for the nature of the conditional production. The first notify production is invoked, and returns a string, if the line begins with ‘notify’, but otherwise, the non-terminal definition falls through, (top-to-bottom order) to the second conditional production, which is the ‘default’. The first conditional production to evaluate positively is the one that ‘runs’ (i.e. the one whose right-hand-side is further evaluated).

Also note that the *error operation and the *starts_with operations are both external references. Everything else in a grogix program is an internal reference: that is, defined within the grogix program. This means the grogix program represents only the operational structure, or ‘essence’ of the program … everything else is pushed out of this structure, or pulled in, via these * operations.

. recipient_identifier -> identifier
‘identifier’ will be defined another time. also, in this consistency checker, there will be binding considerations for the different kinds of identifiers (we have to check that the streams and variables exist, for example).

. with(*next(input, ‘with’)) -> ‘with’
. with -> *error(‘missing “with” ’)
Here we look at the next token at the first level, and, if it is missing at this point in the evaluation, we issue a specific, user-friendly error (I’ll leave the error format discussion to another time.)

. expression -> *valid_type(expr,$1,$2)
. expression -> *error(‘missing expression after “with”’)
If there is an expression, it does a check against the type declared elsewhere, returning the mismatch if found. If there’s no expression at all, it reports the problem.

. on(*next(input,‘on’)) -> ‘on’
. on -> *error(‘missing “on” indicator for stream name’)
Another simple use of the conditional production to identify missing structure.

. stream_identifier -> *validate($expression,$recipient_identifier,identifier)
‘identifier’ will be defined another time. notice that we call an external function *validate, which we assume has access to appropriate tables, to validate passed values of other non-terminals (see notify_statement above), and potentially return an error here is there is a problem. For ease of exposition, we’re presenting the case where the other non-terminals have already been evaluated. A different technique is available if the agreement is with non-terminals whose values are not yet available, but our users can usually rewrite the productions to make agreement-evaluation easy.

. semicolon(*next(input,‘;’)) -> ‘;’
. semicolon -> *error(‘missing semicolon’)
Another simple use of the conditional production to identify missing structure.

Conclusion

The position of the proposed check-module is architecturally flexible: it can be invoked either after the Yazyka compiler encounters a problem, to provide better output to the user, or beforehand, to ensure better input to the compiler.

Socially, the proposed check-module provides an independent validation of assumptions regarding the Yazyka language. Although this module and the compiler may seem like a “divergence risk” because of “two different language definitions” in two different system modules, in fact their jobs are complementary, both helping to socialize the actual language definition. It is the grogix team's job to ensure that there are no conflicts, and that the check-module provides a world-class level of Yazyka training and support to the user.

Tuesday, March 30, 2010

Interfaces and Productions

Grogix is Turing Complete -- these days, it seems to take special effort to create a language that isn't. A Grogix 'operational grammar' can produce the same results as the language in which it's implemented, and generally, the platform upon which it executes. After all, you can do anything in the 'context interface', and write the grammar so that it gives back to the interface the permission to "do its thing". The programmer can then gradually transfer functional structure to the grammar, from the interface.


Since the 'context interface', or interface to the outside world, makes this unusual grammar into a fully functional machine, I'd like to reveal something more about of the general structure of interfaces:


With the appropriate hooks in the traversal routine, the interface can determine a few other things about the handling of conditional productions. The notion of storage in the two presented Grogix programs, as well as the unusual meaning assigned to grammar and storage in the wiki_strings example, are just a few examples of the wide range of perfectly natural interpretations of productions in Grogix.

Some notes on this topic:

* There's no reason for the right-hand side of the production (which is sometimes called the successor) to be an ordered set or template of symbols and non-terminals. It could be unordered, generating combinations rather than permutations. This reflects certain ideas of the Minimalist Program in linguistics, for example the possibility that the "beads on a string" order (as opposed to the order of syntactical transformations) happens rather late in the production of speech (close to the sensorimotor interface). When we avoid left-to-right order in some productions, let's call them "unordered successors", we are clearly defining additional opportunities for parallel computations in the system.

* There's no reason the core generated behavior needs to be a returned string. For example, the right-side or successor side of the production could be a series of commands, executed immediately or later. The production could use any set of symbols or media that has a meaning given to it by the interface. It is up to the interface traversing the grammar to interpret the right-side expression.

* The operational grammar is not a 'grammar' in the particular sense that it is not about language syntax -- although it can be used to represent language in the standard way. Grogix looks like a grammar, and has computational properties and logical operations, as all systems labeled 'grammatical' do, and so I call it a grammar. But someone could differ with my calling this a grammar, reasonably. On the other hand, I believe Grogix seems so useful exactly because it has syntactic structure without referring to language, while at the same time being itself a programming language.

Sunday, March 28, 2010

Modularity within an Operational Grammar

In the previous example, I purposely left a fairly obvious section of "straight HTML" at the bottom of the grammar.

The detail in this section could be turned into a grammar "module", shrinking the "core" grammar, making it more comprehensible.

WikiBase.gx

. start@ :: {{ wiki }}

. wiki@url_name(page) :: {{wiki_page}}
. wiki@url_name(edit) :: {{wiki_edit_page}}
. wiki@ :: {{wiki_page}}

. wiki_page@url_name(page) :: {{ html_page( *page, wiki_content(*page) ) }}
. wiki_page@ :: {{ html_page( "WikiHome", wiki_content("WikiHome") ) }}

. wiki_edit_page@ :: {{ html_page( *edit, wiki_edit ( *edit ) ) }}

. wiki_content@ :: {{ wiki_top ($1) }} {{ wiki_text ($1) }} {{ edit_link ($1)}} {{ wiki_footer }}

. wiki_edit@ :: {{ wiki_top ($1) }} {{ wiki_form ( wiki_text ) }} {{ wiki_footer }}

. wiki_text@url_name(edit) :: {{ wiki_text.value }}
. wiki_text@ :: {{ wiki_strings (wiki_text.value) }}

. edit_link@ :: {{a_link ("/edit="+$1,"Edit "+$1)}}

. wiki_strings@pattern("\<[^<>]*\>") ::
. wiki_strings@pattern("([A-Z][a-z]+)+") :: {{ a_link("/page="+*1,*1) }}
. wiki_strings@pattern("---+") :: {{ hr }}
. wiki_strings@pattern("\*") ::    •

. wiki_top@ :: <div style="font-weight:bold;">{{ a_link( "/page="+$1, $1 ) }}</div>
. wiki_footer@ :: copyright © {{*year}}

. wiki_form@ :: <form action="/page={{*page}}" method="post">{{form_textarea(*page)}}{{form_button}}</form>

. form_textarea@ :: <textarea name="{{*page}}">{{wiki_text}}</textarea>
. form_button@ :: <input type="submit" name="button" value="save">

... html

It seems like the next nearly modular sub-grammar is the wiki_strings section that handles the retrieval of string content. Obviously this section would be much larger in, say, the WikiMedia codebase. So it makes sense to abstract it away from the core grammar.

WikiBase.gx

. start@ :: {{ wiki }}

. wiki@url_name(page) :: {{wiki_page}}
. wiki@url_name(edit) :: {{wiki_edit_page}}
. wiki@ :: {{wiki_page}}

. wiki_page@url_name(page) :: {{ html_page( *page, wiki_content(*page) ) }}
. wiki_page@ :: {{ html_page( "WikiHome", wiki_content("WikiHome") ) }}

. wiki_edit_page@ :: {{ html_page( *edit, wiki_edit ( *edit ) ) }}

. wiki_content@ :: {{ wiki_top ($1) }} {{ wiki_text ($1) }} {{ edit_link ($1)}} {{ wiki_footer }}

. wiki_edit@ :: {{ wiki_top ($1) }} {{ wiki_form ( wiki_text ) }} {{ wiki_footer }}

. wiki_text@url_name(edit) :: {{ wiki_text.value }}
. wiki_text@ :: {{ wiki_strings (wiki_text.value) }}

. edit_link@ :: {{a_link ("/edit="+$1,"Edit "+$1)}}

. wiki_top@ :: <div style="font-weight:bold;">{{ a_link( "/page="+$1, $1 ) }}</div>
. wiki_footer@ :: copyright © {{*year}}

. wiki_form@ :: <form action="/page={{*page}}" method="post">{{form_textarea(*page)}}{{form_button}}</form>

. form_textarea@ :: <textarea name="{{*page}}">{{wiki_text}}</textarea>
. form_button@ :: <input type="submit" name="button" value="save">

... wiki_strings
... html

Although it doesn't shorten the core grammar much, moving the 3 lines of the edit form to its own module makes sense, given how complex an editor can become.

WikiBase.gx

. start@ :: {{ wiki }}

. wiki@url_name(page) :: {{wiki_page}}
. wiki@url_name(edit) :: {{wiki_edit_page}}
. wiki@ :: {{wiki_page}}

. wiki_page@url_name(page) :: {{ html_page( *page, wiki_content(*page) ) }}
. wiki_page@ :: {{ html_page( "WikiHome", wiki_content("WikiHome") ) }}

. wiki_edit_page@ :: {{ html_page( *edit, wiki_edit ( *edit ) ) }}

. wiki_content@ :: {{ wiki_top ($1) }} {{ wiki_text ($1) }} {{ edit_link ($1)}} {{ wiki_footer }}

. wiki_edit@ :: {{ wiki_top ($1) }} {{ wiki_form ( wiki_text ) }} {{ wiki_footer }}

. wiki_text@url_name(edit) :: {{ wiki_text.value }}
. wiki_text@ :: {{ wiki_strings (wiki_text.value) }}

. edit_link@ :: {{a_link ("/edit="+$1,"Edit "+$1)}}

. wiki_top@ :: <div style="font-weight:bold;">{{ a_link( "/page="+$1, $1 ) }}</div>
. wiki_footer@ :: copyright © {{*year}}

... wiki_edit
... wiki_strings
... html

What we're left with is, perhaps, what we'd expect from WikiBase: the large-scale structure of a web service with two dynamic web pages.

I imagine clicking on an ellipsis (or hitting a key when it's highlighted) in order to see the sub-grammar. That sub-grammar of course will have its own module declarations etc.

Note that this makes immediate opportunities for (1) standardized definitions for interaction with modules and (2) standardized modules.

In a way, the grammar rather forced us to do this. It's slightly different from the way it forced us to separate the context from the grammar: the context interface became defined because the context wasn't part of the system derivation we focused on. That is, if you focus on the structure of a system, a context will always emerge that will make it easier to define the system grammatically. Modules are slightly different. In the process of deriving the system grammar, we notice sub-grammars that are quite modularized, even though they may be used pervasively by the grammar. We keep them in the grammar, but we put them aside. This is different from the context interface, which would be difficult to include in the grammar. It's really a matter of where you start. If we wrote a different grammar, with a focus on building a web server, then the context interface would be different -- it would be the operating system. A wiki, or any web service, would be something so far down from a web server grammar, that it would be running inside a standardized module (the application container).

I think it's a strength of this approach that, no matter where you start, it seems like you'll derive a rather similar layering of operational grammars. This is probably because we're focusing so intently on the structure of the functionality: there is no reason to believe that the solutions would be very different.

Thursday, March 25, 2010

Decimating the WikiBase

The Wiki was invented by a talented programmer who wanted to let his friends store patterns of software design modeled after Christopher Alexander's work in A Pattern Language.

The original Perl code is known as WikiBase, and after removing the HyperPerl structure, inspired by Donald Knuth's Literate Programming ideas, the code runs to about:

300 lines

Below is a functionally equivalent Grogix program, which I wrote in about an hour, and which could be better, but runs to about:

30 lines

The difference is Grogix, not me. Operational Grammars compact the multiplicative effects in a system, and clarify system structures and their appropriate order of emergence. Real-world context is pushed into the interface, and all that remains is the unfolding structure.

When looking at this example, please remember that an Operational Grammar represents a full executing system, not just the syntax of a language.

In this example, I've made some changes from previous posts.

1) The ellipses are gone, because people inferred that something was missing, which was not the case.

2) I've marked the three parts of a Conditional Production with '@' and '::' :

nonterminal@condition :: template

3) I've removed the keyword for the 'default' condition. So a default condition looks like:

nonterminal@ :: text

And of course the default condition goes at the end of any run of productions for the same nonterminal. They are evaluated in order.

4) I also changed the right-hand side's {{nonterminal::parameter1,parameter2}} notation to {{ nonterminal( parameter1,parameter2 ) }}. These compose, so: {{ nonterminal1( parameter1, nonterminal2( parameter1, parameter2 ) ) }} ... note the productions access parameters with $1, $2 etc ... these will be name-able in the future.

5) As before, every nonterminal is a potential persistent data type, where its access by .value triggers the persistence mechanism in the interface.

The interface is somewhat different than in the previous example, allowing usages like the patterns-parsing of "wiki_strings" -- that's the strength of the interface-separation approach. New kinds of interface functionality can be added, but it's still a grammar, representing the structure of execution of a system.


WikiBase.gx

. start@ :: {{ wiki }}

. wiki@url_name(page) :: {{wiki_page}}
. wiki@url_name(edit) :: {{wiki_edit_page}}
. wiki@ :: {{wiki_page}}

. wiki_page@url_name(page) :: {{ html_page( *page, wiki_content(*page) ) }}
. wiki_page@ :: {{ html_page( "WikiHome", wiki_content("WikiHome") ) }}

. wiki_edit_page@ :: {{ html_page( *edit, wiki_edit ( *edit ) ) }}

. wiki_content@ :: {{ wiki_top ($1) }} {{ wiki_text ($1) }} {{ edit_link ($1)}} {{ wiki_footer }}

. wiki_edit@ :: {{ wiki_top ($1) }} {{ wiki_form ( wiki_text ) }} {{ wiki_footer }}

. wiki_text@url_name(edit) :: {{ wiki_text.value }}
. wiki_text@ :: {{ wiki_strings (wiki_text.value) }}

. edit_link@ :: {{a_link ("/edit="+$1,"Edit "+$1)}}

. wiki_strings@pattern("\<[^<>]*\>") ::
. wiki_strings@pattern("([A-Z][a-z]+)+") :: {{ a_link("/page="+*1,*1) }}
. wiki_strings@pattern("---+") :: {{ hr }}
. wiki_strings@pattern("\*") ::    •

. wiki_top@ :: <div style="font-weight:bold;">{{ a_link( "/page="+$1, $1 ) }}</div>
. wiki_footer@ :: copyright © {{*year}}

. wiki_form@ :: <form action="/page={{*page}}" method="post">{{form_textarea(*page)}}{{form_button}}</form>

. form_textarea@ :: <textarea name="{{*page}}">{{wiki_text}}</textarea>
. form_button@ :: <input type="submit" name="button" value="save">

. a_link@ :: <a href="{{$1}}">{{$2}}</a>

. hr@ :: <hr \>
. br@ :: <br \>

. html_page@ :: {{ html_header( $1 ) }} {{ html_body( $2 ) }}
. html_header@ :: {{doctype}} <head><title>{{ $1 }}</title>{{html_meta}}</head>
. html_body@ :: <body>{{ $1 }}</body>
. html_meta@ :: <meta http-equiv="content-type" content="text/html; charset=utf-8"/>
. doctype@ :: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">


There are tools I'll build that should make this kind of programming even easier. I imagine an Eclipse plug-in that has two easily-switchable navigation modes: (1) scrolls around the whole grammar and (2) shows only the parent and child productions of whatever production your cursor is on. I'm guessing that this mode, with state discrimination added, could help expand the overlaid trees of conditional grammars, assisting our comprehension of even more complex systems, and producing a kind of "programming by borrowing" of grammar sub-trees ... anyway, much of the problem of navigating complex systems is mitigated by the one-grammar-per-system idea. It's a great model for portability, with a built-in idea for well-defined separation and integration between engine and interface. Additional systems (especially the interfaces) will have their own grammars too ... each layer in a real-world "interface stack" can be a system with its own operational grammar.

Sunday, March 21, 2010

The Conditional Production

The primary notion in Grogix is the "extraction" of structure, or essence, or form, "from" the system being designed, and the "concentration" of that structure "into" the operational grammar, which executes the system.

The basic construct that makes this possible is The Conditional Production. The typical grammar used for parsing will pair patterns with actions, which can be transformations. In contrast, the Conditional Production in the grammar only defines the non-terminal if the condition is met at run-time. This, along with the recursion found in any grammar, allows the definition of a single, finite grammar with infinite numbers of state combinations. Any programming language with iteration or recursion can do this, but only Grogix puts it into a single grammatical structure.

Grogix's "hello world" example:

. start ... default ... hello world

And a more complex version, making use of Conditional Productions:

. start ... default ... hello {{name}}
. name ... *user=="greg" ... mister G
. name ... default ... {{*user}}

"default" is the default or base condition. When a conditional expression that evaluates true is reached, moving from the top to the bottom of the list of productions, that production becomes active.

In these examples, the far right, past the second ellipsis, is a text template, marked or escaped by these {{ call outs }} to further non-terminals or variable values.

The * construction represents an interface variable.

The grammar runs within an interface, and the grammar has various ways of receiving values from the interface. The interface here is not defined, making the grammar a kind of operational idealization, which can be used with multiple interfaces. One interface that we can imagine for these examples is a terminal program which has the user's login name available to it, which it then provides to the grammar. The grammar exits when the tree is fully traversed. We could imagine the interface then provides this result to the user, and quits. We could also imagine the exact same grammar for a web-based terminal service. A grammar can be written to remain abstract from such considerations, making it ideal for portability, or insertion into legacy situations. It could be also contain layers of sets of productions specifically written to deal with the outside world beyond more transparent interfaces, and these layers might be swapped out for different outside contexts.

None of this would make much sense without the key idea of a central grammar executing the system. With Grogix, and Blooming Logic, one has to think of grammars differently, as the central organizing notation for computation, rather than as a tool for defining language syntax.

Monday, March 15, 2010

Depth and Innate Judgements

This post was moved to the Chomsky & Alexander blog, along with future posts about Grogix that are strictly philosophical in nature.

Tuesday, March 9, 2010

Grammar-centric programming

Let's look at a small Grogix example. Grogix stands for "GRammar-Operated General information computing system"

Again, the notion is to keep the entire system coherent through a single, central grammar.

Around the grammar is an interface to the world. The interface allows the recursive traversal procedure, the one that interprets the grammar, to drive action outside, and receive state-change information from the outside.

This particular interface is a Python program, acting as both a Grogix interpreter, and as a liaison with the Google App Engine framework.

Here's a Grogix grammar for a simple wiki:

. start ... start ... {{default_page}}
. default_page ... default ... {{page_contents}}

. page_contents ... name_value(page) ... {{content_section}}
. page_contents ... name_value(site) ... {{page_section}}
. page_contents ... default ... {{site_section}}

. site_section ... no_item(site) ... {{site_cycle}}
. site_section ... default ... List of Sites: {{sites}}{{br}}{{site_cycle}}{{br}}{{br}}{{br}}{{hr}}{{br}}

. sites ... default ... {{sites:site}} {{br}} <a href="/site={{site.name}}"> {{site.name}}</a> : {{site.value}}
. site_cycle ... post ... {{site_cycle_post_response}}
. site_cycle_post_response ... name_value(add) ... success
. site_cycle ... get ... {{site_cycle_get_response}}
. site_cycle_get_response ... name_value(action) ... {{standard_form}}
. site_cycle_get_response ... default ... <a href="/action=add&item=site">Add a site</a>

. page_section ... no_item(page) ... Site: {{*site}}{{br}}{{page_cycle}}
. page_section ... default ... <a href="/">Back to site overview</a>{{br}}Pages:{{pages}}{{br}}{{page_cycle}}{{br}}{{br}}{{br}}{{hr}}{{br}}

. pages ... default ... {{pages:page}} {{br}} -> <a href="/site={{*site}}&page={{page.name}}">{{page.name}}</a> : {{page.value}}
. page_cycle ... post ... {{page_cycle_post_response}}
. page_cycle_post_response ... name_value(add) ... success
. page_cycle ... get ... {{page_cycle_get_response}}
. page_cycle_get_response ... name_value(action) ... {{standard_form}}
. page_cycle_get_response ... default ... <a href="/site={{*site}}&action=add&item=page">Add a page</a>

. content_section ... no_item(content) ... Page: {{*page}}{{br}}{{content_cycle}}'
. content_section ... default ... <a href="/">Back to site overview</a>{{br}}Content:{{contents}}{{br}}{{content_cycle}}{{br}}{{br}}{{br}}{{hr}}{{br}}

. contents ... default ... {{contents:content}} {{br}} {{content.name}} {{br}} <b>{{content.value}}</b>
. content_cycle ... post ... {{content_cycle_post_response}}
. content_cycle_post_response ... name_value(add) ... success
. content_cycle ... get ... {{content_cycle_get_response}}
. content_cycle_get_response ... name_value(action) ... {{standard_form}}
. content_cycle_get_response ... default ... <a href="/site={{*site}}&page={{*page}}&action=add&item=content">Add content</a>

. standard_form ... default ... <form action="{{form_target}}" method="post"> {{form_name_textarea}}{{form_value_textarea}}{{standard_form_button}}{{form_hiddens}}</form>

. form_name_textarea ... default ... name: <textarea style="height:30px;width:300px;background:#eeee00;" name="name"></textarea>{{br}}

. form_value_textarea ... default ... value: <textarea style="height:30px;width:300px;background:#eeee00;" name="value"></textarea>{{br}}

. standard_form_button ... default ... <span align="left"><input type="Submit" name="button" value="{{standard_form_button_text}}"></span>

. form_target ... add(site) ... /action=add&item=site
. form_target ... add(page) ... /site={{*site}}&action=add&item=page
. form_target ... add(content) ... /site={{*site}}&page={{*page}}&action=add&item=content

. form_hiddens ... default ... {{fh::"landing_site",*site}} {{fh::"next_start","start"}} {{fh::"landing_page",*page}}

. fh ... default ... <input type="hidden" name="{{$1}}" value="{{$2}}">

. standard_form_button_text ... add(site) ... Add Site
. standard_form_button_text ... add(page) ... Add Page
. standard_form_button_text ... add(content) ... Add Content

. br ... default ... <br />

. hr ... default ... <hr />



So, what are you looking at here?

It's a grammar. You can see the "start" non-terminal at the beginning.

It's an ordered set of non-terminal productions. You can see the non-terminal on the left is defined by the phrase on the right ... and the phrase on the right is clearly a text-based template ... the production is 'guarded' by a conditional clause between the non-terminal and the template-like phrase:

. [non-terminal] ... [conditional] ... [defining phrase]

The reason I precede each production with a "dot", is that I like to have free comments, to explain what is going on. This is a nod to Donald Knuth's important observations regarding Literate Programming.

The ordered conditionals determine which productions are followed at run-time.

The defining production itself has further non-terminal references in {{ braces }}.

The productions can also inherit attributes passed from (or bequeathed by) its parent production, which uses this construction: {{ nt :: 'literal', *interface_value, $[inherited_value] }}.

The productions can use information directly from the interface, and can modify interface values, including where the interface should "start" in the grammar on various occasions.

There's an assumed general data-type which associates with a non-terminal, and can be assigned values. It is wound and unraveled with the {{sites:site}} recursive construction, with constraints selected in this way {{contents : content [site,page] }}, where site and page are in this case constraints on the data by those values, passed from the interface.

Note that this was my first working version ... the program itself can obviously be shortened considerably to achieve the same end (there's a repetition I was using while debugging some properties of the interpreter ... discovering this is perhaps a useful exercise for the reader). In any case, in my next installment, I'll show a better grogix grammar for a wiki closer in definition to the original, and deploy the grammar-driven Google App Engine online, so people can try it for themselves.

Monday, March 8, 2010

Grogix: the GRammar-Operated General information computing system

I'm motivated by the vast chasm between the products of human engineering and those of the biological world.

I mentioned this here. I'm using the techniques in Christopher Alexander's The Nature of Order, that is, using feeling as a test procedure. The test procedure uncovers what has strong biological coherence, and what doesn't. It is analogous to Noam Chomsky's linguistic test procedure, to determine what is, and what is not, part of the human faculty of language. The Alexandrian version of the Chomsky approach, has helped me to find a strong notation, one that can actually give us a comfortable, natural, humane encoding for systems with the mind-boggling complexity of natural organisms.

At the center of Alexander's current approach is the notion of coherence. Some fields of centers ('centers' are things with a field-like property, without necessarily having a distinct boundary) work together, and support each other, holistically, with different degrees of coherence.

I considered what it was that best held everything together coherently, at least for me, in a computing system. And I know it sounds trite, but it seems that I always come back to the idea that a grammar defines the system -- it's kin to Aristotle's loosely-hierarchical formal cause, the essence of a thing, which still works to describe structure everywhere in the universe.

Now, even though programming systems always seem to have some holistic grammatical properties, for some reason we never express them grammatically. We express them as categories, functions, sets, classes, statements etc. But the only system that actually looks and feels like the grammar of language, is the attribute-grammar toolset of YACC, and its derivatives.

For decades, I've said to myself: "why do we only use grammars at the input and the output of a system?" We know from various kinds of re-writing systems (L-systems and Markov algorithms) that grammars are very powerful computationally. As a consequence of Chomsky's linguistics work, it's clear that various kinds of deep grammar must be at the core of our natural enumerative capacity. So, why are we using functional notation, or trivial imperative notation, when grammar is actually the heart of the form of a system?

I also take to heart Feyerabend's prescription for progress: we've been incrementally building a body of doctrine about programming language semantics, in part to maintain logical consistency in the face of engineering demands -- yet we seem no closer to bridging the chasm between ourselves and nature. Maybe we need to throw out our carefully-constructed mathematical deductions. Maybe we need to dig more deeply, not just into what nature is doing, not just into what mathematics calls for, but into what we do best, mentally, as natural organisms. Our mathematical and symbolic ability is a consequence of our faculty of language. What could be more natural than putting language at the center?

So grogix: a grammar-operated general information computing system. The grammar is the heart of the system. You can have other systems. But if you want a coherent system, it must have one grammar, explicitly, facilitated and protected, at its heart.

Grogix 0.2

Since October of 2009, I was forced to take grogix in a new direction [I now call this old direction, interesting in itself, SUPL]. The old direction was, essentially, a macro preprocessor for generating programs ... albeit one with a well-integrated notion of grammar, morphogenetic sequences, pattern-like steps in those sequences, etc. It's still a useful authoring tool for the kind of analysis described in the sample sequence at www.corememory.org.

But I began to realize: no matter how natural or well-motivated or well-described, a hierarchical expression of the form of a generated Python program was not going to make me happy. I'm just too unsatisfied with our current programming paradigm, which always puts logical consistency before all else, despite the fact that logical consistency has become so straightforward and commonplace that young programmers can create turing complete notations without even knowing that they are doing so.

So, one-by-one, I took parts of grogix-the-macro-processor which I felt were natural and well-motivated, and integrated them into grogix-the-programming language.

Wednesday, September 30, 2009

Mux-Demux: Coherence in Massive Generative Grammars

In the previous post, Grogix, our language for describing an operational sequence of patterns ordered by importance, was given a language structure, "mux-demux", to provide notational equivalence to the multiple application of functions. But more is needed: the higher-order function, where each of the multiple applied functions contains further functions. Our issue is that we still want these sub-functions to be disentangled, unraveled and transparent, so their patterns cannot be hidden, making them available for application globally to our system, so that true minimization of principles and patterns can be possible.

In order to facilitate "higher-order mux-demux", we're going to extend the .demux directive, and turn it into a production. From our previous example (note that templates have step-only local scope):

.step 11 : 'db section'

.template : 'db' : 'NAME'
<<<
Class NAME (db.model):
>>>

.demux : 'db declarations'
.nt: this
.dnt: 'db references'

.dmP: 'main page'
.t : db ( 'Main' )

...


So, in a later pattern/step, a further .demux reference to 'db references' will be applied to a spot after each of the .dmP results at this step.

This provides a kind of "multiplexing trunk', where demultiplexing of the original group is done at any number of steps in the morphogenesis of the system.

But we need two more things to complete our analogy with everyday class and functions use:

(1) A way to merge mux groups.
(2) A way to split mux groups.

(1) is easy, and 'free' in my implementation, requiring no new syntax. We simply use the .demux target in more than one .mux statement.

(2) requires new syntax. We attach 'directed non-terminal' .dnt directives to .dmP's, in a .demux block, not just to .mP's in a .mux block.

.dmP: 'main page'
.t : db ( 'Main' )
.dnt : 'page db declarations'


This tags the production for a later .demux: 'page db declarations' directive ... other .dmP's at this level can be so tagged, splitting the mux group, allowing for differentiation of streams of otherwise similar elements. Clearly something like this is done in nature ... and we certainly need it in order to reduce notation, while maintaining global transparency.

Rethinking the fundamental aspects of functions

A Class is a "minimizing" tool, a shorthand for otherwise impossibly repetitive, incomprehensible and error-prone declarations of methods and data.

Functions fall into the same category, as do global data declarations: with them, we create a necessary shorthand. We resolve nagging details within functions, so we can move on, and think larger thoughts. Functions reflect the compression we perform naturally, while we think, and while we express our thoughts.

There is, however, a problem with large, intertwined libraries, frameworks and cultures of shorthand.

One problem is that, in applying existing Libraries and Functions to a new problem, or a new set of goals, you can't find the important features of your new set in a natural way ... natural either to the problem at hand or to your deepest thoughts about it. Instead, you need to think of it in terms of the machine, and the culture of using the machine represented by the Libraies and Functions, Classes and Methods ...

What you want to do, to quote Plato (from the frontispiece of Christopher Alexander's "Notes on a Synthesis of Form"):

" ... the separation of the Idea into parts, by dividing it at the joints, as nature directs, not breaking any limb in half as a bad carver might ..."


I have always felt that, while there are important ideas hidden in these libraries, they have a tendency to "chop up" the problem you're working on in a most unnatural way. Some are a better fit to your problem than others, to be sure. But there is something fundamentally wrong about how we apply this existing work to our own problems.

Mathematicians have understood the difference for centuries: the application of known results to new problems is either done well, and naturally, unfolding beautifully for the reader of the proof, or else it is hacked together: possibly correct, but unreadable. There needs to be a plot, a gradient of applied resolutions bearing down on the problem.

What I want to do with Grogix, is make it easy to create this gradient. But to do so, we need to think differently about Functions and Classes.

Functions are shorthand, and collected together make a kind of fantasy world. But they represent the fantasy world of an operating computer ... so there is utility in cooperating with it. Yes, the fantasy world collapses when you hit it with a sledgehammer. But it also contacts a billion people when you use it well.

The RNA->DNA->protein mechanisms of nature could also be called a fantasy logical system ... boil a cell, and the machinery disappears. But there is great power in this machinery. In fact the point of the present work is to find a natural way to use an analogy to morphogenesis as a means of cooperating with computing systems.

Let's look at one particular aspect of functions: functions that call functions, or higher-order function. In Object-Oriented Design, these are methods that call other methods or classes. They are pervasive in computing. Every programmer uses them constantly.

One of the consequences of higher-order functions is a kind of cascading effect, or a "tree of applied functions" that are called anytime one is used.

But say we want this tree to be explicit? Say I'd like every program I write to "show its work", and also to "minimize all higher-order functions" by eliminating buried functional similarities?

Why would I want to do this? So we could write more reliable, robust, and natural programs. If we can disentangle the trees of higher-order functions, and lay them side-by-side, and order them by principles, we'll have some real understanding and clarity in computing. Instead, I find that higher-order functions simply hide more, and more, and more, until we cannot move naturally towards our goals. It's like using the specific musical knowledge of, say, a bass player, to choreograph a ballet. It's good stuff, but it requires too many unique transformations to get where you want to go.

If you look at this first working Grogix example, ListItem.gx, you can see that the morphogenetic sequence for unfolding the program (steps 1 through 19) lays the the principles side-by-side ... a complete cross-cutting, holistic application ... but that's easy when there's very little structural repetition, in such a simple web application.

So, let's say that we're creating a web application with multiple features that are organized nearly identically -- a simple example is a single dynamic web page with multiple update-able and extend-able content.

I've created a structure in Grogix which I call "mux/demux" to deal with repeated distribution across several steps in a sequence.

In one step, one defines the "multiplexer" production:

.step 6: 'Global structure'

.muxP : 'global structure'
.mP : 'main page'
.mP : 'news section'
.mP : 'news item'
.mP : 'link section'
.mP : 'link item'
.mP : 'user content section'
.mP : 'user content item'
.mP : 'group content section'
.mP : 'group content item'
.dnt : 'db declarations'
.dnt : 'handler declarations'
.dnt : 'handler map entries'
.dnt : 'template dictionaries'
.dnt : 'templates'
.muxP end

And then in five other "demultiplexing" steps, one applies the application-wide pattern ('db declarations' etc.) to each of the similar application sections:

.step 9: 'handler map entries'

.demux : 'handler map entries'

.template: '' : 'inside_string', 'rountine_name' , KEY
<<<
('/inside_string/KEY', routine_name)
>>>

# add (.*)/ for KEY when you need it

.dmP : 'main page'
.t : typical_entry ('','Main','')
.t : ','

.dmP : 'news section'
.t : typical_entry ('news','News','')
.t : ','

.dmP : 'news item'
.t : typical_entry ('news_item','News_Item','')
.t : ','

...


Since the .dnt's = .steps = patterns from a pattern gradient, this effectively lets me program "holistically" ... I can turn a formerly functional application inside out, with generative grammar, so that the principles are applied explicitly wherever needed. I can add unique concerns to this step, and more than one demux, thereby concentrating the global "what" and "how" together for anyone who might subsequently read the program, wanting to understand the principles applied.

It also maintains a global minimalism. Which is nature's strength: the point of Grogix.

This isn't quite sufficient though. We've dealt only with multiple applications of functions, not with the higher-order functions. We need these too ... you could write a complex program without one, but you still need them to minimize the encoding of action. A good example is this multiple-web-section application. There will be further patterns to apply to all the "mux" sections, long after the first "demux".

And so, we need an arbitrary number of levels of "demuxing".

Essentially, we need to apply the original sets of .muxP production targets to any production with a non-terminal that appears in the first demux section. This maintains the global principle aspect, while allowing the power of higher-order functions and multiple application of functions.

I'll provide a full example in the next post.

Wednesday, September 9, 2009

Blooming Logic

bloominglogic is a work in progress ... one of my projects is an online tool, or webapp, for viewing, editing, and running Grogix sequences.

Grogix

grogix: n. a formal notation for describing holistic, integrative sequences, or recipes.

Groggy

groggy: adj.the feeling of having consumed grog, stimulating abstract thoughts.

Tuesday, August 25, 2009

Grog

grog: n. a drink made from an ad hoc recipe, i.e. based upon principles and intuition rather than exact measurement. Consequently, the number of possible variations are innumerable.